Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only retrieved the 50 most recent posts #39

Open
tjluoma opened this issue Mar 10, 2021 · 1 comment
Open

Only retrieved the 50 most recent posts #39

tjluoma opened this issue Mar 10, 2021 · 1 comment
Labels

Comments

@tjluoma
Copy link

tjluoma commented Mar 10, 2021

I did this:

export _GROUP="bbedit"
./crawler.sh -sh > curl.sh
/opt/homebrew/bin/bash curl.sh

and it saved like 4000+ files, but the 'mbox' folder only has 50 items in it, and https://groups.google.com/g/bbedit says there are 4,444 messages in the group.

Is there something else I need to do? This is a public Google group so I thought that what I did was sufficient.

Thanks!

@icy
Copy link
Owner

icy commented Mar 12, 2021

@tjluoma right that's fine. Let me try on my laptop if I can reproduce your issue. Thanks

@icy
Copy link
Owner

icy commented Mar 12, 2021

@tjluoma I have given a try, and I have 759 messages in mbox folder, and it's still counting.

$ pwd
/home/foo/projects/icy/google-group-crawler/bbedit/mbox

$ ls | wc -l
826

I'd suggest you to turn on verbose stuff as below to see how your script is working and/or they have any issues. Can you actually open the script curl.sh and modify some curl options if necessary. Please let me know if your next try is better. Thanks

$ /opt/homebrew/bin/bash -x curl.sh

icy pushed a commit that referenced this issue Mar 12, 2021
The past output script is silently downloading thing,
now we have more information to see what is going on.

Also fix a confusing message
@icy icy added the question label May 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants