Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add overview slide to consider the big picture/alternate strategies #6

Open
aculich opened this issue Apr 19, 2023 · 0 comments
Open
Labels
enhancement New feature or request

Comments

@aculich
Copy link
Member

aculich commented Apr 19, 2023

After debriefing this consult with @tomvannuenen:
#1816 Trouble bypassing captcha verification while doing data scraping

Let's consider how we help people consider the big picture and goals of what they're trying to accomplish and whether or not some more effective strategies than scraping, including:

  • check if there is another simpler technical approach (use API instead of scraping)
  • check if there is another already-scraped database of the data available (e.g. reddit data on bigquery)
  • check if there is a low-tech (contact by email or sneakernet) solution for acquiring the data easily/quickly in bulk
  • consider the legal/ethical implications of scraping (see Library Text Mining pages/workshops and CLTC Scraping for Research Purposes)
  • mention the purpose of captchas and pitfalls of attempting to circumvent them
@aculich aculich added the enhancement New feature or request label Apr 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant