Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to bypass Google's CAPCHA using a third party service #915

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

Moist-Cat
Copy link
Contributor

refs #211

Using a proxy or making any "suspicious" request triggers Google's reCAPTCHA v2 (with callback). I'm using https://deathbycaptcha.com/ to try to solve it.
I tested it using a blacklisted proxy, and the integration seems to be working fine. The problem is, Google just sends another CAPTCHA after we solve the first one so I'm not 100% sure if I'm sending the token to the correct endpoint of if Google is rejecting my request for some reason.
Manually solving the CAPTCHA gives the same result.

Copy link
Owner

@benbusby benbusby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your effort on this. I'm personally not the biggest fan of CAPTCHA solving services, so I can't test this myself. But as long as it doesn't interfere with core Whoogle functionality (which it doesn't appear to) then I'm open to merging it. Are you still working on it though? From your comment it sounds like this integration doesn't actually solve the CAPTCHA, but maybe I misinterpreted something.

app/utils/captcha.py Outdated Show resolved Hide resolved
@Moist-Cat
Copy link
Contributor Author

Moist-Cat commented Jan 3, 2023

are you still working on it though? From your comment it sounds like this integration doesn't actually solve the CAPTCHA, but maybe I misinterpreted something.

It does solve the CAPTCHA and sends it to Google's servers. Just that I'm not sure if it removes the rate limit because Google just sent another CAPTCHA when I solve the first one. This happened both testing it by hand on the site and internally on the application. That's why I'm not convinced it solves the issue.
Feel free to wait until someone with the rate-limiting issue tests the integration and confirms it's indeed working to merge the code because because I tested it with a free proxy and the results might not be the same.

Looks like this doesn't get used anywhere. Did you mean to raise this exception instead of the AttributeError later on?

No, I meant to use this in the solve function but in the end I decided to just return False whenever we failed to solve the CAPTCHA. I removed it.

@dominickp
Copy link
Contributor

I wonder if the captcha could be returned to the Whoogle user. If indeed solving a few of these causes Google's rate limiting to back off a bit that would be awesome.

@Moist-Cat
Copy link
Contributor Author

Every site protected by Google's reCAPTCHA has an unique site key that needs to be posted in the form. If the host doesn't match with the sitekey the CAPTCHA won't load.
We could:

  • spoof the host by editing /etc/hosts to resolve google.com as 127.0.0.1
  • make a form to post the params to https://google.com/sorry/index and let the user solve the CAPTCHA

In theory this should work but i'm not sure if spoofing the host would be desirable. Maybe we will have to give the process sudo privileges to edit the hosts file before and after the CAPTCHA is solved.

@4194304
Copy link

4194304 commented Sep 20, 2023

#707 and #763 are still working, which makes it so I have to switch Whoogle instances every few days which insanely annoying

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Instance has been ratelimited [FEATURE] anti-captcha support.
4 participants