-
Notifications
You must be signed in to change notification settings - Fork 537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WiP: Fix looping Cloudflare challenge, Resolves #1036 #1163
base: master
Are you sure you want to change the base?
Conversation
FWIW But after each solve there remains a chrome subtask that starts to spin up to 15% CPU and I have to manually kill them off. |
Another thing that I've noticed is that in the user-agent headless replacement: self.execute_cdp_cmd(
"Network.setUserAgentOverride",
{
"userAgent": self.execute_script(
"return navigator.userAgent"
).replace("Headless", "")
},
) I don't know why but If I hardcode the user-agent using the exact that my computer has like this: user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36"
options.add_argument(f"--user-agent={user_agent}") it bypasses cloudflare, but if i put this to make it automatically like you have it on line 533 from So an alternative could be to setup a driver only to get the user agent: def get_user_agent(driver):
return driver.execute_script("return navigator.userAgent;").replace("Headless", "") And then pass the user-agent to the definitive driver PD: I only can tell you what I've discovered to see if we can go through the solution cuz I'm having troubles to get the project installed/set up 😅 |
I didnt actually use this branch, it worked fine after I switched to it. Thanks |
@garfield69 yea this seems to be an issue with Chrome v124. You can revert to v123 in the mean time if it's easier - #1161 Alternatively, build your own binaries, which will use Chromium v123:
|
@m33ts4k0z were you doing this on Windows? |
Yes on a Windows 11 VM on Unraid but it did work in the end. I updated my first post here with the cause. |
Oh cool, did not know I could build on windows. |
@juanfrilla sorry for the delay in replying, been busy and only got to a few quick ones on my phone. I'll have a look at the UA idea when I next get a chance, thanks. Assuming you're following the run from source instructions, what issue are you having? https://github.com/FlareSolverr/FlareSolverr#from-source-code |
@ilike2burnthing my main problem is that i cannot install Xvfb on MacOS |
Tried XQuartz? |
yessir now the project is set up, let's see what I can fix |
What exactly is left to do on this to get it merge? I tried to guess with the comments here and some different issues but I can't get the current status of this. It seems to be stale for quite some time, so what's needed? |
|
Well, I made my own implementation of this "new tab" idea and I was able to make it work with every website I could (ext.to, www3.yggtorrent.cool, dodi-repacks.site, hd-torrents.me/login.php, nhentai.net) on my Linux system using a VPN / socks5 proxy and also with my container image on my own remote Linux server, which was blocked by cloudflare too. Public image with my edits: 21hsmw/flaresolverr:fixlooping |
That's working 95% of the time on Windows for me, even with a proxy, but failing 95% of the time on Docker. Usual error:
Seems it's related to |
When you say it fails on Docker, is it still on Windows or Linux? I got this error on Linux while doing my implementation, but have not been able to replicate it since. For the looping challenges, it seems to be a timing issue. Playing with the timer values can make it work in some cases, but it's not easy to know what works for everyone since it seems to take network latency into account. For example, if I use a proxy close to my location, it works 100% of the time with the sites I listed earlier, but if I use a proxy very far from me, it works 50% of the time. |
Linux. I'll play around with timings again (I did a bunch yesterday), see if I can get something that works both on my Docker and Windows. |
Strange then. I'm able to solve the challenges of all sites I try on my Debian and Fedora systems with different VPNs/Proxies with and without Docker involved. Here's an example with dodi-repacks.site using the docker image I shared previously: |
This comment was marked as off-topic.
This comment was marked as off-topic.
Thanks for your workaround @21hsmw Working with @aevrard the solution you provide will kill the killswitch if you're using something like gluetun... |
Thanks @21hsmw ! |
Worked for me on whatbox.ca services:
flaresolverr:
image: 21hsmw/flaresolverr:fixlooping
environment:
- LOG_LEVEL=${LOG_LEVEL:-info}
- LOG_HTML=${LOG_HTML:-false}
- CAPTCHA_SOLVER=${CAPTCHA_SOLVER:-none}
- TZ=UTC
- PORT=25000
- HOST=127.0.0.1
network_mode: host
pull_policy: always
restart: unless-stopped |
replacing the image of the dockerfile for this: I tested as well on a centOS server with the previous image ( |
This comment was marked as off-topic.
This comment was marked as off-topic.
I can access all trackers in Jackett with the latest Flaresolverr, but not with these changes. |
'latest' being v3.3.17? What OS? |
Flaresolverr v3.3.17, Linux x86_64 running on DigitalOcean. Tested on my macOS and that version also works fine with my normal IP. Any tracker I should try that you have issues with? |
https://github.com/search?q=repo%3AJackett%2FJackett+configuring-flaresolverr&type=code Some may not be currently using CF, there may be some missed, some only use it for login or keyword searches, but that should give you an idea. See also those sites mentioned in #1036. |
juanfrillaaa/flaresolverr:latest solves the challenges greatly on my ubuntu with docker! Thanks @juanfrilla! |
Update: Just tested, it works with an xvfb display inside a container. I'll see if I can find some time to implement part of nodriver for flaresolverr, but I'm not sure if I should put both undetected-chromedriver and nodriver in the same files, or if it would be better to create separate files like |
Separate sounds good. |
@ilike2burnthing I spent some time the last few days implementing the |
Looks good, I'll test this when I get some free time. Can you also push a Docker build of this so I can test it more easily as well? Thanks! |
Sure, it's available here on docker hub: |
I don't know if this can help but YGG is cloudflared since yesterday (june 2 2024). |
i'm not able to get it working with this url (i dont know what happened): |
Can you tell which image you used to get this result? I tried the current latest tag which got stuck in a loop, then I tried |
I'm doing it with
|
When flaresolverr fails to return cookies, what error exactly is being reported? |
@21hsmw I've been testing your nodriver version on Windows 10. It works, but there are two issues I've found with it. First, for some reason it seems to leave behind zombie Chrome processes that peg one core at 100% each. Not sure if this is a nodriver bug or something related to how FlareSolverr handles the lifespan of driver/tab instances. Second issue is that whenever it creates a driver instance to solve a challenge it steals focus from other windows which is incredibly annoying when trying to use it locally while doing other work. TBF this was an issue with chromedirver too, but there it could be solved by passing a |
Yes, I am aware of this problem. I experienced it when I tested it with Windows, and I found out in the nodriver documentation that the parent process could stay running in the background. I did some testing, including testing an internal nodriver function that is not part of the official documentation, and it did not kill it. It's actually on my list of things to fix before doing a PR. I'll see what I can do.
For the nodriver addition, I decided to keep the original headless implementation for Windows. Since it doesn't use a driver here, maybe the headless option built into nodriver could work, to be honest I haven't tested it, so I'll try that later.
That's what sessions are for, but it's not available in nodriver yet. This is something I need to add, and it will probably be added before the end of the week. |
Apologies, still yet to get the time to look at this. There are a few open issues about memory leaks with nodriver, might be worth looking at:
We've also dealt with zombie processes in the past, most recently in #1193, but might be worth looking at 7d84f1b and 9b2c602 in case there's anything useful and equivalent that could be done. |
I'm using this rather ugly workaround for it at the moment using def kill_chrome_processes(delay: float = 5.0) -> None:
if delay > 0:
time.sleep(delay)
for proc in psutil.process_iter(['pid', 'name', 'cmdline']):
if proc.info['name'] in ('chrome', 'chrome.exe'):
if (
any(c.startswith('--no-zygote') for c in proc.info['cmdline'])
and any(c.startswith('--user-agent') for c in proc.info['cmdline'])
):
psutil.Process(proc.info['pid']).kill()
But it is using the headless option built into nodriver (
Oh, right. For some reason I thought that was something related to cookies. Still, wouldn't it make sense for the default behavior to be reusing driver instances unless explicitly choosing to manage sessions? |
If by that you mean running the browser by default all the time, I don't think that's a good idea. Flaresolverr is mostly used via container images, and some people run it on very small setups with low ram. A lot of people who do not know how containers and environment variables work would be stuck with a slow system, and I can already see a lot of problems coming up here because of that. The session option was made for people who want to keep their browser running when they know they can handle the resources it's using, and I think that's the best way to do it.
It's fine, take your time, there's more to do anyway so it's not urgent. I've seen those issues when I started implementing nodriver, I'll try to implement something like @Hyperz shared or I'll try to fix how nodriver detects the processes and deletes them. |
Ah. Yes that seems to work without stealing focus, thanks. |
I keep having If I increase the timeout it doesnt work either. I'm testing with 21hsmw/flaresolverr:nodriver
|
Can you switch Edit: |
Okay, now its working, even in the server. I needed to fix the proxy and the DRIVER=nodriver |
@21hsmw I have two questions
to change of proxies when the limit of attempt happen
|
Attempts are just an indication of how many times flaresolverr tries to pass the challenge on the page, so there's no way to control anything based on this value right now. It will keep trying to pass the challenge until it hits the timeout, and then it will fail.
It depends on what the page looks like and all the elements that need to be loaded, but generally it turns around 1GB of memory when it's trying to pass the challenge on my side (Linux). |
Thanks for your work, he got me through the challenge
|
Thanks to @juanfrilla for #1036 (comment).
Unfortunately, currently this only works on Windows, and the looping challenges return if using proxies or VPNs.