Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a fetcher that uses a real Chrome browser to download the html #237

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

johanoskarsson
Copy link
Contributor

Adds a new Fetcher that uses a real Chrome browser to fetch the html. This solved a problem where I was unable to fetch a page that was partially generated by javascript using any of the existing fetchers. (I assume the page required a modern real browser for some reason I did not investigate further).

This change uses the cdt-java-client library found here to launch and communicate with a Chrome browser: https://github.com/kklisura/chrome-devtools-java-client
However due to a breaking change in Chrome that has not been fixed in this library I am using a fork with that one patch applied: io.fluidsonic.mirror:cdt-java-client:4.0.0-fluidsonic-1. Hopefully the change gets merged back into the main library.

WIP warning: I figured I would publish this PR in its current state in case it helps anyone else. It does however not fullfil all the expectations of a fetcher. It does not return the correct http status etc, just the body. There is a Network class that can probably be used to extract those.

Copy link

codecov bot commented Mar 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.56%. Comparing base (382f21b) to head (475065d).

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #237   +/-   ##
=======================================
  Coverage   89.56%   89.56%           
=======================================
  Files          38       38           
  Lines         986      986           
  Branches       69       69           
=======================================
  Hits          883      883           
  Misses         81       81           
  Partials       22       22           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant