-
-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add "selector" function #9
Labels
Comments
The example used in this proposal is also using an array for selectors. selector: [
guide.movieElementSelector,
'.item-list a[href^="/reservation"]'
] Thats simply for chaining multiple selectors. I guess it could be written as This needs a separate proposal. |
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Sometimes different parts of the scraper script need to access the same element.
Consider this example:
scrapeMovies
gets a list of movie names, https://gist.github.com/gajus/68f9da3b27a51a58db990ae67e9acdae#file-mk2-js-L49-L62scrapeShowtimes
parsers additional information about the parsed movies, https://gist.github.com/gajus/68f9da3b27a51a58db990ae67e9acdae#file-mk2-js-L83-L106The information is scraped from the same URL (therefore, the same document).
scrapeMovies
selects movie elements, then passes an instance of the resultingcheerio
selector toscrapeShowtimes
, thenscrapeShowtimes
is using parent selectortr
to find the corresponding movie table row.Using the parent selector is bad because a
scrapeShowtimes
should work only on the information it is provided (e.g., the identifier of an element); it shouldn't be capable to iterate the DOM upwards. Furthermore, this makes logging useless.A better alternative would be to derive a unique selector that can be shared between the processes. The above example could be then rewritten to:
The idea is that
tr::selector()
returns a CSS selector that given the same document will select the same element.The text was updated successfully, but these errors were encountered: