-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crawler detection improvement #50
Comments
This issue concerns the code present within CrawlerDetection.java |
Any quick solution or suggestion right now. Sent from my iPhone
|
This is not an easy issue to tackle. I've been writing Web crawlers (and search engines) for years so have experienced this issue from both sides of the table. This being said, we've actually modified Apache Nutch to interact with Javascript so we can (with Nutch) actually bypass Javascript download verification as well. i think that this is a difficult issue... there is actually a bunch of research in this area. I will try to find some and post it here. |
Another article I started reading http://searchengineland.com/7-fundamental-technical-seo-questions-to-answer-with-a-log-analysis-and-how-to-easily-do-it-245903 |
@Yongyao we need to make this a priority. Right now it takes forever. |
Write the implementations and also write tests to validate.
The text was updated successfully, but these errors were encountered: