Skip to content

A simple news scraper built with newspaper3k, it can be modified to suite any news based application. Scarping the internet for articles couldn't be much easier

Notifications You must be signed in to change notification settings

clarknull/newsscraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

NIGERIAN NEWS SCRAPER (Python 3)

This project demonstrate how easy it is to build a simple news web scraper using python and newspaper3k library.

Motivation Behind This

I was working on a project that needed to keep users up-to date with specific news information. As a python developer, scraping was the only option. But as a lazy person, the work involved was daunting. After some 30min of research, i came across newspaper3k and my problem was solved. I've decided to share this hoping to motivate some hard working programmers. The newspaper3k library also has some NLP features which can be used to further filter out some articles.

Dependencies

pip3 install newspaper3k

#required for nlp
curl https://raw.githubusercontent.com/codelucas/newspaper/master/download_corpora.py | python3

for full installation guide, please visit the newspaper3k documentation

Extra Note

In some part of the world, having a slow computer system and a slow internet connection is very very common. I had to make the code sleep a bit to free up resources for other system processes. Feel free to make if fly as you wish.

#sleep for 2 seconds after fetching 50 articles
 if counter > 50:
    time.sleep(2)
    counter = 0

License

MIT

About

A simple news scraper built with newspaper3k, it can be modified to suite any news based application. Scarping the internet for articles couldn't be much easier

Topics

Resources

Stars

Watchers

Forks

Languages