Higokami : tinny html scraper depend on Nokogiri
$ wget https://github.com/mitakeck/higokami/releases/download/v0.1.0/higokami-0.1.0.gem
$ gem install higokami-0.1.0.gem
Let's scrape Hacker News with Higokami
require 'higokami'
# parse hacker news
higokami = Higokami.new('sample/news.ycombinator.com/index.json')
puts higokami.parse('https://news.ycombinator.com/')
{
"title": "Hacker News",
"article": [
{
"title": "Deep Photo Style Transfergithub.com",
"link": "https://github.com/luanfuj..."
},
{
"title": "Gcam, the computational photogra...",
"link": "https://blog.x.company/mee..."
},
...
]
}
more sample is here.