Skip to content

lewisakura/spiderboi

Repository files navigation

Spiderboi

NPM

A web crawling library written in TypeScript.

Example

import Crawler from 'spiderboi';

async function run() {
    const crawler = new Crawler('https://google.com');

    // this gets the site's robots.txt so that the crawler can respect it
    await crawler.readyUp();

    const out = await crawler.crawl('/search/about');
    console.log(out);
}

run();
/**
 * above code should output:
 * [ 'https://google.com/search/about/',
 * 'https://google.com/search/about/',
 * 'https://google.com/#app-store',
 * 'https://google.com/#app-store',
 * 'https://google.com/#image-texts' ]
 * 
 * unless of course google changes the /search/about page and ruins this example.
 */