Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: HTTP request tool #9228

Open
wants to merge 29 commits into
base: master
Choose a base branch
from

Conversation

michael-radency
Copy link
Contributor

Summary

Tool to visit a website

Related tickets and issues

https://linear.app/n8n/issue/AI-162/tool-to-visit-a-website

@michael-radency michael-radency added node/new Creation of an entirely new node n8n team Authored by the n8n team labels Apr 26, 2024
);
}
const returnData: string[] = [];
const html = cheerio.load(response);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use @mozilla/readability and jsdom here to cleanly extract the content that's likely relevant to an end-user.

Something like this perhaps:

import { JSDOM } from 'jsdom'
import { Readability } from '@mozilla/readability'

const dom = await JSDOM.fromURL(url)
const article = new Readability(dom.window.document, {
    keepClasses: true,
}).parse()

and then use article.content.

we could also consider using turndown to convert the html into markdown, which LLM tend to handle better than html IMO.

import Turndown from 'turndown'
const markdown = turndown.turndown(article.content)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
n8n team Authored by the n8n team node/new Creation of an entirely new node
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants