Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend Crawler queries by a custom "data-orama" attribute #722

Open
fabiobiondi opened this issue May 14, 2024 · 0 comments
Open

Extend Crawler queries by a custom "data-orama" attribute #722

fabiobiondi opened this issue May 14, 2024 · 0 comments

Comments

@fabiobiondi
Copy link

fabiobiondi commented May 14, 2024

Problem Description

We are trying the Crawler and and we noticed that our Next 14 site is not being indexed.

The problem is probably that we have many nested components that render texts inside <div> instead of <p>.
I realize that it's not the best in terms of accessibility and semantics but we have this need.

Looking at the source code (general-purpose.ts) we realized that the contents of the <div>s are totally ignored.

https://github.com/askorama/crawly/blob/2892e473775a408495d07a0dea016ec23a85d362/src/general-purpose.ts#L34-L51

In fact I and @gioboa did a test modifying your function, adding <div>s to the query, but dirt and non-useful DOM elements were also indexed. So it doesn't seem like a decent solution.

Proposed Solution

We thought an interesting idea might be to let users decide what content to index outside of your rules.

A very simple hypothetical solution could be to insert a data-orama attribute on the elements to be indexed into the site you want to index and extend the crawler to also query those elements.

<div data-orama> content </div>

I think it might be a simple, clean and powerful way to extend it.

What do you think?

Alternatives

Another future solution could be to allow the crawler function to be completely customized by the users

Additional Context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant