Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

meta: Replace lunr (middleman-search) with DocSearch #691

Open
Tracked by #652
tnir opened this issue Jul 15, 2022 · 17 comments · May be fixed by #706 or #702
Open
Tracked by #652

meta: Replace lunr (middleman-search) with DocSearch #691

tnir opened this issue Jul 15, 2022 · 17 comments · May be fixed by #706 or #702
Labels
architecture/cleanup architecture/legacy or broken Legacy and/or broken architecture architecture/maintenance Daily architecture update design good first issue help wanted javascript Pull requests that update Javascript code performance Issues/PRs that improve performance
Milestone

Comments

@tnir
Copy link
Collaborator

tnir commented Jul 15, 2022

Problems

The Bundler.io Website team needs to maintain our own search.js and search_arrow.js by themselves with that abandoned middleman-search (for example manastech/middleman-search#38) and its dependency of the legacy version lunr.js.

We cannot upgrade lunr to the latest without our efforts (or other community's efforts than maintainers).

This also prevents from merging #661.

Solutions

https://github.com/algolia/docsearch (do not confused with https://docsearch.algolia.com/) can be utilized.

Pros

  • For open-source project, the cost is free.
  • lunr-index.json is not required any more.
  • The Bundler.io Website team can save time to maintain our own search.js and search_arrow.js
  • UI components search.js and search_arrows.js can be replaced with Algolia's one, which seems sophisticated to me.
  • We do not need to spend time for reinventing the wheel such as above.
  • Fuzzy search by DocSearch is also nice.
  • Loading index on every page (no cache on browser side as of writing) can be avoided.

Cons

  • Proprietary (while no cost)
@deivid-rodriguez
Copy link
Member

There's also typesense which describes itself as "The Open Source Algolia Alternative".

@tnir
Copy link
Collaborator Author

tnir commented Jul 15, 2022

Yes, typesense cloud is not at no cost even for open source project while typesense software is fully open-sourced. DocSearch equips (user-facing) UI kit specialized for document search as its name suggests.

DocSearch (and Algolia in small amount) is to typesense cloud what GitHub is to GitLab. GitLab is an open-sourced project but (GitLab as cloud service) is getting less and less for open source project (a personal opinion).

Anyway, thanks for replying to my "Cons".

@tnir tnir added this to the MaintainerMonth2022-design milestone Jul 16, 2022
@simi
Copy link
Member

simi commented Jul 17, 2022

Maybe we can ask typesense for some kind of "free" community program.

/cc @kishorenc

@kishorenc
Copy link

@simi We'll be delighted to help!

@tnir
Copy link
Collaborator Author

tnir commented Jul 18, 2022

https://typesense.org/docs/guide/docsearch.html looks like getting started 😁

@tnir
Copy link
Collaborator Author

tnir commented Jul 18, 2022

I implemented #702 with a 30-day free cluster (0.5 GB RAM, 2 vCPUs, 1 hr burst per day) on Typesense cloud associated with an account outside of @rubygems Org.

@tnir tnir self-assigned this Jul 18, 2022
@tnir
Copy link
Collaborator Author

tnir commented Jul 18, 2022

With a proof of concept in #702, at this moment I prefer Algolia's DocSearch as it is for free, provides automated crawling every 24 hours and new look and feel so-called as DocSearch v3...

@jasonbosco
Copy link

jasonbosco commented Jul 18, 2022

@tnir I work with @kishorenc on Typesense. DocSearch v3 also works with Typesense. Here’s one site that uses it: https://rushjs.io/.

Happy to also host the scraper for you, and set it up to crawl on-demand on each deploy, so the index is updated in near real-time. Let me know.

P.S.: I’m a big Ruby fan, the Typesense Cloud management console is built on Rails, so I’m very excited to be able to support the Ruby ecosystem in anyway I can!

@tnir tnir linked a pull request Jul 18, 2022 that will close this issue
2 tasks
@tnir
Copy link
Collaborator Author

tnir commented Jul 18, 2022

@jasonbosco Thanks for your reply. While I could not find any clue about DocSearch v3-compatible typesense-docsearch.js npm package at https://github.com/microsoft/rushjs.io-website, how did rushjs team do it?

On development of this, I also found https://github.com/typesense/typesense-docsearch.js seems having no update more than 18 months 😭

Thanks for hosting the scraper, but can we monitor it like one at Algolia's crawler?

P.S. I was a (big) Typesense fan last summer. When I just logged in today 1 year after it, I found my personal account is disabled 😭.

@tnir
Copy link
Collaborator Author

tnir commented Jul 18, 2022

@jasonbosco Two questions in summary:

  1. How can I get Typesense DocSearch.js with DocSearch v3 compatible (modern UI)?
  2. How will we get Typesense DocSearch.js compatible with latest DocSearch in future?
  3. How can get an account for this case? rubygems namespace?

@jasonbosco
Copy link

@tnir

How can I get Typesense DocSearch.js with DocSearch v3 compatible (modern UI)?

Rushjs.io uses Docusaurus and so they use the Docusaurus Typesense theme, which in turn uses typesense-docsearch-react.

I haven't yet published a non-React version of typesense-docsearch v3. So we can do one of two things:

  1. Use something like this with middleman: https://github.com/plasticine/middleman-react
  2. I can look into publishing a vanilla JS of typesense-docsearch v3

Let me know which one you'd prefer.

How will we get Typesense DocSearch.js compatible with latest DocSearch in future?

I intend to maintain the Typesense fork of Docsearch well into the future, as new versions of Docsearch come out.

On development of this, I also found https://github.com/typesense/typesense-docsearch.js seems having no update more than 18 months 😭

The next branch of typesense-docsearch.js has more recent updates, which is the branch where Docsearch v3 also exists in upstream. Algolia just made the next branch as the default one in Github, whereas I left the default branch as is.

Thanks for hosting the scraper, but can we monitor it like one at Algolia's crawler?

We don't have a monitoring UI for the scraper specifically, but happy to share access to logs that the scraper generates.

When I just logged in today 1 year after it, I found my personal account is disabled
How can get an account for this case? rubygems namespace?

I've gone ahead and reinstated your account.

In any case, you would want to provision clusters under a team account in case cluster access needs to be shared. The best way to do this would be to create a new (free) Github Org just for this purpose call it say rubygems-typesense, and add anyone who needs access to the cluster to this org. When anyone who is a member of this Github org logins to Typesense Cloud via Github, they will then have access to this cluster.

Alternatively, you could also authorize access to the rubygems org when logging into Typesense Cloud, and then from the Typesense Cloud account page, specify a particular Github team within the rubygems Github org who would need access to Typesense Cloud.

To reauthorize your own account, you want to visit https://github.com/settings/applications, remove Typesense Cloud and then log out and log back in to Typesense Cloud. Github will then prompt you to allow access to the new Github org.

@tnir
Copy link
Collaborator Author

tnir commented Jul 20, 2022

@jasonbosco Thank you. I will reconsider the architecture between Algolia and Typesense. I hope you prepare docs for Typesense's open source program. Otherwise, it is hard to compare it with competitors.

@jasonbosco
Copy link

@tnir Following up from this:

I haven't yet published a non-React version of typesense-docsearch v3. So we can do one of two things:

  1. Use something like this with middleman: https://github.com/plasticine/middleman-react
  2. I can look into publishing a vanilla JS of typesense-docsearch v3

I just published v3 of DocSearch which has the modal layout ((2) above), customized to work with Typesense. Here's how to use it: https://typesense.org/docs/guide/docsearch.html#option-c-custom-docs-framework-with-docsearch-js-v3-modal-layout

@tnir
Copy link
Collaborator Author

tnir commented Jul 22, 2022

This issue is for the replacement of home-made search system based on unmaintained middleman-search. I would like to ask @jasonbosco and everyone interested in to move to the dedicated issue #731 per Typesense-specific discussion. Thank you 🚀

@tnir
Copy link
Collaborator Author

tnir commented Jul 23, 2022

Loading index on every page (no cache on browser side as of writing) can be avoided with this improvement. Adding this to pros.

I have to say I'm more and more convinced that a local search solution would be best, [...]

by @deivid-rodriguez in #731 (comment)

@tnir tnir added architecture/legacy or broken Legacy and/or broken architecture javascript Pull requests that update Javascript code performance Issues/PRs that improve performance design architecture/cleanup good first issue help wanted architecture/maintenance Daily architecture update labels Jul 23, 2022
@tnir tnir removed their assignment Jul 23, 2022
@tnir tnir changed the title Replace lunr (middleman-search) with DocSearch meta: Replace lunr (middleman-search) with DocSearch Jul 23, 2022
@tnir
Copy link
Collaborator Author

tnir commented Jul 23, 2022

Un-assigning myself as it is now meta-ish.

@tnir tnir modified the milestones: Design 3Q, Design 4Q Dec 11, 2022
@tnir tnir modified the milestones: Design 4Q, Design 1Q 23 Mar 16, 2023
@tnir
Copy link
Collaborator Author

tnir commented Mar 16, 2023

This issue was postponed to 1Q 23, but is likely to be postponed to 2Q 23.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
architecture/cleanup architecture/legacy or broken Legacy and/or broken architecture architecture/maintenance Daily architecture update design good first issue help wanted javascript Pull requests that update Javascript code performance Issues/PRs that improve performance
Projects
None yet
5 participants