Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: SQLite WASM builds #375

Open
2 of 3 tasks
jlarmstrongiv opened this issue Apr 1, 2024 · 3 comments
Open
2 of 3 tasks

Feature: SQLite WASM builds #375

jlarmstrongiv opened this issue Apr 1, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@jlarmstrongiv
Copy link
Contributor

jlarmstrongiv commented Apr 1, 2024

Describe what you are looking for

Run usearch extensions for SQLite in the browser with SQLite wasm like https://github.com/nalgeon/sqlean.js

Can you contribute to the implementation?

  • I can contribute

Is your feature request specific to a certain interface?

Other bindings

Contact Details

Ping me in Discord

Is there an existing issue for this?

  • I have searched the existing issues

Code of Conduct

  • I agree to follow this project's Code of Conduct
@jlarmstrongiv jlarmstrongiv added the enhancement New feature or request label Apr 1, 2024
@ashvardanian
Copy link
Contributor

ashvardanian commented Apr 1, 2024

I think we should update the release.yml to use WASI SDK 21 over the currently used 20. Moreover, we should avoid using the SDK directly, and instead pass it as a CMake toolchain, as documented.

@jonathanpv
Copy link

If we get this, my app for local rag would be solved. Although am concerned about data persistence, my use case is users upload textbooks and can perform vector similarity on individual textbooks instead of lets say a user uploads 10,000 textbooks, my project atm will do textbook by textbook to avoid overloading the memory.

How would 1: data persistance and 2: memory be handled with wasm + sqlite + usearch based approach?

Conceptually I would like something like

Query: who are the authors
Textbook: Calculus textbook selected, the sqlite id is textbook_id_here
Sqlite: target the table textbook_id_here and query it using unum search

I've read OPFS + wasm may be a great solution to this but in general will the entire db will all be populated in memory. There seems to not be a single non-memory solution to vector search in browser. Disk-ANN works in C and other languages, im assuming solely due to file system and non browser restrictions. However, I assume a browser-js native implementation can work complete with hnsw and vector search but without multithreading as i believe multithread is not possible in browser-js. Given OPFS is files and supports reading the bytes its basically gives freedom to talk to unum indexes, avoiding the posisblitiy of using wasm all together.

Similar projects I've found are here:
https://github.com/askorama/orama

They allow client based vector search by being js native
https://github.com/askorama/orama/blob/main/packages/docs/open-source/usage/search/vector-search.md#L5L102

And heres how they persist data
https://github.com/askorama/orama/blob/44836b3f2132061b907015f18bea334f9dd4478b/packages/docs/open-source/plugins/plugin-data-persistence.md#L5L104

However for sqlite approach theres also this repo that saves sqlite
storage in indexedDB (which apprently OPFS supersedes in performance but heres the repo nonetheless)
https://github.com/jlongster/absurd-sql

Theres also work by the official sqlite team to address this persisting storage
https://sqlite.org/wasm/doc/trunk/persistence.md

Given this information perhaps we can just have native js browser based OPFS solution to query unum files index.unumsearch calculus.unumsearch
and my app can just be query("my calculus question", opfs.file(namehere)) on unum file per textbook

Interested in more discussion here

@jlarmstrongiv
Copy link
Contributor Author

While it’s still being developed, it appears this library https://github.com/asg017/sqlite-vec will support vector search with sqlite in wasm. I hope usearch sqlite extensions support wasm in the future too!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants