Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add backend kwargs for pgml.embed #1181

Closed
wants to merge 3 commits into from
Closed

Add backend kwargs for pgml.embed #1181

wants to merge 3 commits into from

Conversation

kczimm
Copy link
Contributor

@kczimm kczimm commented Nov 21, 2023

This PR originally intended to fix #1169. There are some models that do not work with InstructorEmbedding and sentence-transformers but rather should use transformers.AutoModel. Here, we introduce a backend field to the kwargs argument which allows you to opt-in to using the transformers.AutoModel.

Now, you should be able to do the following:

SELECT pgml.embed(
    transformer => 'jinaai/jina-embeddings-v2-base-en',
    text => 'Dynamical Scalar Degree of Freedom in Horava-Lifshitz Gravity"}',
    kwargs => '{"trust_remote_code": true, "backend": "transformers"}'
);

As an aside, I moved the usage and caching of embeddings into Rust and out of Python.

@kczimm kczimm closed this May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pgml.embed trust_remote_code
1 participant