Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS]: similarity_threshold documentation is wrong #578

Open
mayalinetsky-kryon opened this issue Nov 26, 2023 · 0 comments
Open

[DOCS]: similarity_threshold documentation is wrong #578

mayalinetsky-kryon opened this issue Nov 26, 2023 · 0 comments

Comments

@mayalinetsky-kryon
Copy link

Documentation Link

https://gptcache.readthedocs.io/en/latest/references/gptcache.html?highlight=config#module-gptcache.config

Describe the problem

The documentation says:

similarity_threshold (float) – a threshold ranged from 0 to 1 to filter search results with similarity score higher than the threshold. When it is 0, there is no hits. When it is 1, all search results will be returned as hits.

But when I initialize the cache like so:

cache_base = CacheBase("sqlite")
vector_base = VectorBase("faiss", dimension=embedding_onnx.dimension)
data_manager = get_data_manager(cache_base, vector_base, max_size=100000)
cache.init(
    embedding_func=embedding_onnx.to_embeddings,
    data_manager=data_manager,
    similarity_evaluation=SearchDistanceEvaluation(max_distance=1),
    config=Config(similarity_threshold=1),
)

I don't get any hits.

Describe the improvement

Reverse the description of similarity_threshold.

Anything else?

Also, the syntax of "filter search results with similarity score higher than the threshold" is not defined enough. Are we filtering OUT results with a similarity higher than the threshold, and discarding them? or are we looking ONLY at items with similarity higher than the threshold?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant