Multiple Filters using Chroma().as_retriever #10537
-
Hello everyone! I've started using Langchain and ChromaDB a few days ago, but I'm facing an issue I cannot solve. I have a VectorStore that contains multiple pdfs and associated metadata. My goal is to pre-filter in multiple ways. Looking into the documentation the only example about filters is using just one filter. There's other methods like "get" that include "where" arguments, which has examples of using more than one filter. I tested this code for one filter: retriever = DB.as_retriever(search_kwargs={"k": int(get_env_var('TARGET_SOURCE_CHUNKS')),
"filter": {"user_id": user_id}}) Which worked without any issues. The real problem appears when I try to use more than one filter. I have tried it in different ways: First: retriever = DB.as_retriever(search_kwargs={"k": int(get_env_var('TARGET_SOURCE_CHUNKS')),
"filter": {"user_id": user_id, "category_id":cat_id}}) Second: retriever = DB.as_retriever(search_kwargs={"k": int(get_env_var('TARGET_SOURCE_CHUNKS')),
"filter":[ {"user_id": user_id}, {"category_id":cat_id}]}) Also wanted to receive even more than one user at the same time: retriever = DB.as_retriever(search_kwargs={"k": int(get_env_var('TARGET_SOURCE_CHUNKS')),
"filter": {"user_id": [user_id_1, user_id_2]}}) Nothing worked. I'd appreciate your help in order to achieve filtering in different metadata keys, and also more than one value for each metadata key. Thanks in advance! |
Beta Was this translation helpful? Give feedback.
Replies: 7 comments 9 replies
-
🤖 Hello, Thank you for using LangChain and ChromaDB. I understand you're having trouble with multiple filters using the Based on the issues and solutions I found in the LangChain repository, it seems that the Here's how you can use multiple filters: retriever = DB.as_retriever(search_kwargs={"k": int(get_env_var('TARGET_SOURCE_CHUNKS')),
"filter":{'$or': [{'user_id': {'$eq': user_id}},
{'category_id': {'$eq': cat_id}}]}}) This will return documents that match either the If you want to filter documents that match both retriever = DB.as_retriever(search_kwargs={"k": int(get_env_var('TARGET_SOURCE_CHUNKS')),
"filter":{'$and': [{'user_id': {'$eq': user_id}},
{'category_id': {'$eq': cat_id}}]}}) For filtering on a list of values, you can use the following format: retriever = DB.as_retriever(search_kwargs={"k": int(get_env_var('TARGET_SOURCE_CHUNKS')),
"filter":{'user_id': {'$in': [user_id_1, user_id_2]}}}) This will return documents that have a Please note that the I hope this helps! If you have any other questions, feel free to ask. Best, Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
Beta Was this translation helpful? Give feedback.
-
@dosu-bot.
How to fix that? |
Beta Was this translation helpful? Give feedback.
-
retriever = DB.as_retriever(search_kwargs={"k": int(get_env_var('TARGET_SOURCE_CHUNKS')), |
Beta Was this translation helpful? Give feedback.
-
Here is what I want to do
And if I run the folloiwng code,
How to solve this? |
Beta Was this translation helpful? Give feedback.
-
use |
Beta Was this translation helpful? Give feedback.
-
What is this functionality based on? Is it native python, custom or some package? |
Beta Was this translation helpful? Give feedback.
-
How is this working for Qdrant?
do i have to use the qdrant specific syntax for filtering? |
Beta Was this translation helpful? Give feedback.
🤖
Hello,
Thank you for using LangChain and ChromaDB. I understand you're having trouble with multiple filters using the
as_retriever
method.Based on the issues and solutions I found in the LangChain repository, it seems that the
filter
argument in theas_retriever
method should be able to handle multiple filters. However, the syntax you're using might not be correct.Here's how you can use multiple filters:
This…