Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: PostgresSyntaxError when using the "in" operator and MetadataFilters #13111

Open
spovedavoladi opened this issue Apr 25, 2024 · 1 comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@spovedavoladi
Copy link

Bug Description

Using MetadataFilters with the "in" operator and a list of IDs throws an error, implemented as so:

if document_ids:
    filters = MetadataFilters(
        filters=[
            MetadataFilter(key='id', value=document_ids, operator=FilterOperator.IN),
        ],
    )

index = get_index(company_id)

response = index.as_query_engine(
    similarity_top_k=SIMILARITY_TOP_K_CHAT,
    filters=filters,
)

Initially tried with as_chat_engine, but fails with as_retriever as well as with as_query_engine

Version

0.10.31

Steps to Reproduce

  1. Index a set of documents by creating custom TextNodes and assign a custom node ID (id_)
  2. Query your index using by filtering with a list of IDs

Relevant Logs/Tracbacks

ERROR [2024-04-25T15:03:13] | Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/uvicorn/protocols/http/httptools_impl.py", line 375, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 75, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/fastapi/applications.py", line 208, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/starlette/applications.py", line 112, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 181, in __call__
    raise exc from None
  File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 159, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.9/site-packages/starlette/middleware/cors.py", line 86, in __call__
    await self.simple_response(scope, receive, send, request_headers=headers)
  File "/usr/local/lib/python3.9/site-packages/starlette/middleware/cors.py", line 142, in simple_response
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/starlette/exceptions.py", line 82, in __call__
    raise exc from None
  File "/usr/local/lib/python3.9/site-packages/starlette/exceptions.py", line 71, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 580, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 241, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 52, in app
    response = await func(request)
  File "/usr/local/lib/python3.9/site-packages/fastapi/routing.py", line 226, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.9/site-packages/fastapi/routing.py", line 159, in run_endpoint_function
    return await dependant.call(**values)
  File "/app/modules/queries/controllers.py", line 69, in chat_company_index
    response = await query_engine.aquery(data.content)
  File "/usr/local/lib/python3.9/site-packages/llama_index/core/base/base_query_engine.py", line 46, in aquery
    return await self._aquery(str_or_query_bundle)
  File "/usr/local/lib/python3.9/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 201, in _aquery
    nodes = await self.aretrieve(query_bundle)
  File "/usr/local/lib/python3.9/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 146, in aretrieve
    nodes = await self._retriever.aretrieve(query_bundle)
  File "/usr/local/lib/python3.9/site-packages/llama_index/core/base/base_retriever.py", line 249, in aretrieve
    nodes = await self._aretrieve(query_bundle)
  File "/usr/local/lib/python3.9/site-packages/llama_index/core/indices/vector_store/retrievers/retriever.py", line 105, in _aretrieve
    return await self._aget_nodes_with_embeddings(query_bundle)
  File "/usr/local/lib/python3.9/site-packages/llama_index/core/indices/vector_store/retrievers/retriever.py", line 177, in _aget_nodes_with_embeddings
    query_result = await self._vector_store.aquery(query, **self._kwargs)
  File "/usr/local/lib/python3.9/site-packages/llama_index/vector_stores/postgres/base.py", line 650, in aquery
    results = await self._aquery_with_score(
  File "/usr/local/lib/python3.9/site-packages/llama_index/vector_stores/postgres/base.py", line 476, in _aquery_with_score
    res = await async_session.execute(stmt)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/ext/asyncio/session.py", line 452, in execute
    result = await greenlet_spawn(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 197, in greenlet_spawn
    result = context.throw(*sys.exc_info())
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 2306, in execute
    return self._execute_internal(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 2191, in _execute_internal
    result: Result[Any] = compile_state_cls.orm_execute_statement(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/context.py", line 293, in orm_execute_statement
    result = conn.execute(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1421, in execute
    return meth(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 514, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1643, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1849, in _execute_context
    return self._exec_single_context(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1989, in _exec_single_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2356, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1970, in _exec_single_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 924, in do_execute
    cursor.execute(statement, parameters)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 572, in execute
    self._adapt_connection.await_(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 127, in await_only
    return current.driver.switch(awaitable)  # type: ignore[no-any-return]
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 192, in greenlet_spawn
    value = await result
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 550, in _prepare_and_execute
    self._handle_exception(error)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 501, in _handle_exception
    self._adapt_connection._handle_exception(error)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/dialects/postgresql/asyncpg.py", line 789, in _handle_exception
    raise translated_error from error
sqlalchemy.exc.ProgrammingError: (sqlalchemy.dialects.postgresql.asyncpg.ProgrammingError) <class 'asyncpg.exceptions.PostgresSyntaxError'>: syntax error at or near "MDQyNzZhNzgtY2IxOC00Y2QxLTg1NGMtZDU0Mzk3ZWVlYzc1"
[SQL: SELECT public."data_f251f0c0-83b7-498b-b2ba-d92340754a91".id, public."data_f251f0c0-83b7-498b-b2ba-d92340754a91".node_id, public."data_f251f0c0-83b7-498b-b2ba-d92340754a91".text, public."data_f251f0c0-83b7-498b-b2ba-d92340754a91".metadata_, public."data_f251f0c0-83b7-498b-b2ba-d92340754a91".embedding <=> $1 AS distance
FROM public."data_f251f0c0-83b7-498b-b2ba-d92340754a91"
WHERE metadata_::jsonb->'id' @> '["['MDQyNzZhNzgtY2IxOC00Y2QxLTg1NGMtZDU0Mzk3ZWVlYzc1', 'YThlZTRjMTAtNjViMy00OGY1LWIwMzEtZGIwNGQ3MzdiN2Iy', 'YTczOTY5Y2YtYmIwYy00YWRmLWJkYjctNWYxZmIxYWI3MThm', 'MjM1MTY1ODItYTRlMS00MDZjLTg5OWEtNjkwZjMxYjkxZGVm', 'ZTRmZTcyODItMzRhYi00YjIwLWE3MzItOWQ1MTkxMjU2ZmQz', 'YWY5MWRjYzAtODg2Mi00ZGQ3LWI4MzQtY2VlNGM3MDZjNmMx', 'MDdhNDk2ZmUtZDMyNC00Y2E2LTljOWEtODc2ZTFkNjNiYTgy', 'MDI2M2ZjNGYtMTMwOS00NTRhLWI3MTYtM2ZjYWQyYWVkMTMz', 'OTc2MzhjNjktNTYzOS00YjIzLWJmN2ItNjZhNGFhODg5ODI5', 'OTY0MWUxOGEtYTVhYi00ZmZhLWExNjktNGM1MWQ2ZGQ3NjNl', 'NjBjYWM4OWMtNzI4ZC00MTY5LTg0ZmUtODRkZmQ2NjJhMDA2', 'ZTUyMzMyNWYtOWQzMi00ODc5LTk1ODEtMTI1Yzk0OGVhMmUz', 'MmZmZWY1NDQtNzU2Ny00ZGUyLWFkZDktNmM2ZTdiN2EyYTNk', 'MTEwZDY2ZTItYjc5MC00YjA1LTljZjYtNDdiYWNhYWUyNjY0', 'MzVhZDhhMjYtZTE2MC00YmE3LWEyODItZmYyOGY5YmViYTMx', 'N2NkZGMzNTEtOGZiYi00YjUyLWI0OGItY2RhNjE2YzE1ZmQz', 'MjljMWM0YjYtMmY3MC00MzU4LWI1ODMtZWZkNWNjOTcxZTAy', 'MDlkNTg2OTItMTU2Ni00YzAzLThhMzYtNTgxY2IyZTQ2Nzkx', 'MTg2OTI1OGUtNzNmNy00NGVlLWI3ZjEtNDU2ZDRmMzIxZDIx', 'ZjM4Yzk3NDAtYzIzNi00MDc0LWJkNjQtMDUzZmUyNGJhNjRk', 'YzlkZTliOTktOWFjZS00YjgzLTg4MmMtM2Y4YjhkZjFlNGI3', 'OThkOTRiZTMtOTBmMS00M2ZlLTk4ZTYtNzFiMTY1ZDZlMmFj', 'YTc0NThlOGMtMWZmYS00ODI0LWI2ZGMtNzRhNDY4Nzk5MTkw', 'YmE5MDViZTEtNmY2Ni00NWVmLThhOTgtNjNiMDBmZDc1ZTFk', 'Y2NkZGQ5ZmEtNzc3Ny00MzA2LTgyOTktMDc0ZGYwODRlNzhk', 'Nzg1YWM0M2MtNGZiYS00OTMwLThkY2UtZDgxOTM3Mjk0MzQ5', 'YTVjNDQyYTQtNmQzNy00MjU5LWFjMzgtYmY5ZWJlNGYzOTAw', 'YmNhN2FiYTEtZmNlNy00ODAzLWE3NWUtOGRmM2E0M2MyYzdm', 'YWUwMGJhYzgtZDRkMC00YTBjLWE2Y2YtMDhhYTUxOWJjYmQ4', 'YzU4NDY1N2ItZGZkZS00YWYwLTkzMjEtZTYxNDE3OTJjYjRk', 'MzNmMjA2MjgtODM4MC00NDkxLWJmYTYtMThhYzAzYTg0MDQw', 'YzU0Y2Y5ZDktOWFiNi00OWJmLWE3MGEtMjM5MDRiZGFhZThk', 'YTc5YWQ5MjItZTA5MS00ZDIwLWJjODItYjJmMTg5N2M0Yzc3', 'OWFiNTBjMzAtMjU1NS00ZDdkLTllZTktYTMxODJjYzg3N2Mw', 'OGVmODIwZGEtNzc4ZS00NzZhLTkzODYtMDE1YzVjMDE3NzRk', 'YzFlNDU1YTAtNmFjNS00MzM1LWJhMGQtNDBiYWEwNTkyMjcx', 'NWFlOTAyNDYtZjViNS00Y2RiLWEzM2QtYTI2ODkwZTBmMTk1', 'MzJhMzRkYmItY2FiYy00OTEyLThiNTYtZjFkYTU5OGExNGRl', 'YjIzOWRmMjUtNTcxZC00OGIzLWE1Y2UtY2Q5MDcxZWVkNmFl', 'ODZjMzQzYjgtNGNhYy00NzdiLThmYmMtZjEwMWI3YzRlZjhl', 'NWU4ZDQ3NjItNjc0Ny00YzdiLWEwMTUtZWNlYWMzMmQ5MzQz', 'Y2MwNmI0MzItMDdjZC00ZGUwLTk1MzktMzFlMGZmMjJkYzMx', 'ZmZmYWZiYmQtY2I4ZS00MGZmLWFlMjktNzM2MWM1NGUzNDcy', 'YjE1YWY4YzUtZDAxMy00NDA2LWI1OGQtMzdiODIwZjQ0M2I0']"]' ORDER BY distance asc
 LIMIT $2::INTEGER]
[parameters: (None, 5)]
(Background on this error at: https://sqlalche.me/e/20/f405)
@spovedavoladi spovedavoladi added bug Something isn't working triage Issue needs to be triaged/prioritized labels Apr 25, 2024
Copy link

dosubot bot commented Apr 25, 2024

The PostgresSyntaxError you're encountering seems to be related to the syntax of the SQL query generated by the use of MetadataFilters with the "in" operator. This error often occurs when the format of the list of IDs doesn't match what PostgreSQL expects, especially in the context of an "IN" clause.

To address this issue:

  1. Check the Format of ID List: Ensure the list of IDs passed to MetadataFilters is correctly formatted. PostgreSQL expects a tuple for the "IN" clause. If there's only one ID, it should still be formatted as a tuple, like (id1,).

  2. Parameterized Queries: If constructing SQL queries manually, use parameterized queries to avoid syntax errors. Instead of inserting the list directly into the query string, use placeholders and provide the list as a parameter.

  3. Logging: Implement logging around the query generation to capture the exact SQL query being sent to PostgreSQL. This can help pinpoint where the syntax error is introduced.

  4. SQL Generation Logic: If MetadataFilters translate to SQL queries internally, review this conversion process to ensure it handles lists of IDs correctly, particularly for the "IN" operator.

By following these steps, you should be able to identify and correct the syntax error causing the PostgresSyntaxError.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

1 participant