Intermittent playground crash #46

louis030195 · 2023-04-15T07:13:27Z

🐛 Bug Report

crash sometimes (chain of issues, backend crashing default timeout supabase is 5 seconds)

🔬 How To Reproduce

Steps to reproduce the behavior:

Ask a question that involves a big context I guess (example: ask a question to whole ethereum documentation)

📎 Additional context

Dashboard/playground

TypeError: Cannot read properties of undefined (reading 'data')
  File "app:///_next/server/pages/api/createContext.js", line 139, col 18, in <anonymous>
    return r.data;
  File "<anonymous>", in Array.map
  File "app:///_next/server/pages/api/createContext.js", line 132, col 30, in createContext
    const datas = topResults.map((r)=>{
  File "node:internal/process/task_queues", line 95, col 5, in process.processTicksAndRejections
  File "app:///_next/server/pages/api/createContext.js", line 170, col 21, in buildPrompt
    const context = await createContext(prompt, datasetIds, apiKey);
  File "/var/task/dashboard/node_modules/@sentry/nextjs/cjs/server/wrapApiHandlerWithSentry.js", line 143, col 33, in <anonymous>
    const handlerResult = await wrappingTarget.apply(thisArg, args);
  File "/var/task/dashboard/node_modules/next/dist/server/api-utils/node.js", line 372, col 9, in Object.apiResolver
    await resolver(req, res);
  File "/var/task/dashboard/node_modules/next/dist/server/next-server.js", line 513, col 9, in NextNodeServer.runApi
    await (0, _node1).apiResolver(req.originalRequest, res.originalResponse, query, pageModule, {
  File "/var/task/dashboard/node_modules/next/dist/server/next-server.js", line 815, col 35, in Object.fn
    handled = await this.handleApiRequest(req, res, query, // TODO: see if we can add a runtime check for this
  File "/var/task/dashboard/node_modules/next/dist/server/router.js", line 243, col 32, in Router.execute
    const result = await route.fn(req, res, params, parsedUrlUpdated, upgradeHead);

Backend

different-ai/embedbase-hosted#3

Solutions:

increase timeout setting timeout parameter does not change timeout supabase-community/supabase-py#376
SQL optimisation (better)

The text was updated successfully, but these errors were encountered:

louis030195 · 2023-04-15T07:14:04Z

PS: need to improve error handling when using embedbase-js sdk at some level to determine

louis030195 · 2023-04-15T10:31:49Z

need to try dot product instead of cos sim in pgvector:

...
  select
    STUFFHERE,
    (documents.embedding <#> embedding) * -1 as similarity
  from documents

  -- The dot product is negative because of a Postgres limitation, so we negate it
  and (documents.embedding <#> embedding) * -1 > match_threshold

  -- OpenAI embeddings are normalized to length 1, so
  -- cosine similarity and dot product will produce the same results.
  -- Using dot product which can be computed slightly faster.
  --
  -- For the different syntaxes, see https://github.com/pgvector/pgvector
  order by documents.embedding <#> embedding
  
  limit match_count;
end;
$$;

need to setup performance monitoring beforehand though

louis030195 · 2023-04-20T12:19:37Z

pgvector/pgvector#82

louis030195 · 2023-04-24T12:23:44Z

different thing that could be tried that will highly likely improve perf:

https://github.com/pgvector/pgvector#query-options
increase list size (because table starts growing beyond the optimal 100) CREATE INDEX ON items USING ivfflat (embedding vector_ip_ops) WITH (lists = 220);
use SCANN/FAISS + Supabase

louis030195 · 2023-04-24T13:53:01Z

To update the index to use 220 lists, you'll need to first drop the existing index and then create a new index with the desired lists value. Here are the SQL commands to do that:

-- Drop the existing index
DROP INDEX documents_embedding_vector_cosine_ops_idx;

-- Create a new index with 220 lists
CREATE INDEX documents_embedding_vector_cosine_ops_idx
ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 220);

Replace documents_embedding_vector_cosine_ops_idx with the actual name of your index if it's different.

Dropping and recreating an index can have some impact on your users, depending on your database's current usage and workload. Here's how it might affect your users:

Query performance: While the index is being dropped and recreated, any queries that rely on this index may experience slower performance because the database will need to do a full table scan instead of using the index.
Table lock: Depending on the PostgreSQL version and configuration, dropping and creating an index might lock the table or cause other queries to be blocked. This can cause delays for users trying to access the table during the index operation.

To minimize the impact on your users, consider performing the index update during a maintenance window or a period of low database usage. Additionally, you can use the CONCURRENTLY keyword when creating the new index to avoid locking the table:

-- Create a new index with 220 lists, concurrently
CREATE INDEX CONCURRENTLY documents_embedding_vector_cosine_ops_idx
ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 220);

Note that you cannot use the CONCURRENTLY keyword when dropping an index. However, dropping an index is generally a quick operation and should not cause significant disruption.

louis030195 · 2023-04-28T07:58:41Z

nvm all this. just need to distinct the select query when optimizing duplicates

…fault fixing #46

louis030195 · 2023-05-01T16:49:45Z

fixed 🚢🚢🚢🚢🚢

louis030195 added the bug Something isn't working label Apr 15, 2023

louis030195 added a commit that referenced this issue Apr 28, 2023

Release 1.0.6: fix: embedbase-core: distinct select in database by de…

8afc5f3

…fault fixing #46

louis030195 closed this as completed May 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intermittent playground crash #46

Intermittent playground crash #46

louis030195 commented Apr 15, 2023

louis030195 commented Apr 15, 2023

louis030195 commented Apr 15, 2023 •

edited

louis030195 commented Apr 20, 2023

louis030195 commented Apr 24, 2023 •

edited

louis030195 commented Apr 24, 2023 •

edited

louis030195 commented Apr 28, 2023

louis030195 commented May 1, 2023

Intermittent playground crash #46

Intermittent playground crash #46

Comments

louis030195 commented Apr 15, 2023

🐛 Bug Report

🔬 How To Reproduce

📎 Additional context

louis030195 commented Apr 15, 2023

louis030195 commented Apr 15, 2023 • edited

louis030195 commented Apr 20, 2023

louis030195 commented Apr 24, 2023 • edited

louis030195 commented Apr 24, 2023 • edited

louis030195 commented Apr 28, 2023

louis030195 commented May 1, 2023

louis030195 commented Apr 15, 2023 •

edited

louis030195 commented Apr 24, 2023 •

edited

louis030195 commented Apr 24, 2023 •

edited