Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(nodejs): add better error handling when missing embedding functions #1290

Merged
merged 10 commits into from
May 14, 2024

Conversation

universalmind303
Copy link
Contributor

note:
running the default lint command npm run lint -- --fix seems to have made a lot of unrelated changes.

Comment on lines 928 to 945
async add(
data: Array<Record<string, unknown>> | ArrowTable
): Promise<number> {
const schema = await this.schema
let tbl: ArrowTable
const schema = await this.schema;

let tbl: ArrowTable;

const schemaWithoutEmbeddings = validateSchemaEmbeddings(
schema,
data,
this._embeddings
);

if (data instanceof ArrowTable) {
tbl = data
tbl = data;
} else {
tbl = makeArrowTable(data, { schema })
tbl = makeArrowTable(data, { schema: schemaWithoutEmbeddings });
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the relevant code change. Every thing else is just from running the linter

}
}

function validateSchemaEmbeddings(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the relevant code change. Every thing else is just from running the linter

@universalmind303 universalmind303 changed the title fix: add better error handling when missing embedding functions fix(node, vectordb): add better error handling when missing embedding functions May 10, 2024
@universalmind303 universalmind303 changed the title fix(node, vectordb): add better error handling when missing embedding functions fix(nodejs): add better error handling when missing embedding functions May 13, 2024
@universalmind303
Copy link
Contributor Author

@wjones127 @westonpace, I'm unable to reproduce the CI failures locally. Could one of you take a look and see if you are able to reproduce.

@westonpace
Copy link
Contributor

@wjones127 @westonpace, I'm unable to reproduce the CI failures locally. Could one of you take a look and see if you are able to reproduce.

@universalmind303 It does reproduce locally for me. It appears that it's this line:

    if (data[0][field.name] === undefined) {

I suspect data[0] is undefined.

@github-actions github-actions bot added bug Something isn't working typescript Typescript / javascript labels May 13, 2024
Copy link
Contributor

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be reading this wrong but does this prevent a user from providing a fixed-size-list column directly without using the embeddings function? In other words, if a user is calculating their own embeddings (not using lancedb to do it) will it still work?

Also, any change added to node/src/arrow.ts is probably appropriate for nodejs/src/arrow.ts.

node/src/arrow.ts Outdated Show resolved Hide resolved
@universalmind303
Copy link
Contributor Author

I might be reading this wrong but does this prevent a user from providing a fixed-size-list column directly without using the embeddings function? In other words, if a user is calculating their own embeddings (not using lancedb to do it) will it still work?

Users can still provide their own embeddings manually without using the embed functionality. It will only error out if there is a vector field in the schema with no matching implementation (manual or embed function)

Example:

     const schema = new Schema([
        new Field("id", new Int32()),
        new Field("text", new Utf8()),
        new Field(
          "vector",
          new FixedSizeList(2, new Field("item", new Float32(), true))
        ),
      ]);

user provides schema and manual embeddings (OK)

      const data = [
        { id: 1, text: "foo", vector: [0.1, 0.2] },
        { id: 2, text: "bar", vector: [0.3, 0.4] },
      ];
      let table = await con.createTable({
        name: "test",
        data,
        schema,
      });

user provides schema and embedding function (OK)

      const embeddingFunction = {...}
      const data = [
        { id: 1, text: "foo"},
        { id: 2, text: "bar"},
      ];
      let table = await con.createTable({
        name: "test",
        data,
        schema,
        embeddingFunction
      });

User provides schema, but no embeddings or function (Not OK)

      const data = [
        { id: 1, text: "foo"},
        { id: 2, text: "bar"},
      ];
      let table = await con.createTable({
        name: "test",
        data,
        schema
      });

Copy link
Contributor

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, looks good.

@universalmind303 universalmind303 merged commit bc582bb into lancedb:main May 14, 2024
10 checks passed
@universalmind303 universalmind303 deleted the iss-1289 branch May 14, 2024 13:43
universalmind303 added a commit that referenced this pull request May 15, 2024
i accidentally left a console.log when doing
#1290
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working typescript Typescript / javascript
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants