Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support UUID as arrow type #1433

Open
eddyxu opened this issue Oct 19, 2023 · 6 comments
Open

Support UUID as arrow type #1433

eddyxu opened this issue Oct 19, 2023 · 6 comments
Assignees
Labels
arrow Apache Arrow related issues

Comments

@eddyxu
Copy link
Contributor

eddyxu commented Oct 19, 2023

  • Support specify UUID as arrow schema
  • Stored as fixed size data type (128 bits)
  • Have Python and Javascript definition in both LanceDB SDKs
@eddyxu eddyxu added the arrow Apache Arrow related issues label Oct 19, 2023
@rok
Copy link
Contributor

rok commented Oct 24, 2023

Is this meant as extension type (e.g. https://github.com/apache/arrow/pull/37298) or as part of lance format?

@eddyxu
Copy link
Contributor Author

eddyxu commented Oct 27, 2023

It'd be great that we can natively support UUID type from pyarrow. ALso we wanted to support the interoperability in javascript as well.

@rok
Copy link
Contributor

rok commented Oct 27, 2023

I have an upstream PR in arrow that may get merged for 15.0.0 apache/arrow#37298.
Should we wait for one quarter to get it or implement a temporary equivalent in pylance (e.g. #1471)?

I'll start working on the js one.

@rok
Copy link
Contributor

rok commented Dec 20, 2023

@eddyxu I was planning to continue work on this after the upstream extension type was merged (apache/arrow#37298). If you'd want UUID type in Python before that #1471 can be merged. If not #1471 can be closed out.

@eddyxu
Copy link
Contributor Author

eddyxu commented Dec 20, 2023

Once we have upstream PR merged, do we need anything change from lance to support it?
I can take it over from there.

@rok
Copy link
Contributor

rok commented Dec 20, 2023

It'll be available in pyarrow as e.g. FixedShapeTensorArray is, see test here. I'm not sure if you'll need changes on rust side of things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Apache Arrow related issues
Projects
None yet
Development

No branches or pull requests

2 participants