Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SVE vector optimization for halfvectors dot product calculation #536

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

pashkinelfe
Copy link
Contributor

Hi, @ankane and @jkatz !

I optimized the dot product calculation for half vector using SVE extension that is present on the many machines suitable for vector search on ARM architecture.

  1. Testing HNSW build time (same test as Can I help get tinyint or half branches released? #326 (comment) and Parallel index builds for HNSW #409 (comment)
image

Results show that on the same machine (Graviton3):

  • index build time for half vectors using default inner product function is better than for float32 vectors only at high number of cores. Possibly due to less locks/memory accesses/disk IO
  • index build time for SVE inner product function for halfvectore is better than with default inner product function for halfvectors at any number of cores. Even at serial build.
  1. I tried to add SVE optimization for converting float32<->float16 but found no performance gain in
    insert into emb_f16_3 (vector) select vector::halfvec(1536) from emb;

Possibly because using SVE intrinsic makes sense when we deal with many float16 number altogether as it's in dot product calculation. So it's not in the patch.

I suggest that if the performance gain for (1) is good enough we could try to add some architecture checks into the patch. I'm not convinced that the way I did is optimal one. So if anyone more experiences in these checks add some improvements, I appreciate it very much.

And also maybe it's worth adding SVE optimized cosine and l2 distance and testing these cases separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant