SVE vector optimization for halfvectors dot product calculation #536
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi, @ankane and @jkatz !
I optimized the dot product calculation for half vector using SVE extension that is present on the many machines suitable for vector search on ARM architecture.
Results show that on the same machine (Graviton3):
insert into emb_f16_3 (vector) select vector::halfvec(1536) from emb;
Possibly because using SVE intrinsic makes sense when we deal with many float16 number altogether as it's in dot product calculation. So it's not in the patch.
I suggest that if the performance gain for (1) is good enough we could try to add some architecture checks into the patch. I'm not convinced that the way I did is optimal one. So if anyone more experiences in these checks add some improvements, I appreciate it very much.
And also maybe it's worth adding SVE optimized cosine and l2 distance and testing these cases separately.