Optimize visiting neighbors #386

hlinnaka · 2023-12-22T14:39:30Z

Here are two more micro-optimizations of HnswSearchLayer, for HNSW index build

1st Commit: Add a bulk-variant of AddToVisited

The idea is to move code around so that we collect all the 'hash' values to an array in a tight loop, before performing the hash table lookups. This codepath causes a lot of CPU cache misses, as the elements are scattered around memory, and performing all the fetches upfront allows the CPU to schedule fetching those cachelines sooner. That's my theory of why this works, anyway :-).

This gives a 5%-10% speedup on my laptop, on HNSW index build of a subset of the 'angular-100' dataset (same test I used on my previous PRs). I'd love to hear how this performs on other systems, as this could be very dependent on CPU details.

2nd & 3rd commits: Calculate 4 neighbor distances at a time in HNSW search

This is just a proof of concept at this stage, but shows promising results. The idea is to have a variant of the distance function that calculates the distance from one point 'q' to 4 other points in one call. That gives the vectorized loop in the distance function more work to do in each iteration. If you think this is worthwhile, I can spend more time polishing this, adding these array-variants as proper AM support functions, etc.

This gives another 10% speedup on the same test on my laptop. It could possibly be optimized further, by providing variants with different array sizes, or a variable-length version and let the function itself vectorize it in the optimal way. With some refactoring, I think we could also use this in CheckElementCloser(). This might also work well together with PR #311, but I haven't tested that.

jkatz · 2023-12-22T18:21:46Z

🚀 I still have on my TODO to run some serious benchmarks on these patches; I'll work to get those up and running to see how it performs under different contexts on some larger machines. Given they're focused on index building, my thought process is to test:

"Regression" tests using the ANN Benchmark suite to capture both performance/recall -- need to do two sets: with and without parallel build
Increasing concurrent building benchmarks (discussed here)
Testing with the parallel build benchmark on different large data sets (10MM + 100MM) with a focus on timing

ankane · 2024-01-08T20:55:51Z

Hey @hlinnaka, thanks for more PRs! I'm currently working on in-memory parallel index builds, but will dig into this and #388 once that's finished (as it may affect the impact on build time).

Thanks to CPU cache effects (I think), it's faster to fetch all the hashes first. This gives a 5%-10% speedup in HNSW build of a 100-dimension on my laptop.

Introduce a helper function to calculate the distances of all candidates in an array. It doesn't change much on its own, but paves the way for further optimizations, in next commit.

In my testing, this gives a further 10% speedup in HNSW index build.

hlinnaka · 2024-01-29T09:15:12Z

Rebased this

hlinnaka mentioned this pull request Jan 18, 2024

Parallel index builds for HNSW #409

Closed

hlinnaka added 3 commits January 29, 2024 11:06

Add a bulk-variant of AddToVisited

c426894

Thanks to CPU cache effects (I think), it's faster to fetch all the hashes first. This gives a 5%-10% speedup in HNSW build of a 100-dimension on my laptop.

Refactor distance calculations in HnswSearchlayer

9c9b95f

Introduce a helper function to calculate the distances of all candidates in an array. It doesn't change much on its own, but paves the way for further optimizations, in next commit.

PoC: Calculate 4 neighbor distances at a time in HNSW search

6e6e1c7

In my testing, this gives a further 10% speedup in HNSW index build.

hlinnaka force-pushed the optimize-visit-neighbors branch from 822af36 to 6e6e1c7 Compare January 29, 2024 09:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize visiting neighbors #386

Optimize visiting neighbors #386

hlinnaka commented Dec 22, 2023

jkatz commented Dec 22, 2023

ankane commented Jan 8, 2024

hlinnaka commented Jan 29, 2024

Optimize visiting neighbors #386

Are you sure you want to change the base?

Optimize visiting neighbors #386

Conversation

hlinnaka commented Dec 22, 2023

1st Commit: Add a bulk-variant of AddToVisited

2nd & 3rd commits: Calculate 4 neighbor distances at a time in HNSW search

jkatz commented Dec 22, 2023

ankane commented Jan 8, 2024

hlinnaka commented Jan 29, 2024