Bug: Deadlock in concurrent update()s #396

chun0nick · 2024-04-12T20:33:30Z

Describe the bug

Deadlock came up during testing, appears to be an issue in index.hpp. We're using the Rust bindings for v2.8.16 so through index_dense.hpp. This testing involved deleting a bunch of vectors and then, with multiple threads, calling add(), which then called update() on the deleted nodes.

In update we acquire the lock on the deleted node here. We then call connect_node_across_levels_ which invokes search_for_one_ and search_to_insert_, while locking and locking (respectively) every candidate node it looks at.

This update lock on the deleted node seems to be the problem. If another thread performing an update selects that deleted node as a "next" candidate, we have to wait until the other thread is completely done updating, as we don't drop the update lock until the end of the call. What we've seen somewhat consistently (with 16 threads updating) is a deadlock, where each of the threads needs the candidate lock that another thread updating already holds, and that thread in turn needs the candidate lock a different thread holds.

This may be the issue described in #354?

Steps to reproduce

Create an index, add a reasonable number of vectors. Delete them - then concurrently call add (update) with a bunch of threads.

Expected behavior

The concurrent updates should not conflict with each other.

USearch version

v2.8.16

Operating System

Amazon Linux 2

Hardware architecture

x86

Which interface are you using?

Other bindings

Contact Details

[email protected]

Is there an existing issue for this?

I have searched the existing issues

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

ashvardanian · 2024-04-12T21:02:43Z

Good catch! Looking into it! Feel free to submit a test-case or patches via PR while I do 🤗

ashvardanian · 2024-04-15T00:57:53Z

Replicated it, will make part of the primary testing suite in C++. Thanks for finding and reporting the issue, @chun0nick! Would you be open to be mentioned as a contributor for this patch? 🤗

chun0nick · 2024-04-15T17:17:33Z

Awesomely fast response! And sorry - was planning to give you a test case today. Sure, would be happy to be mentioned as a contributor.

sef43 · 2024-04-23T13:45:27Z

I had what looked to be a deadlock happen when using the Python api (as part of usearch_molecules indexing step) when using 32 threads. Could this bug be the cause of that?

ashvardanian · 2024-04-23T15:01:21Z

Yes, @sef43, the issue is coming from the core in update-heavy workloads. Haven't finished patching yet, was busy the last few days with new bindings for UForm and new algos in StringZilla.

chun0nick added the bug Something isn't working label Apr 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Deadlock in concurrent update()s #396

Bug: Deadlock in concurrent update()s #396

chun0nick commented Apr 12, 2024 •

edited

ashvardanian commented Apr 12, 2024

ashvardanian commented Apr 15, 2024

chun0nick commented Apr 15, 2024

sef43 commented Apr 23, 2024

ashvardanian commented Apr 23, 2024

Bug: Deadlock in concurrent update()s #396

Bug: Deadlock in concurrent update()s #396

Comments

chun0nick commented Apr 12, 2024 • edited

Describe the bug

Steps to reproduce

Expected behavior

USearch version

Operating System

Hardware architecture

Which interface are you using?

Contact Details

Is there an existing issue for this?

Code of Conduct

ashvardanian commented Apr 12, 2024

ashvardanian commented Apr 15, 2024

chun0nick commented Apr 15, 2024

sef43 commented Apr 23, 2024

ashvardanian commented Apr 23, 2024

chun0nick commented Apr 12, 2024 •

edited