Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assert the validity of the milli databases in the tests #4586

Open
irevoire opened this issue Apr 18, 2024 · 0 comments
Open

Assert the validity of the milli databases in the tests #4586

irevoire opened this issue Apr 18, 2024 · 0 comments
Labels
maintenance Issue about maintenance (CI, tests, refacto...)
Milestone

Comments

@irevoire
Copy link
Member

Last sprint, we had a bunch of database corruptions going on that were really hard to implement.
One way we avoided this kind of issue on other projects was by running a complete scan of the database in the tests after each insertion/commit.
Milli contains a lot of databases, but we should implement the same kind of function.

I believe @dureuill already made something like that for the filters. It could be a good first step to integrate it into our test suite.
And then grows the number of checks.


This strategy has already been implemented in the index scheduler and arroy.
I think the most aggressive checks are made in arroy, here’s the code: https://github.com/meilisearch/arroy/blob/19e0a07d40fd2b7685b70c941fee00400e9dda24/src/reader.rs#L361-L441
But as a TLDR, here’s what I actually check:

All the trees must be valid

That means I can come from the root of a tree and follow every node till I reach every leaf without ever encountering an unknown node or item ID.

All the item IDs are used exactly once in every tree

While traversing a tree, I also ensure that every item ID the user provides is in the tree.
So, there is no extraneous ID and no missing ID.

The node of one tree cannot be used in another tree

A tree should never contain a reference to another tree; if that happens, it means something got corrupted somewhere, and we re-used a wrong node ID in an incremental build process.
It could be really hard to fix if caught too late.

Nothing is left unknown in the database

After checking all of that, I just went through an exhaustive list of everything in the database.
If I find anything else in the database, that’s a bug. It means that some nodes are leaked in the database, and over multiple indexing processes, the database could grow for no reason or, worse, cause corruption.

This check is called everywhere

In the end, in arroy, every time I update a database, I snapshot its content on disk just to be sure it never changes in an unexpected way.
And the function in charge of snapshotting the database calls the assert validity function:
https://github.com/meilisearch/arroy/blob/19e0a07d40fd2b7685b70c941fee00400e9dda24/src/tests/mod.rs#L41

@irevoire irevoire added the maintenance Issue about maintenance (CI, tests, refacto...) label Apr 18, 2024
@irevoire irevoire changed the title Check the validity of the milli databases in the tests Assert the validity of the milli databases in the tests Apr 18, 2024
@curquiza curquiza added this to the v1.9.0 milestone Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Issue about maintenance (CI, tests, refacto...)
Projects
None yet
Development

No branches or pull requests

2 participants