Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interesting corruption case: RuntimeError: bucket being iterated changed size and AssertionError: Bucket next pointer is damaged but only with C extension #68

Open
jamadden opened this issue Apr 13, 2017 · 2 comments
Assignees
Labels

Comments

@jamadden
Copy link
Member

jamadden commented Apr 13, 2017

We have a corrupt BTree in a ZODB. I don't know exactly how we got here, but it just happened recently with the most recent releases of BTree and it's C extension, so I suspect there's a lurking bug.

This happens to be a IOBTree in a catalog, so it's probably pretty large.

The first symptom is that doing list(btree.keys()) results in RuntimeError: the bucket being iterated changed size. But this is a single-threaded program, and that's an atomic call implemented in C, so changing size is not possible.

When we subsequently do btree._check() we get this happy error: AssertionError: Bucket next pointer is damaged. Those two errors seem to fit correctly together (I think).

Here's where it gets interesting. If we do the same two operations with the Python implementation instead of the C implementation, they both pass. The iteration doesn't do that kind of checking, so the absence of a RuntimeError is not surprising. But the Python _check does look for next pointers being damaged, and doesn't raise an error on our tree:

            for i in range(len(data)-1):
                assert_(data[i].child._next is data[i+1].child,
                       "Bucket next pointer is damaged")

So we appear to have a case where Python and C are interpreting the same pickle data in different ways. (That's bad.)

It's highly probable that this BTree was written to by both the Python and C implementations at different times over the past few weeks. I think we removed the BTree C extension while we investigated zopefoundation/persistent#62 (BTree being another library with a C extension that had recently changed). It seems the obvious compatibility issues of mixing C and Python implementations may be gone, but there's still something going on there.

I can try to take a look at this (if no one wants to beat me to it 😄 ), but it'll probably be awhile before I can get back to it in depth.

@jamadden jamadden added the bug label Apr 13, 2017
@jamadden jamadden self-assigned this Apr 13, 2017
@papachoco
Copy link

Thanks Jason

@d-maurer
Copy link
Contributor

"#118 (comment)"
contains the analysis of a ZODB provided by @matthchr . There, a bucket deletion updates the bucket chain correctly for "forgets" to update data in the parent. As a consequence, the bucket chain gets damaged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants