Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RCORE-2141 RCORE-2142 Clean up a bunch of old encryption cruft #7698

Merged
merged 2 commits into from
Jun 6, 2024

Conversation

tgoyne
Copy link
Member

@tgoyne tgoyne commented May 15, 2024

The global shared cache of encrypted file maps was originally required because we actually opened Realm files mulitple times in normal usage, so each of the open files had to know about each other to copy things around. #4839 made it so that in normal usage we only ever have one DB instance per file per process, so it became dead code. Multiprocess encryption made it unneccesary even when the one-DB-per-process rule is violated, as the multiprocess code path covers that.

This eliminates our last reliance on file UniqueIDs, so it lets us get rid of hacks related to that.

The encryption page reclaimer mostly never actually worked. It used a very conservative page reclamation rule that meant that pages would never be reclaimed if there was a long-lived Transaction, even if it was frozen or kept refreshed. This is very common in practice, and when it doesn't happen the DB usually isn't kept open either, making it redundant.

Encryption used to rely on handling BAD_EXEC signals (or mach exceptions) rather than explicit barriers, so it had to read and write in page-sized chunks. That's no longer the case, so we can eliminate a lot of complexity by always reading and writing in 4k blocks.

Our use of off_t meant that on Windows we didn't support >2GB files because off_t is 32-bit even on x64 Windows. The encryption layer now theoretically supports files up to 8 TB on 32-bit (which isn't relevant because SlabAlloc doesn't).

This makes it so that the multiprocess encryption codepaths can be tested in a single process, and in fact UNITTEST_ENCRYPT_ALL=1 will incidentally test it in a bunch of places. This revealed a preexisting bug:

  1. Process 1 reads page X
  2. Process 2 writes to one byte range in page X
  3. Process 1 refreshes the reader mapping and marks the page as StaleIV
  4. Process 1 writes to a different byte range in page X
  5. This byte range is copied to the read mapping and the page is marked as UpToDate
  6. Process 1 reads from the byte range written by process 2 and gets garbage data

When copying data to a StaleIV page we need to copy the entire page rather than just the modified bytes. We can't just mark the page as Clean because while it's fine for the reader mapping to see the data on disk rather than the newly written data while the write is still in progress, we wouldn't know when to actually reread the page.

We no longer use file seeking anywhere and use explicit position offsets. File seeking is spooky when multiple threads are involved and it involved a lot of extra syscalls.

IV refreshing now involves fewer read() calls. I don't think this is actually a meaningful perf gain since they would all have been warm cache hits anyway. Might be faster, though.

The global mapping_mutex is gone and encryption operations on two different DBs can now happen concurrently.

The error messages when decryption fails now include a little more information.

Fixes #7743. Fixes #7744.

@tgoyne tgoyne added the no-jira-ticket Skip checking the PR title for Jira reference label May 15, 2024
@tgoyne tgoyne self-assigned this May 15, 2024
@cla-bot cla-bot bot added the cla: yes label May 15, 2024
@finnschiermer
Copy link
Contributor

I applaud the wisdom found here :-)

@tgoyne tgoyne force-pushed the tg/file-map-cache branch 9 times, most recently from 8f9e7f4 to aa3d9c0 Compare May 21, 2024 01:08
Copy link

coveralls-official bot commented May 21, 2024

Pull Request Test Coverage Report for Build thomas.goyne_396

Details

  • 1048 of 1098 (95.45%) changed or added relevant lines in 27 files are covered.
  • 132 unchanged lines in 20 files lost coverage.
  • Overall coverage increased (+0.1%) to 90.956%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/realm/alloc_slab.cpp 20 21 95.24%
src/realm/db.cpp 17 18 94.44%
test/test_json.cpp 0 1 0.0%
test/test_encrypted_file_mapping.cpp 347 349 99.43%
test/test_file.cpp 99 101 98.02%
src/realm/util/file_mapper.cpp 35 38 92.11%
test/util/test_path.hpp 0 3 0.0%
src/realm/util/encrypted_file_mapping.hpp 5 11 45.45%
src/realm/util/file.cpp 98 109 89.91%
src/realm/util/encrypted_file_mapping.cpp 359 379 94.72%
Files with Coverage Reduction New Missed Lines %
src/realm/sync/instructions.hpp 1 76.03%
src/realm/util/encrypted_file_mapping.hpp 1 41.27%
src/realm/util/serializer.cpp 1 90.43%
src/realm/uuid.cpp 1 98.48%
test/test_all.cpp 1 76.47%
test/test_dictionary.cpp 1 99.83%
src/realm/alloc_slab.cpp 2 90.56%
src/realm/object-store/shared_realm.cpp 2 91.89%
test/test_file.cpp 2 97.4%
test/test_lang_bind_helper.cpp 2 93.2%
Totals Coverage Status
Change from base Build 2388: 0.1%
Covered Lines: 214536
Relevant Lines: 235869

💛 - Coveralls

@tgoyne tgoyne force-pushed the tg/file-map-cache branch 15 times, most recently from 4e6e8b1 to c468758 Compare May 24, 2024 23:08
@tgoyne tgoyne force-pushed the tg/file-map-cache branch 2 times, most recently from 80c8d66 to e2e78b2 Compare May 25, 2024 01:46
@tgoyne tgoyne marked this pull request as ready for review May 25, 2024 03:41
Comment on lines -609 to +687
flush();
sync();
do_flush();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reasoning for removing the call to sync? Is it because we can rely on the IV un-bumping strategy for consistency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing a map is the wrong granularity for syncing. Either we need to be syncing between the IV write and the data write for every page, or we need to be syncing once (or twice) per transaction as part of committing. This was making us sync at fairly random times in the middle of the commit which weren't connected to anything logical, and is why some of the tests had to do a lot less work on Windows to not be unreasonably slow (FlushFileBuffers() is closer to F_FULLFSYNC than fsync()).

static void memcpy_if_changed(void* dst, const void* src, size_t n)
{
#if REALM_SANITIZE_THREAD
// Because our copying is page-level granularity, we have some benign races
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove this case from suppression from test/tsan.suppress?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we can.

src/realm/util/encrypted_file_mapping.hpp Outdated Show resolved Hide resolved
@@ -1649,23 +1488,6 @@ FileDesc File::dup_file_desc(FileDesc fd)
return fd_duped;
}

File::UniqueID File::get_unique_id()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to finally be rid of this 💯

test/test_file.cpp Outdated Show resolved Hide resolved
test/test_lang_bind_helper.cpp Outdated Show resolved Hide resolved
src/realm/util/overload.hpp Outdated Show resolved Hide resolved
if (n == 0)
break;
used_size += n;
}
return std::string(buffer.data(), used_size); // Throws
}


std::string util::load_file_and_chomp(const std::string& path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move the simple deletions/cleanups like this one into a separate PR, a lot of the file/mapping stuff is probably too interconnected to be worth extracting, but it would be nice to reduce the number of non-encryption related clean-ups here. (These are all great though!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a few steps in between, but this is actually connected to the encryption changes. Making File::seek() work for encrypted files was previously done with a global mutex which I wanted to kill. Rather than trying to figure out some locking scheme that would make seeking work I updated all of our uses of File to not rely on syncing and instead issue atomic read/write calls at specific offsets. For each thing I had to update for this I first checked if it was actually still used and just deleted it if not.

Ideally the File changes would all be a separate commit that goes before the encryption changes but they are pretty entangled, largely because of the map_flags thing that was passed around through every layer without ever being used for anything.

@tgoyne tgoyne changed the title Clean up a bunch of old encryption cruft RCORE-2141 Clean up a bunch of old encryption cruft May 28, 2024
Copy link
Contributor

@ironage ironage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvements across the board 👍 I'm glad to see that the encryption code has been simplified as well. I gave this a fairly detailed review, but given the large scope of changes it may be best to get @finnschiermer to review as well in case I missed something.

@tgoyne tgoyne changed the title RCORE-2141 Clean up a bunch of old encryption cruft RCORE-2141 RCORE-2142 Clean up a bunch of old encryption cruft May 29, 2024
@tgoyne
Copy link
Member Author

tgoyne commented May 29, 2024

I did a bit of benchmarking of this and concluded that while there's some things that are faster, it was really hard to actually hit the cases where you could hit the performance pitfalls of the old code, which is a good thing I guess. As a result the only real performance change from this is that operations on unrelated encrypted files no longer sometimes block each other.

Copy link
Contributor

@finnschiermer finnschiermer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very, very nice.

The global shared cache of encrypted file maps was originally required because
we actually opened Realm files mulitple times in normal usage, so each of the
open files had to know about each other to copy things around. #4839 made it so
that in normal usage we only ever have one DB instance per file per process, so
it became dead code. Multiprocess encryption made it unneccesary even when the
one-DB-per-process rule is violated, as the multiprocess code path covers that.

This eliminates our last reliance on file UniqueIDs, so it lets us get rid of
hacks related to that.

The encryption page reclaimer mostly never actually worked. It used a very
conserative page reclaimation rule that meant that pages would never be
reclaimed if there was a long-lived Transaction, even if it was frozen or kept
refreshed. This is very common in practice, and when it doesn't happen the DB
usually isn't kept open either, making it redundant.

Encryption used to rely on handling BAD_EXEC signals (or mach exceptions)
rather than explicit barriers, so it had to read and write in page-sized
chunks. That's no longer the case, so we can eliminate a lot of complexity by
always reading and writing in 4k blocks.
@tgoyne tgoyne merged commit 42e4a85 into master Jun 6, 2024
39 checks passed
@tgoyne tgoyne deleted the tg/file-map-cache branch June 6, 2024 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes no-jira-ticket Skip checking the PR title for Jira reference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Realm files are limited to 2GB on x64 Windows Multiprocess encryption can sometimes read stale values
3 participants