Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: Explain ResetRemove #124

Open
Kiiyya opened this issue Aug 29, 2021 · 3 comments
Open

Docs: Explain ResetRemove #124

Kiiyya opened this issue Aug 29, 2021 · 3 comments

Comments

@Kiiyya
Copy link

Kiiyya commented Aug 29, 2021

I have read most of the A comprehensive study of Convergent and Commutative Replicated Data Types paper, but as far as I can tell, it contains no notion similar to ResetRemove, and I can't figure out what purpose that trait serves.

The following explanation can be found in the docs for Map, but not for ResetRemove:

Reset-remove means that if one replica removes an entry while another actor concurrently edits that entry, once we sync these two maps, we will see that the entry is still in the map but all edits seen by the removing actor will be gone.

  1. For some reason, only CRDTs which fulfill ResetRemove are a Val and can be used in Map. Why?
  2. The above quote should be in ResetRemove probably?
  3. I don't understand why the property as described in the quote makes sense. There is no info on it in the paper either. How does that help maintain the CRDT properties? I am confused.
@davidrusu
Copy link
Member

davidrusu commented Aug 31, 2021

Reset-remove semantics was introduced with the Map CRDT. Map is a re-implementation of the Riak riak_dt_map CRDT that you can read more about here: https://github.com/basho/riak_dt/blob/develop/src/riak_dt_map.erl#L53

But more generally, It's what you end up with when you extend the ideas from the ORSWOT CRDT to model a key-value CRDT where the value is also a CRDT.

Reset-remove comes in when you need to deal with the case where a key is remove while concurrently the value associated with that key is updated by another replica.

Reset-remove semantics says that when those two states merge, we should see that the removed key now exists and that it's value is only the update that the other replica had made, all other information in that value CRDT has been removed. It's as if the second replica was operating on the empty CRDT.

Concretely given an initial common state of a nested Map CRDT: { a: { b: 1 } }

Replica 1 executes REMOVE(a) -> { }
Replica 2 executes UPDATE(a/c, 2) -> { a: { b: 1, c: 2 } }

Replica 1 and 2 merge -> { a: { c: 2 } }

All information seen by replica 1 is removed and it's as if Replica 2 was operating on the reset value { a: {} }.

Hopefully that's makes things clearer, let me know if not.

You're right that the ResetRemove trait is only used by the Map CRDT and that the map values must implement this trait. I was hoping we'd discover more CRDT's that would require this trait.. and we probably will find some, I just haven't had the time to explore that angle. I would wager any other CRDT that supports CRDT nesting will also require ResetRemove on it's nested CRDT's (i.e. consider an LSeq that supports holding CRDT's as elements, this would likely follow reset-remove semantics)

The above quote should be in ResetRemove probably?

This explanation of reset-remove is dealing specifically with the Map CRDT (It's talking about map entries and such) so I think we keep it in map.rs for now. I'll see about a more general definition for the ResetRemove trait.

@Kiiyya
Copy link
Author

Kiiyya commented Aug 31, 2021

Reset-remove semantics says that when those two states merge, we should see that the removed key now exists and that it's value is only the update that the other replica had made, all other information in that value CRDT has been removed. It's as if the second replica was operating on the empty CRDT.

Ahhhh, this is the key idea behind it then. Yeah that makes sense. Thanks for the explanation. Initially I thought ResetRemove was a way to garbage collect tombstones.

You could probably copy most of your comment into the module-level docs in the map module, at least until more nested CRDTs pop up. I'd also add a link at the doc for ResetRemove to look at the module-level docs for map. Or should I make a PR with that?

@davidrusu
Copy link
Member

A PR would be very welcome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants