Abstract
Replicated data types (RDTs) are specialized data structures that allow for concurrent modification of multiple replicas, even when they are geographically dispersed, without requiring coordination between them. However, constructing efficient and correct RDTs is challenging due to the complexity involved in reasoning about independently evolving states of replicas and resolving conflicts between them.
In this talk, I will introduce Mergeable Replicated Data Types (MRDTs), a practical approach to constructing and verifying RDTs that is both efficient and correct. MRDTs build on the concept of a distributed version control system like Git, but extend it to arbitrary data types rather than just files. The key idea is to make sequential data types suitable for distribution by equipping them with a three-way merge function that reconciles conflicting versions. I will discuss how this merge function captures the complexities of distribution, simplifying both implementation and verification. Furthermore, I will discuss the critical role played by persistent data structures in MRDTs, as well as the inherent trade-off between persistence and efficiency in distributed data stores.