Distributed, fault-tolerant in-place consensus sequence on innovative hardware as building block for data management.
Subject Area
Security and Dependability, Operating-, Communication- and Distributed Systems
Term
from 2017 to 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 361478098
Quorum consensus algorithms like the Paxos algorithm are widely usedas basic building blocks for fault-tolerance in distributedsystems. Unfortunately, distributed quorum consensus causes muchoverhead to negotiate and safely store the consensus. We thereforeplan to optimize Paxos-based fault-tolerance for sequences ofconsensus in three ways:1. exploit multicast and reduce operations of modern interconnects to reduce the latency and number of messages,2. use remote direct memory access (RDMA) in combination with NVRAM to manage a distributed shared state,3. modify Paxos to support a sequence of consensus decisions in-place and avoid separate memory resources for each Paxos instance.4. We then build efficient custom datatypes on top of consensus sequences, that support partial updates, multiple-reader-single-writer locks, or compare-and-swap semantics.The resulting distributed fault-tolerant consensus will provide lowlatency and high-throughput decisions. It will allow to applyrecoverable distributed consensus in new scenarios where it wasavoided before due to its high latency. The optimized consensus canbe used as a building block in current and future distributed datamanagement and database systems - including those developed in SPP - that often rely on a sequence of decisions to process locks,transactions, to make atomic changes like compare and swap, to supportreplicated state machines, or to elect the next master etc.
DFG Programme
Priority Programmes