Consensus Protocols Exploiting Network Programmability

Staff - Faculty of Informatics

Date: / -

USI Lugano Campus, room SI-003, Informatics building (Via G. Buffi 13)

You are cordially invited to attend the PhD Dissertation Defense of Huynh Tu Dang on Thursday January 31st, 2019 at 14:30 in room SI-003 (Informatics building).

Services rely on replication mechanisms to be available at all time. The service demanding high availability is replicated on a set of machines called replicas. To maintain the consistency of replicas, a consensus protocol such as Paxos or Raft is used to synchronize the replicas' state. As a result, failures of a minority of replicas will not affect the service as other non-faulty replicas continue serving requests. A consensus protocol is a procedure to achieve an agreement among processors in a distributed system involving unreliable processors. Unfortunately, achieving such an agreement involves extra processing on every request, imposing a substantial performance degradation. Consequently, performance has long been a concern for consensus protocols. Although many efforts have been made to improve consensus performance, it continues to be an important problem for researchers. This dissertation presents a novel approach to improving consensus performance. Essentially, it exploits the programmability of a new breed of network devices to accelerate consensus protocols that traditionally run on commodity servers. The benefits of using programmable network devices to run consensus protocols are twofold: The network switches process packets faster than commodity servers and consensus messages travel fewer hops in the network. It means that the system throughput is increased and the latency of requests is reduced. The evaluation of our network-accelerated consensus approach shows promising results. Individual components of our FPGA-based and switch-based consensus implementations can process 10 million and 2.5 billion consensus messages per second, respectively. Our FPGA-based system as a whole delivers 4.3 times performance of a traditional software consensus implementation. The latency is also better for our system and is only one third of the latency of the software consensus implementation when both systems are under half of their maximum throughputs. In order to drive even higher performance, we apply a partition mechanism to our switch-based system, leading to 11 times better throughput and 5 times better latency. By dynamically switching between software-based and network-based implementations, our consensus systems not only improve performance but also use energy more efficiently. Encouraging by those benefits, we developed a fault-tolerant non-volatile memory system. A prototype using software memory controller demonstrated reasonable overhead over local memory access, showing great promise as scalable main memory. Our network-based consensus approach would have a great impact in data centers. It not only improves performance of replication mechanisms which relied on consensus but also enhances performance of services built on top of those replication mechanisms. Our approach also motivates the others to move new functionalities into the network, such as key-value store and stream processing. We expect that in the near future, applications that typically run on traditional servers will be folded into networks for performance.

Dissertation Committee:

  • Prof. Robert Soulé, Università della Svizzera italiana, Switzerland (Research Advisor)
  • Prof. Fernando Pedone, Università della Svizzera italiana, Switzerland (Research co-Advisor
  • Prof. Antonio Carzaniga, Università della Svizzera italiana, Switzerland (Internal Member)
  • Prof. Patrick Eugster, Università della Svizzera italiana, Switzerland (Internal Member)
  • Prof. Marco Canini, KAUST, Saudi Arabia (External Member)