Byzantine Fault Tolerant Storage for the Cloud

Staff - Faculty of Informatics

Start date: 12 November 2014

End date: 13 November 2014

You are cordially invited to attend the PhD Dissertation Defense of Ricardo PADILHA on Wednesday, November 12th 2014 at 10h30 in room 351 (Main building)

Abstract:
Cloud computing has changed the way distributed systems are designed, built, and maintained. Economies of scale made large deployments viable, mainly through hardware commoditization. Hardware became cheaper, but not more reliable. In the current cloud environment, more machines mean more failures. A broad failure model is required to cope with these failures, the Byzantine failure model. Unfortunately, Byzantine fault-tolerance (BFT) has been known traditionally for being non-scalable, unaware of confidentiality issues, and prone to contention. In this thesis, we address the issues of confidentiality, scalability, and contention in BFT distributed systems.

BFT, in its original description, sought to detect and tolerate faulty data coming from replicas. It did not foresee a situation where faulty replicas would be exploited to leak information. We address the issue of confidentiality by applying secret sharing techniques, making it impossible for any faulty replica to leak information since no replica contains a complete copy of the data.

From an pragmatic perspective, BFT has become a synonym for state-machine replication (SMR). It is well-known that SMR is inherently not scalable in its simplest approaches. To make BFT scalable, we break the system in several self-contained, independent BFT partitions. The safety, liveness, and strong serializability properties are maintained by a BFT multi-partition transaction execution protocol.

We generalize our BFT multi-partition transaction execution protocol to support multi-round execution. Instead of a single round to execute and vote on the outcome of a transaction, replicas are able to exchange data between rounds. This data exchange mechanism is exploited to manage the problem of contention. Replicas establish a partial order on conflicting transactions with the help of timestamps. A sophisticated locking mechanism allows conflicting transactions to be re-ordered in case of timestamp change, thus effectively eliminating aborts caused by contention.

We extensively evaluated these contributions with fully functional prototypes. The applications implemented include (in no particular order): micro-benchmarks, the TPC-B benchmark, an NFS v2 client and server, a transactional key-value store, an SQL engine for the Apache Derby RDBMS, a Twitter-like social network service, and an Apache Kafka-like message queue system.

Dissertation Committee:

  • Prof. Fernando Pedone, Università della Svizzera italiana, Switzerland (Research Advisor)
  • Prof. Antonio Carzaniga, Università della Svizzera italiana, Switzerland (Internal Member)
  • Prof. Miroslaw Malek, Università della Svizzera italiana, Switzerland (Internal Member)
  • Prof. Christian Cachin, IBM Zurich Research Laboratory, Switzerland (External Member)
  • Prof. Fabian Kuhn, Albert-Ludwigs-Universität, Germany (External Member)
  • Prof. Nuno Neves, Universidade de Lisboa, Portugal (External Member)