Size-aware Sharding For Improving Tail Latencies in In-memory Key-value Stores

Staff - Faculty of Informatics

Date: 26 October 2018 / 13:30 - 14:30

USI Lugano Campus, room A-24, Red building (Via G. Buffi 13)

Speaker:

Diego Didona

 

Ecole Polytechnique Fédérale Lausanne, Switzerland

Date:

Friday, October 26, 2018

Place:

USI Lugano Campus, room A-24, Red building (Via G. Buffi 13)

Time:

13:30-14:30

 

 

Abstract:

In this talk we introduce the concept of size-aware sharding to improve tail latencies for in-memory key-value stores, and describe its implementation in the Minos key-value store. Size-aware sharding distributes requests for keys to cores according to the size of the item associated with the key. In particular, requests for small and large items are sent to disjoint subsets of cores. Size-aware sharding improves tail latencies by avoiding that a request for a small item gets queued behind a request for a large item. The challenge in implementing size-aware sharding is to maintain high throughput by minimizing the cost of software dispatching and by achieving load balancing between different cores.  Minos uses hardware dispatch for all requests for small items, which form the very large majority of all requests, to achieve high throughput, and achieves load balancing by adapting the number of cores handling requests for small and large items to their relative presence in the workload. We compare Minos to three state-of-the-art designs of in-memory KV stores. Compared to its closest competitor, Minos achieves a 99th percentile latency that is up to two orders of magnitude lower. Put differently, for a target 99th percentile latency equal to 10 times the mean service time, Minos achieves a throughput that is up to 7.4 times higher.

 

 

Biography:

Diego Didona received the MS degree in computer engineering in 2010 from the Sapienza Università di Roma and the PhD degree in computer engineering from Instituto Superior Técnico, Universidade de Lisboa in 2015. He is a postdoctoral researcher at EPFL where he works on data center job scheduling, key-value stores, consistency in geo-replicated data platforms, and performance modeling applied to self-tuning systems.

 

 

Host:

Prof. Fernando Pedone