Analyzing System Performance with Probabilistic Performance Annotations

Decanato - Facoltà di scienze informatiche

Data: 9 Dicembre 2020 / 17:00 - 19:00

Online

You are cordially invited to attend the PhD Dissertation Defence of Daniele Rogora on Wednesday 9 December 2020 at 17:00 on MS Teams.

Abstract:
Understanding the performance of software is complicated. For several performance metrics, in addition to the algorithmic complexity, one must also consider the dynamics of running a program within different combinations of hardware and software environments. Such dynamical aspects are not visible from the code alone. Moreover, software systems themselves continue to grow in size and complexity, and although modularity works well for functionality, it is less helpful for performance. In fact, while functional behaviors and problems can be reasonably isolated, performance problems are often interaction problems and they are pervasive. We introduce and develop the concept of probabilistic performance annotations to understand, debug, and predict the performance of software. A performance annotation describes the expected performance of a software component as a function of features of the input and features of the systems on which the software is running. In particular, performance annotation use concrete metrics such as running time, and concrete features such as the real variables as defined in the source code of the program. Performance annotations are easy to read and understand for the developer or performance analyst, and can also be used by automated tools, for example for performance regression testing. We also introduce Freud, a tool that creates performance annotations automatically for real world C/C++ software. Freud uses dynamic analysis: it instruments a binary program written in C/C++ and collects information about performance metrics and features from the running program. Freud then statistically analyzes such information to derive probabilistic performance annotations. In particular, Freud computes regressions and clusters to create regression trees and mixture models that describe complex, multi-modal performance behaviors. We illustrate our approach to performance analysis and in particular the use of Freud on three complex systems---the ownCloud distributed storage service; the MySQL database system; and the x264 video encoder library and application---producing non-trivial characterizations of their performance.

Dissertation Committee:

  • Prof. Antonio Carzaniga, Università della Svizzera italiana, Switzerland (Research Advisor)
  • Prof. Robert Soulé, Università della Svizzera italiana / Yale University, Switzerland/USA (Research co-Advisor)
  • Prof. Matthias Hauswirth, Università della Svizzera italiana, Switzerland (Internal Member)
  • Prof. Fernando Pedone, Università della Svizzera italiana, Switzerland (Internal Member)
  • Prof. Amer Diwan, Google, USA (External Member)
  • Prof. Timothy Roscoe, ETH Zurich, Switzerland (External Member)