Self-healing Cloud Applications

Decanato - Facoltà di scienze informatiche

Data: 24 Gennaio 2020 / 14:30 - 16:00

USI Lugano Campus, room SI-003, Informatics building (Via G. Buffi 13)

You are cordially invited to attend the PhD Dissertation Defense of Rui Xin on Friday January 24th, 2020 at 14:30 in room SI-003 (Informatics building).

Abstract:
Cloud computing enables ubiquitous, convenient and on-demand network access to configurable computing resources, and provides on-demand resource pooling, elasticity, and service. Could systems are composed of many distributed machines, feature multi-layer architecture and offer different types of services. Cloud computing reduces costs and improves resource utilization efficiency, with a considerable amount of complexity and dynamics that challenge the reliability of the system. The new challenges of cloud systems motivate a new holistic self-healing approach, which must be accurate, lightweight and proactive, to ensure reliable cloud applications. Self-healing techniques work at runtime, thus they offer automatic and flexible ways to increase reliability by detecting errors, diagnosing errors, and either fixing the errors or mitigating their effects. Self-Healing Systems leverage the time between the activation of a fault and the failure by taking actions to avoid failures. Self-Healing systems shall predict failures, localize the faults and fix or mask them before the failure occurrence. In my Ph.D, I focused on predicting failures and localizing faults. In this thesis I present an approach, \projectNoSpace, that predicts failures by detecting anomalous systems states early enough to diagnose the causing errors and fix them before the failure occurrence, and localizes faults by leveraging the collected data to pinpoint the location of error and possibly the type of the fault. The contribution of my Ph.D work includes: - an approach to accurately predict failures and localize faults that requires training with fault seeding. - an approach to predict failures and localize faults that requires training with data from normal execution only. - a prototype implementation of the two approaches - a set of experimental results that evaluate the proposed approaches of \projectNoSpace.

Dissertation Committee:
- Prof. Mauro Pezzè, Università della Svizzera italiana, Switzerland (Research Advisor)
- Prof. Walter Binder, Università della Svizzera italiana, Switzerland (Internal Member)
- Prof. Cesare Pautasso, Università della Svizzera italiana, Switzerland (Internal Member)
- Prof. Rogério De Leoms, University of Kent, UK (External Member)
- Prof. Holger Giese, Hasso-Plattner Institute, Germany (External Member)