Topological Data Analysis: an introduction with examples and applications
Staff - Faculty of Informatics
Date: 8 July 2024 / 10:30 - 11:30
USI East Campus, Room D0.02
Speaker: Matteo Pegoraro, Politecnico di Milano and Università di Bologna
Abstract:
In this talk I will provide an introduction to some techniques which fall into the realm of Topological Data Analysis (TDA), exploring advantages and disadvantages of the presented approach using examples and applications which I developed in my own research. TDA is a relatively new field of research which is successfully establishing itself as a viable tool in a range of data analysis scenarios, particularly those requiring consideration of invariance properties or, equivalently, shape-related information. In this talk we will see applications involving three kinds of data, with an increased level of complexity and novelty:
- Having covered the basics of the most diffused TDA pipelines we will consider a data set of meshes of the aorta of different patients and see how the topological descriptors effectively encode significant shape information about the meshes.
- We will discuss a benchmark data set used to test and validate several Functional Data Analysis (FDA) techniques to encompass the problem of alignment of functional data. We will show how TDA representations solve this problem and trivializes some steps of the analysis.
- Lastly we consider a metastatic cancer case study, where high throughput feature extraction methods (radiomics) are used to represent cancer lesions according to high dimensional embeddings. Considering each lesion as a point in the radiomic space, we will transform the obtained point clouds (patients) into hierarchical clustering dendrograms and stratify them with an ad-hoc defined metric. We wil show how results properly correlate with disease-free survival information.
Biography: Matteo Pegoraro earned his bachelor’s and master’s degrees in Mathematics from the Università degli Studi di Milano, focusing on algebraic geometry and the geometry of Banach spaces. After a year as a Data Scientist, he won a double-degree PhD scholarship at Politecnico di Milano and Università di Bologna, supervised by Prof. Piercesare Secchi. His PhD research is centered on Merge Trees in Topological Data Analysis. In particular, he defined a novel metric for such trees with proven stability properties. He then obtained a postdoc position at Aalborg University with Prof. Lisbeth Fajstrup, working on an interdisciplinary project correlating geometric and topological features of glasses with their transportation properties. In this project, he established a max-flow notion compatible with the periodic boundary conditions in simulated materials science data. Recently, he visited Inria centers at Saclay and Sophia-Antipolis to further develop this work. Over the years, he has also developed an interest in Optimal Transport, aiming to define regression and dimensionality reduction techniques for data sets of finite measures/densities/histograms.”