Improving variable selection properties with transfer learning and data integration

Facoltà di scienze informatiche - Segreterie degli studi

Data: 6 novembre 2025 / 12:15 - 14:45

USI East Campus, Room D5.01

Speaker: Dr. Paul Rognon-Vael, Bocconi University

Abstract: Sparse high-dimensional signal recovery is only possible under certain conditions on the number of parameters, sample size, signal strength and underlying sparsity. We study how these mathematical limits can be pushed by leveraging external information on the likelihood of variables to be associated with the outcome. As a particular case we consider a flexible transfer learning framework where the set of variables selected in the source is transferred to ease variable selection in the target. We introduce a family of penalized linear regression methods motivated by connections to Bayesian variable selection where the penalties depend on the external information or transferred set of variables. We show they attain variable selection consistency where otherwise it would not be possible, or attain it at a faster rate. We first precisely quantify the gains that are achievable for ideal penalties set by an oracle. Subsequently, we propose computationally fast algorithms for the incorporation of external information and transfer learning that do not require an oracle and are inspired by empirical Bayes techniques. We prove these algorithms recover most of the gains achievable by oracle penalties and are robust to negative transfer. We show the proposed algorithm improves on standard variable selection methods on simulated and empirical data.

Biography: Paul Rognon-Vael is a postdoctoral research in Botond Szabo's research group at Bocconi University (Milan). He obtained his Ph.D. degree from Universitat Pompeu Fabra (UPF) and Universitat Politècnica de Catalunya (UPC), where he was advised by David Rossell and Piotr Zwiernik. Before his doctoral studies, he was a research intern at Marta Melé's lab at the Barcelona Supercomputing Center, working on transcriptomics. He also worked as a quantitative finance analyst for all kinds of risks (climate, market, credit, liquidity, operational, fair lending) for large and midsize banks in the US.

Host: Prof. Deborah Sulem