Low-Rank Parametrized Transformers and Saddle-to-Saddle Dynamics
Facoltà di scienze informatiche - Segreterie degli studi
Data: 14 novembre 2025 / 12:30 - 13:30
USI East Campus, Room D5.01
Speaker: Dr. Katerina Papagiannouli
Abstract: In this talk, we will characterise the gradient flow dynamics of an almost-linear low-rank parametrized transformer trained for regression in the limit of vanishing initialization. We explicitly characterise the visited saddles as well as the jump times each corresponding to the activation of a new direction within the attention-weighted subspace. Starting from almost zero initialization, coordinates are successively activated revealing the well-known phenomenon of incremental learning. Our analysis relies on a reparameterization of the gradient flow that enables us to track these transitions between the rank-increased plateaus. (joint with Hana Tseran RIKEN Institute, University of Tokyo)
Biography: Katerina Papagiannouli is an assistant professor (Ricercatore a Tempo Determinato di Tipo A) in Statistics and Machine Learning at the Department of Mathematics, University of Pisa and a visitor researcher at Max Planck Institute for Mathematics in the Sciences. Prior to this, she was a postdoctoral researcher at the MPI for MiS in the Inference and Learning group. She obtained her Ph.D. in Mathematical Statistics from Humboldt University of Berlin.
Host: Prof. Deborah Sulem