Designing modular and efficient neural networks with conditional computation

Decanato - Facoltà di scienze informatiche

Data: 12 Ottobre 2023 / 11:00 - 12:00

USI East Campus, room D0.03

This seminar takes place within the lecture of Advanced topics in Machine Learning.

Speaker: Prof. Simone Scardapane, Sapienza University, Rome

Abstract: As neural networks increase in size and complexity, it is of paramount importance to imagine novel strategies and architectural biases to improve their efficiency, interpretability, power usage, and accuracy. In this talk, we look at a sparse modularity principle: how to design networks that learn to flexibly activate or deactivate their components and computations to adapt to the user’s requirements. This requires techniques to take discrete decisions in a differentiable way, e.g., routing tokens in a mixture-of-experts, deactivating components with conditional computation techniques, or merging and reassembling separate components from different models. We will see common solutions to this problem (e.g., Gumbel-Softmax tricks), and how modularity allows us to design networks with novel characteristics, such as dynamic adaptability to energy constraints, specialization of components in a continual learning setting, and more. We will conclude with an overview of open challenges and issues.

Biography: Simone Scardapane is a tenure-track assistant professor at Sapienza University of Rome. His research is focused on graph neural networks, explainability, continual learning and, more recently, modular and efficient deep networks. He has published more than 100 papers on these topics in top-tier journals and conferences. Currently, he is an associate editor for the IEEE Transactions on Neural Networks and Learning Systems (IEEE), Neural Networks (Elsevier), Industrial Artificial Intelligence (Springer), and Cognitive Computation (Springer). He is a member of multiple groups and societies, including the ELLIS society, the IEEE Task Force on Reservoir Computing, the “Machine learning in geodesy” joint study group of the International Association of Geodesy, and the Statistical Pattern Recognition Techniques TC of the International Association for Pattern Recognition."

Host: Prof. Cesare Alippi