A Hybrid Approach to Visual Perceptual Intelligence

Decanato - Facoltà di scienze informatiche

Data: 6 Giugno 2024 / 13:45 - 14:30

USI East Campus, Room C1.03

Speaker: Christos Sakaridis, ETH Zürich

There is solid empirical evidence that both semantic and geometric visual perceptual cues are crucial for executing actions in embodied intelligent systems. Thus, solving visual perception is a pressing need, as the large-scale deployment of such intelligent systems can greatly benefit humanity, for instance through the elimination of the million-scale yearly worldwide road traffic fatalities. However, the agnostic data-driven approach taken by the vast majority of recent works on visual perception, which attempts to learn the associated complex compositional input-to-output mappings in their entirety from individual observations and observation-level objectives, exhibits fundamental limitations in generalization, yielding physically and geometrically infeasible outputs. By contrast, this talk will advocate for a hybrid data-driven approach to visual perceptual intelligence, which still embraces the well-proven representational power of generic parametric compositional mappings, while – simultaneously – organically integrating fundamental principles of optics, geometry and mechanics in the learned models and data, thus injecting valid inductive biases and granting these models broader capability of generalization. The merit of this hybrid approach will be empirically supported by illustrating its overarching position in a diverse set of exemplary works of ours on visual perception, which features a variety of semantic and geometric tasks on diverse types of visual data, as well as various instantiations of the prior knowledge part of our hybrid framework in the form of architectural modules, representations, losses, and input data.

Dr. Christos Sakaridis is a lecturer at ETH Zürich and a senior postdoctoral researcher at the Computer Vision Lab of ETH. His broad research fields are computer vision and artificial intelligence. The focus of his research is on semantic and geometric visual perception, involving multiple domains, visual conditions, and modalities. Since 2021, he is the Principal Engineer of TRACE-Zürich, a large-scale project on visual intelligence for autonomous cars and robots. He received the ETH Zürich Career Seed Award in 2022. He obtained his PhD from ETH Zürich in 2021, having worked in Computer Vision Lab. Prior to that, he received his MSc in Computer Science from ETH Zürich in 2016 and his Diploma in Electrical and Computer Engineering from National Technical University of Athens in 2014.

Host: Prof. Laura Pozzi