Seminars at the Faculty of Informatics

Talks@IDSIA: Dr Mattiheu Geist - Kalman Temporal Differences

Dear Colleagues,

on Tuesday, 22nd of December, Dr Matthieu Geist will give us a talk titled: 

Kalman Temporal Differences

The place is: Sala Primavera, Galleria 2, 6928 Manno 


Generalization is an important problem in reinforcement learning (RL), and value function approximation (VFA) is a way to handle it. A value function approximator should exhibit some features: being sample efficient (data can be expansive, especially in an industrial context), handling nonlinearities (nonlinear parameterization such as multilayer perceptron, Bellman optimality equation), handling nonstationarities (nonstationary system, but above all generalized policy iteration induces nonstationarities) and providing an uncertainty information about estimates (which should provide useful for the dilemma between exploration and exploitation). After a quick survey on VFA, it will be shown that casting value function approximation as a filtering problem allows introducing a framework which handles all these problems at the same time.