Talks@IDSIA: Dr Mattiheu Geist - Kalman Temporal Differences
Dalle Molle Institute for Artificial Intelligence
Start date: 22 December 2009
End date: 23 December 2009
Generalization is an important problem in reinforcement learning (RL), and value function approximation (VFA) is a way to handle it. A value function approximator should exhibit some features: being sample efficient (data can be expansive, especially in an industrial context), handling nonlinearities (nonlinear parameterization such as multilayer perceptron, Bellman optimality equation), handling nonstationarities (nonstationary system, but above all generalized policy iteration induces nonstationarities) and providing an uncertainty information about estimates (which should provide useful for the dilemma between exploration and exploitation). After a quick survey on VFA, it will be shown that casting value function approximation as a filtering problem allows introducing a framework which handles all these problems at the same time.