[email protected]: Dr Mattiheu Geist - Kalman Temporal Differences
Istituto Dalle Molle di studi sull’intelligenza artificiale
Data d'inizio: 22 Dicembre 2009
Data di fine: 23 Dicembre 2009
Generalization is an important problem in reinforcement learning (RL), and value function approximation (VFA) is a way to handle it. A value function approximator should exhibit some features: being sample efficient (data can be expansive, especially in an industrial context), handling nonlinearities (nonlinear parameterization such as multilayer perceptron, Bellman optimality equation), handling nonstationarities (nonstationary system, but above all generalized policy iteration induces nonstationarities) and providing an uncertainty information about estimates (which should provide useful for the dilemma between exploration and exploitation). After a quick survey on VFA, it will be shown that casting value function approximation as a filtering problem allows introducing a framework which handles all these problems at the same time.