Sep
30
Model Selection in Reinforcement Learning
Csaba Szepesvári and I have a new paper about the model selection problem in reinforcement learning. This paper, which is published by the Machine Learning Journal, considers the batch (offline, non-interactive) reinforcement learning setting when the goal is to find an action-value function with the smallest Bellman error among a countable set of candidate functions. We prove an oracle-like inequality and show that under some additional conditions this leads to an adaptive algorithm.
For more information the results, take a look at the paper here (or here for the version on the MLJ journal website — subscription required).