Papers
Speedy Q-Learning
NIPS 2011
The Fixed Points of Off-Policy TD
NIPS 2011
Relative Entropy Inverse Reinforcement Learning
AISTATS 2011
Reward Design via Online Gradient Ascent
NIPS 2010
Double Q-learning
NIPS 2010
Learning Policy Improvements with Path Integrals
AISTATS 2010
Efficient Reductions for Imitation Learning
AISTATS 2010