Papers
Regularized Off-Policy TD-Learning
NIPS 2012
Value Pursuit Iteration
NIPS 2012
The Fixed Points of Off-Policy TD
NIPS 2011
Transfer from Multiple MDPs
NIPS 2011
Generalized TD Learning
JMLR 2011
Policy Gradient Coagent Networks
NIPS 2011
Continuous Rapid Action Value Estimates
ACML 2011