Papers
Policy Gradient in Continuous Time
JMLR 2006
Learning Operational Space Control
RSS 2006
Least-Squares Policy Iteration
JMLR 2003
ε-MDPs: Learning in Varying Environments
JMLR 2002
Policy Search using Paired Comparisons
JMLR 2002