Papers
BRPO: Batch Residual Policy Optimization
IJCAI 2020
A Nonparametric Off-Policy Policy Gradient
AISTATS 2020
Adaptive Trade-Offs in Off-Policy Learning
AISTATS 2020
Attentive Experience Replay
AAAI 2020