2016 ACL ACL 2016

On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems