Discovering Options That Minimize Average Planning Time
Abstract
Abstract We present an option discovery algorithm that accelerates planning by minimizing the shortest distance between any two states in the MDP. The proposed algorithm produces options that approximately minimize planning time in the multi-goal setting: it is shown to be a worst case (4-alpha, 2)-approximation of the optimal option set, where alpha is the approximation ratio of the k-medians with penalties subroutine. We then present a variation, "Fast Average Options", with improved run-time and describe a general means of producing similar algorithms based on selection of a k-medians subroutine. We empirically evaluate our method on four discrete and two continuous control planning domains and show that it outperforms other leading option discovery algorithms.