ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search

Shangtong Zhang; Hengshuai Yao

2019 AAAI AAAI 2019

ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search

Abstract

Abstract In this paper, we propose an actor ensemble algorithm, named ACE, for continuous control with a deterministic policy in reinforcement learning. In ACE, we use actor ensemble (i.e., multiple actors) to search the global maxima of the critic. Besides the ensemble perspective, we also formulate ACE in the option framework by extending the option-critic architecture with deterministic intra-option policies, revealing a relationship between ensemble and options. Furthermore, we perform a look-ahead tree search with those actors and a learned value prediction model, resulting in a refined value estimation. We demonstrate a significant performance boost of ACE over DDPG and its variants in challenging physical robot simulators.

🚀 Conference Pioneer — AAAI 2019

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Reinforcement Learning

🧭 Keyword Pioneer — deterministic policy

🐣 Hot Topic Early Bird — tree search

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Shangtong Zhang , Hengshuai Yao

Topics

Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Applications > Robotics Artificial Intelligence > Core AI > Robotics Deep Learning > Learning Types > Reinforcement Learning

Keywords

reinforcement learning continuous control model-based reinforcement learning value estimation tree search deterministic policy option framework value prediction actor ensemble

Download PDF

Related papers

Cooperative Multimodal Approach to Depression Detection in Twitter 2019

Learning to Align Question and Answer Utterances in Customer Service Conversation with Recurrent Pointer Networks 2019

Community Detection in Social Networks Considering Topic Correlations 2019

Session-Based Recommendation with Graph Neural Networks 2019

Blameworthiness in Multi-Agent Settings 2019