MOSER: Learning Sensory Policy for Task-specific Viewpoint via View-conditional World Model

Shenghua Wan; Hai-Hang Sun; Le Gan; De-Chuan Zhan

2024 IJCAI IJCAI 2024

MOSER: Learning Sensory Policy for Task-specific Viewpoint via View-conditional World Model

Abstract

Reinforcement learning from visual observations is a challenging problem with many real-world applications. Existing algorithms mostly rely on a single observation from a well-designed fixed camera that requires human knowledge. Recent studies learn from different viewpoints with multiple fixed cameras, but this incurs high computation and storage costs and may not guarantee the coverage of the optimal viewpoint. To alleviate these limitations, we propose a straightforward View-conditional Partially Observable Markov Decision Processes (VPOMDPs) assumption and develop a new method, the MOdel-based SEnsor controlleR (MOSER). MOSER jointly learns a view-conditional world model (VWM) to simulate the environment, a sensory policy to control the camera, and a motor policy to complete tasks. We design intrinsic rewards from the VWM without additional modules to guide the sensory policy to adjust the camera parameters. Experiments on locomotion and manipulation tasks demonstrate that MOSER autonomously discovers task-specific viewpoints and significantly outperforms most baseline methods.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Reinforcement Learning

🧭 Keyword Pioneer — view-conditional world model

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

Authors

Shenghua Wan , Hai-Hang Sun , Le Gan , De-Chuan Zhan

Topics

Artificial Intelligence > Core AI > Multimodal Learning Reinforcement Learning > Applications > Robotics

Keywords

partially observable markov decision process intrinsic reward view-conditional world model sensory policy motor policy viewpoint control

Download PDF

Related papers

Langshaw: Declarative Interaction Protocols Based on Sayso and Conflict 2024

A Successful Strategy for Multichannel Iterated Prisoner’s Dilemma 2024

Bring Metric Functions into Diffusion Models 2024

Fast One-Stage Unsupervised Domain Adaptive Person Search 2024

FreqFormer: Frequency-aware Transformer for Lightweight Image Super-resolution 2024