2024 L4DC L4DC 2024

On task-relevant loss functions in meta-reinforcement learning

Abstract

Designing a competent meta-reinforcement learning (meta-RL) algorithm in terms of data usage remains a central challenge to be tackled for its successful real-world applications. In this paper, we propose a sample-efficient meta-RL algorithm that learns a model of the system or environment at hand in a task-directed manner. As opposed to the standard model-based approaches to meta-RL, our method exploits the value information in order to rapidly capture the decision-critical part of the environment. The key component of our method is the loss function for learning both the task inference module and the system model. This systematically couples the model discrepancy and the value estimate, thereby enabling our proposed algorithm to learn the policy and task inference module with a significantly smaller amount of data compared to the existing meta-RL algorithms. The proposed method is evaluated in high-dimensional robotic control, empirically verifying its effectiveness in extracting information indispensable for solving the tasks from observations in a sample-efficient manner.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Reinforcement Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics