Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation

John Martin; Jinkun Wang; Brendan Englot

2018 CORL CoRL 2018

Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation

Abstract

We present a method for Temporal Difference (TD) learning that addresses several challenges faced by robots learning to navigate in a marine environment. For improved data efficiency, our method reduces TD updates to Gaussian Process regression. To make predictions amenable to online settings, we introduce a sparse approximation with improved quality over current rejection-based methods. We derive the predictive value function posterior and use the moments to obtain a new algorithm for model-free policy evaluation, SPGP-SARSA. With simple changes, we show SPGP-SARSA can be reduced to a model-based equivalent, SPGP-TD. We perform comprehensive simulation studies and also conduct physical learning trials with an underwater robot. Our results show SPGP-SARSA can outperform the state-of-the-art sparse method, replicate the prediction quality of its exact counterpart, and be applied to solve underwater navigation tasks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Reinforcement Learning and Robotics

🧭 Keyword Pioneer — marine navigation

🐣 Hot Topic Early Bird — gaussian process

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

John Martin , Jinkun Wang , Brendan Englot

Topics

Artificial Intelligence > Bayesian & Probabilistic > Bayesian Learning Reinforcement Learning > Applications > Robotics Robotics > Capabilities > Navigation

Keywords

temporal difference learning policy evaluation gaussian process sparse approximation marine navigation underwater robot

Download PDF

Related papers

Batch Active Preference-Based Learning of Reward Functions 2018

Personalized Dynamics Models for Adaptive Assistive Navigation Systems 2018

Neural Modular Control for Embodied Question Answering 2018

Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents 2018

Deep Drone Racing: Learning Agile Flight in Dynamic Environments 2018