Transformer-based Localization from Embodied Dialog with Large-scale Pre-training

Meera Hahn; James M. Rehg

2022 AACL AACL 2022

Transformer-based Localization from Embodied Dialog with Large-scale Pre-training

Abstract

AbstractWe address the challenging task of Localization via Embodied Dialog (LED). Given a dialog from two agents, an Observer navigating through an unknown environment and a Locator who is attempting to identify the Observer’s location, the goal is to predict the Observer’s final location in a map. We develop a novel LED-Bert architecture and present an effective pretraining strategy. We show that a graph-based scene representation is more effective than the top-down 2D maps used in prior works. Our approach outperforms previous baselines.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Meera Hahn , James M. Rehg

Topics

Artificial Intelligence > Core AI > Agent Systems Artificial Intelligence > Core AI > Multimodal Learning Artificial Intelligence > Learning Paradigms > Transfer Learning

Keywords

trajectory prediction graph representation pretraining strategy spatial localization embodied dialog

Download PDF

Related papers

A Japanese Corpus of Many Specialized Domains for Word Segmentation and Part-of-Speech Tagging 2022

Enhancing Tabular Reasoning with Pattern Exploiting Training 2022

Re-contextualizing Fairness in NLP: The Case of India 2022

Adversarially Improving NMT Robustness to ASR Errors with Confusion Sets 2022

Promoting Pre-trained LM with Linguistic Features on Automatic Readability Assessment 2022