Toward Interactive Grounded Language Acqusition

Thomas Kollar; Jayant Krishnamurthy; Grant Strimel

2013 RSS RSS 2013

Toward Interactive Grounded Language Acqusition

Abstract

This paper addresses the problem of enabling robots to interactively learn visual and spatial models from multi-modal interactions involving speech, gesture and images. Our approach, called Logical Semantics with Perception (LSP), provides a natural and intuitive interface by significantly reducing the amount of supervision that a human is required to provide. This paper demonstrates LSP in an interactive setting. Given speech and gesture input, LSP is able to learn object and relation classifiers for objects like mugs and relations like left and right. We extend LSP to generate complex natural language descriptions of selected objects using adjectives, nouns and relations, such as âthe orange mug to the right of the green book.â Furthermore, we extend LSP to incorporate determiners (e.g., âtheâ) into its training procedure, enabling the model to generate acceptable relational language 20% more often than the unaugmented model.

🧭 Keyword Pioneer — visual spatial reasoning

🐣 Hot Topic Early Bird — natural language generation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Thomas Kollar , Jayant Krishnamurthy , Grant Strimel

Topics

Artificial Intelligence > Core AI > Human-AI Interaction Artificial Intelligence > Core AI > Multimodal Learning

Keywords

natural language generation multi-modal learning visual spatial reasoning grounded language acquisition

Download PDF

Related papers

Realtime Registration-Based Tracking via Approximate Nearest Neighbour Search 2013

Metastability for High-Dimensional Walking Systems on Stochastically Rough Terrain 2013

Deep Learning for Detecting Robotic Grasps 2013

Sorry Dave, I'm Afraid I Can't Do That: Explaining Unachievable Robot Tasks Using Natural Language 2013

Convex Optimization of Nonlinear Feedback Controllers via Occupation Measures 2013