Modeling Referential Gaze in Task-oriented Settings of Varying Referential Complexity

Özge Alaçam; Eugen Ruppert; Ganeshan Malhotra; Chris Biemann; Sina Zarrieß

2022 AACL AACL 2022

Modeling Referential Gaze in Task-oriented Settings of Varying Referential Complexity

Abstract

AbstractReferential gaze is a fundamental phenomenon for psycholinguistics and human-human communication. However, modeling referential gaze for real-world scenarios, e.g. for task-oriented communication, is lacking the well-deserved attention from the NLP community. In this paper, we address this challenging issue by proposing a novel multimodal NLP task; namely predicting when the gaze is referential. We further investigate how to model referential gaze and transfer gaze features to adapt to unseen situated settings that target different referential complexities than the training environment. We train (i) a sequential attention-based LSTM model and (ii) a multivariate transformer encoder architecture to predict whether the gaze is on a referent object. The models are evaluated on the three complexity datasets. The results indicate that the gaze features can be transferred not only among various similar tasks and scenes but also across various complexity levels. Taking the referential complexity of a scene into account is important for successful target prediction using gaze parameters especially when there is not much data for fine-tuning.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning

🧭 Keyword Pioneer — referential gaze

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Özge Alaçam , Eugen Ruppert , Ganeshan Malhotra , Chris Biemann , Sina Zarrieß

Topics

Artificial Intelligence > Core AI > Human-AI Interaction Artificial Intelligence > Core AI > Multimodal Learning Deep Learning > Architectures > Transformers

Keywords

attention mechanism multimodal learning transformer encoder referential gaze gaze transfer

Download PDF

Related papers

A Japanese Corpus of Many Specialized Domains for Word Segmentation and Part-of-Speech Tagging 2022

Enhancing Tabular Reasoning with Pattern Exploiting Training 2022

Re-contextualizing Fairness in NLP: The Case of India 2022

Adversarially Improving NMT Robustness to ASR Errors with Confusion Sets 2022

Promoting Pre-trained LM with Linguistic Features on Automatic Readability Assessment 2022