JTAV: Jointly Learning Social Media Content Representation by Fusing Textual, Acoustic, and Visual Features

Hongru Liang; Haozheng Wang; Jun Wang; Shaodi You; Zhe Sun; Jin-Mao Wei; Zhenglu Yang

2018 COLING COLING 2018

JTAV: Jointly Learning Social Media Content Representation by Fusing Textual, Acoustic, and Visual Features

Abstract

AbstractLearning social media content is the basis of many real-world applications, including information retrieval and recommendation systems, among others. In contrast with previous works that focus mainly on single modal or bi-modal learning, we propose to learn social media content by fusing jointly textual, acoustic, and visual information (JTAV). Effective strategies are proposed to extract fine-grained features of each modality, that is, attBiGRU and DCRNN. We also introduce cross-modal fusion and attentive pooling techniques to integrate multi-modal information comprehensively. Extensive experimental evaluation conducted on real-world datasets demonstrate our proposed model outperforms the state-of-the-art approaches by a large margin.

🐣 Hot Topic Early Bird — feature fusion

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Hongru Liang , Haozheng Wang , Jun Wang , Shaodi You , Zhe Sun , Jin-Mao Wei , Zhenglu Yang

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Learning Types > Self-Supervised Learning

Keywords

attention mechanism social media analysis text representation multi-modal learning feature fusion acoustic feature

Download PDF

Related papers

DialEdit: Annotations for Spoken Conversational Image Editing 2018

Downward Compatible Revision of Dialogue Annotation 2018

Zero Pronoun Resolution with Attention-based Neural Network 2018

Triad-based Neural Network for Coreference Resolution 2018

Challenges of language technologies for the indigenous languages of the Americas 2018