MT-Speech at SemEval-2022 Task 10: Incorporating Data Augmentation and Auxiliary Task with Cross-Lingual Pretrained Language Model for Structured Sentiment Analysis

Cong Chen; Jiansong Chen; Cao Liu; Fan Yang; Guanglu Wan; Jinxiong Xia

2022 SEMEVAL SemEval 2022

MT-Speech at SemEval-2022 Task 10: Incorporating Data Augmentation and Auxiliary Task with Cross-Lingual Pretrained Language Model for Structured Sentiment Analysis

Abstract

AbstractSentiment analysis is a fundamental task, and structure sentiment analysis (SSA) is an important component of sentiment analysis. However, traditional SSA is suffering from some important issues: (1) lack of interactive knowledge of different languages; (2) small amount of annotation data or even no annotation data. To address the above problems, we incorporate data augment and auxiliary tasks within a cross-lingual pretrained language model into SSA. Specifically, we employ XLM-Roberta to enhance mutually interactive information when parallel data is available in the pretraining stage. Furthermore, we leverage two data augment strategies and auxiliary tasks to improve the performance on few-label data and zero-shot cross-lingual settings. Experiments demonstrate the effectiveness of our models. Our models rank first on the cross-lingual sub-task and rank second on the monolingual sub-task of SemEval-2022 task 10.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio