Comparison of methods for explicit discourse connective identification across various domains

Merel Scholman; Tianai Dong; Frances Yung; Vera Demberg

2021 EMNLP EMNLP 2021

Comparison of methods for explicit discourse connective identification across various domains

Abstract

AbstractExisting parse methods use varying approaches to identify explicit discourse connectives, but their performance has not been consistently evaluated in comparison to each other, nor have they been evaluated consistently on text other than newspaper articles. We here assess the performance on explicit connective identification of three parse methods (PDTB e2e, Lin et al., 2014; the winner of CONLL2015, Wang et al., 2015; and DisSent, Nie et al., 2019), along with a simple heuristic. We also examine how well these systems generalize to different datasets, namely written newspaper text (PDTB), written scientific text (BioDRB), prepared spoken text (TED-MDB) and spontaneous spoken text (Disco-SPICE). The results show that the e2e parser outperforms the other parse methods in all datasets. However, performance drops significantly from the PDTB to all other datasets. We provide a more fine-grained analysis of domain differences and connectives that prove difficult to parse, in order to highlight the areas where gains can be made.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Interdisciplinary and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — explicit discourse connective

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Merel Scholman , Tianai Dong , Frances Yung , Vera Demberg

Topics

Machine Learning > Application Areas > Domain Adaptation Natural Language Processing > Understanding > Semantic Analysis Natural Language Processing > Applications > Information Extraction Interdisciplinary > Linguistics > Computational Linguistics Artificial Intelligence > Core AI > Natural Language Processing

Keywords

domain generalization domain adaptation natural language processing discourse parsing cross-domain evaluation text parsing neural network explicit discourse connective parsing method explicit connective

Download PDF

Related papers

Continual Learning in Multilingual NMT via Language-Specific Embeddings 2021

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents 2021

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity 2021

Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings 2021

Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis 2021