Using Discourse Connectives to Test Genre Bias in Masked Language Models

Heidrun Dorgeloh; Lea Kawaletz; Simon Stein; Regina Stodden; Stefan Conrad

2024 EACL EACL 2024

Using Discourse Connectives to Test Genre Bias in Masked Language Models

Abstract

AbstractThis paper presents evidence for an effect of genre on the use of discourse connectives in argumentation. Drawing from discourse processing research on reasoning based structures, we use fill-mask computation to measure genre-induced expectations of argument realisation, and beta regression to model the probabilities of these realisations against a set of predictors. Contrasting fill-mask probabilities for the presence or absence of a discourse connective in baseline and finetuned language models reveals that genre introduces biases for the realisation of argument structure. These outcomes suggest that cross-domain discourse processing, but also argument mining, should take into account generalisations about specific features, such as connectives, and their probability related to the genre context.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🧭 Keyword Pioneer — genre bia

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Heidrun Dorgeloh , Lea Kawaletz , Simon Stein , Regina Stodden , Stefan Conrad

Topics

Natural Language Processing > Understanding > Semantic Analysis Artificial Intelligence > Core AI > Fairness Natural Language Processing > Resources & Methods > Language Modeling

Keywords

argument mining masked language model discourse connective genre bia cross-domain processing argument structure fill-mask computation

Download PDF

Related papers

A Dataset for Metaphor Detection in Early Medieval Hebrew Poetry 2024

PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation 2024

Overview of the Hate Speech Detection in Turkish and Arabic Tweets (HSD-2Lang) Shared Task at CASE 2024 2024

Evaluating In-Context Learning for Computational Literary Studies: A Case Study Based on the Automatic Recognition of Knowledge Transfer in German Drama 2024

Selam@DravidianLangTech 2024:Identifying Hate Speech and Offensive Language 2024