2024
EMNLP
EMNLP 2024
Who Wrote When? Author Diarization in Social Media Discussions
Abstract
AbstractWe are proposing a novel framework for author diarization, i.e. attributing comments in online discussions to individual authors. We consider an innovative approach that merges pre-trained neural representations of writing style with author-conditional encoder-decoder diarization, enhanced by a Conditional Random Field with Viterbi decoding for alignment refinement. Additionally, we introduce two new large-scale German language datasets, one for authorship verification and the other for author diarization. We evaluate the performance of our diarization framework on these datasets, offering insights into the strengths and limitations of this approach.
❓
The Questioner
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Interdisciplinary and Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— author diarization
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Core Methods > Classification
Natural Language Processing > Applications > Text Classification
Natural Language Processing > Resources & Methods > Text Representation
Interdisciplinary > Linguistics > Computational Linguistics
Interdisciplinary > Social > Social Media Analysis
Machine Learning > Core Methods > Graphical Models
Natural Language Processing > Applications > Named Entity Recognition
Artificial Intelligence > Core AI > Information Extraction