2023 ACL ACL 2023

Glossy Bytes: Neural Glossing using Subword Encoding

Abstract

AbstractThis paper presents several different neural subword modelling based approaches to interlinear glossing for seven under-resourced languages as a part of the 2023 SIGMORPHON shared task on interlinear glossing. We experiment with various augmentation and tokenization strategies for both the open and closed tracks of data. We found that while byte-level models may perform well for greater amounts of data, character based approaches remain competitive in their performance in lower resource settings.

🧭 Keyword Pioneer — neural glossing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Speech & Audio