2024
INTERSPEECH
INTERSPEECH 2024
Oversampling, Augmentation and Curriculum Learning for Speaking Assessment with Limited Training Data
Abstract
Automated assessment systems for spontaneous speech are an increasingly important component in language proficiency tests and learning platforms. These systems have seen remarkable development in recent years, driven by advances in self-supervised learning. Nevertheless, in languages such as Finnish and Finland Swedish, their performance is still limited by the low-resource and imbalance nature of their data. To alleviate these issues, this work evaluates two data-level methods: oversampling and curriculum learning. Our results reveal that combining these methods results in the greatest boost to model performance, achieved without additional data or modification to the model structure.
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio