2025 IJCNLP IJCNLP 2025

Thesis Proposal: Efficient Methods for Natural Language Generation/Understanding Systems

Abstract

AbstractWhile Large Language Models (LLMs) have shown remarkable performance in various Natural Language Processing (NLP) tasks, their effectiveness seem to be heavily biased toward high-resource languages. This proposal aims to address this gap by developing efficient training strategies for low-resource languages. We propose various techniques for efficient learning in simluated low-resource settings for English. We then plan to adapt these methods for low-resource languages. We plan to experiment with both natural language generation and understanding models. We evaluate the models on similar benchmarks as the BabyLM challenge for English. For other languages, we plan to use treebanks and translation techniques to create our own silver test set to evaluate the low-resource LMs.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors