RETUYT-INCO at MLSP 2024: Experiments on Language Simplification using Embeddings, Classifiers and Large Language Models

Ignacio Sastre; Leandro Alfonso; Facundo Fleitas; Federico Gil; Andrés Lucas; Tomás Spoturno; Santiago Góngora; Aiala Rosá; Luis Chiruzzo

2024 NAACL NAACL 2024

RETUYT-INCO at MLSP 2024: Experiments on Language Simplification using Embeddings, Classifiers and Large Language Models

Abstract

AbstractIn this paper we present the participation of the RETUYT-INCO team at the BEA-MLSP 2024 shared task. We followed different approaches, from Multilayer Perceptron models with word embeddings to Large Language Models fine-tuned on different datasets: already existing, crowd-annotated, and synthetic.Our best models are based on fine-tuning Mistral-7B, either with a manually annotated dataset or with synthetic data.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ignacio Sastre , Leandro Alfonso , Facundo Fleitas , Federico Gil , Andrés Lucas , Tomás Spoturno , Santiago Góngora , Aiala Rosá , Luis Chiruzzo

Topics

Artificial Intelligence > Core AI > Foundation Models Machine Learning > Core Methods > Classification Machine Learning > Core Methods > Embedding Learning

Keywords

multilayer perceptron word embedding large language model language simplification

Download PDF

Related papers

Working Alliance Transformer for Psychotherapy Dialogue Classification 2024

Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences 2024

Assessing Logical Puzzle Solving in Large Language Models: Insights from a Minesweeper Case Study 2024

TelME: Teacher-leading Multimodal Fusion Network for Emotion Recognition in Conversation 2024

Extractive Summarization with Text Generator 2024