Papers
290 papers found
BabyHGRN: Exploring RNNs for Sample-Efficient Language Modeling
Patrick Haller, Jonas Golde, Alan Akbik
BabyLlama-2: Ensemble-Distilled Models Consistently Outperform Teachers With Limited Data
Jean-Loup Tastet, Inar Timiryasov
BabyLM Challenge: Experimenting with Self-Distillation and Reverse-Distillation for Language Model Pre-Training on Constrained Datasets
Aakarsh Nair, Alina Hancharova, Mayank Kumar et al.
BabyLM Challenge: Exploring the effect of variation sets on language model training efficiency
Akari Haga, Akiyo Fukatsu, Miyu Oba et al.
BERTtime Stories: Investigating the Role of Synthetic Story Data in Language Pre-training
Nikitas Theodoropoulos, Giorgos Filandrianos, Vassilis Lyberatos et al.
Causal ATE Mitigates Unintended Bias in Controlled Text Generation
Rahul Madhavan, Kahini Wadhawan
Choosy Babies Need One Coach: Inducing Mode-Seeking Behavior in BabyLlama with Reverse KL Divergence
Shaozhen Shi, Yevgen Matusevych, Malvina Nissim
ConcreteGPT: A Baby GPT-2 Based on Lexical Concreteness and Curriculum Learning
Luca Capone, Alessandro Bondielli, Alessandro Lenci
Continuous Attentive Multimodal Prompt Tuning for Few-Shot Multimodal Sarcasm Detection
Soumyadeep Jana, Animesh Dey, Ranbir Singh Sanasam
Critical Questions Generation: Motivation and Challenges
Blanca Calvo Figueras, Rodrigo Agerri
CrowdCounter: A benchmark type-specific multi-target counterspeech dataset
Punyajoy Saha, Abhilash Datta, Abhik Jana et al.
Developmentally Plausible Multimodal Language Models Are Highly Modular
Alina Klerings, Christian Bartelt, Aaron Mueller
Different Ways to Forget: Linguistic Gates in Recurrent Neural Networks
Cristiano Chesi, Veronica Bressan, Matilde Barbini et al.
Dreaming Out Loud: A Self-Synthesis Approach For Training Vision-Language Models With Developmentally Plausible Data
Badr AlKhamissi, Yingtian Tang, Abdülkadir Gökce et al.
EditEval: An Instruction-Based Benchmark for Text Improvements
Jane Dwivedi-Yu, Timo Schick, Zhengbao Jiang et al.
Explaining the Hardest Errors of Contextual Embedding Based Classifiers
Claudio Moisés Valiense De Andrade, Washington Cunha, Guilherme Fonseca et al.
Exploring Curriculum Learning for Vision-Language Tasks: A Study on Small-Scale Multimodal Training
Rohan Saha, Abrar Fahim, Alona Fyshe et al.
Extending the BabyLM Initiative : Promoting Diversity in Datasets and Metrics through High-Quality Linguistic Corpora
Laurent Prévot, Sheng-Fu Wang, Jou-An Chi et al.
Findings of the Second BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Michael Y. Hu, Aaron Mueller, Candace Ross et al.
From Babble to Words: Pre-Training Language Models on Continuous Streams of Phonemes
Zébulon Goriely, Richard Diehl Martinez, Andrew Caines et al.
Further Compressing Distilled Language Models via Frequency-aware Partial Sparse Coding of Embeddings
Kohki Tamura, Naoki Yoshinaga, Masato Neishi
Generalizations across filler-gap dependencies in neural language models
Katherine Howitt, Sathvik Nair, Allison Dods et al.
Global Learning with Triplet Relations in Abstractive Summarization
Fengyu Lu, Jiaxin Duan, Junfei Liu
Global-Pruner: A Stable and Efficient Pruner for Retraining-Free Pruning of Encoder-Based Language Models
Guangzhen Yao, Yuehan Wang, Hui Xu et al.