Uncertainty in Semantic Language Modeling with PIXELS

Stefania Radu; Marco Zullich; Matias Valdenegro-Toro

2025 EMNLP EMNLP 2025

Uncertainty in Semantic Language Modeling with PIXELS

Abstract

AbstractPixel-based language models aim to solve the vocabulary bottleneck problem in language modeling, but the challenge of uncertainty quantification remains open. The novelty of this work consists of analysing uncertainty and confidence in pixel-based language models across 18 languages and 7 scripts, all part of 3 semantically challenging tasks. This is achieved through several methods such as Monte Carlo Dropout, Transformer Attention, and Ensemble Learning. The results suggest that pixel-based models underestimate uncertainty when reconstructing patches. The uncertainty is also influenced by the script, with Latin languages displaying lower uncertainty. The findings on ensemble learning show better performance when applying hyperparameter tuning during the named entity recognition and question-answering tasks across 16 languages.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Stefania Radu , Marco Zullich , Matias Valdenegro-Toro

Topics

Machine Learning > Optimization & Theory > Bayesian Inference Natural Language Processing > Resources & Methods > Large Language Models Natural Language Processing > Resources & Methods > Language Modeling Deep Learning > Models > Large Language Models Machine Learning > Learning Types > Uncertainty Quantification

Keywords

ensemble learning uncertainty quantification question answering named entity recognition monte carlo dropout pixel-based language model

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025