Consistent Bidirectional Language Modelling: Expressive Power and Representational Conciseness

Georgi Shopov; Stefan Gerdjikov

2024 EMNLP EMNLP 2024

Consistent Bidirectional Language Modelling: Expressive Power and Representational Conciseness

Abstract

AbstractThe inability to utilise future contexts and the pre-determined left-to-right generation order are major limitations of unidirectional language models. Bidirectionality has been introduced to address those deficiencies. However, a crucial shortcoming of bidirectional language models is the potential inconsistency of their conditional distributions. This fundamental flaw greatly diminishes their applicability and hinders their capability of tractable sampling and likelihood computation. In this work, we introduce a class of bidirectional language models, called latent language models, that are consistent by definition and can be efficiently used both for generation and scoring of sequences. We define latent language models based on the well-understood formalism of bisequential decompositions from automata theory. This formal correspondence allows us to precisely charaterise the abilities and limitations of a subclass of latent language models, called rational language models. As a result, we obtain that latent language models are exponentially more concise and significantly more expressive than unidirectional language models.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — latent language model

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Georgi Shopov , Stefan Gerdjikov

Topics

Machine Learning > Core Methods > Representation Learning Natural Language Processing > Generation > Language Modeling Machine Learning > Learning Types > Representation Learning Deep Learning > Models > Large Language Models Deep Learning > Learning Types > Representation Learning Deep Learning > Models > Language Modeling

Keywords

representation learning sequence modeling language modeling automata theory bidirectional language model latent language model bisequential decomposition representational conciseness

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024