When is a Language Process a Language Model?
Abstract
AbstractA language model may be viewed as a π΄-valued stochastic process for some alphabet π΄.However, in some pathological situations, such a stochastic process may βleakβ probability mass onto the set of infinite strings and hence is not equivalent to the conventional view of a language model as a distribution over ordinary (finite) strings.Such ill-behaved language processes are referred to as *non-tight* in the literature.In this work, we study conditions of tightness through the lens of stochastic processes.In particular, by regarding the symbol as marking a stopping time and using results from martingale theory, we give characterizations of tightness that generalize our previous work [(Du et al. 2023)](https://arxiv.org/abs/2212.10502).