Factuality Evaluation Using Reasoning and World Modeling

Sachin Vashistha

2026 AAAI AAAI 2026

Factuality Evaluation Using Reasoning and World Modeling

Abstract

Abstract Large language models (LLMs) have rapidly become primary tools for searching and generating information given a carefully designed prompt (may contain few-shot examples). However, these models frequently produce factually incorrect statements that are not consistent with verifiable facts and reliable sources, raising fundamental questions about how these models store, update, and reason with facts. Improving factuality, therefore, requires more than surface-level mitigation strategies: it demands a deeper understanding of how LLMs construct and maintain world models, and how reasoning processes can be guided to remain faithful to the verifiable information. Existing strategies, such as retrieval-augmented generation, training-time alignment, post hoc verification, etc., partly address these challenges but do not provide a holistic account of how facts are internally stored, updated, or grounded in external knowledge sources. My research addresses this gap by studying factuality through the dual lens of reasoning and world modeling, asking how LLMs encode facts, how adversarial or linguistic perturbations compromise factual reasoning, and how interpretability tools can reveal and correct model vulnerabilities. In this work, I aim to develop a framework in which an LLM interacts with an explicit external knowledge source, thereby forming a robust world model for factual evaluation.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sachin Vashistha

Topics

Artificial Intelligence > Core AI > Interpretability Natural Language Processing > Applications > Fact-Checking

Keywords

factuality evaluation factual reasoning large language model world modeling interpretability tool

Download PDF

Related papers

Hi-EF: Benchmarking Emotion Forecasting in Human-interaction 2026

MosaicDoc: A Large-Scale Bilingual Benchmark for Visually Rich Document Understanding 2026

Sparse3DPR: Training-Free 3D Hierarchical Scene Parsing and Task-Adaptive Subgraph Reasoning from Sparse RGB Views 2026

LayerEdit: Disentangled Multi-Object Editing via Conflict-Aware Multi-Layer Learning 2026

HDGS: Hierarchical Dynamic Gaussian Splatting for Urban Driving Scenes 2026