SLOT: Structuring the Output of Large Language Models

Zhengyuan Shen; Darren Yow-Bang Wang; Soumya Smruti Mishra; Zhichao Xu; Yifei Teng; Haibo Ding

2025 EMNLP EMNLP 2025

SLOT: Structuring the Output of Large Language Models

Abstract

AbstractStructured outputs are essential for large language models (LLMs) in critical applications like agents and information extraction. Despite their capabilities, LLMs often generate outputs that deviate from predefined schemas, significantly hampering reliable application development. We present SLOT (Structured LLM Output Transformer), a model-agnostic approach that transforms unstructured LLM outputs into precise structured formats. While existing solutions predominantly rely on constrained decoding techniques or are tightly coupled with specific models, SLOT employs a fine-tuned lightweight language model as a post-processing layer, achieving flexibility across various LLMs and schema specifications. We introduce SLOTBench, curated by a data synthesis pipeline alongside a formal evaluation methodology that quantifies both schema accuracy and content fidelity. Our results demonstrate that fine-tuned Mistral-7B model with constrained decoding achieves near-perfect schema accuracy (99.5%) and content similarity (94.0%), outperforming Claude-3.5-Sonnet by substantial margins (+25 and +20 percentage points, respectively). Notably, even compact models like Llama-3.2-1B can match or exceed the structured output capabilities of much larger proprietary models when equipped with SLOT, enabling reliable structured generation in resource-constrained environments. SLOTBench will be released upon legal approval.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — schema accuracy

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Zhengyuan Shen , Darren Yow-Bang Wang , Soumya Smruti Mishra , Zhichao Xu , Yifei Teng , Haibo Ding

Topics

Artificial Intelligence > Core AI > Foundation Models Machine Learning > Application Areas > Efficient Computing Natural Language Processing > Generation > Text Generation Natural Language Processing > Applications > Information Extraction Artificial Intelligence > Core AI > Large Language Models Deep Learning > Models > Large Language Models Natural Language Processing > Applications > Text Generation Machine Learning > Learning Types > Fine-Tuning Deep Learning > Learning Types > Fine-Tuning

Keywords

information extraction text generation structured output language model constrained decoding structured output generation language model fine-tuning large language model schema accuracy content fidelity post-processing layer structured generation

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025