Aligning LLMs for Thai Legal Question Answering with Efficient Semantic-Similarity Rewards

Pawitsapak Akarajaradwong; Chompakorn Chaksangchaichot; Pirat Pothavorn; Ekapol Chuangsuwanich; Attapol Rutherford; Sarana Nutanong

2025 EMNLP EMNLP 2025

Aligning LLMs for Thai Legal Question Answering with Efficient Semantic-Similarity Rewards

Abstract

AbstractThe Retrieval-Augmented Generation (RAG) systems’ performance on Thai legal question answering is still limited, especially for questions requiring extensive, complex legal reasoning. To address these limitations, we introduce a resource-efficient approach that aligns Large Language Models (LLMs) for improved citation accuracy and response quality using Group-Relative Policy Optimization (GRPO). Our proposed method leverages BGE-M3 embeddings as a cost-efficient semantic-similarity reward, significantly reducing computational expenses up to 2.5x compared to an LLM-based reward model. Experiments on the NitiBench benchmark demonstrate substantial improvements: GRPO achieves up to 90% citation-F1 gains relative to the base model and a 31% increase in joint quality metrics over instruction tuning. Crucially, our approach provides a practical and effective solution for enhancing legal LLMs in resource-constrained environments.

🧭 Keyword Pioneer — group-relative policy optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Pawitsapak Akarajaradwong , Chompakorn Chaksangchaichot , Pirat Pothavorn , Ekapol Chuangsuwanich , Attapol Rutherford , Sarana Nutanong

Topics

Natural Language Processing > Applications > Question Answering Natural Language Processing > Resources & Methods > Large Language Models

Keywords

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025