2025 IJCNLP IJCNLP 2025

Bridging Health Literacy Gaps in Indian Languages: Multilingual LLMs for Clinical Text Simplification

Abstract

AbstractWe demonstrate how open multilingual LLMs (mT5, IndicTrans2) can simplify complex medical documents into culturally sensitive, patient friendly text in Indian languages, advancing equitable healthcare communication and multilingual scientific accessibility.Clinical documents such as discharge summaries, consent forms, and medication instructions are essential for patient care but are often written in complex, jargon-heavy language. This barrier is intensified in multilingual and low-literacy contexts like India, where linguistic diversity meets limited health literacy. We present a multilingual clinical text simplification pipeline using open large language models (mT5 and IndicTrans2) to automatically rewrite complex medical text into accessible, culturally appropriate, and patient-friendly versions in English, Hindi, Tamil, and Telugu. Using a synthetic dataset of 2,000 discharge summaries, our models achieve up to 42% readability improvement while maintaining factual accuracy. The framework demonstrates how open, reproducible LLMs can bridge linguistic inequities in healthcare communication and support inclusive, patient-centric digital health access in India.

🌉 Interdisciplinary Bridge — Healthcare & Medicine and Natural Language Processing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors