I know you are different! Towards Persona Driven Knowledge-infused Dialogue Assistant
Abstract
AbstractDespite advances in large language models (LLMs), Task-Oriented Dialogue (TOD) systems often fall short in delivering personalized, context-rich responses, especially in low-resource, code-mixed, and multimodal settings like Hinglish (Hindi-English). To bridge this gap, we introduce HiVisTask, the first Hinglish multimodal, multidomain, persona-based TOD dataset that captures user-agent interactions across text and visual modalities. We also propose G3 TOD, a generalizable framework that enhances personalization using three structured knowledge graphs: entity context, user persona, and commonsense reasoning, all extracted from conversation history. Extensive experiments with LLMs (e.g., LLaMA3.2, Phi3, GPT4, Mistral7b, Qwen3, Gemma3) show that G3 TOD consistently outperforms both standard and ablated baselines. We observe substantial gains across evaluation metrics (both quantitative: BLEU ↑ and qualitative: Human Eval ↑) over existing models. The observed improvements strongly underscore the value of structured and selective contextualization in generating personalized and engaging multimodal responses.