2025 COLING COLING 2025

Adapt LLM for Multi-turn Reasoning QA using Tidy Data

Abstract

AbstractThis paper presents our submission to the Fin-DBQA shared task at the 9th FinNLP workshop. The task involves answering finance-focused questions in a multi-turn environment, requiring step-by-step reasoning and Python code generation. We propose a novel approach to tackle this multidimensional problem by pre-processing the data into tidy data format so that each column represents a variable and each row an observation. Our experiments demonstrate that using the tidy data format allows all models to surpass SOTA, with GPT-4o achieving a 50.62% accuracy on the DBQR-QA benchmark achieving second place on the shared task leaderboard. These findings suggest that transforming data into the tidy data format enhances reasoning capabilities, reduces syntax errors, and improves performance on table-reasoning QA tasks. The code is available online.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — tidy data format
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy

Authors