Saama Technologies at SemEval-2025 Task 8: Few-shot prompting with LLM-generated examples for question answering on tabular data
Abstract
AbstractFor SemEval 2025 Task 8, addressing tabular data question answering, we introduce a novel few-shot prompting system that guides large language models (LLMs) to generate Python code representing the reasoning process. Our system automatically creates a library of exemplar code snippets from training data, which are then used for few-shot prompting. Crucially, we incorporate a selection prompt to choose the best candidate code from multiple LLM-generated options, improving robustness and accuracy. Our system achieved competitive results, ranking 17th in the Open Model track and 25th overall. Ablation studies demonstrate the effectiveness of our exemplar generation and code selection strategies. We conclude with a discussion of limitations and promising avenues for future research.