Dunamu ML at the Financial Misinformation Detection Challenge Task: Improving Supervised Fine-Tuning with LLM-based Data Augmentation
Abstract
AbstractIn this paper, we describe Dunamu MLâs submission to the Financial Misinformation Detection (FMD) 2025 shared task. To address the low-resource challenge in FMD, we augmented a general domain misinformation detection dataset for training. We first collected claims, contexts, and misinformation labels from a public dataset. Then, we generated evidence for each label based on a closed LLM with few-shot examples extracted from the FMD training dataset. Finally, we oversampled the training data specific to the financial domain and augmented it with the generated data to perform supervised fine-tuning (SFT) on the LLM. When evaluated on the blind test dataset, our model achieved an F1 score of 84.67 in misinformation classification and a ROUGE-1 score of 81.21 in evidence generation, ranking first on the leaderboard in both aspects.