MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning

Chengpeng Li; Zheng Yuan; Hongyi Yuan; Guanting Dong; Keming Lu; Jiancan Wu; Chuanqi Tan; Xiang Wang; Chang Zhou

2024 ACL ACL 2024

MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning

Abstract

AbstractIn math reasoning with large language models (LLMs), fine-tuning data augmentation by query evolution and diverse reasoning paths is empirically verified effective, profoundly narrowing the gap between open-sourced LLMs and cutting-edge proprietary LLMs. In this paper, we conduct an investigation for such data augmentation in math reasoning and are intended to answer: (1) What strategies of data augmentation are more effective; (2) What is the scaling relationship between the amount of augmented data and model performance; and (3) Can data augmentation incentivize generalization to out-of-domain mathematical reasoning tasks?To this end, we create two new dataset AugGSM8K and AugMATH, by complicating and diversifying the queries and sampling multiple reasoning paths from GSM8K and MATH.We obtained a series of LLMs called MuggleMath by fine-tuning LLaMA models on AugGSM8K and AugMATH. MuggleMath substantially achieves new state-of-the-art on GSM8K and MATH.A log-linear relationship and a segmented log-linear are presented between MuggleMath’s performance and the amount of augmented data on GSM8K and MATH, respectively.We also find that it is weak in out-of-domain math reasoning generalization from AugGSM8K to MATH and from AugMATH to GSM8K, which suggests that augmenting queries that cover a broader range of subjects is more beneficial for generalization.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — query evolution

🐣 Hot Topic Early Bird — reasoning path

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Chengpeng Li , Zheng Yuan , Hongyi Yuan , Guanting Dong , Keming Lu , Jiancan Wu , Chuanqi Tan , Xiang Wang , Chang Zhou

Topics

Machine Learning > Application Areas > Data Augmentation Machine Learning > Learning Types > Transfer Learning Artificial Intelligence > Core AI > Large Language Models

Keywords

transfer learning mathematical reasoning data augmentation parameter efficient fine-tuning chain of thought reasoning path large language model math reasoning query evolution

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024