2025 NAACL NAACL 2025

A Dataset of Ancient Chinese Math Word Problems and an Application for Research in Historic Mathematics

Abstract

AbstractSolving math word problems, i.e. mathemati-cal problems stated in natural language, has re-ceived much attention in the Artificial Intelli-gence (AI) community over the last years. Un-surprisingly, research has focused on problems stated in contemporary languages. In contrast to this, in this article, we introduce a dataset of math word problems that is extracted from an-cient Chinese mathematical texts. The dataset is made available.1 We report a baseline per-formance for GPT-4o solving the problems in the dataset using a Program-of-Thought paradigm that translates the mathematical pro-cedures in the original texts into Python code, giving acceptable performance but showing that the model often struggles with understand-ing the pre-modern language. Finally, we de-scribe how the generated code can be used for research into the history of mathematics, by of-fering a way to search the texts by abstract op-erations instead of specific lexemes.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — history of mathematics
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors