The State of the Art of Large Language Models on Chartered Financial Analyst Exams

Mahmoud Mahfouz; Ethan Callanan; Mathieu Sibue; Antony Papadimitriou; Zhiqiang Ma; Xiaomo Liu; Xiaodan Zhu

2024 EMNLP EMNLP 2024

The State of the Art of Large Language Models on Chartered Financial Analyst Exams

Abstract

AbstractThe Chartered Financial Analyst (CFA) program is one of the most widely recognized financial certifications globally. In this work, we test a variety of state-of-the-art large language models (LLMs) on mock CFA exams to provide an overview of their financial analysis capabilities using the same evaluation standards applied for human professionals. We benchmark five leading proprietary models and eight open-source models on all three levels of the CFA through challenging multiple-choice and essay questions. We find that flagship proprietary models perform relatively well and can solidly pass levels I and II exams, but fail at level III due to essay questions. Open-source models generally fall short of estimated passing scores, but still show strong performance considering their size, cost, and availability advantages. We also find that using textbook data helps bridge the gap between open-source and proprietary models to a certain extent, despite reduced gains in CFA levels II and III. By understanding the current financial analysis abilities of LLMs, we aim to guide practitioners on which models are best suited for enhancing automation in the financial industry.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Mahmoud Mahfouz , Ethan Callanan , Mathieu Sibue , Antony Papadimitriou , Zhiqiang Ma , Xiaomo Liu , Xiaodan Zhu

Topics

Artificial Intelligence > Core AI > Foundation Models Natural Language Processing > Applications > Question Answering Artificial Intelligence > Core AI > Large Language Models Deep Learning > Models > Large Language Models Machine Learning > Learning Types > Evaluation

Keywords

benchmark evaluation text classification question answering financial analysis large language model

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024