On Evaluating LLMs’ Capabilities as Functional Approximators: A Bayesian Evaluation Framework

Shoaib Ahmed Siddiqui; Yanzhi Chen; Juyeon Heo; Menglin Xia; Adrian Weller

2025 COLING COLING 2025

On Evaluating LLMs’ Capabilities as Functional Approximators: A Bayesian Evaluation Framework

Abstract

AbstractRecent works have successfully applied Large Language Models (LLMs) to function modeling tasks. However, the reasons behind this success remain unclear. In this work, we propose a new evaluation framework to comprehensively assess LLMs’ function modeling abilities. By adopting a Bayesian perspective of function modeling, we discover that LLMs are relatively weak in understanding patterns in raw data, but excel at utilizing prior knowledge about the domain to develop a strong understanding of the underlying function. Our findings offer new insights about the strengths and limitations of LLMs in the context of function modeling.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Shoaib Ahmed Siddiqui , Yanzhi Chen , Juyeon Heo , Menglin Xia , Adrian Weller

Topics

Artificial Intelligence > Bayesian & Probabilistic > Bayesian Learning Machine Learning > Optimization & Theory > Theory Natural Language Processing > Resources & Methods > Large Language Models

Keywords

bayesian learning function approximation evaluation framework prior knowledge large language model

Download PDF

Related papers

Navigating Dialectal Bias and Ethical Complexities in Levantine Arabic Hate Speech Detection 2025

TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution 2025

Positive Text Reframing under Multi-strategy Optimization 2025

RAM2C: A Liberal Arts Educational Chatbot based on Retrieval-augmented Multi-role Multi-expert Collaboration 2025

Two-stage Incomplete Utterance Rewriting on Editing Operation 2025