Can Large Language Models Reason about Program Invariants?

Kexin Pei; David Bieber; Kensen Shi; Charles Sutton; Pengcheng Yin

2023 ICML ICML 2023

Can Large Language Models Reason about Program Invariants?

Abstract

Identifying invariants is an important program analysis task with applications towards program understanding, bug finding, vulnerability analysis, and formal verification. Existing tools for identifying program invariants rely on dynamic analysis, requiring traces collected from multiple executions in order to produce reliable invariants. We study the application of large language models to invariant prediction, finding that models trained on source code and fine-tuned for invariant generation can perform invariant prediction as static rather than dynamic analysis. Using a scratchpad approach where invariants are predicted sequentially through a program gives the best performance, finding invariants statically of quality comparable to those obtained by a dynamic analysis tool with access to five program traces.

❓ The Questioner

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Science and Knowledge & Reasoning and Natural Language Processing

🧭 Keyword Pioneer — program invariant

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio