The 2nd Automated Verification of Textual Claims (AVeriTeC) Shared Task: Open-weights, Reproducible and Efficient Systems

Mubashara Akhtar; Rami Aly; Yulong Chen; Zhenyun Deng; Michael Schlichtkrull; Chenxi Whitehouse; Andreas Vlachos

2025 ACL ACL 2025

The 2nd Automated Verification of Textual Claims (AVeriTeC) Shared Task: Open-weights, Reproducible and Efficient Systems

Abstract

AbstractIn the First Automated Verification of Textual Claims (AVeriTeC) shared task participanting teams developed systems that for each claim retrieve evidence from the web and predict its veracity. While there was progress in automated fact-checking for real-world claims, the majority of the systems proposed relied on closed-weights large language models, which rendered them expensive to run and less reporducible. To ameliorate this issue, in this year’s edition of the AVERITEC shared task we required system to use only open-weights models that could be run use a single GPU with 23GBs of RAM, and that systems should take one minute or less to return verdicts accompanied by evidence retrieved from a precompiled knowledge store. The shared task received 7 submissions; 6 of which exceeded the accuracy of our baseline on the test set, while they ran in under a minute per claim on the hardware we had speficied. The winning team was CTU AIC with an AVeriTeC score of 33.17%. In this paper we describe the shared task in detail and highlight key findings.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Natural Language Processing

🧭 Keyword Pioneer — open weight

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio