FinDVer: Explainable Claim Verification over Long and Hybrid-content Financial Documents

Yilun Zhao; Yitao Long; Tintin Jiang; Chengye Wang; Weiyuan Chen; Hongjun Liu; Xiangru Tang; Yiming Zhang; Chen Zhao; Arman Cohan

2024 EMNLP EMNLP 2024

FinDVer: Explainable Claim Verification over Long and Hybrid-content Financial Documents

Abstract

AbstractWe introduce FinDVer, a comprehensive benchmark specifically designed to evaluate the explainable claim verification capabilities of LLMs in the context of understanding and analyzing long, hybrid-content financial documents. FinDVer contains 4,000 expert-annotated examples across four subsets, each focusing on a type of scenario that frequently arises in real-world financial domains. We assess a broad spectrum of 25 LLMs under long-context and RAG settings. Our results show that even the current best-performing system (i.e., GPT-4o) significantly lags behind human experts. Our detailed findings and insights highlight the strengths and limitations of existing LLMs in this new task. We believe FinDVer can serve as a valuable benchmark for evaluating LLM capabilities in claim verification over complex, expert-domain documents.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Natural Language Processing

🐣 Hot Topic Early Bird — claim verification

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yilun Zhao , Yitao Long , Tintin Jiang , Chengye Wang , Weiyuan Chen , Hongjun Liu , Xiangru Tang , Yiming Zhang , Chen Zhao , Arman Cohan

Topics

Natural Language Processing > Applications > Fact-Checking Natural Language Processing > Applications > Machine Reading Comprehension Artificial Intelligence > Core AI > Natural Language Processing Deep Learning > Learning Types > Retrieval-Augmented Generation

Keywords

claim verification explainable ai retrieval-augmented generation financial document large language model

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024