LJPCheck: Functional Tests for Legal Judgment Prediction

yuan zhang; Wanhong Huang; Yi Feng; Chuanyi Li; Zhiwei Fei; Jidong Ge; Bin Luo; Vincent Ng

2024 ACL ACL 2024

LJPCheck: Functional Tests for Legal Judgment Prediction

Abstract

AbstractLegal Judgment Prediction (LJP) refers to the task of automatically predicting judgment results (e.g., charges, law articles and term of penalty) given the fact description of cases. While SOTA models have achieved high accuracy and F1 scores on public datasets, existing datasets fail to evaluate specific aspects of these models (e.g., legal fairness, which significantly impact their applications in real scenarios). Inspired by functional testing in software engineering, we introduce LJPCHECK, a suite of functional tests for LJP models, to comprehend LJP models’ behaviors and offer diagnostic insights. We illustrate the utility of LJPCHECK on five SOTA LJP models. Extensive experiments reveal vulnerabilities in these models, prompting an in-depth discussion into the underlying reasons of their shortcomings.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Healthcare & Medicine and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — legal fairness

🐣 Hot Topic Early Bird — software engineering

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

yuan zhang , Wanhong Huang , Yi Feng , Chuanyi Li , Zhiwei Fei , Jidong Ge , Bin Luo , Vincent Ng

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Core Methods > Classification Machine Learning > Application Areas > Fairness Healthcare & Medicine > Clinical > Clinical NLP Machine Learning > Learning Types > Evaluation Natural Language Processing > Applications > Clinical NLP Artificial Intelligence > Core AI > Natural Language Processing

Keywords

model evaluation software engineering legal nlp legal judgment prediction functional testing legal fairness

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024