Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability

Haonan Li; Xudong Han; Zenan Zhai; Honglin Mu; Hao Wang; Zhenxuan Zhang; Yilin Geng; Shom Lin; Renxi Wang; Artem Shelmanov; Xiangyu Qi; Yuxia Wang; Donghai Hong; Youliang Yuan; Meng Chen; Haoqin Tu; Fajri Koto; Cong Zeng; Tatsuki Kuribayashi; Rishabh Bhardwaj; Bingchen Zhao; Yawen Duan; Yi Liu; Emad A. Alghamdi; Yaodong Yang; Yinpeng Dong; Soujanya Poria; Pengfei Liu; Zhengzhong Liu; Hector Xuguang Ren; Eduard Hovy; Iryna Gurevych; Preslav Nakov; Monojit Choudhury; Timothy Baldwin

2025 NAACL NAACL 2025

Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability

Abstract

AbstractAs large language models (LLMs) continue to evolve, leaderboards play a significant role in steering their development. Existing leaderboards often prioritize model capabilities while overlooking safety concerns, leaving a significant gap in responsible AI development. To address this gap, we introduce Libra-Leaderboard, a comprehensive framework designed to rank LLMs through a balanced evaluation of performance and safety. Combining a dynamic leaderboard with an interactive LLM arena, Libra-Leaderboard encourages the joint optimization of capability and safety. Unlike traditional approaches that average performance and safety metrics, Libra-Leaderboard uses a distance-to-optimal-score method to calculate the overall rankings. This approach incentivizes models to achieve a balance rather than excelling in one dimension at the expense of some other ones. In the first release, Libra-Leaderboard evaluates 26 mainstream LLMs from 14 leading organizations, identifying critical safety challenges even in state-of-the-art models.

👥 Mega-Team — 35 authors

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — capability safety balance

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Haonan Li , Xudong Han , Zenan Zhai , Honglin Mu , Hao Wang , Zhenxuan Zhang , Yilin Geng , Shom Lin , Renxi Wang , Artem Shelmanov , Xiangyu Qi , Yuxia Wang , Donghai Hong , Youliang Yuan , Meng Chen , Haoqin Tu , Fajri Koto , Cong Zeng , Tatsuki Kuribayashi , Rishabh Bhardwaj , Bingchen Zhao , Yawen Duan , Yi Liu , Emad A. Alghamdi , Yaodong Yang , Yinpeng Dong , Soujanya Poria , Pengfei Liu , Zhengzhong Liu , Hector Xuguang Ren , Eduard Hovy , Iryna Gurevych , Preslav Nakov , Monojit Choudhury , Timothy Baldwin

Topics

Artificial Intelligence > Core AI > Responsible AI Machine Learning > Application Areas > Fairness

Keywords

model safety responsible ai large language model leaderboard evaluation capability safety balance

Download PDF

Few-shot Personalization of LLMs with Mis-aligned Responses 2025

NLI under the Microscope: What Atomic Hypothesis Decomposition Reveals 2025

Understanding Figurative Meaning through Explainable Visual Entailment 2025

CogLM: Tracking Cognitive Development of Large Language Models 2025

Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability

Abstract

Authors

Topics

Keywords

Related papers