Papers

17,973 papers found
2025 EMNLP
A Probabilistic Inference Scaling Theory for LLM Self-Correction
Zhe Yang, Yichang Zhang, Yudong Wang et al.
2025 EMNLP
ArabicWeb-Edu: Educational Quality Data for Arabic LLM Training
Majd Hawasly, Tasnim Mohiuddin, Hamdy Mubarak et al.
2025 EMNLP
AraEval: An Arabic Multi-Task Evaluation Suite for Large Language Models
Alhanoof Althnian, Norah A. Alzahrani, Shaykhah Z. Alsubaie et al.
2025 EMNLP
AraHealthQA 2025: The First Shared Task on Arabic Health Question Answering
Hassan Alhuzali, Walid Al-Eisawi, Muhammad Abdul-Mageed et al.
2025 EMNLP
AraReasoner: Evaluating Reasoning-Based LLMs for Arabic NLP
Ahmed Abul Hasanaath, Aisha Alansari, Ahmed Ashraf et al.
2025 EMNLP
AraSafe: Benchmarking Safety in Arabic LLMs
Hamdy Mubarak, Abubakr Mohamed, Majd Hawasly
2025 EMNLP
Are Checklists Really Useful for Automatic Evaluation of Generative Tasks?
Momoka Furuhashi, Kouta Nakayama, Takashi Kodama et al.
2025 EMNLP
Are Language Models Consequentialist or Deontological Moral Reasoners?
Keenan Samway, Max Kleiman-Weiner, David Guzman Piedrahita et al.
2025 EMNLP