← Optimization & Theory

Machine Learning › Optimization & Theory ›

Theory

4950 directly classified papers

Papers per year

Papers

Assessing Large Language Models on Islamic Legal Reasoning: Evidence from Inheritance Law Evaluation EMNLP 2025

Local Normalization Distortion and the Thermodynamic Formalism of Decoding Strategies for Large Language Models EMNLP 2025

Logical forms complement probability in understanding language model (and human) performance ACL 2025

FADE: Why Bad Descriptions Happen to Good Features ACL 2025

Benchmarking the Benchmarks: Reproducing Climate-Related NLP Tasks ACL 2025

Beyond Surface-Level Patterns: An Essence-Driven Defense Framework Against Jailbreak Attacks in LLMs ACL 2025

Why Uncertainty Estimation Methods Fall Short in RAG: An Axiomatic Analysis ACL 2025

Voting or Consensus? Decision-Making in Multi-Agent Debate ACL 2025

Reasoning Circuits in Language Models: A Mechanistic Interpretation of Syllogistic Inference ACL 2025

Subjectivity in the Annotation of Bridging Anaphora ACL 2025

Current Semantic-change Quantification Methods Struggle with Discovery in the Wild EMNLP 2025

Model Consistency as a Cheap yet Predictive Proxy for LLM Elo Scores EMNLP 2025

The 2025 ReproNLP Shared Task on Reproducibility of Evaluations in NLP: Overview and Results ACL 2025

How Persuasive Is Your Context? EMNLP 2025

Discursive Circuits: How Do Language Models Understand Discourse Relations? EMNLP 2025

Statistical inference on black-box generative models in the data kernel perspective space ACL 2025

VoiceBBQ: Investigating Effect of Content and Acoustics in Social Bias of Spoken Language Model EMNLP 2025

Rethinking Evaluation Metrics for Grammatical Error Correction: Why Use a Different Evaluation Process than Human? ACL 2025

A Reproduction Study: The Kernel PCA Interpretation of Self-Attention Fails Under Scrutiny ACL 2025

CausalGraphBench: a Benchmark for Evaluating Language Models capabilities of Causal Graph discovery ACL 2025

A Measure of the System Dependence of Automated Metrics ACL 2025

SPECS: Specificity-Enhanced CLIP-Score for Long Image Caption Evaluation EMNLP 2025

Unique Hard Attention: A Tale of Two Sides ACL 2025

Sandcastles in the Storm: Revisiting the (Im)possibility of Strong Watermarking ACL 2025

A Theory of Response Sampling in LLMs: Part Descriptive and Part Prescriptive ACL 2025