Artificial Intelligence › Core AI ›

Large Language Models

6405 directly classified papers

Papers per year

Papers

The Hallucination Tax of Reinforcement Finetuning EMNLP 2025

Bridging the Editing Gap in LLMs: FineEdit for Precise and Targeted Text Modifications EMNLP 2025

Dynamic Evaluation for Oversensitivity in LLMs EMNLP 2025

ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent Framework EMNLP 2025

DCRM: A Heuristic to Measure Response Pair Quality in Preference Optimization EMNLP 2025

ReflAct: World-Grounded Decision Making in LLM Agents via Goal-State Reflection EMNLP 2025

NESTFUL: A Benchmark for Evaluating LLMs on Nested Sequences of API Calls EMNLP 2025

Benchmarking and Mitigating MCQA Selection Bias of Large Vision-Language Models EMNLP 2025

Can Large Language Models Unlock Novel Scientific Research Ideas? EMNLP 2025

seqBench: A Tunable Benchmark to Quantify Sequential Reasoning Limits of LLMs EMNLP 2025

LiteraryQA: Towards Effective Evaluation of Long-document Narrative QA EMNLP 2025

Augmenting Compliance-Guaranteed Customer Service Chatbots: Context-Aware Knowledge Expansion with Large Language Models EMNLP 2025

Slim-SC: Thought Pruning for Efficient Scaling with Self-Consistency EMNLP 2025

AI Argues Differently: Distinct Argumentative and Linguistic Patterns of LLMs in Persuasive Contexts EMNLP 2025

EduVidQA: Generating and Evaluating Long-form Answers to Student Questions based on Lecture Videos EMNLP 2025

Turning Logic Against Itself: Probing Model Defenses Through Contrastive Questions EMNLP 2025

FoREST: Frame of Reference Evaluation in Spatial Reasoning Tasks EMNLP 2025

so much depends / upon / a whitespace: Why Whitespace Matters for Poets and LLMs EMNLP 2025

Quantifying Logical Consistency in Transformers via Query-Key Alignment EMNLP 2025

Evaluating Large Language Models for Detecting Antisemitism EMNLP 2025

Towards Robust Mathematical Reasoning EMNLP 2025

Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Fine-tuning EMNLP 2025

Introducing Spotlight: A Novel Approach for Generating Captivating Key Information from Documents EMNLP 2025

Which Word Orders Facilitate Length Generalization in LMs? An Investigation with GCG-Based Artificial Languages EMNLP 2025

Chinese Toxic Language Mitigation via Sentiment Polarity Consistent Rewrites EMNLP 2025