Research Explorer

On Evaluating the Integration of Reasoning and Action in LLM Agents with Database Question Answering

Linyong Nan, Ellen Zhang, Weijin Zou et al.

2024 NAACL

CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments

Kung-Hsiang Huang, Akshara Prabhakar, Sidharth Dhawan et al.

2025 NAACL

AI-LieDar : Examine the Trade-off Between Utility and Truthfulness in LLM Agents

Zhe Su, Xuhui Zhou, Sanketh Rangreji et al.

2025 NAACL

CSR-Bench: Benchmarking LLM Agents in Deployment of Computer Science Research Repositories

Yijia Xiao, Runhui Wang, Luyang Kong et al.

2025 NAACL

Adapting LLM Agents with Universal Communication Feedback

Kuan Wang, Yadong Lu, Michael Santacroce et al.

2025 NAACL

Adaptive Attacks Break Defenses Against Indirect Prompt Injection Attacks on LLM Agents

Qiusi Zhan, Richard Fang, Henil Shalin Panchal et al.

2025 NAACL

Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents

Shrinidhi Kumbhar, Venkatesh Mishra, Kevin Coutinho et al.

2025 NAACL

Self Knowledge-Tracing for Tool Use (SKT-Tool): Helping LLM Agents Understand Their Capabilities in Tool Use

Joshua Vigel, Renpei Cai, Eleanor Chen et al.

2025 NAACL

TableWise at SemEval-2025 Task 8: LLM Agents for TabQA

Harsh Bansal, Aman Raj, Akshit Sharma et al.

2025 SEMEVAL

QleverAnswering-PUCRS at SemEval-2025 Task 8: Exploring LLM agents, code generation and correction for Table Question Answering

André Bergmann Lisboa, Lucas Cardoso Azevedo, Lucas Rafael Costella Pessutto

2025 SEMEVAL

Teams of LLM Agents can Exploit Zero-Day Vulnerabilities

Yuxuan Zhu, Antony Kellermann, Akul Gupta et al.

2026 EACL

H-MEM: Hierarchical Memory for High-Efficiency Long-Term Reasoning in LLM Agents

Haoran Sun, Shaoning Zeng, Bob Zhang

2026 EACL

Automating Android Build Repair: Bridging the Reasoning-Execution Gap in LLM Agents with Domain-Specific Tools

Ha Min Son, Huan Ren, Xin Liu et al.

2026 EACL

Beyond Blind Following: Evaluating Robustness of LLM Agents under Imperfect Guidance

Yao Fu, Ran Qiu, Xinhe Wang et al.

2026 EACL

Communication Enables Cooperation in LLM Agents: A Comparison with Curriculum-Based Approaches

Hachem Madmoun, Salem Lahlou

2026 EACL

PersonaTrace: Synthesizing Realistic Digital Footprints with LLM Agents

Minjia Wang, Yunfeng Wang, Xiao Ma et al.

2026 EACL

Beyond IVR: Benchmarking Customer Support LLM Agents for Business-Adherence

Sumanth Balaji, Piyush Mishra, Aashraya Sachdeva et al.

2026 EACL

SIRAJ: Diverse and Efficient Red-Teaming for LLM Agents via Distilled Structured Reasoning

Kaiwen Zhou, Ahmed Elgohary, A S M Iftekhar et al.

2026 EACL

Label-Consistent Data Generation for Aspect-Based Sentiment Analysis Using LLM Agents

Mohammad Hossein Akbari Monfared, Lucie Flek, Akbar Karimi

2026 EACL

RAG-Enhanced Collaborative LLM Agents for Drug Discovery

Namkyeong Lee, Edward De Brouwer, Ehsan Hajiramezanali et al.

2026 AAAI

AgentSense: Virtual Sensor Data Generation Using LLM Agents in Simulated Home Environments

Zikang Leng, Megha Thukral, Yaqi Liu et al.

2026 AAAI

Investigating Prosocial Behavior Theory in LLM Agents Under Policy-Induced Inequities

Yujia Zhou, Hexi Wang, Qingyao Ai et al.

2026 AAAI

MAGIC: Mastering Physical Adversarial Generation in Context Through Collaborative LLM Agents

Yun Xing, Nhat Chung, Jie Zhang et al.

2026 AAAI

Conformal Constrained Policy Optimization for Cost-Effective LLM Agents

Wenwen Si, Sooyong Jang, Insup Lee et al.

2026 AAAI

DEPO: Dual-Efficiency Preference Optimization for LLM Agents

Sirui Chen, Mengshi Zhao, Lei Xu et al.

2026 AAAI

Papers