← Applications

Computer Science › Applications ›

Software Engineering

567 directly classified papers

Papers per year

Papers

FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation ACL 2025

CodeRAG-Bench: Can Retrieval Augment Code Generation? NAACL 2025

TestEval: Benchmarking Large Language Models for Test Case Generation NAACL 2025

VisualCoder: Guiding Large Language Models in Code Execution with Fine-grained Multimodal Chain-of-Thought Reasoning NAACL 2025

What can Large Language Models Capture about Code Functional Equivalence? NAACL 2025

Unmasking Database Vulnerabilities: Zero-Knowledge Schema Inference Attacks in Text-to-SQL Systems NAACL 2025

AssertionBench: A Benchmark to Evaluate Large-Language Models for Assertion Generation NAACL 2025

LLM-Assisted Translation of Legacy FORTRAN Codes to C++: A Cross-Platform Study NAACL 2025

Towards Effectively Leveraging Execution Traces for Program Repair with Code LLMs NAACL 2025

CAD-Recode: Reverse Engineering CAD Code from Point Clouds ICCV 2025

From Knowledge to Noise: CTIM-Rover and the Pitfalls of Episodic Memory in Software Engineering Agents ACL 2025

ExeCoder: Empowering Large Language Models with Executability Representation for Code Translation EMNLP 2025

CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images CVPR 2025

Automated CAD Modeling Sequence Generation from Text Descriptions via Transformer-Based Large Language Models ACL 2025

Automating the Expansion of Instrument Typicals in Piping and Instrumentation Diagrams (P&IDs) AAAI 2025

No Size Fits All: The Perils and Pitfalls of Leveraging LLMs Vary with Company Size COLING 2025

Deriving Semantic Checkers from Tests to Detect Silent Failures in Production Distributed Systems OSDI 2025

Overlapping Context with Variable-Length Stride Increases Diversity when Training Large Language Model for Code ACL 2025

UniDebugger: Hierarchical Multi-Agent Framework for Unified Software Debugging EMNLP 2025

Supporting Online Discussions: Integrating AI Into the adhocracy+ Participation Platform To Enhance Deliberation EMNLP 2025

Benchmarking Long-Context Language Models on Long Code Understanding ACL 2025

CodeScientist: End-to-End Semi-Automated Scientific Discovery with Code-based Experimentation ACL 2025

EquiBench: Benchmarking Large Language Models’ Reasoning about Program Semantics via Equivalence Checking EMNLP 2025

PlanGEN: A Multi-Agent Framework for Generating Planning and Reasoning Trajectories for Complex Problem Solving EMNLP 2025

A Survey on Model Repair in AI Planning IJCAI 2025