Papers
16,137 papers found
Long-form RewardBench: Evaluating Reward Models for Long-form Generation
Hui Huang, Yancheng He, Wei Liu et al.
Think-J: Learning to Think for Generative LLM-as-a-Judge
Hui Huang, Yancheng He, Hongli Zhou et al.
Hybrid Routing for a Mixture of LoRA Experts
Yitong Huang, Ziqi Yang, Zihui Wang et al.
Large Language Model Unlearning for Source Code
Xue Jiang, Yihong Dong, Huangzhao Zhang et al.
Confidence-Guided Stepwise Model Routing for Cost-Efficient Reasoning
Sangmook Lee, Dohyung Kim, Hyukhun Koh et al.
OSVBench: Benchmarking LLMs on Specification Generation Tasks for Operating System Verification
Shangyu Li, Juyong Jiang, Tiancheng Zhao et al.
VerifyBench: A Systematic Benchmark for Evaluating Reasoning Verifiers Across Domains
Xuzhao Li, Xuchen Li, Shiyu Hu et al.
SepPrune: Structured Pruning for Efficient Deep Speech Separation
Yuqi Li, Kai Li, Xin Yin et al.
RLMR: Reinforcement Learning with Mixed Rewards for Creative Writing
JianXing Liao, Tian Zhang, Xiao Feng et al.
Talk2Image: A Multi-Agent System for Multi-Turn Image Generation and Editing
Shichao Ma, Yunhe Guo, Jiahao Su et al.
GateRA: Token-aware Modulation for Parameter-Efficient Fine-tuning
Jie Ou, Shuaihong Jiang, Yingjun Du et al.
RetrySQL: Text-to-SQL Training with Retry Data for Self-Correcting Query Generation
Alicja Rączkowska, Riccardo Belluzzo, Piotr Zieliński et al.
Scaling LLM Speculative Decoding: Non-Autoregressive Forecasting in Large-Batch Scenarios
Luohe Shi, Zuchao Li, Lefei Zhang et al.
GUI-G²: Gaussian Reward Modeling for GUI Grounding
Fei Tang, Zhangxuan Gu, Zhengxi Lu et al.
PRIME: Planning and Retrieval-Integrated Memory for Enhanced Reasoning
Hieu Tran, Zonghai Yao, Nguyen Luong Tran et al.
GRAM-R²: Self-Training Generative Foundation Reward Models for Reward Reasoning
Chenglong Wang, Yongyu Mu, Hang Zhou et al.
ICL-Router: In-Context Learned Model Representations for LLM Routing
Chenxu Wang, Hao Li, Yiqun Zhang et al.
Rethinking Flow and Diffusion Bridge Models for Speech Enhancement
Dahan Wang, Jun Gao, Tong Lei et al.
OptScale: Probabilistic Optimality for Inference-time Scaling
Youkang Wang, Jian Wang, Rubing Chen et al.
DeepOR: A Deep Reasoning Foundation Model for Optimization Modeling
Ziyang Xiao, Yuan Jessica Wang, Xiongwei Han et al.
Multiplicative Orthogonal Sequential Editing for Language Models
Hao-Xiang Xu, Jun-Yu Ma, Ziqi Peng et al.
HyCoRA: Hyper-Contrastive Role-Adaptive Learning for Role-Playing
Shihao Yang, Zhicong Lu, Yong Yang et al.
SASST: Leveraging Syntax-Aware Chunking and LLMs for Simultaneous Speech Translation
Zeyu Yang, Lai Wei, Roman Koshkin et al.
ExPairT-LLM: Exact Learning for LLM Code Selection by Pairwise Queries
Tom Yuviler, Dana Drachsler-Cohen
Prune4Web: DOM Tree Pruning Programming for Web Agent
Jiayuan Zhang, Kaiquan Chen, Zhihao Lu et al.