Papers
16,557 papers found
VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Captioning
Ji Soo Lee, Jongha Kim, Jeehye Na et al.
Radiology Report Generation via Multi-objective Preference Optimization
Ting Xiao, Lei Shi, Peng Liu et al.
Forward KL Regularized Preference Optimization for Aligning Diffusion Policies
Zhao Shan, Chenyou Fan, Shuang Qiu et al.
Multi-Reference Preference Optimization for Large Language Models
Hung Le, Quan Hung Tran, Dung Nguyen et al.
Self-Evolutionary Large Language Models Through Uncertainty-Enhanced Preference Optimization
Jianing Wang, Yang Zhou, Xiaocheng Zhang et al.
Enhancing Audiovisual Speech Recognition Through Bifocal Preference Optimization
Yihan Wu, Yichen Lu, Yifan Peng et al.
KnowPO: Knowledge-Aware Preference Optimization for Controllable Knowledge Selection in Retrieval-Augmented Language Models
Ruizhe Zhang, Yongxin Xu, Yuzhen Xiao et al.
Advancing Audio-Based Text Generation with Imbalance Preference Optimization
Zhenghao Zhou, Yongjie Liu, Chen Cao
WEPO: Web Element Preference Optimization for LLM-based Web Navigation
Jiarun Liu, Jia Hao, Chunhong Zhang et al.
JailPO: A Novel Black-Box Jailbreak Framework via Preference Optimization Against Aligned LLMs
Hongyi Li, Jiawei Ye, Jie Wu et al.
Atomic Consistency Preference Optimization for Long-Form Question Answering
Jingfeng Chen, Raghuveer Thirukovalluru, Junlin Wang et al.
MAPO: Advancing Multilingual Reasoning through Multilingual-Alignment-as-Preference Optimization
Shuaijie She, Wei Zou, Shujian Huang et al.
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
Tianduo Wang, Shichen Li, Wei Lu
Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Game
Pengyu Cheng, Yifan Yang, Jian Li et al.
Disentangling Length from Quality in Direct Preference Optimization
Ryan Park, Rafael Rafailov, Stefano Ermon et al.
Direct Preference Optimization with an Offset
Afra Amini, Tim Vieira, Ryan Cotterell
Fine-grained Video Dubbing Duration Alignment with Segment Supervised Preference Optimization
Chaoqun Cui, Liangbin Huang, Shijing Wang et al.
RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation
Shi-Qi Yan, Quan Liu, Zhen-Hua Ling
SDPO: Segment-Level Direct Preference Optimization for Social Agents
Aobo Kong, Wentao Ma, Shiwan Zhao et al.
Enhancing Safe and Controllable Protein Generation via Knowledge Preference Optimization
Yuhao Wang, Keyan Ding, Kehua Feng et al.
DiffPO: Diffusion-styled Preference Optimization for Inference Time Alignment of Large Language Models
Ruizhe Chen, Wenhao Chai, Zhifei Yang et al.
AutoMixAlign: Adaptive Data Mixing for Multi-Task Preference Optimization in LLMs
Nicholas E. Corrado, Julian Katz-Samuels, Adithya M Devraj et al.
Uncovering the Impact of Chain-of-Thought Reasoning for Direct Preference Optimization: Lessons from Text-to-SQL
Hanbing Liu, Haoyang Li, Xiaokang Zhang et al.
Optimal Transport-Based Token Weighting scheme for Enhanced Preference Optimization
Meng Li, Guangda Huzhang, Haibo Zhang et al.