Papers
16,557 papers found
Refining Text Generation for Realistic Conversational Recommendation via Direct Preference Optimization
Manato Tajiri, Michimasa Inaba
Image Difference Captioning via Adversarial Preference Optimization
Zihan Huang, Junda Wu, Rohan Surana et al.
Learning to Translate Ambiguous Terminology by Preference Optimization on Post-Edits
Nathaniel Berger, Johannes Eschbach-Dymanus, Miriam Exel et al.
Auto-Weighted Group Relative Preference Optimization for Multi-Objective Text Generation Tasks
Yuki Ichihara, Yuu Jinnai
DCRM: A Heuristic to Measure Response Pair Quality in Preference Optimization
Chengyu Huang, Tanya Goyal
SPO: Self Preference Optimization with Self Regularization
Yuhao Sun, Yifan Zhang, Quandong Wang et al.
Creative Preference Optimization
Mete Ismayilzada, Antonio Laverghetta Jr., Simone A. Luchini et al.
ReCUT: Balancing Reasoning Length and Accuracy in LLMs via Stepwise Trails and Preference Optimization
Zhensheng Jin, Xinze Li, Yifan Ji et al.
Captioning for Text-Video Retrieval via Dual-Group Direct Preference Optimization
Ji Soo Lee, Byungoh Ko, Jaewon Cho et al.
SeaPO: Strategic Error Amplification for Robust Preference Optimization of Large Language Models
Jun Rao, Yunjie Liao, Xuebo Liu et al.
MidPO: Dual Preference Optimization for Safety and Helpfulness in Large Language Models via a Mixture of Experts Framework
Yupeng Qi, Ziyu Lyu, Min Yang et al.
Adaptive Preference Optimization with Uncertainty-aware Utility Anchor
Xiaobo Wang, Zixia Jia, Jiaqi Li et al.
Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination Mitigation
Jihao Gu, Yingyao Wang, Meng Cao et al.
CoTD-PO: Chain-of-Thought Distillation with Preference Optimization
Lujie Niu, Haochen Sun, Fangkun Zhao et al.
DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization
Chao Zhang, Xin Shi, Xueqiao Zhang et al.
Perspective-driven Preference Optimization with Entropy Maximization for Diverse Argument Generation
Yilin Cao, Ruike Zhang, Penghui Wei et al.
Instruction-Tuned English to Bhojpuri Neural Machine Translation Using Contrastive Preference Optimization
Kshetrimayum Boynao Singh, Deepak Kumar, Asif Ekbal
MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization
Hengjia Li, Lifan Jiang, Xi Xiao et al.
Unsupervised Visual Chain-of-Thought Reasoning via Preference Optimization
Kesen Zhao, Beier Zhu, Qianru Sun et al.
Scalable Ranked Preference Optimization for Text-to-Image Generation
Shyamgopal Karthik, Huseyin Coskun, Zeynep Akata et al.
Group Preference Optimization: Few-Shot Alignment of Large Language Models
Siyan Zhao, John Dang, Aditya Grover
Beyond Reverse KL: Generalizing Direct Preference Optimization with Diverse Divergence Constraints
Chaoqi Wang, Yibo Jiang, Chenghao Yang et al.
Statistical Rejection Sampling Improves Preference Optimization
Tianqi Liu, Yao Zhao, Rishabh Joshi et al.
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Shicong Cen, Jincheng Mei, Katayoon Goshvadi et al.
Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization
Audrey Huang, Wenhao Zhan, Tengyang Xie et al.