Papers
16,557 papers found
Indirect Online Preference Optimization via Reinforcement Learning
En Wang, Xingyu Lin, Du Su et al.
Atomic Consistency Preference Optimization for Long-Form Question Answering
Jingfeng Chen, Raghuveer Thirukovalluru, Junlin Wang et al.
CAPO: Confidence Aware Preference Optimization Learning for Multilingual Preferences
Rhitabrat Pokharel, Yufei Tao, Ameeta Agrawal
NHK Submission to WAT 2025: Leveraging Preference Optimization for Article-level Japanese–English News Translation
Hideya Mino, Rei Endo, Yoshihiko Kawai
Direct Preference Optimization for Neural Machine Translation with Minimum Bayes Risk Decoding
Guangyu Yang, Jinghong Chen, Weizhe Lin et al.
RS-DPO: A Hybrid Rejection Sampling and Direct Preference Optimization Method for Alignment of Large Language Models
Saeed Khaki, JinJin Li, Lan Ma et al.
Improving Socratic Question Generation using Data Augmentation and Preference Optimization
Nischal Ashok Kumar, Andrew Lan
Team NP_PROBLEM at SemEval-2024 Task 7: Numerical Reasoning in Headline Generation with Preference Optimization
Pawan Rajpoot, Nut Chukamphaeng
Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
Ruohong Zhang, Liangke Gui, Zhiqing Sun et al.
LiPO: Listwise Preference Optimization through Learning-to-Rank
Tianqi Liu, Zhen Qin, Junru Wu et al.
Style Transfer with Multi-iteration Preference Optimization
Shuai Liu, Jonathan May
Mitigating Hallucinated Translations in Large Language Models with Hallucination-focused Preference Optimization
Zilu Tang, Rajen Chatterjee, Sarthak Garg
BPO: Towards Balanced Preference Optimization between Knowledge Breadth and Depth in Alignment
Sizhe Wang, Yongqi Tong, Hengyuan Zhang et al.
PA-RAG: RAG Alignment via Multi-Perspective Preference Optimization
Jiayi Wu, Hengyi Cai, Lingyong Yan et al.
PORT: Preference Optimization on Reasoning Traces
Salem Lahlou, Abdalgader Abubaker, Hakim Hacid
Sequence-level Large Language Model Training with Contrastive Preference Optimization
Zhili Feng, Dhananjay Ram, Cole Hawkins et al.
Understanding Reference Policies in Direct Preference Optimization
Yixin Liu, Pengfei Liu, Arman Cohan
2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision
Shilong Li, Yancheng He, Hui Huang et al.
Team NP_PROBLEM at SemEval-2024 Task 7: Numerical Reasoning in Headline Generation with Preference Optimization
Pawan Rajpoot, Nut Chukamphaeng
Sakura at SemEval-2025 Task 2: Enhancing Named Entity Translation with Fine-Tuning and Preference Optimization
Alberto Poncelas, Ohnmar Htun
Dataground at SemEval-2025 Task 8: Small LLMs and Preference Optimization for Tabular QA
Giuseppe Attardi, Andrea Nelson Mauro, Daniele Sartiano
Atyaephyra at SemEval-2025 Task 4: Low-Rank Negative Preference Optimization
Jan Bronec, Jindřich Helcl
Align Video Diffusion Model with Online Video-Centric Preference Optimization
Jiacheng Zhang, Jie Wu, Weifeng Chen et al.
Offline Preference Optimization via Maximum Marginal Likelihood Estimation
Saeed Najafi, Alona Fyshe
Joint Multimodal Preference Optimization for Fine-Grained Visual-Textual Alignment
Jiwon Kim, Hyunsoo Yoon