Papers

16,557 papers found
Geometric-Averaged Preference Optimization for Soft Preference Labels
Hiroki Furuta, Kuang-Huei Lee, Shixiang Shane Gu et al.
2024 NIPS
No Preference Left Behind: Group Distributional Preference Optimization
Binwei Yao, Zefan Cai, Yun-Shiuan Chuang et al.
2025 ICLR
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov, Archit Sharma, Eric Mitchell et al.
2023 NIPS
On Softmax Direct Preference Optimization for Recommendation
Yuxin Chen, Junfei Tan, An Zhang et al.
2024 NIPS
Group Robust Preference Optimization in Reward-free RLHF
Shyam Sundhar Ramesh, Yifan Hu, Iason Chaimalas et al.
2024 NIPS
2024 NIPS
2024 NIPS
Iterative Reasoning Preference Optimization
Richard Yuanzhe Pang, Weizhe Yuan, Kyunghyun Cho et al.
2024 NIPS
2024 NIPS
2024 NIPS
$\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
Junkang Wu, Yuexiang Xie, Zhengyi Yang et al.
2024 NIPS
2025 AAAI