2024
ICML
ICML 2024
Human Alignment of Large Language Models through Online Preference Optimisation
Authors
Daniele Calandriello
,
Zhaohan Daniel Guo
,
Rémi Munos
,
Mark Rowland
,
Yunhao Tang
,
Bernardo Avila Pires
,
Pierre Harvey Richemond
,
Charline Le Lan
,
Michal Valko
,
Tianqi Liu
,
Rishabh Joshi
,
Zeyu Zheng
,
Bilal Piot