2025 ICML ICML 2025

PARM: Multi-Objective Test-Time Alignment via Preference-Aware Autoregressive Reward Model