2025 ICML ICML 2025

MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking