2025 ICML ICML 2025

Larger or Smaller Reward Margins to Select Preferences for LLM Alignment?

The Questioner