2024 ICML ICML 2024

Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective

The Questioner