2024 ICML ICML 2024

Position: Do pretrained Transformers Learn In-Context by Gradient Descent?

The Questioner