2024 ICML ICML 2024

What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks

The Questioner