2021 IJCNLP IJCNLP 2021

MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers