Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Evaluation
345 directly classified papers
Papers per year
2014: 1
2016: 3
2017: 1
2018: 9
2019: 21
2020: 34
2021: 32
2022: 50
2023: 28
2024: 90
2025: 76
Papers
Probing Across Time: What Does RoBERTa Know and When?
EMNLP 2021
Discretized Integrated Gradients for Explaining Language Models
EMNLP 2021
What happens if you treat ordinal ratings as interval data? Human evaluations in NLP are even more under-powered than you think
EMNLP 2021
Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation
EMNLP 2021
We Need to Talk About train-dev-test Splits
EMNLP 2021
TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation
EMNLP 2021
On the Limits of Minimal Pairs in Contrastive Evaluation
EMNLP 2021
How Does BERT Rerank Passages? An Attribution Analysis with Information Bottlenecks
EMNLP 2021
Controlled tasks for model analysis: Retrieving discrete information from sequences
EMNLP 2021
HateCheck: Functional Tests for Hate Speech Detection Models
ACL 2021
When Do You Need Billions of Words of Pretraining Data?
ACL 2021
Lower Perplexity is Not Always Human-Like
ACL 2021
Dissecting Generation Modes for Abstractive Summarization Models via Ablation and Attribution
ACL 2021
Are VQA Systems RAD? Measuring Robustness to Augmented Data with Focused Interventions
ACL 2021
Not all parameters are born equal: Attention is mostly what you need
EMNLP 2021
Not All Models Localize Linguistic Knowledge in the Same Place: A Layer-wise Probing on BERToids’ Representations
EMNLP 2021
A Novel Visual Interpretability for Deep Neural Networks by Optimizing Activation Maps with Perturbation
AAAI 2021
Representer Point Selection via Local Jacobian Expansion for Post-hoc Classifier Explanation of Deep Neural Networks and Ensemble Models
NIPS 2021
CIDEr-R: Robust Consensus-based Image Description Evaluation
EMNLP 2021
DELA Corpus - A Document-Level Corpus Annotated with Context-Related Issues
EMNLP 2021
To Ship or Not to Ship: An Extensive Evaluation of Automatic Metrics for Machine Translation
EMNLP 2021
The CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis Resolution in Dialogue: A Cross-Team Analysis
EMNLP 2021
Anaphora Resolution in Dialogue: Cross-Team Analysis of the DFKI-TalkingRobots Team Submissions for the CODI-CRAC 2021 Shared-Task
EMNLP 2021
Exploratory Model Analysis Using Data-Driven Neuron Representations
EMNLP 2021
On the Difficulty of Membership Inference Attacks
CVPR 2021
<
1
…
10
11
12
13
14
>