Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Machine Learning
›
Learning Types
›
Evaluation
1654 directly classified papers
Papers per year
2005: 1
2006: 1
2007: 1
2008: 2
2009: 1
2010: 3
2011: 2
2012: 3
2013: 5
2014: 4
2015: 4
2016: 11
2017: 19
2018: 32
2019: 39
2020: 72
2021: 110
2022: 202
2023: 222
2024: 351
2025: 569
Papers
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation
EMNLP 2018
Deep learning evaluation using deep linguistic processing
NAACL 2018
Annotation Artifacts in Natural Language Inference Data
NAACL 2018
Behavior Analysis of NLI Models: Uncovering the Influence of Three Factors on Robustness
NAACL 2018
The Importance of Calibration for Estimating Proportions from Annotations
NAACL 2018
Whistle-blowing ASRs: Evaluating the Need for More Inclusive Speech Recognition Systems
INTERSPEECH 2018
A Boo(n) for Evaluating Architecture Performance
ICML 2018
Sentence-Level Fluency Evaluation: References Help, But Can Be Spared!
CONLL 2018
Taylor’s law for Human Linguistic Sequences
ACL 2018
The price of debiasing automatic metrics in natural language evalaution
ACL 2018
Inherent Biases in Reference-based Evaluation for Grammatical Error Correction
ACL 2018
Improving Text-to-SQL Evaluation Methodology
ACL 2018
Efficient Online Scalar Annotation with Bounded Support
ACL 2018
Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task
ACL 2018
Locally Private Hypothesis Testing
ICML 2018
Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation
NIPS 2018
In Search of Coherence and Consensus: Measuring the Interpretability of Statistical Topics
JMLR 2018
Neural Quality Estimation of Grammatical Error Correction
EMNLP 2018
On the Impact of Various Types of Noise on Neural Machine Translation
ACL 2018
Exploring gap filling as a cheaper alternative to reading comprehension questionnaires when evaluating machine translation for gisting
EMNLP 2018
A Methodology for Evaluating Interaction Strategies of Task-Oriented Conversational Agents
EMNLP 2018
Practical Methods for Graph Two-Sample Testing
NIPS 2018
Online control of the false discovery rate with decaying memory
NIPS 2017
On Optimal Generalizability in Parametric Learning
NIPS 2017
Finite Sample Analysis of the GTD Policy Evaluation Algorithms in Markov Setting
NIPS 2017
<
1
…
63
64
65
66
67
>