← Learning Types

Machine Learning › Learning Types ›

Evaluation

1654 directly classified papers

Papers per year

Papers

A Meta-Analysis of Overfitting in Machine Learning NIPS 2019

Evaluating Question Answering Evaluation EMNLP 2019

A Closer Look at Data Bias in Neural Extractive Summarization Models EMNLP 2019

Evaluating Research Novelty Detection: Counterfactual Approaches EMNLP 2019

Findings of the WMT 2019 Shared Tasks on Quality Estimation ACL 2019

The MuCoW Test Suite at WMT 2019: Automatically Harvested Multilingual Contrastive Word Sense Disambiguation Test Sets for Machine Translation ACL 2019

Narrative Generation in the Wild: Methods from NaNoGenMo ACL 2019

Evaluating Automatic Term Extraction Methods on Individual Documents ACL 2019

Confirming the Non-compositionality of Idioms for Sentiment Analysis ACL 2019

Explaining Simple Natural Language Inference ACL 2019

Are Red Roses Red? Evaluating Consistency of Question-Answering Models ACL 2019

Selection Bias Explorations and Debias Methods for Natural Language Sentence Matching Datasets ACL 2019

Aiming beyond the Obvious: Identifying Non-Obvious Cases in Semantic Similarity Datasets ACL 2019

Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study ACL 2019

Interpretable Predictive Modeling for Climate Variables with Weighted Lasso AAAI 2019

On the Efficiency of Data Collection for Crowdsourced Classification IJCAI 2018

Breaking NLI Systems with Sentences that Require Simple Lexical Inferences ACL 2018

Tackling the Story Ending Biases in The Story Cloze Test ACL 2018

Intersection-Validation: A Method for Evaluating Structure Learning without Ground Truth AISTATS 2018

MeSH-based dataset for measuring the relevance of text retrieval ACL 2018

Towards a Better Metric for Evaluating Question Generation Systems EMNLP 2018

Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation EMNLP 2018

Semantic Structural Evaluation for Text Simplification NAACL 2018

Lessons from the Bible on Modern Topics: Low-Resource Multilingual Topic Model Evaluation NAACL 2018

Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research JMLR 2018