Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Core AI
Artificial Intelligence
›
Core AI
›
Evaluation
10 directly classified papers
Papers per year
2021: 2
2022: 2
2023: 1
2024: 3
2025: 2
Papers
Evaluating Text Style Transfer Evaluation: Are There Any Reliable Metrics?
NAACL 2025
Towards a Principled Evaluation of Knowledge Editors
ACL 2025
ToMBench: Benchmarking Theory of Mind in Large Language Models
ACL 2024
BenchIE^FL: A Manually Re-Annotated Fact-Based Open Information Extraction Benchmark
ACL 2024
HelloFresh: LLM Evalutions on Streams of Real-World Human Editorial Actions across X Community Notes and Wikipedia edits
ACL 2024
The Devil is in the Details: On the Pitfalls of Event Extraction Evaluation
ACL 2023
ADBench: Anomaly Detection Benchmark
NIPS 2022
BenchIE: A Framework for Multi-Faceted Fact-Based Open Information Extraction Evaluation
ACL 2022
TabPert : An Effective Platform for Tabular Perturbation
EMNLP 2021
Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach
EMNLP 2021
<
1
>