Amelia Glaese

5 papers · 2021–2025 · 3 conferences · across top CS/AI conferences

Achievements

🌍 Conference Polyglot (3) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (12) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🐝 Cross-Pollinator (15)

Conferences

EMNLP (2) NIPS (2) ICML (1)

Top co-authors

Johannes Welbl (2) Sumanth Dathathri (2) John Mellor (2) Jonathan Uesato (2) Po-Sen Huang (2) Geoffrey Irving (2) Lisa Anne Hendricks (2) John Aslanides (2) Nat McAleese (2) Roman Ring (1)

Keywords

large language model (3) harmful content (2) toxicity detection (2) language model (2) responsible ai (1) bias mitigation (1) reward model (1) safety evaluation (1) red teaming (1) harmful content detection (1) human preference (1) adversarial testing (1) automatic evaluation (1) model fairness (1) offensive content detection (1) model bia (1) toxicity mitigation (1) reinforcement learning (1) offensive content (1) prompt engineering (1)

Papers

PaperBench: Evaluating AI’s Ability to Replicate AI Research ICML 2025

Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models NIPS 2022

Fine-tuning language models to find agreement among humans with diverse preferences NIPS 2022

Red Teaming Language Models with Language Models EMNLP 2022

Challenges in Detoxifying Language Models EMNLP 2021