Nora Belrose
5 papers
· 2023–2025
· 3 conferences
· across top CS/AI conferences
Achievements
🌍
Conference Polyglot
(3)
🌉
Interdisciplinary Bridge
🧭
Keyword Pioneer
🐝
Cross-Pollinator
(15)
❓
The Questioner
Conferences
ICML (3)
AAAI (1)
NIPS (1)
Top co-authors
Keywords
representation learning
(1)
game playing
(1)
adversarial attack
(1)
linear classifier
(1)
recurrent neural network
(1)
language model
(1)
zero-shot transfer
(1)
activation manipulation
(1)
concept erasure
(1)
bias reduction
(1)
interpretability method
(1)
model steering
(1)
transformer model
(1)
adversarial policies
(1)
agent vulnerability
(1)
activation addition
(1)