How do we get there? Evaluating transformer neural networks as cognitive models for English past tense inflection

Xiaomeng Ma; Lingyu Gao

2022 IJCNLP IJCNLP 2022

How do we get there? Evaluating transformer neural networks as cognitive models for English past tense inflection

Abstract

AbstractThere is an ongoing debate of whether neural network can grasp the quasi-regularities in languages like humans. In a typical quasi-regularity task, English past tense inflections, the neural network model has long been criticized that it learns only to generalize the most frequent pattern, but not the regular pattern, thus can not learn the abstract categories of regular and irregular and is dissimilar to human performance. In this work, we train a set of transformer models with different settings to examine their behavior on this task. The models achieved high accuracy on unseen regular verbs and some accuracy on unseen irregular verbs. The models’ performance on the regulars are heavily affected by type frequency and ratio but not token frequency and ratio, and vice versa for the irregulars. The different behaviors on the regulars and irregulars suggest that the models have some degree of symbolic learning on the regularity of the verbs. In addition, the models are weakly correlated with human behavior on nonce verbs. Although the transformer model exhibits some level of learning on the abstract category of verb regularity, its performance does not fit human data well suggesting that it might not be a good cognitive model.

❓ The Questioner

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing

🐣 Hot Topic Early Bird — cognitive modeling

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Xiaomeng Ma , Lingyu Gao

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Optimization & Theory > Learning Theory Deep Learning > Architectures > Transformers Natural Language Processing > Generation > Language Modeling Interdisciplinary > Cognitive Science > Cognitive Modeling

Keywords

cognitive modeling human behavior transformer neural network cognitive model past tense inflection transformer model symbolic learning

Download PDF

Related papers

Chasing the Tail with Domain Generalization: A Case Study on Frequency-Enriched Datasets 2022

Double Trouble: How to not Explain a Text Classifier’s Decisions Using Counterfactuals Synthesized by Masked Language Models? 2022

Leveraging Key Information Modeling to Improve Less-Data Constrained News Headline Generation via Duality Fine-Tuning 2022

Graph-augmented Learning to Rank for Querying Large-scale Knowledge Graph 2022

Missing Modality meets Meta Sampling (M3S): An Efficient Universal Approach for Multimodal Sentiment Analysis with Missing Modality 2022