Can You Answer This? – Exploring Zero-Shot QA Generalization Capabilities in Large Language Models (Student Abstract)

Saptarshi Sengupta; Shreya Ghosh; Preslav Nakov; Prasenjit Mitra

2023 AAAI AAAI 2023

Can You Answer This? – Exploring Zero-Shot QA Generalization Capabilities in Large Language Models (Student Abstract)

Abstract

Abstract The buzz around Transformer-based language models (TLM) such as BERT, RoBERTa, etc. is well-founded owing to their impressive results on an array of tasks. However, when applied to areas needing specialized knowledge (closed-domain), such as medical, finance, etc. their performance takes drastic hits, sometimes more than their older recurrent/convolutional counterparts. In this paper, we explore zero-shot capabilities of large LMs for extractive QA. Our objective is to examine performance change in the face of domain drift i.e. when the target domain data is vastly different in semantic and statistical properties from the source domain and attempt to explain the subsequent behavior. To this end, we present two studies in this paper while planning further experiments later down the road. Our findings indicate flaws in the current generation of TLM limiting their performance on closed-domain tasks.

❓ The Questioner

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Saptarshi Sengupta , Shreya Ghosh , Preslav Nakov , Prasenjit Mitra

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Learning Types > Zero-Shot Learning Natural Language Processing > Applications > Question Answering

Keywords

zero-shot learning domain generalization transfer learning question answering large language model

Download PDF

Related papers

A Model-Agnostic Heuristics for Selective Classification 2023

Tackling Safe and Efficient Multi-Agent Reinforcement Learning via Dynamic Shielding (Student Abstract) 2023

Head-Free Lightweight Semantic Segmentation with Linear Transformer 2023

Hierarchical ConViT with Attention-Based Relational Reasoner for Visual Analogical Reasoning 2023

Deep Spiking Neural Networks with High Representation Similarity Model Visual Pathways of Macaque and Mouse 2023