Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis

Amey Hengle; Atharva Kulkarni; Shantanu Deepak Patankar; Madhumitha Chandrasekaran; Sneha D’silva; Jemima S. Jacob; Rashmi Gupta

2024 EMNLP EMNLP 2024

Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis

Abstract

AbstractIn this study, we introduce ANGST, a novel, first of its kind benchmark for depression-anxiety comorbidity classification from social media posts. Unlike contemporary datasets that often oversimplify the intricate interplay between different mental health disorders by treating them as isolated conditions, ANGST enables multi-label classification, allowing each post to be simultaneously identified as indicating depression and/or anxiety. Comprising 2876 meticulously annotated posts by expert psychologists and an additional 7667 silver-labeled posts, ANGST posits a more representative sample of online mental health discourse. Moreover, we benchmark ANGST using various state-of-the-art language models, ranging from Mental-BERT to GPT-4. Our results provide significant insights into the capabilities and limitations of these models in complex diagnostic scenarios. While GPT-4 generally outperforms other models, none achieve an F1 score exceeding 72% in multi-class comorbid classification, underscoring the ongoing challenges in applying language models to mental health diagnostics.

🌉 Interdisciplinary Bridge — Healthcare & Medicine and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — depression anxiety classification

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Amey Hengle , Atharva Kulkarni , Shantanu Deepak Patankar , Madhumitha Chandrasekaran , Sneha D’silva , Jemima S. Jacob , Rashmi Gupta

Topics

Machine Learning > Core Methods > Classification Natural Language Processing > Applications > Text Classification Healthcare & Medicine > Clinical > Clinical NLP Healthcare & Medicine > Clinical > Mental Health Natural Language Processing > Applications > Natural Language Understanding

Keywords

benchmark evaluation multi-label classification clinical nlp large language model mental health diagnosis depression anxiety classification comorbid classification depression anxiety comorbidity

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024