CReSE: Benchmark Data and Automatic Evaluation Framework for Recommending Eligibility Criteria from Clinical Trial Information

Siun Kim; Jung-Hyun Won; David Lee; Renqian Luo; Lijun Wu; Tao Qin; Howard Lee

2024 EACL EACL 2024

CReSE: Benchmark Data and Automatic Evaluation Framework for Recommending Eligibility Criteria from Clinical Trial Information

Abstract

AbstractEligibility criteria (EC) refer to a set of conditions an individual must meet to participate in a clinical trial, defining the study population and minimizing potential risks to patients. Previous research in clinical trial design has been primarily focused on searching for similar trials and generating EC within manual instructions, employing similarity-based performance metrics, which may not fully reflect human judgment. In this study, we propose a novel task of recommending EC based on clinical trial information, including trial titles, and introduce an automatic evaluation framework to assess the clinical validity of the EC recommendation model. Our new approach, known as CReSE (Contrastive learning and Rephrasing-based and Clinical Relevance-preserving Sentence Embedding), represents EC through contrastive learning and rephrasing via large language models (LLMs). The CReSE model outperforms existing language models pre-trained on the biomedical domain in EC clustering. Additionally, we have curated a benchmark dataset comprising 3.2M high-quality EC-title pairs extracted from 270K clinical trials available on ClinicalTrials.gov. The EC recommendation models achieve commendable performance metrics, with 49.0% precision@1 and 44.2% MAP@5 on our evaluation framework. We expect that our evaluation framework built on the CReSE model will contribute significantly to the development and assessment of the EC recommendation models in terms of clinical validity.

🌉 Interdisciplinary Bridge — Data Science & Analytics and Deep Learning and Healthcare & Medicine and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — clinical eligibility criterion

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Siun Kim , Jung-Hyun Won , David Lee , Renqian Luo , Lijun Wu , Tao Qin , Howard Lee

Topics

Deep Learning > Architectures > Transformers Natural Language Processing > Applications > Information Extraction Healthcare & Medicine > Clinical > Clinical NLP Data Science & Analytics > Applications > Information Retrieval Machine Learning > Application Areas > Recommender Systems

Keywords

contrastive learning biomedical domain recommendation system clinical trial sentence embedding clinical eligibility criterion large language model eligibility criterion

Download PDF

Related papers

A Dataset for Metaphor Detection in Early Medieval Hebrew Poetry 2024

PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation 2024

Overview of the Hate Speech Detection in Turkish and Arabic Tweets (HSD-2Lang) Shared Task at CASE 2024 2024

Evaluating In-Context Learning for Computational Literary Studies: A Case Study Based on the Automatic Recognition of Knowledge Transfer in German Drama 2024

Selam@DravidianLangTech 2024:Identifying Hate Speech and Offensive Language 2024