TelBench: A Benchmark for Evaluating Telco-Specific Large Language Models

Sunwoo Lee; Dhammiko Arya; Seung-Mo Cho; Gyoung-eun Han; Seokyoung Hong; Wonbeom Jang; Seojin Lee; Sohee Park; Sereimony Sek; Injee Song; Sungbin Yoon; Eric Davis

2024 EMNLP EMNLP 2024

TelBench: A Benchmark for Evaluating Telco-Specific Large Language Models

Abstract

AbstractThe telecommunications industry, characterized by its vast customer base and complex service offerings, necessitates a high level of domain expertise and proficiency in customer service center operations. Consequently, there is a growing demand for Large Language Models (LLMs) to augment the capabilities of customer service representatives. This paper introduces a methodology for developing a specialized Telecommunications LLM (Telco LLM) designed to enhance the efficiency of customer service agents and promote consistency in service quality across representatives. We present the construction process of TelBench, a novel dataset created for performance evaluation of customer service expertise in the telecommunications domain. We also evaluate various LLMs and demonstrate the ability to benchmark both proprietary and open-source LLMs on predefined telecommunications-related tasks, thereby establishing metrics that define telcommunications performance.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sunwoo Lee , Dhammiko Arya , Seung-Mo Cho , Gyoung-eun Han , Seokyoung Hong , Wonbeom Jang , Seojin Lee , Sohee Park , Sereimony Sek , Injee Song , Sungbin Yoon , Eric Davis

Topics

Natural Language Processing > Applications > Question Answering Natural Language Processing > Resources & Methods > Large Language Models Artificial Intelligence > Core AI > Large Language Models Machine Learning > Learning Types > Domain Adaptation

Keywords

benchmark evaluation domain adaptation question answering customer service domain-specific language model large language model

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024