TeleChat: An Open-source Billingual Large Language Model

Zihan Wang; XinZhang Liu; Shixuan Liu; Yitong Yao; Yunyao Huang; Mengxiang Li; Zhongjiang He; Yongxian Li; Luwen Pu; Huinan Xu; Chao Wang; Shuangyong Song

2024 ACL ACL 2024

TeleChat: An Open-source Billingual Large Language Model

Abstract

AbstractIn this paper, we present TeleChat, a collection of large language models (LLMs) with parameters of 7 billion and 12 billion. TeleChat is initially pretrained on an extensive corpus containing a diverse collection of texts from both English and Chinese languages, encompassing trillions of tokens. Subsequently, the model undergoes fine-tuning to align with human preferences, following a detailed methodology that we describe. We evaluate the performance of TeleChat on various tasks, including general dialogue generation, language understanding, mathematics, reasoning, code generation, and knowledge-based question answering. Our findings indicate that TeleChat achieves state-of-the-art performance to other open-source models of similar size across a wide range of public benchmarks. To support future research and applications utilizing LLMs, we release the fine-tuned model checkpoints of TeleChat-7B and TeleChat-12B, along with code and a portion of our filtered high-quality pretraining data, to the public community.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Zihan Wang , XinZhang Liu , Shixuan Liu , Yitong Yao , Yunyao Huang , Mengxiang Li , Zhongjiang He , Yongxian Li , Luwen Pu , Huinan Xu , Chao Wang , Shuangyong Song

Topics

Deep Learning > Architectures > Transformers Natural Language Processing > Resources & Methods > Large Language Models Artificial Intelligence > Core AI > Large Language Models Deep Learning > Models > Large Language Models Deep Learning > Learning Types > Fine-Tuning

Keywords

code generation bilingual model large language model

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024