Chain-of-Thought Tuning: Masked Language Models can also Think Step By Step in Natural Language Understanding

Caoyun Fan; Jidong Tian; Yitian Li; Wenqing Chen; Hao He; Yaohui Jin

2023 EMNLP EMNLP 2023

Chain-of-Thought Tuning: Masked Language Models can also Think Step By Step in Natural Language Understanding

Abstract

AbstractChain-of-Thought (CoT) is a technique that guides Large Language Models (LLMs) to decompose complex tasks into multi-step reasoning through intermediate steps in natural language form. Briefly, CoT enables LLMs to think step by step. However, although many Natural Language Understanding (NLU) tasks also require thinking step by step, LLMs perform less well than small-scale Masked Language Models (MLMs). To migrate CoT from LLMs to MLMs, we propose Chain-of-Thought Tuning (CoTT), a two-step reasoning framework based on prompt tuning, to implement step-by-step thinking for MLMs on NLU tasks. From the perspective of CoT, CoTT’s two-step framework enables MLMs to implement task decomposition; CoTT’s prompt tuning allows intermediate steps to be used in natural language form. Thereby, the success of CoT can be extended to NLU tasks through MLMs. To verify the effectiveness of CoTT, we conduct experiments on two NLU tasks: hierarchical classification and relation extraction, and the results show that CoTT outperforms baselines and achieves state-of-the-art performance.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🧭 Keyword Pioneer — chain-of-thought tuning

🐣 Hot Topic Early Bird — task decomposition

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Caoyun Fan , Jidong Tian , Yitian Li , Wenqing Chen , Hao He , Yaohui Jin

Topics

Artificial Intelligence > Core AI > Foundation Models Natural Language Processing > Understanding > Semantic Analysis

Keywords

hierarchical classification relation extraction masked language model task decomposition prompt tuning chain-of-thought tuning

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023