Learning to Define Terms in the Software Domain

Vidhisha Balachandran; Dheeraj Rajagopal; Rose Catherine Kanjirathinkal; William Cohen

2018 EMNLP EMNLP 2018

Learning to Define Terms in the Software Domain

Abstract

AbstractOne way to test a person’s knowledge of a domain is to ask them to define domain-specific terms. Here, we investigate the task of automatically generating definitions of technical terms by reading text from the technical domain. Specifically, we learn definitions of software entities from a large corpus built from the user forum Stack Overflow. To model definitions, we train a language model and incorporate additional domain-specific information like word co-occurrence, and ontological category information. Our approach improves previous baselines by 2 BLEU points for the definition generation task. Our experiments also show the additional challenges associated with the task and the short-comings of language-model based architectures for definition generation.

🧭 Keyword Pioneer — definition generation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Vidhisha Balachandran , Dheeraj Rajagopal , Rose Catherine Kanjirathinkal , William Cohen

Topics

Natural Language Processing > Generation > Text Generation Natural Language Processing > Resources & Methods > Lexical Semantics

Keywords

language model definition generation software domain technical term

Download PDF

Related papers

Speeding Up Neural Machine Translation Decoding by Cube Pruning 2018

Limitations in learning an interpreted language with recurrent models 2018

Results of the sixth edition of the BioASQ Challenge 2018

Neural Segmental Hypergraphs for Overlapping Mention Recognition 2018

Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates 2018