2017 IJCAI IJCAI 2017

Enhancing the Unified Features to Locate Buggy Files by Exploiting the Sequential Nature of Source Code

Abstract

Bug reports provide an effective way for end-users to disclose potential bugs hidden in a software system, while automatically locating the potential buggy source files according to a bug report remains a great challenge in software maintenance. Many previous approaches represent bug reports and source code from lexical and structural information correlated their relevance by measuring their similarity, and recently a CNN-based model is proposed to learn the unified features for bug localization, which overcomes the difficulty in modeling natural and programming languages with different structural semantics. However, previous studies fail to capture the sequential nature of source code, which carries additional semantics beyond the lexical and structural terms and such information is vital in modeling program functionalities and behaviors. In this paper, we propose a novel model LS-CNN, which enhances the unified features by exploiting the sequential nature of source code. LS-CNN combines CNN and LSTM to extract semantic features for automatically identifying potential buggy source code according to a bug report. Experimental results on widely-used software projects indicate that LS-CNN significantly outperforms the state-of-the-art methods in locating buggy files.

🌉 Interdisciplinary Bridge — Computer Science and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors