2019 AAAI AAAI 2019

Find Me if You Can: Deep Software Clone Detection by Exploiting the Contest between the Plagiarist and the Detector

Abstract

Abstract Code clone is common in software development, which usually leads to software defects or copyright infringement. Researchers have paid significant attention to code clone detection, and many methods have been proposed. However, the patterns for generating the code clones do not always remain the same. In order to fool the clone detection systems, the plagiarists, known as the clone creator, usually conduct a series of tricky modifications on the code fragments to make the clone difficult to detect. The existing clone detection approaches, which neglects the dynamics of the โ€œcontestโ€ between the plagiarist and the detectors, is doomed to be not robust to adversarial revision of the code. In this paper, we propose a novel clone detection approach, namely ACD, to mimic the adversarial process between the plagiarist and the detector, which enables us to not only build strong a clone detector but also model the behavior of the plagiarists. Such a plagiarist model may in turn help to understand the vulnerability of the current software clone detection tools. Experiments show that the learned policy of plagiarist can help us build stronger clone detector, which outperforms the existing clone detection methods.

๐Ÿš€ Conference Pioneer โ€” AAAI 2019
๐ŸŒ‰ Interdisciplinary Bridge โ€” Computer Science and Deep Learning and Machine Learning
๐Ÿงญ Keyword Pioneer โ€” code clone detection
๐Ÿ Cross-Pollinator โ€” Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors