Detecting Accounting Frauds in Publicly Traded U.S. Firms: A Machine Learning Approach

Bin Li; Julia Yu; Jie Zhang; Bin Ke

2015 ACML ACML 2015

Detecting Accounting Frauds in Publicly Traded U.S. Firms: A Machine Learning Approach

Abstract

This paper studies how machine learning techniques can facilitate the detection of accounting fraud in publicly traded US firms. Existing studies often mimic human experts and employ the financial or nonfinancial ratios as the features for their systems. We depart from these studies by adopting raw accounting variables, which are directly available from a firm’s financial statement and thereby can be easily applied to new firms at low cost. Further, we collected the most complete fraud dataset of US publicly traded firms and labeled the fraud and non-fraud firm-years. One key issue of the dataset is that the data is extremely imbalanced, in which the fraud firm-years are often less than one percent. Without re-sampling the data, we further propose to tackle the imbalance issue by adopting the techniques of imbalanced learning. In particular, we employ the linear and nonlinear Biased Penalty Support Vector Machine and the Ensemble Methods, both of which have been proved to successfully handle the imbalance issue in the machine learning literatures. We finally evaluate our approach by conducting extensive empirical studies. Empirical results show that the proposed schema can achieve much better performance, in terms of balanced accuracy, than the state of the art. Besides the performance, our approaches can also compute very fast, which further supports their practical deployment.

🌉 Interdisciplinary Bridge — Computer Science and Machine Learning

🧭 Keyword Pioneer — imbalanced learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

📈 Trend Setter — Cybersecurity

🐣 Hot Topic Early Bird — ensemble method

Authors

Bin Li , Julia Yu , Jie Zhang , Bin Ke

Topics

Machine Learning > Core Methods > Classification Computer Science > Applications > Cybersecurity

Keywords

fraud detection support vector machine ensemble method imbalanced learning

Download PDF

Related papers

Continuous Target Shift Adaptation in Supervised Learning 2015

Surrogate regret bounds for generalized classification performance metrics 2015

Statistical Unfolded Logic Learning 2015

Integration of Single-view Graphs with Diffusion of Tensor Product Graphs for Multi-view Spectral Clustering 2015

Class-prior Estimation for Learning from Positive and Unlabeled Data 2015