Breaking Audio CAPTCHAs

Jennifer Tam; Jiri Simsa; Sean Hyde; Luis V. Ahn

2008 NIPS NeurIPS 2008

Breaking Audio CAPTCHAs

Abstract

CAP T C H A s are computer-generated tests that humans can pass but current computer systems cannot. CAP T C H A s provide a method for automatically distinguishing a human from a computer program, and therefore can protect Web services from abuse by so-called "bots." Most CAP T C H A s consist of distorted images, usually text, for which a user must provide some description. Unfortunately, visual CAP T C H A s limit access to the millions of visually impaired people using the Web. Audio CAP T C H A s were created to solve this accessibility issue; however, the security of audio CAP T C H A s was never formally tested. Some visual CAP T C H A s have been broken using machine learning techniques, and we propose using similar ideas to test the security of audio CAP T C H A s . Audio CAP T C H A s are generally composed of a set of words to be identified, layered on top of noise. We analyzed the security of current audio CAP T CH A s from popular Web sites by using AdaBoost, SVM, and k-NN, and achieved correct solutions for test samples with accuracy up to 71%. Such accuracy is enough to consider these CAPTCHAs broken. Training several different machine learning algorithms on different types of audio CAP T C H A s allowed us to analyze the strengths and weaknesses of the algorithms so that we could suggest a design for a more robust audio CAPTCHA. 1 Int r o d u c t i o n CAP T C H A s [1] are automated tests designed to tell computers and humans apart by presenting users with a problem that humans can solve but current computer programs cannot. Because CAPTCHAs can distinguish between humans and computers with high probability, they are used for many different security applications: they prevent bots from voting continuously in online polls, automatically registering for millions of spam email accounts, automatically purchasing tickets to buy out an event, etc. Once a CAP T C H A is broken (i.e., computer programs can successfully pass the test), bots can impersonate h

🌱 Topic Pioneer — Cybersecurity

🌉 Interdisciplinary Bridge — Computer Science and Machine Learning

🧭 Keyword Pioneer — audio captcha

🐣 Hot Topic Early Bird — speech recognition

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

📈 Trend Setter — Speech Recognition

Authors

Jennifer Tam , Jiri Simsa , Sean Hyde , Luis V. Ahn

Topics

Machine Learning > Core Methods > Classification Speech & Audio > Recognition > Speech Recognition Computer Science > Applications > Cybersecurity Machine Learning > Learning Types > Classification Machine Learning > Application Areas > Classification

Keywords

adversarial robustness speech recognition nearest neighbor audio captcha machine learning adversarial classification audio classification support vector machine k-nearest neighbor

Download PDF

Related papers

On the Efficient Minimization of Classification Calibrated Surrogates 2008

Hebbian Learning of Bayes Optimal Decisions 2008

Biasing Approximate Dynamic Programming with a Lower Discount Factor 2008

Counting Solution Clusters in Graph Coloring Problems Using Belief Propagation 2008

Domain Adaptation with Multiple Sources 2008