Learning Distributions Generated by Single-Layer ReLU Networks in the Presence of Arbitrary Outliers

Saikiran Bulusu; Geethu Joseph; M. Cenk Gursoy; Pramod Varshney

2022 NIPS NeurIPS 2022

Learning Distributions Generated by Single-Layer ReLU Networks in the Presence of Arbitrary Outliers

Abstract

We consider a set of data samples such that a fraction of the samples are arbitrary outliers, and the rest are the output samples of a single-layer neural network with rectified linear unit (ReLU) activation. Our goal is to estimate the parameters (weight matrix and bias vector) of the neural network, assuming the bias vector to be non-negative. We estimate the network parameters using the gradient descent algorithm combined with either the median- or trimmed mean-based filters to mitigate the effect of the arbitrary outliers. We then prove that $\tilde{O}\left( \frac{1}{p^2}+\frac{1}{\epsilon^2p}\right)$ samples and $\tilde{O}\left( \frac{d^2}{p^2}+ \frac{d^2}{\epsilon^2p}\right)$ time are sufficient for our algorithm to estimate the neural network parameters within an error of $\epsilon$ when the outlier probability is $1-p$, where $2/3

🧭 Keyword Pioneer — single-layer relu network

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Saikiran Bulusu , Geethu Joseph , M. Cenk Gursoy , Pramod Varshney

Topics

Machine Learning > Core Methods > Regression Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Optimization & Theory > Neural Network Optimization Machine Learning > Learning Types > Robustness

Keywords

parameter estimation robust statistics outlier detection gradient descent robust estimation relu network single-layer relu network

Download PDF

Related papers

Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching 2022

A Theoretical View on Sparsely Activated Networks 2022

Prune and distill: similar reformatting of image information along rat visual cortex and deep neural networks 2022

Matryoshka Representation Learning 2022

Off-Policy Evaluation with Deficient Support Using Side Information 2022