2021 ACML ACML 2021

Video Action Recognition with Neural Architecture Search

Abstract

Recently, deep convolutional neural networks have been widely used in the field of videoaction recognition. Current approaches tend to concentrate on the structure design fordifferent backbone networks, but what kind of network structures can process video botheffectively and quickly still remains to be solved despite the encouraging progress. With thehelp of neural architecture search (NAS), we search for three hyperparameters in the videoprocessing network, which are the number of frames, the number of layers per residual stageand the channel number for all layers. We relax the entire search space into a continuoussearch space, and search for a set of network architectures that balance accuracy andcomputational efficiency by considering accuracy as the primary optimization goal andcomputational complexity as the secondary optimization goal. We conduct experiments onUCF101 and Kinetics400 datasets, validating new state-of-the-art results of the proposedNAS based scheme for video action recognition.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning
🐣 Hot Topic Early Bird — video processing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio