2018 INTERSPEECH INTERSPEECH 2018

Temporal Noise Shaping with Companding

Abstract

Audio codecs are typically transform-domain based and efficiently code stationary musical signals but they struggle with speech and signals with dense transients such as applause. The temporal noise shaping (TNS) tool standardized in HE-AAC alleviates the issue of noise unmasking in these troublesome cases via signal-adaptive filtering of the transform domain quantization noise, albeit at the cost of significant additional side information in the bitstream. We present a novel alternative referred to as companding that involves QMF domain pre- and post-processing around the core transform-domain coding system: prior to transform encoding, the dynamic range of the signal is reduced locally within a QMF time slot and restored again post decoding, which naturally shapes the coding noise temporally. A primary advantage is that the companding function is fixed and hence enables signal-adaptive noise shaping with just 1-2 bits of side-information per frame. Subjective tests illustrate that the proposed tool improves the quality of hard-to-code applause excerpts compared to TNS while achieving comparable performance on speech signals. The coding tool described in this paper is part of the Dolby AC-4 audio coding system standardized by ETSI and included in ATSC 3.0.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning
🧭 Keyword Pioneer — temporal noise shaping
🐣 Hot Topic Early Bird — signal processing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio