The performance of speech systems is degraded in the presence of noise. The principle of modulation filtering attempts to remove the spectro-temporal modulations that are susceptible to noise. While traditional approaches use modulation filters that are hand-crafted, we propose a novel method for modulation filter learning using deep variational models. Specifically, we pose the filter learning problem in an unsupervised generative modeling framework where the convolutional filters capture the modulations that are shown to improve speech recognition performance significantly.
Faculty Member: Sriram Ganapathy [EE]