Robust Speech Recognition Based on Mapping Noisy Features to Clean Features
Abstract
The conventional view on the problem of robustness in speech recognition is that performance degradation in ASR systems is due to mismatch between training and test conditions. If problem of robustness in ASR systems is considered as a mismatch between the training and testing conditions the solution would be to find a way to reduce it. Common approaches are: Data-Driven methods such as speech signal enhancement and using robust features and model-based methods that alter or adapt model of speech signal. In this paper, we study a model of environment and obtain a relation between noisy and clean speech features based on this model. We propose two techniques for mapping noisy features to clean features in cepstrum domain. We implement the proposed methods and some of precedent data-driven methods such as: spectral subtraction, cepstral mean normalization, cepstral mean and variance mean normalization and SNR-dependent cepstral normalization .We show that proposed methods outperform precedent methods and are effective for robust speech recognition in noisy environments.
Keywords
noise, robustness, neural network, map, data-driven methods, speech recognition