There have been several studies, in the recent past, pointing to the importance of analytic phase of the speech signal in human perception, especially in noisy conditions. However, phase information is still not used in state-of-the-art speech recognition systems. In this paper, we illustrate the importance of analytic phase of the speech signal for automatic speech recognition. As the computation of analytic phase suffers from inevitable phase wrapping problem, we extract features from its time derivative, referred to as instantaneous frequency (IF). In this work, we highlight the issues involved in IF extraction from speech-like signals, and propose suitable modifications for IF extraction from speech signals. We used the deep neural netw...
Despite many technological advances, hearing aids still amplify the background sounds together with ...
State-of-the-art automatic speech recognition systems (ASRs) use only the short-time magnitude spect...
Speaker identification with deep learning commonly use time-frequency representation of the voice si...
Analytic phase of the speech signal plays an important role in human speech perception, specially in...
The objective of this work is to study the speaker-specific nature of analytic phase of speech signa...
The objective of this paper is to establish the importance of phase of analytic signal of speech, re...
This paper investigates the use of Instantaneous Frequency Distributions for the analysis of Speech ...
Speech recognition by machines is an important technology for the 21st century. Speech signals are p...
The major impulse-like excitation in the speech signal is due to abrupt closure of the vocal folds, ...
Recurrent neural networks (RNNs) and its variants have achieved significant success in speech recogn...
The speech signal is inherently characterized by its variations in time, which get reflected as vari...
Accurate estimation of the instantaneous frequency of speech resonances is a hard problem mainly due...
In this paper, we investigate the performance of modulation related features and normalized spectral...
An analytic signal s(t) is modeled over a T second duration by a pole- zero model by considering its...
The objective of this paper is to critically evaluate the performance of a nonstationary analysis me...
Despite many technological advances, hearing aids still amplify the background sounds together with ...
State-of-the-art automatic speech recognition systems (ASRs) use only the short-time magnitude spect...
Speaker identification with deep learning commonly use time-frequency representation of the voice si...
Analytic phase of the speech signal plays an important role in human speech perception, specially in...
The objective of this work is to study the speaker-specific nature of analytic phase of speech signa...
The objective of this paper is to establish the importance of phase of analytic signal of speech, re...
This paper investigates the use of Instantaneous Frequency Distributions for the analysis of Speech ...
Speech recognition by machines is an important technology for the 21st century. Speech signals are p...
The major impulse-like excitation in the speech signal is due to abrupt closure of the vocal folds, ...
Recurrent neural networks (RNNs) and its variants have achieved significant success in speech recogn...
The speech signal is inherently characterized by its variations in time, which get reflected as vari...
Accurate estimation of the instantaneous frequency of speech resonances is a hard problem mainly due...
In this paper, we investigate the performance of modulation related features and normalized spectral...
An analytic signal s(t) is modeled over a T second duration by a pole- zero model by considering its...
The objective of this paper is to critically evaluate the performance of a nonstationary analysis me...
Despite many technological advances, hearing aids still amplify the background sounds together with ...
State-of-the-art automatic speech recognition systems (ASRs) use only the short-time magnitude spect...
Speaker identification with deep learning commonly use time-frequency representation of the voice si...