The paper reviews several techniques which are used in conjunction with the short-term analysis and which are reported to be more robust in presence of noise or other non-linguistic factors. We show that one property common to all such techniques is that they are effectively extracting speech features from segments of speech longer than 10-20 ms. II. Introduction to the Problem The communication channel and its noise level remains most often fixed or varies only rather slowly during the conversation. On the other hand, steady configurations of vocal tract are rare and carry only a little of linguistic information. The description of speech signal as a succession of equally spaced short-term samples originated in speech coding. It assumes ...
A very large data base consisting of over thirty-six hours of unconstrained extemporaneous speech, f...
A variety of methods for audio-visual integration, which inte-grate audio and visual information at ...
Extending previous work on prediction of phoneme recogni-tion error from unlabelled data, corrupted ...
Some speech analysis techniques used in automatic speech recognition utilize temporal processing of ...
Speech signal is redundant and non-stationary by nature. Because of vocal tract inertness these vari...
Studies from multiple disciplines show that spectro-temporal units of natural languages and human sp...
In this paper, we analyze the temporal modulation char-acteristics of speech and noise from a speech...
Abstract Nearly perfect speech recognition under condition of severe reduction of spectral informati...
The speech signal is inherently characterized by its variations in time, which get reflected as vari...
[[abstract]]Data-driven temporal filtering approaches based on a specific optimization technique hav...
In speech processing the short-time magnitude spectrum is believed to contain most of the informatio...
Speech recognition system extract the textual data from the speech signal. The research in speech re...
The performance of automatic speech recognition (ASR) is known to degrade under noise corruption. Su...
This paper addresses the problem of temporal constraints in the Viterbi algorithm in speaker-depende...
Speech signal is redundant and non-stationary by nature. Because of vocal tract inertness these vari...
A very large data base consisting of over thirty-six hours of unconstrained extemporaneous speech, f...
A variety of methods for audio-visual integration, which inte-grate audio and visual information at ...
Extending previous work on prediction of phoneme recogni-tion error from unlabelled data, corrupted ...
Some speech analysis techniques used in automatic speech recognition utilize temporal processing of ...
Speech signal is redundant and non-stationary by nature. Because of vocal tract inertness these vari...
Studies from multiple disciplines show that spectro-temporal units of natural languages and human sp...
In this paper, we analyze the temporal modulation char-acteristics of speech and noise from a speech...
Abstract Nearly perfect speech recognition under condition of severe reduction of spectral informati...
The speech signal is inherently characterized by its variations in time, which get reflected as vari...
[[abstract]]Data-driven temporal filtering approaches based on a specific optimization technique hav...
In speech processing the short-time magnitude spectrum is believed to contain most of the informatio...
Speech recognition system extract the textual data from the speech signal. The research in speech re...
The performance of automatic speech recognition (ASR) is known to degrade under noise corruption. Su...
This paper addresses the problem of temporal constraints in the Viterbi algorithm in speaker-depende...
Speech signal is redundant and non-stationary by nature. Because of vocal tract inertness these vari...
A very large data base consisting of over thirty-six hours of unconstrained extemporaneous speech, f...
A variety of methods for audio-visual integration, which inte-grate audio and visual information at ...
Extending previous work on prediction of phoneme recogni-tion error from unlabelled data, corrupted ...