International audienceWe address far-field speaker verification with deep neural network (DNN) based speaker embedding extractor, where mismatch between enrollment and test data often comes from convolutive effects (e.g. room reverberation) and noise. To mitigate these effects, we focus on two parametric normalization methods: per-channel energy normalization (PCEN) and parameterized cepstral mean normalization (PCMN). Both methods contain differentiable parameters and thus can be conveniently integrated to, and jointly optimized with the DNN using automatic differentiation methods. We consider both fixed and trainable (data-driven) variants of each method. We evaluate the performance on Hi-MIA, a recent large-scale far-field speech corpus,...
Advancements in automatic speaker verification (ASV) can be considered to be primarily limited to im...
The objective of this work is to study state-of-the-art deep neural networks based speaker verificat...
• Implement a high-accuracy text-dependent/short-duration speaker id system • Exploit Deep Neural Ne...
International audienceModern automatic speaker verification relies largely on deep neural networks (...
International audienceSpeaker verification (SV) suffers from unsatisfactory performance in far-field...
This paper presents an investigation of far field speech recog-nition using beamforming and channel ...
Acoustic modeling based on deep architectures has recently gained remarkable success, with substanti...
International audienceToday's smart devices using speaker verification are getting equipped with mul...
A method for speaker normalization in deep neural network (DNN) based discriminative feature estimat...
Some of the experiments presented in this manuscript were performed on Grid5000, a server supported ...
The parametric Bayesian Feature Enhancement (BFE) and a data-driven Denoising Autoencoder (DA) both ...
This paper proposes a neural network based system for multi-channel speech enhancement and dereverbe...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
International audienceMost of the speech processing applications use triangular filters spaced in me...
International audienceMulti-taper estimators provide low-variance power spectrum estimates that can ...
Advancements in automatic speaker verification (ASV) can be considered to be primarily limited to im...
The objective of this work is to study state-of-the-art deep neural networks based speaker verificat...
• Implement a high-accuracy text-dependent/short-duration speaker id system • Exploit Deep Neural Ne...
International audienceModern automatic speaker verification relies largely on deep neural networks (...
International audienceSpeaker verification (SV) suffers from unsatisfactory performance in far-field...
This paper presents an investigation of far field speech recog-nition using beamforming and channel ...
Acoustic modeling based on deep architectures has recently gained remarkable success, with substanti...
International audienceToday's smart devices using speaker verification are getting equipped with mul...
A method for speaker normalization in deep neural network (DNN) based discriminative feature estimat...
Some of the experiments presented in this manuscript were performed on Grid5000, a server supported ...
The parametric Bayesian Feature Enhancement (BFE) and a data-driven Denoising Autoencoder (DA) both ...
This paper proposes a neural network based system for multi-channel speech enhancement and dereverbe...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
International audienceMost of the speech processing applications use triangular filters spaced in me...
International audienceMulti-taper estimators provide low-variance power spectrum estimates that can ...
Advancements in automatic speaker verification (ASV) can be considered to be primarily limited to im...
The objective of this work is to study state-of-the-art deep neural networks based speaker verificat...
• Implement a high-accuracy text-dependent/short-duration speaker id system • Exploit Deep Neural Ne...