Ph.D.In this thesis, DNN acoustic model adaptation is investigated. Performance of automatic speech recognition (ASR) systems may be affected by mismatch between the systems and target speech. This mismatch is caused by acoustic variability in the speech under different input conditions, e.g., speaker, environment, speaking style, and emotional state. Techniques of acoustic model adaptation play a key role in reducing the mismatch. Currently, most of the widely used adaptation methods are proposed for speaker variability. These speaker adaptation techniques can also be applied to deal with acoustic variability contributed by other identifiable input conditions. However, acoustic variability may be joint effect of latent input conditions, wh...
Ph.D.The brain constantly monitors the auditory environment and detects unexpected changes, even wit...
Discriminative training techniques define state-of-the-art performance for automatic speech recognit...
Ph.D.To facilitate the access of human dataset which consists of a large amount of 3D human models, ...
Ph.D.This thesis proposes an end-to-end neural framework for expressive text-to-speech (E-TTS) synth...
Ph.D.Two sets of modern studies have largely shaped the contemporary understanding of the human spee...
Ph.D.Neural mechanisms underlying visual word recognition in reading, especially whether and how oth...
Novel motor task learning by one hand unilaterally results in an auto-gain of performance in the unt...
Ph.D.Over the past a few years, the computer vision community has witnessed great success achieved i...
M.Phil.Evidence shows that the systems of speech perception and production are intrinsically linked ...
Ph.D.This thesis mainly investigates the use of posteriorgram-to-acoustic modeling forunconstrained ...
Understanding single modality mediums including audio, visual, and language have achieved great succ...
Deep learning in visual understanding and editing tasks has witnessed great success in recent years,...
Ph.D.Aspect-Based Sentiment Analysis (ABSA) is the process of analyzing user-expressed opinions/ sen...
Ph.D.Neural Machine Translation (NMT) aims at sequentially generating a target sentence with the sam...
M.Phil.Learning to distinguish objects in our world using their attributes requires both common sens...
Ph.D.The brain constantly monitors the auditory environment and detects unexpected changes, even wit...
Discriminative training techniques define state-of-the-art performance for automatic speech recognit...
Ph.D.To facilitate the access of human dataset which consists of a large amount of 3D human models, ...
Ph.D.This thesis proposes an end-to-end neural framework for expressive text-to-speech (E-TTS) synth...
Ph.D.Two sets of modern studies have largely shaped the contemporary understanding of the human spee...
Ph.D.Neural mechanisms underlying visual word recognition in reading, especially whether and how oth...
Novel motor task learning by one hand unilaterally results in an auto-gain of performance in the unt...
Ph.D.Over the past a few years, the computer vision community has witnessed great success achieved i...
M.Phil.Evidence shows that the systems of speech perception and production are intrinsically linked ...
Ph.D.This thesis mainly investigates the use of posteriorgram-to-acoustic modeling forunconstrained ...
Understanding single modality mediums including audio, visual, and language have achieved great succ...
Deep learning in visual understanding and editing tasks has witnessed great success in recent years,...
Ph.D.Aspect-Based Sentiment Analysis (ABSA) is the process of analyzing user-expressed opinions/ sen...
Ph.D.Neural Machine Translation (NMT) aims at sequentially generating a target sentence with the sam...
M.Phil.Learning to distinguish objects in our world using their attributes requires both common sens...
Ph.D.The brain constantly monitors the auditory environment and detects unexpected changes, even wit...
Discriminative training techniques define state-of-the-art performance for automatic speech recognit...
Ph.D.To facilitate the access of human dataset which consists of a large amount of 3D human models, ...