Visual speech, referring to the visual domain of speech, has attracted increasing attention due to its wide applications, such as public security, medical treatment, military defense, and film entertainment. As a powerful AI strategy, deep learning techniques have extensively promoted the development of visual speech learning. Over the past five years, numerous deep learning based methods have been proposed to address various problems in this area, especially automatic visual speech recognition and generation. To push forward future research on visual speech, this paper aims to present a comprehensive review of recent progress in deep learning methods on visual speech analysis. We cover different aspects of visual speech, including fundamen...
In this paper we propose a new learning-based representation that is referred to as Visual Speech Un...
In this paper we present a deep learning architecture for extracting word embeddings for visual spee...
Deep bottleneck features (DBNFs) have been used successfully in the past for acoustic speech recogni...
Speech is the most natural means of communication for humans. Therefore, since the beginning of comp...
The objective of this work is visual recognition of speech and gestures. Solving this problem opens ...
In visual speech recognition (VSR), speech is transcribed using only visual information to interpret...
This paper discusses a transition from the traditional methods to novel deep learning architectures ...
Deep learning is a technique with artificial intelligent (AI) that simulate humans’ learning behavio...
The project proposes an end-to-end deep learning architecture for word-level visual speech recogniti...
This paper proposes and compares a range of methods to improve the naturalness of visual speech synt...
Speech enhancement, which aims to recover the clean speech of the corrupted signal, plays an importa...
The development of deep learning and the continuous progress of artificial intelligence have contrib...
Visual speech recognition (VSR) aims to recognize the content of speech based on lip movements, with...
Abstract — Speech Recognition is the translation of spoken words into text. Speech recognition invol...
In the last few decades, there has been considerable amount of research on the use of Machine Learni...
In this paper we propose a new learning-based representation that is referred to as Visual Speech Un...
In this paper we present a deep learning architecture for extracting word embeddings for visual spee...
Deep bottleneck features (DBNFs) have been used successfully in the past for acoustic speech recogni...
Speech is the most natural means of communication for humans. Therefore, since the beginning of comp...
The objective of this work is visual recognition of speech and gestures. Solving this problem opens ...
In visual speech recognition (VSR), speech is transcribed using only visual information to interpret...
This paper discusses a transition from the traditional methods to novel deep learning architectures ...
Deep learning is a technique with artificial intelligent (AI) that simulate humans’ learning behavio...
The project proposes an end-to-end deep learning architecture for word-level visual speech recogniti...
This paper proposes and compares a range of methods to improve the naturalness of visual speech synt...
Speech enhancement, which aims to recover the clean speech of the corrupted signal, plays an importa...
The development of deep learning and the continuous progress of artificial intelligence have contrib...
Visual speech recognition (VSR) aims to recognize the content of speech based on lip movements, with...
Abstract — Speech Recognition is the translation of spoken words into text. Speech recognition invol...
In the last few decades, there has been considerable amount of research on the use of Machine Learni...
In this paper we propose a new learning-based representation that is referred to as Visual Speech Un...
In this paper we present a deep learning architecture for extracting word embeddings for visual spee...
Deep bottleneck features (DBNFs) have been used successfully in the past for acoustic speech recogni...