Video captioning refers to the process of conveying information of video clips through automatically generated natural language sentences. The unprecedented success of deep learning approaches in Computer Vision and Natural Language Processing have spurred significant progress in the research area of video captioning. Currently, video captioning has extensive applications in video surveillance, video subtitling and human-robot interaction. Most existing video captioning methods adopt the pure encoder-decoder framework, where the encoder is used to extract video features while the decoder is used to generate captions. However, even though current state-of-the-art models achieved high scores on the evaluation metrics, a significant proportion...
This paper strives to find amidst a set of sentences the one best describing the content of a given ...
abstract: Video analysis and understanding have obtained more and more attention in recent years. Th...
A long standing goal of artificial intelligence is to enable machines to perceive the visual world a...
Deep learning is a very prevalent field in these recent years and so many applications is coming out...
Video captioning refers to the task of generating a natural language sentence that explains the cont...
In recent times, digital media contents are inherently of multimedia type, consisting of the form te...
In the modern era, image captioning has become one of the most widely required tools. Moreover, ther...
Abstract Dense video captioning (DVC) detects multiple events in an input video and generates natura...
Understanding visual media, i.e. images and videos, has been a cornerstone topic in computer vision ...
Nowadays due to vast number of camera equipped devices, large amount of data in terms of image and v...
In this thesis, we propose novel deep learning algorithms for the vision and language tasks, includi...
A Common problem linking computer vision and natural language processing is the ability to generate ...
Vision to language problems, such as video annotation, or visual question answering, stand out from ...
Automatic image captioning, which involves describing the contents of an image, is a challenging pro...
Linking natural language to visual data is an important topic at the intersection of Natural Languag...
This paper strives to find amidst a set of sentences the one best describing the content of a given ...
abstract: Video analysis and understanding have obtained more and more attention in recent years. Th...
A long standing goal of artificial intelligence is to enable machines to perceive the visual world a...
Deep learning is a very prevalent field in these recent years and so many applications is coming out...
Video captioning refers to the task of generating a natural language sentence that explains the cont...
In recent times, digital media contents are inherently of multimedia type, consisting of the form te...
In the modern era, image captioning has become one of the most widely required tools. Moreover, ther...
Abstract Dense video captioning (DVC) detects multiple events in an input video and generates natura...
Understanding visual media, i.e. images and videos, has been a cornerstone topic in computer vision ...
Nowadays due to vast number of camera equipped devices, large amount of data in terms of image and v...
In this thesis, we propose novel deep learning algorithms for the vision and language tasks, includi...
A Common problem linking computer vision and natural language processing is the ability to generate ...
Vision to language problems, such as video annotation, or visual question answering, stand out from ...
Automatic image captioning, which involves describing the contents of an image, is a challenging pro...
Linking natural language to visual data is an important topic at the intersection of Natural Languag...
This paper strives to find amidst a set of sentences the one best describing the content of a given ...
abstract: Video analysis and understanding have obtained more and more attention in recent years. Th...
A long standing goal of artificial intelligence is to enable machines to perceive the visual world a...