We propose an unsupervised learning algorithm for automatically inferring the mappings between English nouns and corresponding video objects. Given a se-quence of natural language instructions and an un-aligned video recording, we simultaneously align each instruction to its corresponding video segment, and also align nouns in each instruction to their corresponding objects in video. While existing grounded language ac-quisition algorithms rely on pre-aligned supervised data (each sentence paired with corresponding image frame or video segment), our algorithm aims to automatically infer the alignment from the temporal structure of the video and parallel text instructions. We propose two generative models that are closely related to the HMM ...
Word alignments identify translational correspondences between words in a parallel sentence pair and...
Word alignment is an important natural language processing task that indicates the correspondence be...
Tree-based approaches to alignment model translation as a sequence of probabilistic op-erations tran...
We propose an unsupervised learning algorithm for automatically inferring the mappings between Engli...
We address the problem of automatically aligning natural language sentences with cor-responding vide...
Thesis (Ph. D.)--University of Rochester. Department of Computer Science, 2016.Today we encounter la...
Video-language pre-training for text-based video retrieval tasks is vitally important. Previous pre-...
Vision-language alignment learning for video-text retrieval arouses a lot of attention in recent yea...
The objective of this paper is a temporal alignment network that ingests long term video sequences, ...
International audienceSuppose that we are given a set of videos, along with natural language descrip...
We describe a method to align ASL video subtitles with a closed-caption transcript. Our alignments a...
To solve video-and-language grounding tasks, the key is for the network to understand the connection...
Word alignment is an important natural language pro-cessing task that indicates the correspondence b...
Tree-based approaches to alignment model translation as a sequence of probabilistic operations tra...
The goal of this work is to temporally align asynchronous subtitles in sign language videos. In part...
Word alignments identify translational correspondences between words in a parallel sentence pair and...
Word alignment is an important natural language processing task that indicates the correspondence be...
Tree-based approaches to alignment model translation as a sequence of probabilistic op-erations tran...
We propose an unsupervised learning algorithm for automatically inferring the mappings between Engli...
We address the problem of automatically aligning natural language sentences with cor-responding vide...
Thesis (Ph. D.)--University of Rochester. Department of Computer Science, 2016.Today we encounter la...
Video-language pre-training for text-based video retrieval tasks is vitally important. Previous pre-...
Vision-language alignment learning for video-text retrieval arouses a lot of attention in recent yea...
The objective of this paper is a temporal alignment network that ingests long term video sequences, ...
International audienceSuppose that we are given a set of videos, along with natural language descrip...
We describe a method to align ASL video subtitles with a closed-caption transcript. Our alignments a...
To solve video-and-language grounding tasks, the key is for the network to understand the connection...
Word alignment is an important natural language pro-cessing task that indicates the correspondence b...
Tree-based approaches to alignment model translation as a sequence of probabilistic operations tra...
The goal of this work is to temporally align asynchronous subtitles in sign language videos. In part...
Word alignments identify translational correspondences between words in a parallel sentence pair and...
Word alignment is an important natural language processing task that indicates the correspondence be...
Tree-based approaches to alignment model translation as a sequence of probabilistic op-erations tran...