This paper integrates techniques in natural language processing and computer vision to improve recognition and description of entities and activities in real-world videos. We propose a strategy for generating textual descriptions of videos by using a factor graph to combine visual detections with language statistics. We use state-of-the-art visual recognition systems to obtain confidences on entities, activities, and scenes present in the video. Our factor graph model combines these detection confidences with probabilistic knowledge mined from text corpora to estimate the most likely subject, verb, object, and place. Results on YouTube videos show that our approach im-proves both the joint detection of these latent, diverse sentence compone...
We present a holistic data-driven technique that gener-ates natural-language descriptions for videos...
There has been continuous growth in the volume and ubiquity of video material. It has become essenti...
The problem of describing images through natural lan-guage has gained importance in the computer vis...
This paper integrates techniques in natural language processing and computer vision to improve recog...
This paper integrates techniques in natural language processing and computer vision to improve recog...
Generating natural language descriptions for visual data links computer vision and computational lin...
Generating natural language descriptions for visual data links computer vision and computational lin...
Humans use rich natural language to describe and communicate visual perceptions. In order to provide...
Humans use rich natural language to describe and communicate visual perceptions. In order to provide...
Humans use rich natural language to describe and communicate visual perceptions. In order to provide...
We present a holistic data-driven technique that generates natural-language descriptions for videos....
Humans use rich natural language to describe and com-municate visual perceptions. In order to provid...
Humans use rich natural language to describe and com-municate visual perceptions. In order to provid...
Linking natural language to visual data is an important topic at the intersection of Natural Languag...
This electronic version was submitted by the student author. The certified thesis is available in th...
We present a holistic data-driven technique that gener-ates natural-language descriptions for videos...
There has been continuous growth in the volume and ubiquity of video material. It has become essenti...
The problem of describing images through natural lan-guage has gained importance in the computer vis...
This paper integrates techniques in natural language processing and computer vision to improve recog...
This paper integrates techniques in natural language processing and computer vision to improve recog...
Generating natural language descriptions for visual data links computer vision and computational lin...
Generating natural language descriptions for visual data links computer vision and computational lin...
Humans use rich natural language to describe and communicate visual perceptions. In order to provide...
Humans use rich natural language to describe and communicate visual perceptions. In order to provide...
Humans use rich natural language to describe and communicate visual perceptions. In order to provide...
We present a holistic data-driven technique that generates natural-language descriptions for videos....
Humans use rich natural language to describe and com-municate visual perceptions. In order to provid...
Humans use rich natural language to describe and com-municate visual perceptions. In order to provid...
Linking natural language to visual data is an important topic at the intersection of Natural Languag...
This electronic version was submitted by the student author. The certified thesis is available in th...
We present a holistic data-driven technique that gener-ates natural-language descriptions for videos...
There has been continuous growth in the volume and ubiquity of video material. It has become essenti...
The problem of describing images through natural lan-guage has gained importance in the computer vis...