We study a fundamental question for developing storytelling machines: what vocabulary is suited for machines to tell the story of a video? We start by manually specifying the vocabulary concepts and their annotations. In order to effectively handcraft the vocabulary, we empirically study what are the best practices for handcrafting the vocabulary for video storytelling? From our analysis, we conclude that for an effective storytelling the vocabulary should encompass over thousands of concepts from various types, which are trained and normalized in an appropriate way. Creating such a handcrafted vocabulary of concepts is labor intensive. We alleviate the manual labor by addressing the next research question: can a machine learn novel video c...
There is a great need to automatically segment, categorize, and annotate video data, and to develop ...
Story generation systems rely heavily on their knowledge base in order to come up with stories. Most...
Online video constitutes the largest, continuously growing portion of the Web content. Web users dri...
Humans spend a large amount of time listening, watching, and reading stories. We argue that the abil...
| openaire: EC/H2020/780069/EU//MeMADThis chapter focuses on the recent surge of interest in automat...
We combine in this paper automatic learning of a large lexicon of semantic concepts with traditional...
Stories are diverse and highly personalized, resulting in a large possible output space for story ge...
Humans can easily describe what they see in a coherent way and at varying level of detail. However, ...
Videos are used often for communicating ideas, concepts, experience, and situations, because of the ...
Humans can easily describe what they see in a coher-ent way and at varying level of detail. However,...
People typically learn through exposure to visual facts associated with linguistic descriptions. For...
This paper focuses on developing a creative storytelling agent that makes use of commonsense knowled...
Linking natural language to visual data is an important topic at the intersection of Natural Languag...
In the past, video production has had three distinct phases: content collection, logging, and video ...
Abstract — In this paper, we propose an automatic video retrieval method based on high-level concept...
There is a great need to automatically segment, categorize, and annotate video data, and to develop ...
Story generation systems rely heavily on their knowledge base in order to come up with stories. Most...
Online video constitutes the largest, continuously growing portion of the Web content. Web users dri...
Humans spend a large amount of time listening, watching, and reading stories. We argue that the abil...
| openaire: EC/H2020/780069/EU//MeMADThis chapter focuses on the recent surge of interest in automat...
We combine in this paper automatic learning of a large lexicon of semantic concepts with traditional...
Stories are diverse and highly personalized, resulting in a large possible output space for story ge...
Humans can easily describe what they see in a coherent way and at varying level of detail. However, ...
Videos are used often for communicating ideas, concepts, experience, and situations, because of the ...
Humans can easily describe what they see in a coher-ent way and at varying level of detail. However,...
People typically learn through exposure to visual facts associated with linguistic descriptions. For...
This paper focuses on developing a creative storytelling agent that makes use of commonsense knowled...
Linking natural language to visual data is an important topic at the intersection of Natural Languag...
In the past, video production has had three distinct phases: content collection, logging, and video ...
Abstract — In this paper, we propose an automatic video retrieval method based on high-level concept...
There is a great need to automatically segment, categorize, and annotate video data, and to develop ...
Story generation systems rely heavily on their knowledge base in order to come up with stories. Most...
Online video constitutes the largest, continuously growing portion of the Web content. Web users dri...