In this paper, we present a new approach for classifying video content into semantic classes at a high level of abstraction by exploiting the connoted visual code. The method is based on the concept of supervised learning algorithms that have already been applied for the classification of written text and spoken language quite successfully. In order to extend this approach for classifying video content, a visual analog to words is constructed from signal-level visual features. A common bag-of-words approach is applied in order to represent video documents. Subsequently, support vector machines are trained to categorize the documents into known classes by using the proposed visual words. Experimental results indicating the classification per...
Human action recognition in videos is becoming more and more popular in applications such as intelli...
This paper describes the second participation of the SINAI research group in the VideoCLEF track. Th...
Abstract Video data are usually represented by high dimensional features. The performance of video s...
Vision to language problems, such as video annotation, or visual question answering, stand out from ...
Automated content analysis has been growing popular in the research field given the vast and increas...
With the explosive growth of online videos, automatic real-time categorization of Web videos plays a...
The work presented here introduces a real-time automatic scene classifier within content-based video...
Generating natural language descriptions for visual data links computer vision and computational lin...
In the recent years, computer vision has been undergoing a period of great development, testified by...
We address the problem of classifying complex videos based on their content. A typical approach to t...
The advent of MOOC platforms brought an abundance of video educational content that made the selecti...
For online medical education purposes, we have developed a novel scheme to incorporate the results o...
In this paper we introduce two novel methods for object recognition from video. Our major contributi...
With the number of videos growing rapidly in modern society, automatically recognizing objects from ...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
Human action recognition in videos is becoming more and more popular in applications such as intelli...
This paper describes the second participation of the SINAI research group in the VideoCLEF track. Th...
Abstract Video data are usually represented by high dimensional features. The performance of video s...
Vision to language problems, such as video annotation, or visual question answering, stand out from ...
Automated content analysis has been growing popular in the research field given the vast and increas...
With the explosive growth of online videos, automatic real-time categorization of Web videos plays a...
The work presented here introduces a real-time automatic scene classifier within content-based video...
Generating natural language descriptions for visual data links computer vision and computational lin...
In the recent years, computer vision has been undergoing a period of great development, testified by...
We address the problem of classifying complex videos based on their content. A typical approach to t...
The advent of MOOC platforms brought an abundance of video educational content that made the selecti...
For online medical education purposes, we have developed a novel scheme to incorporate the results o...
In this paper we introduce two novel methods for object recognition from video. Our major contributi...
With the number of videos growing rapidly in modern society, automatically recognizing objects from ...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
Human action recognition in videos is becoming more and more popular in applications such as intelli...
This paper describes the second participation of the SINAI research group in the VideoCLEF track. Th...
Abstract Video data are usually represented by high dimensional features. The performance of video s...