This work deals with audio-visual video recognition using machine learning. A general audio-visual video recognition system first extracts auditory and visual feature descriptors, then represents the extracted bi-modal features using feature encoding techniques, and finally performs recognition using a machine learning classifier. This work adapts a similar pipe-line, contributing to the first two major components: visual feature extraction and global feature representation. Visual feature extraction is a vital step in video recognition. In general, the visual feature extraction starts by detecting spatio-temporal interest points where the features are most discriminative in a video. There are a few problems associated with existing spatio-...
Feature point detection and local feature extraction are the two critical steps in trajectory-based ...
Many applications need to deal with a scene that evolves over time. This thesis addresses the dynami...
Human action recognition plays a crucial role in visual learning applications such as video understa...
Audio-visual recognition systems rely on efficient feature extraction. Many spatio-temporal interest...
Representation learning is a fundamental research problem in the area of machine learning, refining ...
The aim of this PhD thesis is to make a step forward towards teaching computers to understand videos...
Video sequence classification is the process of recognizing the semantic labels of the given video s...
University of Technology Sydney. Faculty of Engineering and Information Technology.Video understandi...
Feature encoding has been extensively studied for the task of visual action recognition (VAR). The r...
Computing descriptors for videos is a crucial task in computer vision. In this paper, we propose a g...
Computing descriptors for videos is a crucial task in computer vision. In this paper, we propose a g...
© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for a...
This paper has been presented at : 25th IEEE International Conference on Image Processing (ICIP 2018...
This dissertation proposes a general framework to efficiently identify the objects of interest (OI) ...
This article presents a new method for violent scene detection using super descriptor tensor decompo...
Feature point detection and local feature extraction are the two critical steps in trajectory-based ...
Many applications need to deal with a scene that evolves over time. This thesis addresses the dynami...
Human action recognition plays a crucial role in visual learning applications such as video understa...
Audio-visual recognition systems rely on efficient feature extraction. Many spatio-temporal interest...
Representation learning is a fundamental research problem in the area of machine learning, refining ...
The aim of this PhD thesis is to make a step forward towards teaching computers to understand videos...
Video sequence classification is the process of recognizing the semantic labels of the given video s...
University of Technology Sydney. Faculty of Engineering and Information Technology.Video understandi...
Feature encoding has been extensively studied for the task of visual action recognition (VAR). The r...
Computing descriptors for videos is a crucial task in computer vision. In this paper, we propose a g...
Computing descriptors for videos is a crucial task in computer vision. In this paper, we propose a g...
© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for a...
This paper has been presented at : 25th IEEE International Conference on Image Processing (ICIP 2018...
This dissertation proposes a general framework to efficiently identify the objects of interest (OI) ...
This article presents a new method for violent scene detection using super descriptor tensor decompo...
Feature point detection and local feature extraction are the two critical steps in trajectory-based ...
Many applications need to deal with a scene that evolves over time. This thesis addresses the dynami...
Human action recognition plays a crucial role in visual learning applications such as video understa...