We address the problem of capturing temporal information for video classification in 2D networks, without increasing their computational cost. Existing approaches focus on modifying the architecture of 2D networks (e.g. by including filters in the temporal dimension to turn them into 3D networks, or using optical flow, etc.), which increases computation cost. Instead, we propose a novel sampling strategy, where we re-order the channels of the input video, to capture short-term frame-to-frame changes. We observe that without bells and whistles, the proposed sampling strategy improves performance on multiple architectures (e.g. TSN, TRN, TSM, and MVFNet) and datasets (CATER, Something-Something-V1 and V2), up to 24% over the baseline of using...
Though action recognition in videos has achieved great success recently, it remains a challenging ta...
This paper addresses the recognitions of human actions in videos. Human action recognition can be se...
Human action recognition is nowadays within the most active computer vision research areas. The pro...
In this dissertation, I present my work towards exploring temporal information for better video unde...
This thesis focuses on video understanding for human action and interaction recognition. We start by...
© 1991-2012 IEEE. Encouraged by the success of convolutional neural networks (CNNs) in image classif...
Efficiency is an important issue in designing video architectures for action recognition. 3D CNNs ha...
IEEE The explosive growth in video streaming requires video understanding at high accuracy and low c...
Technological innovation in the field of video action recognition drives the development of video-ba...
Current deep learning based video classification architectures are typically trained end-to-end on l...
Encouraged by the success of Convolutional Neural Networks (CNNs) in image classification, recently ...
The task of spatial-temporal action detection has attracted increasing researchers. Existing dominan...
Despite their great predictive capability, Convolutional Neural Networks (CNNs) are computational-ex...
Most video based action recognition approaches create the video-level representation by temporally p...
In this work, the authors propose several techniques for accelerating a modern action recognition pi...
Though action recognition in videos has achieved great success recently, it remains a challenging ta...
This paper addresses the recognitions of human actions in videos. Human action recognition can be se...
Human action recognition is nowadays within the most active computer vision research areas. The pro...
In this dissertation, I present my work towards exploring temporal information for better video unde...
This thesis focuses on video understanding for human action and interaction recognition. We start by...
© 1991-2012 IEEE. Encouraged by the success of convolutional neural networks (CNNs) in image classif...
Efficiency is an important issue in designing video architectures for action recognition. 3D CNNs ha...
IEEE The explosive growth in video streaming requires video understanding at high accuracy and low c...
Technological innovation in the field of video action recognition drives the development of video-ba...
Current deep learning based video classification architectures are typically trained end-to-end on l...
Encouraged by the success of Convolutional Neural Networks (CNNs) in image classification, recently ...
The task of spatial-temporal action detection has attracted increasing researchers. Existing dominan...
Despite their great predictive capability, Convolutional Neural Networks (CNNs) are computational-ex...
Most video based action recognition approaches create the video-level representation by temporally p...
In this work, the authors propose several techniques for accelerating a modern action recognition pi...
Though action recognition in videos has achieved great success recently, it remains a challenging ta...
This paper addresses the recognitions of human actions in videos. Human action recognition can be se...
Human action recognition is nowadays within the most active computer vision research areas. The pro...