International audienceTypical human actions last several seconds and exhibit characteristic spatio-temporal structure. Recent methods attempt to capture this structure and learn action representations with convolutional neural networks. Such representations, however, are typically learned at the level of a few video frames failing to model actions at their full temporal extent. In this work we learn video representations using neural networks with long-term temporal convolutions (LTC). We demonstrate that LTC-CNN models with increased temporal extents improve the accuracy of action recognition. We also study the impact of different low-level representations, such as raw values of video pixels and optical flow vector fields and demonstrate t...
Lack of temporal information in still images is a major obstacle in still image action recognition. ...
This paper introduces a fusion convolutional architecture for efficient learning of spatio-temporal ...
In this paper we address the problem of human action recognition from video sequences. Inspired by t...
International audienceTypical human actions last several seconds and exhibit characteristic spatio-t...
Recognizing actions according to video features is an important problem in a wide scope of applicati...
Convolutional neural network(CNN) models have been extensively used in recent years to solve the pro...
Convolutional neural network(CNN) models have been extensively used in recent years to solve the pro...
This thesis focuses on video understanding for human action and interaction recognition. We start by...
Deep convolutional neural networks have lately dominated scene understanding tasks, particularly tho...
Deep convolutional neural networks have lately dominated scene understanding tasks, particularly tho...
Human actions captured in video sequences contain two crucial factors for action recognition, i.e., ...
Video-based action recognition with deep neural networks has shown remarkable progress. However, mos...
Human action recognition plays a crucial role in various applications, including video surveillance,...
Hand-crafted feature functions are usually designed based on the domain knowledge of a presumably co...
Action recognition requires the accurate analysis of action elements in the form of a video clip and...
Lack of temporal information in still images is a major obstacle in still image action recognition. ...
This paper introduces a fusion convolutional architecture for efficient learning of spatio-temporal ...
In this paper we address the problem of human action recognition from video sequences. Inspired by t...
International audienceTypical human actions last several seconds and exhibit characteristic spatio-t...
Recognizing actions according to video features is an important problem in a wide scope of applicati...
Convolutional neural network(CNN) models have been extensively used in recent years to solve the pro...
Convolutional neural network(CNN) models have been extensively used in recent years to solve the pro...
This thesis focuses on video understanding for human action and interaction recognition. We start by...
Deep convolutional neural networks have lately dominated scene understanding tasks, particularly tho...
Deep convolutional neural networks have lately dominated scene understanding tasks, particularly tho...
Human actions captured in video sequences contain two crucial factors for action recognition, i.e., ...
Video-based action recognition with deep neural networks has shown remarkable progress. However, mos...
Human action recognition plays a crucial role in various applications, including video surveillance,...
Hand-crafted feature functions are usually designed based on the domain knowledge of a presumably co...
Action recognition requires the accurate analysis of action elements in the form of a video clip and...
Lack of temporal information in still images is a major obstacle in still image action recognition. ...
This paper introduces a fusion convolutional architecture for efficient learning of spatio-temporal ...
In this paper we address the problem of human action recognition from video sequences. Inspired by t...