Most action recognition models today are highly parameterized, and evaluated on datasets with predominantly spatially distinct classes. It has also been shown that 2D Convolutional Neural Networks (CNNs) tend to be biased toward texture rather than shape in still image recognition tasks. Taken together, this raises suspicion that large video models partly learn spurious correlations rather than to track relevant shapes over time to infer generalizable semantics from their movement. A natural way to avoid parameter explosion when learning visual patterns over time is to make use of recurrence. In this article, we empirically study whether the choice of low-level temporal modeling has consequences for texture bias and cross-domain robustness....
International audienceTypical human actions last several seconds and exhibit characteristic spatio-t...
Effective processing of video input is essential for the recognition of temporally varying events su...
The object of this research work is to address some of the issues affecting vision based human acti...
This thesis focuses on video understanding for human action and interaction recognition. We start by...
3D convolutional networks, as direct inheritors of 2D convolutional networks for images, have placed...
3D convolutional networks, as direct inheritors of 2D convolutional networks for images, have placed...
3D convolutional networks, as direct inheritors of 2D convolutional networks for images, have placed...
Convolutional neural networks (CNNs) have achieved high accuracy on several different perceptual tas...
Action classification has made great progress, but segmenting and recognizing actions from long untr...
Action recognition has been an active research topic for over three decades. There are various appli...
Convolutional neural network(CNN) models have been extensively used in recent years to solve the pro...
Convolutional neural network(CNN) models have been extensively used in recent years to solve the pro...
Effective processing of video input is essential for the recognition of temporally varying events su...
International audienceTypical human actions last several seconds and exhibit characteristic spatio-t...
Fine-grained action recognition is a challenging task in computer vision. As fine-grained datasets h...
International audienceTypical human actions last several seconds and exhibit characteristic spatio-t...
Effective processing of video input is essential for the recognition of temporally varying events su...
The object of this research work is to address some of the issues affecting vision based human acti...
This thesis focuses on video understanding for human action and interaction recognition. We start by...
3D convolutional networks, as direct inheritors of 2D convolutional networks for images, have placed...
3D convolutional networks, as direct inheritors of 2D convolutional networks for images, have placed...
3D convolutional networks, as direct inheritors of 2D convolutional networks for images, have placed...
Convolutional neural networks (CNNs) have achieved high accuracy on several different perceptual tas...
Action classification has made great progress, but segmenting and recognizing actions from long untr...
Action recognition has been an active research topic for over three decades. There are various appli...
Convolutional neural network(CNN) models have been extensively used in recent years to solve the pro...
Convolutional neural network(CNN) models have been extensively used in recent years to solve the pro...
Effective processing of video input is essential for the recognition of temporally varying events su...
International audienceTypical human actions last several seconds and exhibit characteristic spatio-t...
Fine-grained action recognition is a challenging task in computer vision. As fine-grained datasets h...
International audienceTypical human actions last several seconds and exhibit characteristic spatio-t...
Effective processing of video input is essential for the recognition of temporally varying events su...
The object of this research work is to address some of the issues affecting vision based human acti...