This thesis presents a novel self-supervised approach of learning visual representations from videos containing human actions. Our approach tackles the complex problem of learning without the need of labeled data by exploring to what extent the ideas successfully used for images can be transferred, adapted and extended to videos for action recognition purposes. We begin by giving a brief introduction to the topic of learning features without having access to a labeled corpora, providing the motivation of our work. We continue with presenting the related research in terms of contrastive learning, action recognition from videos with 3D convolutions and self-supervised techniques for both images and videos. Next, we formalize our approach with...
Unsupervised domain adaptation (UDA) methods have become very popular in computer vision. However, w...
International audienceActivity recognition in video sequences is a difficult problem due to the comp...
International audienceThis paper exploits the context of natural dynamic scenes for human action rec...
With the rapid advancement of deep learning techniques in computer vision, researchers have achieved...
International audienceIn this paper, we propose a self-supervised method for video representation le...
The remarkable success of deep learning in various domains relies on the availability of large-scale...
Static image action recognition, which aims to recognize action based on a single image, usually rel...
The objective of this paper is visual-only self-supervised video representation learning. We make th...
Modern Computer Vision systems learn visual concepts through examples (i.e. images) which have been ...
In the image domain, excellent representations can be learned by inducing invariance to content-pres...
Human behavior understanding is a fundamental problem of computer vision. It is an important compone...
The aim of this paper is to address recognition of natural human actions in diverse and realistic vi...
Human action recognition in videos draws strong research interest in computer vision because of its ...
International audienceThe aim of this paper is to address recognition of natural human actions in di...
This work addresses the problem of recognizing actions and interactions in realistic video settings ...
Unsupervised domain adaptation (UDA) methods have become very popular in computer vision. However, w...
International audienceActivity recognition in video sequences is a difficult problem due to the comp...
International audienceThis paper exploits the context of natural dynamic scenes for human action rec...
With the rapid advancement of deep learning techniques in computer vision, researchers have achieved...
International audienceIn this paper, we propose a self-supervised method for video representation le...
The remarkable success of deep learning in various domains relies on the availability of large-scale...
Static image action recognition, which aims to recognize action based on a single image, usually rel...
The objective of this paper is visual-only self-supervised video representation learning. We make th...
Modern Computer Vision systems learn visual concepts through examples (i.e. images) which have been ...
In the image domain, excellent representations can be learned by inducing invariance to content-pres...
Human behavior understanding is a fundamental problem of computer vision. It is an important compone...
The aim of this paper is to address recognition of natural human actions in diverse and realistic vi...
Human action recognition in videos draws strong research interest in computer vision because of its ...
International audienceThe aim of this paper is to address recognition of natural human actions in di...
This work addresses the problem of recognizing actions and interactions in realistic video settings ...
Unsupervised domain adaptation (UDA) methods have become very popular in computer vision. However, w...
International audienceActivity recognition in video sequences is a difficult problem due to the comp...
International audienceThis paper exploits the context of natural dynamic scenes for human action rec...