Self-supervised learning has demonstrated remarkable capability in representation learning for skeleton-based action recognition. Existing methods mainly focus on applying global data augmentation to generate different views of the skeleton sequence for contrastive learning. However, due to the rich action clues in the skeleton sequences, existing methods may only take a global perspective to learn to discriminate different skeletons without thoroughly leveraging the local relationship between different skeleton joints and video frames, which is essential for real-world applications. In this work, we propose a Partial Spatio-Temporal Learning (PSTL) framework to exploit the local relationship from a partial skeleton sequences built by a uni...
Skeleton-based human action recognition has made great progress, especially with the development of ...
Self-supervised learning has proved effective for skeleton-based human action understanding, which i...
Spatio-temporal convolution often fails to learn motion dynamics in videos and thus an effective mot...
Self-supervised skeleton-based action recognition with contrastive learning has attracted much atten...
In this work, we study self-supervised representation learning for 3D skeleton-based action recognit...
This paper presents a new representation of skeleton sequences for 3D action recognition. Existing m...
This paper presents a new representation of skeleton sequences for 3D action recognition. Existing m...
© 2017 IEEE. This paper presents a new method for 3D action recognition with skeleton sequences (i.e...
In recent years, skeleton based action recognition is becoming an increasingly attractive alternativ...
In recent years, skeleton-based action recognition has achieved remarkable performance in understand...
Over the past few years, skeleton-based action recognition has attracted great success because the s...
Abstract Skeleton‐based neural networks have been considered a focus for human action recognition (H...
Contrastive learning has received increasing attention in the field of skeleton-based action represe...
Human action recognition (HAR) by skeleton data is considered a potential research aspect in compute...
In this paper we studied the influence of adding skeleton data on top of human actions videos when p...
Skeleton-based human action recognition has made great progress, especially with the development of ...
Self-supervised learning has proved effective for skeleton-based human action understanding, which i...
Spatio-temporal convolution often fails to learn motion dynamics in videos and thus an effective mot...
Self-supervised skeleton-based action recognition with contrastive learning has attracted much atten...
In this work, we study self-supervised representation learning for 3D skeleton-based action recognit...
This paper presents a new representation of skeleton sequences for 3D action recognition. Existing m...
This paper presents a new representation of skeleton sequences for 3D action recognition. Existing m...
© 2017 IEEE. This paper presents a new method for 3D action recognition with skeleton sequences (i.e...
In recent years, skeleton based action recognition is becoming an increasingly attractive alternativ...
In recent years, skeleton-based action recognition has achieved remarkable performance in understand...
Over the past few years, skeleton-based action recognition has attracted great success because the s...
Abstract Skeleton‐based neural networks have been considered a focus for human action recognition (H...
Contrastive learning has received increasing attention in the field of skeleton-based action represe...
Human action recognition (HAR) by skeleton data is considered a potential research aspect in compute...
In this paper we studied the influence of adding skeleton data on top of human actions videos when p...
Skeleton-based human action recognition has made great progress, especially with the development of ...
Self-supervised learning has proved effective for skeleton-based human action understanding, which i...
Spatio-temporal convolution often fails to learn motion dynamics in videos and thus an effective mot...