Recently, the vision transformer and its variants have played an increasingly important role in both monocular and multi-view human pose estimation. Considering image patches as tokens, transformers can model the global dependencies within the entire image or across images from other views. However, global attention is computationally expensive. As a consequence, it is difficult to scale up these transformer-based methods to high-resolution features and many views. In this paper, we propose the token-Pruned Pose Transformer (PPT) for 2D human pose estimation, which can locate a rough human mask and performs self-attention only within selected tokens. Furthermore, we extend our PPT to multi-view human pose estimation. Built upon PPT, we pr...
3D human pose estimation is a widely researched computer vision task that could be applied in scenar...
We present an innovative approach to 3D Human Pose Estimation (3D-HPE) by integrating cutting-edge d...
Recently, vision transformers have shown great success in 2D human pose estimation (2D HPE), 3D huma...
Multi-person Pose Estimation is essential for several computer vision tasks related to motion analys...
The state-of-the-art for monocular 3D human pose esti- mation in videos is dominated by the paradigm...
We propose a direct, regression-based approach to 2D human pose estimation from single images. We fo...
In this paper, we introduce a set of effective TOken REduction (TORE) strategies for Transformer-bas...
Multi-person pose estimation generally follows top-down and bottom-up paradigms. Both of them use an...
This paper proposes a unified framework dubbed Multi-view and Temporal Fusing Transformer (MTF-Trans...
Human motion capture either requires multi-camera systems or is unreliable using single-view input d...
Human pose estimation (HPE) is a classical task in the field of computer vision. Applications develo...
While the voxel-based methods have achieved promising results for multi-person 3D pose estimation fr...
This paper presents Volumetric Transformer Pose estimator (VTP), the first 3D volumetric transformer...
Estimating 3D human poses from monocular videos is a challenging task due to depth ambiguity and sel...
Existing volumetric methods for predicting 3D human pose estimation are accurate, but computationall...
3D human pose estimation is a widely researched computer vision task that could be applied in scenar...
We present an innovative approach to 3D Human Pose Estimation (3D-HPE) by integrating cutting-edge d...
Recently, vision transformers have shown great success in 2D human pose estimation (2D HPE), 3D huma...
Multi-person Pose Estimation is essential for several computer vision tasks related to motion analys...
The state-of-the-art for monocular 3D human pose esti- mation in videos is dominated by the paradigm...
We propose a direct, regression-based approach to 2D human pose estimation from single images. We fo...
In this paper, we introduce a set of effective TOken REduction (TORE) strategies for Transformer-bas...
Multi-person pose estimation generally follows top-down and bottom-up paradigms. Both of them use an...
This paper proposes a unified framework dubbed Multi-view and Temporal Fusing Transformer (MTF-Trans...
Human motion capture either requires multi-camera systems or is unreliable using single-view input d...
Human pose estimation (HPE) is a classical task in the field of computer vision. Applications develo...
While the voxel-based methods have achieved promising results for multi-person 3D pose estimation fr...
This paper presents Volumetric Transformer Pose estimator (VTP), the first 3D volumetric transformer...
Estimating 3D human poses from monocular videos is a challenging task due to depth ambiguity and sel...
Existing volumetric methods for predicting 3D human pose estimation are accurate, but computationall...
3D human pose estimation is a widely researched computer vision task that could be applied in scenar...
We present an innovative approach to 3D Human Pose Estimation (3D-HPE) by integrating cutting-edge d...
Recently, vision transformers have shown great success in 2D human pose estimation (2D HPE), 3D huma...