Estimating human poses from videos is critical in human-computer interaction. By precisely estimating human poses, the robot can provide an appropriate response to the human. Most existing approaches use the optical flow, RNNs, or CNNs to extract temporal features from videos. Despite the positive results of these attempts, most of them only straightforwardly integrate features along the temporal dimension, ignoring temporal correlations between joints. In contrast to previous methods, we propose a plug-and-play kinematics modeling module (KMM) based on the domain-cross attention mechanism to model the temporal correlation between joints across different frames explicitly. Specifically, the proposed KMM models the temporal correlation betwe...
We present the first deep learning approach to estimate the human skeletal system of the musculoskel...
Denoising diffusion probabilistic models that were initially proposed for realistic image generation...
This thesis presents new methods in two closely related areas of computer vision: human pose estimat...
Video-based human pose estimation (VHPE) is a vital yet challenging task. While deep learning method...
Estimating 3D poses from a monocular video is still a challenging task, despite the significant prog...
We address action recognition in videos by modeling the spatial-temporal structures of human poses. ...
When analyzing human motion videos, the output jitters from existing pose estimators are highly-unba...
We address the problem of articulated human pose es-timation in videos using an ensemble of tractabl...
Our objective is to efficiently and accurately estimate human upper body pose in gesture videos. To ...
We address the problem of articulated human pose es-timation in videos using an ensemble of tractabl...
Estimating 3D human body shapes and poses from videos is a challenging computer vision task. The int...
This thesis presents new methods in two closely related areas of computer vision: human pose estimat...
Human pose forecasting is a complex structured-data sequence-modelling task, which has received incr...
International audienceMost state-of-the-art methods for action recognition rely on a two-stream arch...
The objective of this work is human pose estimation in videos, where multiple frames are available. ...
We present the first deep learning approach to estimate the human skeletal system of the musculoskel...
Denoising diffusion probabilistic models that were initially proposed for realistic image generation...
This thesis presents new methods in two closely related areas of computer vision: human pose estimat...
Video-based human pose estimation (VHPE) is a vital yet challenging task. While deep learning method...
Estimating 3D poses from a monocular video is still a challenging task, despite the significant prog...
We address action recognition in videos by modeling the spatial-temporal structures of human poses. ...
When analyzing human motion videos, the output jitters from existing pose estimators are highly-unba...
We address the problem of articulated human pose es-timation in videos using an ensemble of tractabl...
Our objective is to efficiently and accurately estimate human upper body pose in gesture videos. To ...
We address the problem of articulated human pose es-timation in videos using an ensemble of tractabl...
Estimating 3D human body shapes and poses from videos is a challenging computer vision task. The int...
This thesis presents new methods in two closely related areas of computer vision: human pose estimat...
Human pose forecasting is a complex structured-data sequence-modelling task, which has received incr...
International audienceMost state-of-the-art methods for action recognition rely on a two-stream arch...
The objective of this work is human pose estimation in videos, where multiple frames are available. ...
We present the first deep learning approach to estimate the human skeletal system of the musculoskel...
Denoising diffusion probabilistic models that were initially proposed for realistic image generation...
This thesis presents new methods in two closely related areas of computer vision: human pose estimat...