We tackle the task of reconstructing hand-object interactions from short video clips. Given an input video, our approach casts 3D inference as a per-video optimization and recovers a neural 3D representation of the object shape, as well as the time-varying motion and hand articulation. While the input video naturally provides some multi-view cues to guide 3D inference, these are insufficient on their own due to occlusions and limited viewpoint variations. To obtain accurate 3D, we augment the multi-view signals with generic data-driven priors to guide reconstruction. Specifically, we learn a diffusion network to model the conditional distribution of (geometric) renderings of objects conditioned on hand configuration and category label, and ...
We propose a method for object-aware 3D egocentric pose estimation that tightly integrates kinematic...
Egocentric cameras are becoming more popular, intro-ducing increasing volumes of video in which the ...
3D hand pose estimation aims at recovering 3D coordinates of joints or mesh vertices of hand from vi...
International audienceModeling hand-object manipulations is essential for understanding how humans i...
Reconstructing interacting hands from monocular RGB data is a challenging task, as it involves many ...
Reconstructing two-hand interactions from a single image is a challenging problem due to ambiguities...
We aim to teach robots to perform simple object manipulation tasks by watching a single video demons...
Learning deformable 3D objects from 2D images is often an ill-posed problem. Existing methods rely o...
Grasping with anthropomorphic robotic hands involves much more hand-object interactions compared to ...
This paper addresses a novel task of anticipating 3D human-object interactions (HOIs). Most existing...
Previous works concerning single-view hand-held object reconstruction typically utilize supervision ...
Recent progress in 3D scene understanding enables scalable learning of representations across large ...
A large number of works in egocentric vision have concentrated on action and object recognition. Det...
We introduce a simple and effective network architecture for monocular 3D hand pose estimation consi...
Reconstructing two-hand interactions from a single image is a challengingproblem due to ambiguities ...
We propose a method for object-aware 3D egocentric pose estimation that tightly integrates kinematic...
Egocentric cameras are becoming more popular, intro-ducing increasing volumes of video in which the ...
3D hand pose estimation aims at recovering 3D coordinates of joints or mesh vertices of hand from vi...
International audienceModeling hand-object manipulations is essential for understanding how humans i...
Reconstructing interacting hands from monocular RGB data is a challenging task, as it involves many ...
Reconstructing two-hand interactions from a single image is a challenging problem due to ambiguities...
We aim to teach robots to perform simple object manipulation tasks by watching a single video demons...
Learning deformable 3D objects from 2D images is often an ill-posed problem. Existing methods rely o...
Grasping with anthropomorphic robotic hands involves much more hand-object interactions compared to ...
This paper addresses a novel task of anticipating 3D human-object interactions (HOIs). Most existing...
Previous works concerning single-view hand-held object reconstruction typically utilize supervision ...
Recent progress in 3D scene understanding enables scalable learning of representations across large ...
A large number of works in egocentric vision have concentrated on action and object recognition. Det...
We introduce a simple and effective network architecture for monocular 3D hand pose estimation consi...
Reconstructing two-hand interactions from a single image is a challengingproblem due to ambiguities ...
We propose a method for object-aware 3D egocentric pose estimation that tightly integrates kinematic...
Egocentric cameras are becoming more popular, intro-ducing increasing volumes of video in which the ...
3D hand pose estimation aims at recovering 3D coordinates of joints or mesh vertices of hand from vi...