The past decade has witnessed remarkable progress in image-based, data-driven vision and graphics. However, existing approaches often treat the images as pure 2D signals and not as a 2D projection of the physical 3D world. As a result, a lot of training examples are required to cover sufficiently diverse appearances and inevitably suffer from limited generalization capability. In this thesis, I propose "inference-by-composition" approaches to overcome these limitations by modeling and interpreting visual signals in terms of physical surface, object, and scene. I show how we can incorporate physically grounded constraints such as scene-specific geometry in a non-parametric optimization framework for (1) revealing the missing parts of an imag...
This paper addresses scene understanding in the context of a moving camera, integrating semantic rea...
This dissertation investigates the general structure from motion problem. That is, how to compute in...
In this document, we study how to infer 3D from images captured by a single camera, without assuming...
The past decade has witnessed remarkable progress in image-based, data-driven vision and graphics. H...
Visual analysis is concerned with problems to identify object status or scene layout in images or vi...
265 pagesPhysics-based computer vision can be formulated as an inverse process of graphics rendering...
Many computer vision algorithms limit their performance by ignoring the underlying 3D geometric stru...
Visual data are what make our daily life fun. Often times, we consume those data created by experts ...
This thesis studies how to infer the time-varying 3D structures of generic, deformable objects, and ...
Humans are able to recognize objects in a scene almost effortlessly. Our visual system can easily ha...
This paper addresses scene understanding in the context of a moving camera, integrating semantic rea...
Reconstruction happens in the human brain every day. When humans watch their surrounding scene, they...
Understanding the shape of a scene from a single color image is a formidable computer vision task. H...
Humans are able to recognize objects in a scene almost effortlessly. Our visual system can easily ha...
Generating new, photorealistic views of a scene given only a single video is a difficult task that c...
This paper addresses scene understanding in the context of a moving camera, integrating semantic rea...
This dissertation investigates the general structure from motion problem. That is, how to compute in...
In this document, we study how to infer 3D from images captured by a single camera, without assuming...
The past decade has witnessed remarkable progress in image-based, data-driven vision and graphics. H...
Visual analysis is concerned with problems to identify object status or scene layout in images or vi...
265 pagesPhysics-based computer vision can be formulated as an inverse process of graphics rendering...
Many computer vision algorithms limit their performance by ignoring the underlying 3D geometric stru...
Visual data are what make our daily life fun. Often times, we consume those data created by experts ...
This thesis studies how to infer the time-varying 3D structures of generic, deformable objects, and ...
Humans are able to recognize objects in a scene almost effortlessly. Our visual system can easily ha...
This paper addresses scene understanding in the context of a moving camera, integrating semantic rea...
Reconstruction happens in the human brain every day. When humans watch their surrounding scene, they...
Understanding the shape of a scene from a single color image is a formidable computer vision task. H...
Humans are able to recognize objects in a scene almost effortlessly. Our visual system can easily ha...
Generating new, photorealistic views of a scene given only a single video is a difficult task that c...
This paper addresses scene understanding in the context of a moving camera, integrating semantic rea...
This dissertation investigates the general structure from motion problem. That is, how to compute in...
In this document, we study how to infer 3D from images captured by a single camera, without assuming...