We present a generative model of images that explicitly reasons over the set of objects they show. Our model learns a structured latent representation that separates objects from each other and from the background; unlike prior works, it explicitly represents the 2D position and depth of each object, as well as an embedding of its segmentation mask and appearance. The model can be trained from images alone in a purely unsupervised fashion without the need for object masks or depth information. Moreover, it always generates complete objects, even though a significant fraction of training images contain occlusions. Finally, we show that our model can infer decompositions of novel images into their constituent objects, including accurate predi...
A layout to image (L2I) generation model aims to generate a complicated image containing multiple ob...
We study the problem of learning generative models of 3D shapes. Voxels or 3D parts have been widely...
A natural approach to generative modeling of videos is to represent them as a composition of moving ...
We present a generative model of images that explicitly reasons over the set of objects they show. O...
We introduce a framework to learn object segmentation from a collection of images without any manual...
Natural images arise from complicated processes involving many factors of variation. They reflect th...
We study unsupervised learning in a probabilistic generative model for occlusion. The model uses two...
The goal of scene understanding is to capture the full content of an image in a human-interpretable ...
Learning how to model complex scenes in a modular way with recombinable components is a pre-requisit...
We study unsupervised learning of occluding objects in images of visual scenes. The derived learning...
One of the fundamental problems of computer vision is to detect and localize objectssuch as humans a...
Deep learning has been widely used in real-life applications during the last few decades, such as fa...
Visual perception is a fundamental task of computer vision. Subtasks within perception can be decomp...
Visual object recognition is one of the key human capabilities that we would like machines to have. ...
Many scene understanding tasks are formulated as a labelling problem that tries to assign a label to...
A layout to image (L2I) generation model aims to generate a complicated image containing multiple ob...
We study the problem of learning generative models of 3D shapes. Voxels or 3D parts have been widely...
A natural approach to generative modeling of videos is to represent them as a composition of moving ...
We present a generative model of images that explicitly reasons over the set of objects they show. O...
We introduce a framework to learn object segmentation from a collection of images without any manual...
Natural images arise from complicated processes involving many factors of variation. They reflect th...
We study unsupervised learning in a probabilistic generative model for occlusion. The model uses two...
The goal of scene understanding is to capture the full content of an image in a human-interpretable ...
Learning how to model complex scenes in a modular way with recombinable components is a pre-requisit...
We study unsupervised learning of occluding objects in images of visual scenes. The derived learning...
One of the fundamental problems of computer vision is to detect and localize objectssuch as humans a...
Deep learning has been widely used in real-life applications during the last few decades, such as fa...
Visual perception is a fundamental task of computer vision. Subtasks within perception can be decomp...
Visual object recognition is one of the key human capabilities that we would like machines to have. ...
Many scene understanding tasks are formulated as a labelling problem that tries to assign a label to...
A layout to image (L2I) generation model aims to generate a complicated image containing multiple ob...
We study the problem of learning generative models of 3D shapes. Voxels or 3D parts have been widely...
A natural approach to generative modeling of videos is to represent them as a composition of moving ...