When creating 3D content, highly specialized skills are generally needed to design and generate models of objects and other assets by hand. We address this problem through high-quality 3D asset retrieval from multi-modal inputs, including 2D sketches, images and text. We use CLIP as it provides a bridge to higher-level latent features. We use these features to perform a multi-modality fusion to address the lack of artistic control that affects common data-driven approaches. Our approach allows for multi-modal conditional feature-driven retrieval through a 3D asset database, by utilizing a combination of input latent embeddings. We explore the effects of different combinations of feature embeddings across different input types and weighting ...
[[abstract]]This study proposes a novel cascaded 3D model retrieval framework for automatically and ...
International audienceHand drawn figures are the imprints of shapes in human's mind. How a human exp...
Sketch/Image-based 3D scene retrieval is to retrieve man-made 3D scene models given a user\u27s hand...
Sketch and speech are intuitive interaction methods that convey complementary information and have b...
We present a technique for zero-shot generation of a 3D model using only a target text prompt. Witho...
Retrieving 3D models from 2D human sketches has re-ceived considerable attention in the areas of gra...
In this paper we study, for the first time, the problem of fine-grained sketch-based 3D shape retrie...
Language is one of the primary means by which we describe the 3D world around us. While rapid progre...
We introduce Point-Bind, a 3D multi-modality model aligning point clouds with 2D image, language, au...
Semantic-driven 3D shape generation aims to generate 3D objects conditioned on text. Previous works ...
Sketch-based 3D model retrieval focus on retrieving relevant 3D models using sketch(es) as input. It...
In this thesis, we address the problem of returning target images that match user queries in image r...
Visual imagery is ubiquitous in society and can take various formats: from 2D sketches and photograp...
We explore the task of text to 3D object generation using CLIP. Specifically, we use CLIP for guidan...
We introduce the task of localizing a flexible number of objects in real-world 3D scenes using natur...
[[abstract]]This study proposes a novel cascaded 3D model retrieval framework for automatically and ...
International audienceHand drawn figures are the imprints of shapes in human's mind. How a human exp...
Sketch/Image-based 3D scene retrieval is to retrieve man-made 3D scene models given a user\u27s hand...
Sketch and speech are intuitive interaction methods that convey complementary information and have b...
We present a technique for zero-shot generation of a 3D model using only a target text prompt. Witho...
Retrieving 3D models from 2D human sketches has re-ceived considerable attention in the areas of gra...
In this paper we study, for the first time, the problem of fine-grained sketch-based 3D shape retrie...
Language is one of the primary means by which we describe the 3D world around us. While rapid progre...
We introduce Point-Bind, a 3D multi-modality model aligning point clouds with 2D image, language, au...
Semantic-driven 3D shape generation aims to generate 3D objects conditioned on text. Previous works ...
Sketch-based 3D model retrieval focus on retrieving relevant 3D models using sketch(es) as input. It...
In this thesis, we address the problem of returning target images that match user queries in image r...
Visual imagery is ubiquitous in society and can take various formats: from 2D sketches and photograp...
We explore the task of text to 3D object generation using CLIP. Specifically, we use CLIP for guidan...
We introduce the task of localizing a flexible number of objects in real-world 3D scenes using natur...
[[abstract]]This study proposes a novel cascaded 3D model retrieval framework for automatically and ...
International audienceHand drawn figures are the imprints of shapes in human's mind. How a human exp...
Sketch/Image-based 3D scene retrieval is to retrieve man-made 3D scene models given a user\u27s hand...