Despite great progress in object detection, most existing methods work only on a limited set of object categories, due to the tremendous human effort needed for bounding-box annotations of training data. To alleviate the problem, recent open vocabulary and zero-shot detection methods attempt to detect novel object categories beyond those seen during training. They achieve this goal by training on a pre-defined base categories to induce generalization to novel objects. However, their potential is still constrained by the small set of base categories available for training. To enlarge the set of base classes, we propose a method to automatically generate pseudo bounding-box annotations of diverse objects from large-scale image-caption pairs. ...
Open-vocabulary object detection (OVD) aims to scale up vocabulary size to detect objects of novel c...
We address the problem of training Object Detection models using significantly less bounding box ann...
We propose a new setting for detecting unseen objects called Zero-shot Annotation object Detection (...
Building robust and generic object detection frameworks requires scaling to larger label spaces and ...
The goal of this work is to establish a scalable pipeline for expanding an object detector towards n...
Existing object detection methods are bounded in a fixed-set vocabulary by costly labeled data. When...
In the field of visual scene understanding, deep neural networks have made impressive advancements i...
We aim at advancing open-vocabulary object detection, which detects objects described by arbitrary t...
Combining simple architectures with large-scale pre-training has led to massive improvements in imag...
Open-vocabulary detection (OVD) is a new object detection paradigm, aiming to localize and recognize...
Open-set object detection aims at detecting arbitrary categories beyond those seen during training. ...
A central task in computer vision is detecting object classes such as cars and horses in complex sc...
Inspired by the success of vision-language methods (VLMs) in zero-shot classification, recent works ...
Open-vocabulary object detection, which is concerned with the problem of detecting novel objects gui...
Object detection is a fundamental computer vision task that estimates object classification labels a...
Open-vocabulary object detection (OVD) aims to scale up vocabulary size to detect objects of novel c...
We address the problem of training Object Detection models using significantly less bounding box ann...
We propose a new setting for detecting unseen objects called Zero-shot Annotation object Detection (...
Building robust and generic object detection frameworks requires scaling to larger label spaces and ...
The goal of this work is to establish a scalable pipeline for expanding an object detector towards n...
Existing object detection methods are bounded in a fixed-set vocabulary by costly labeled data. When...
In the field of visual scene understanding, deep neural networks have made impressive advancements i...
We aim at advancing open-vocabulary object detection, which detects objects described by arbitrary t...
Combining simple architectures with large-scale pre-training has led to massive improvements in imag...
Open-vocabulary detection (OVD) is a new object detection paradigm, aiming to localize and recognize...
Open-set object detection aims at detecting arbitrary categories beyond those seen during training. ...
A central task in computer vision is detecting object classes such as cars and horses in complex sc...
Inspired by the success of vision-language methods (VLMs) in zero-shot classification, recent works ...
Open-vocabulary object detection, which is concerned with the problem of detecting novel objects gui...
Object detection is a fundamental computer vision task that estimates object classification labels a...
Open-vocabulary object detection (OVD) aims to scale up vocabulary size to detect objects of novel c...
We address the problem of training Object Detection models using significantly less bounding box ann...
We propose a new setting for detecting unseen objects called Zero-shot Annotation object Detection (...