Fully supervised semantic segmentation learns from dense masks, which requires heavy annotation cost for closed set. In this paper, we use natural language as supervision without any pixel-level annotation for open world segmentation. We call the proposed framework as FreeSeg, where the mask is freely available from raw feature map of pretraining model. Compared with zero-shot or openset segmentation, FreeSeg doesn't require any annotated masks, and it widely predicts categories beyond class-agnostic unsupervised segmentation. Specifically, FreeSeg obtains free mask from Image-Text Similarity Map (ITSM) of Interpretable Contrastive Language-Image Pretraining (ICLIP). And our core improvements are the smoothed min pooling for dense ICLIP, wi...
Weakly supervised semantic segmentation is a challenging task as it only takes image-level informati...
Recently, CLIP-based approaches have exhibited remarkable performance on generalization and few-shot...
For the semantic segmentation of images, state-of-the-art deep neural networks (DNNs) achieve high s...
Open-vocabulary semantic segmentation aims to segment an image into semantic regions according to te...
Contrastive Language-Image Pre-training (CLIP) has made a remarkable breakthrough in open-vocabulary...
Self-attention is of vital importance in semantic segmentation as it enables modeling of long-range ...
We introduce Patch Aligned Contrastive Learning (PACL), a modified compatibility function for CLIP's...
Recently, the contrastive language-image pre-training, e.g., CLIP, has demonstrated promising result...
We design an open-vocabulary image segmentation model to organize an image into meaningful regions i...
Fully supervised methods for semantic segmentation require pixel-level class masks to train, the cre...
Recent years have seen a rapid growth in new approaches improving the accuracy of semantic segmentat...
In recent years, the development of instance segmentation has garnered significant attention in a wi...
When trained at a sufficient scale, self-supervised learning has exhibited a notable ability to solv...
To bridge the gap between supervised semantic segmentation and real-world applications that acquires...
Weakly-supervised semantic segmentation (WSSS) aims to train a semantic segmentation network using w...
Weakly supervised semantic segmentation is a challenging task as it only takes image-level informati...
Recently, CLIP-based approaches have exhibited remarkable performance on generalization and few-shot...
For the semantic segmentation of images, state-of-the-art deep neural networks (DNNs) achieve high s...
Open-vocabulary semantic segmentation aims to segment an image into semantic regions according to te...
Contrastive Language-Image Pre-training (CLIP) has made a remarkable breakthrough in open-vocabulary...
Self-attention is of vital importance in semantic segmentation as it enables modeling of long-range ...
We introduce Patch Aligned Contrastive Learning (PACL), a modified compatibility function for CLIP's...
Recently, the contrastive language-image pre-training, e.g., CLIP, has demonstrated promising result...
We design an open-vocabulary image segmentation model to organize an image into meaningful regions i...
Fully supervised methods for semantic segmentation require pixel-level class masks to train, the cre...
Recent years have seen a rapid growth in new approaches improving the accuracy of semantic segmentat...
In recent years, the development of instance segmentation has garnered significant attention in a wi...
When trained at a sufficient scale, self-supervised learning has exhibited a notable ability to solv...
To bridge the gap between supervised semantic segmentation and real-world applications that acquires...
Weakly-supervised semantic segmentation (WSSS) aims to train a semantic segmentation network using w...
Weakly supervised semantic segmentation is a challenging task as it only takes image-level informati...
Recently, CLIP-based approaches have exhibited remarkable performance on generalization and few-shot...
For the semantic segmentation of images, state-of-the-art deep neural networks (DNNs) achieve high s...