Unpaired Referring Expression Grounding via Bidirectional Cross-Modal Matching

Shi, Hengcan
Hayat, Munawar
Cai, Jianfei

Publication date

June 2022

Abstract

Referring expression grounding is an important and challenging task in computer vision. To avoid the laborious annotation in conventional referring grounding, unpaired referring grounding is introduced, where the training data only contains a number of images and queries without correspondences. The few existing solutions to unpaired referring grounding are still preliminary, due to the challenges of learning image-text matching and lack of the top-down guidance with unpaired data. In this paper, we propose a novel bidirectional cross-modal matching (BiCM) framework to address these challenges. Particularly, we design a query-aware attention map (QAM) module that introduces top-down perspective via generating query-specific visual attention...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Unpaired Referring Expression Grounding via Bidirectional Cross-Modal Matching

Abstract

Extracted data

Unpaired Referring Expression Grounding via Bidirectional Cross-Modal Matching

Abstract

Extracted data

Related items

Related items