Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image d...
Both interpretations of the title Retrieving Images as Text are considered in this thesis. We use te...
This article presents a generalized system of image--text relations which applies to different genre...
Advanced image-based application systems such as image retrieval and visual question answering depen...
Text and images are the two most common data modalities found on the Internet. Understanding the syn...
Texts and images provide alternative, yet orthogonal views of the same underlying cognitive concept....
Abstract. In this paper, we introduce a novel approach to image-based infor-mation retrieval by comb...
This research explores the interaction of linguistic and photographic information in an integrated t...
The combination of different media types is a defining characteristic of multimedia yet much researc...
The World Wide Web has become a common-place for finding for all kinds of purposes. The amount of da...
In this chapter, we present an approach to handle multi-modality in image retrieval using a Vector S...
In the past few years, cross-modal image-text retrieval (ITR) has experienced increased interest in ...
Progress in semantic media adaptation and personalisation requires that we know more about how diffe...
The number of images available has grown over the years, as well as the number of techniques to aid ...
ii Due to the omnipresence of digital cameras and mobile phones the number of images stored in image...
Common image-text joint understanding techniques presume that images and the associated text can uni...
Both interpretations of the title Retrieving Images as Text are considered in this thesis. We use te...
This article presents a generalized system of image--text relations which applies to different genre...
Advanced image-based application systems such as image retrieval and visual question answering depen...
Text and images are the two most common data modalities found on the Internet. Understanding the syn...
Texts and images provide alternative, yet orthogonal views of the same underlying cognitive concept....
Abstract. In this paper, we introduce a novel approach to image-based infor-mation retrieval by comb...
This research explores the interaction of linguistic and photographic information in an integrated t...
The combination of different media types is a defining characteristic of multimedia yet much researc...
The World Wide Web has become a common-place for finding for all kinds of purposes. The amount of da...
In this chapter, we present an approach to handle multi-modality in image retrieval using a Vector S...
In the past few years, cross-modal image-text retrieval (ITR) has experienced increased interest in ...
Progress in semantic media adaptation and personalisation requires that we know more about how diffe...
The number of images available has grown over the years, as well as the number of techniques to aid ...
ii Due to the omnipresence of digital cameras and mobile phones the number of images stored in image...
Common image-text joint understanding techniques presume that images and the associated text can uni...
Both interpretations of the title Retrieving Images as Text are considered in this thesis. We use te...
This article presents a generalized system of image--text relations which applies to different genre...
Advanced image-based application systems such as image retrieval and visual question answering depen...