While many approaches exist in the literature to learn low-dimensional representations for data collections in multiple modalities, the generalizability of multi-modal nonlinear embeddings to previously unseen data is a rather overlooked subject. In this work, we first present a theoretical analysis of learning multi-modal nonlinear embeddings in a supervised setting. Our performance bounds indicate that for successful generalization in multi-modal classification and retrieval problems, the regularity of the interpolation functions extending the embedding to the whole data space is as important as the between-class separation and cross-modal alignment criteria. We then propose a multi-modal nonlinear representation learning algorithm that i...
Most machine learning applications involve a domain shift between data on which a model has initiall...
Abstract—In this paper we study the problem of learning from multiple modal data for purpose of docu...
Most cross-modal retrieval methods based on subspace learning just focus on learning the projection ...
In many problems in machine learning there exist relations between data collections from different m...
In practical machine learning settings, there often exist relations or links between data from diffe...
Multi-modal data analysis methods often learn representations that align different modalities in a n...
© 2017 Elsevier Inc. The heterogeneity-gap between different modalities brings a significant challen...
In this paper, the problem of multi-view embed-ding from different visual cues and modalities is con...
The cross-modal retrieval task can return different modal nearest neighbors, such as image or text. ...
textabstractDifferences in scanning parameters or modalities can complicate image analysis based on ...
A better similarity mapping function across heterogeneous high-dimensional features is very desirabl...
Pre-trained large-scale models provide a transferable embedding, and they show promising performance...
Cross-modal recognition and matching with privileged information are important challenging problems ...
Cross-modal recognition and matching with privileged information are important challenging problems ...
Most machine learning applications involve a domain shift between data on which a model has initiall...
Most machine learning applications involve a domain shift between data on which a model has initiall...
Abstract—In this paper we study the problem of learning from multiple modal data for purpose of docu...
Most cross-modal retrieval methods based on subspace learning just focus on learning the projection ...
In many problems in machine learning there exist relations between data collections from different m...
In practical machine learning settings, there often exist relations or links between data from diffe...
Multi-modal data analysis methods often learn representations that align different modalities in a n...
© 2017 Elsevier Inc. The heterogeneity-gap between different modalities brings a significant challen...
In this paper, the problem of multi-view embed-ding from different visual cues and modalities is con...
The cross-modal retrieval task can return different modal nearest neighbors, such as image or text. ...
textabstractDifferences in scanning parameters or modalities can complicate image analysis based on ...
A better similarity mapping function across heterogeneous high-dimensional features is very desirabl...
Pre-trained large-scale models provide a transferable embedding, and they show promising performance...
Cross-modal recognition and matching with privileged information are important challenging problems ...
Cross-modal recognition and matching with privileged information are important challenging problems ...
Most machine learning applications involve a domain shift between data on which a model has initiall...
Most machine learning applications involve a domain shift between data on which a model has initiall...
Abstract—In this paper we study the problem of learning from multiple modal data for purpose of docu...
Most cross-modal retrieval methods based on subspace learning just focus on learning the projection ...