Modeling various aspects that make a music piece unique is a challenging task, requiring the combination of multiple sources of information. Deep learning is commonly used to obtain representations using various sources of information, such as the audio, interactions between users and songs, or associated genre metadata. Recently, contrastive learning has led to representations that generalize better compared to traditional supervised methods. In this paper, we present a novel approach that combines multiple types of information related to music using cross-modal contrastive learning, allowing us to learn an audio feature from heterogeneous data simultaneously. We align the latent representations obtained from playlists-track interactions, ...
Nowadays, music genre classification is becoming an interesting area and attracting lots of research...
This work has been accepted at the 23rd International Society for Music Information Retrieval Confer...
In music domain, feature learning has been conducted mainly in two ways: unsupervised learning based...
Modeling various aspects that make a music piece unique is a challenging task, requiring the combina...
Music genre labels are useful to organize songs, albums, and artists into broader groups that share ...
As one of the most intuitive interfaces known to humans, natural language has the potential to media...
Music genre labels are useful to organize songs, albums, and artists into broader groups that share ...
While deep learning has enabled great advances in many areas of music, labeled music datasets remain...
Contrastive learning is a powerful way of learning multimodal representations across various domains...
This paper revisits the idea of music representation learning supervised by editorial metadata, cont...
Inspired by the success of deploying deep learning in the fields of Computer Vision and Natural Lang...
Audio representation learning based on deep neural networks (DNNs) emerged as an alternative approac...
A fundamental and general representation of audio and music which integrates multi-modal data source...
Deep cross-modal learning has successfully demonstrated excellent performance in cross-modal multime...
As one of the most intuitive interfaces known to humans, natural language has the potential to media...
Nowadays, music genre classification is becoming an interesting area and attracting lots of research...
This work has been accepted at the 23rd International Society for Music Information Retrieval Confer...
In music domain, feature learning has been conducted mainly in two ways: unsupervised learning based...
Modeling various aspects that make a music piece unique is a challenging task, requiring the combina...
Music genre labels are useful to organize songs, albums, and artists into broader groups that share ...
As one of the most intuitive interfaces known to humans, natural language has the potential to media...
Music genre labels are useful to organize songs, albums, and artists into broader groups that share ...
While deep learning has enabled great advances in many areas of music, labeled music datasets remain...
Contrastive learning is a powerful way of learning multimodal representations across various domains...
This paper revisits the idea of music representation learning supervised by editorial metadata, cont...
Inspired by the success of deploying deep learning in the fields of Computer Vision and Natural Lang...
Audio representation learning based on deep neural networks (DNNs) emerged as an alternative approac...
A fundamental and general representation of audio and music which integrates multi-modal data source...
Deep cross-modal learning has successfully demonstrated excellent performance in cross-modal multime...
As one of the most intuitive interfaces known to humans, natural language has the potential to media...
Nowadays, music genre classification is becoming an interesting area and attracting lots of research...
This work has been accepted at the 23rd International Society for Music Information Retrieval Confer...
In music domain, feature learning has been conducted mainly in two ways: unsupervised learning based...