Recent advances of music source separation have achieved high quality of vocal isolation from mix audio. This has paved the way for various applications in the area of music informational retrieval (MIR). In this paper, we propose a method to learn a cross-domain embedding space between isolated vocal and mixed audio for vocal-centric MIR tasks, leveraging a pre-trained music source separation model. Learning the cross-domain embedding was previously attempted with a triplet-based similarity model where vocal and mixed audio are encoded by two different convolutional neural networks. We improve the approach with a structure-preserving triplet loss that exploits not only cross-domain similarity between vocal and mixed audio but also intra-do...
The incredibly large amount of music that is available online nowadays has created a need for powerf...
Choral singing is a widely practiced form of ensemble singing wherein a group of people sing simulta...
Music and speech exhibit striking similarities in the communication of emotions in the acoustic doma...
Previous approaches in singer identification have used one of monophonic vocal tracks or mixed track...
Informed source separation has recently gained renewed interest with the introduction of neural netw...
Content-based Music Information Retrieval (MIR) systems seek to automatically extract meaningful inf...
Modeling various aspects that make a music piece unique is a challenging task, requiring the combina...
Modeling various aspects that make a music piece unique is a challenging task, requiring the combina...
In recent years, music source separation has been one of the most intensively studied research areas...
An interesting problem in accessing music digital libraries is how to combine the information of dif...
The integration of additional side information to improve music source separation has been investiga...
This paper discusses the concept of transfer learning and its potential applications to MIR tasks su...
In this paper, we tackle the problem of domain-adaptive representation learning for music processing...
This paper presents two systems for extracting the vocals from a musical piece. Vocals extraction fi...
Very few large-scale music research datasets are publicly available. There is an increasing need for...
The incredibly large amount of music that is available online nowadays has created a need for powerf...
Choral singing is a widely practiced form of ensemble singing wherein a group of people sing simulta...
Music and speech exhibit striking similarities in the communication of emotions in the acoustic doma...
Previous approaches in singer identification have used one of monophonic vocal tracks or mixed track...
Informed source separation has recently gained renewed interest with the introduction of neural netw...
Content-based Music Information Retrieval (MIR) systems seek to automatically extract meaningful inf...
Modeling various aspects that make a music piece unique is a challenging task, requiring the combina...
Modeling various aspects that make a music piece unique is a challenging task, requiring the combina...
In recent years, music source separation has been one of the most intensively studied research areas...
An interesting problem in accessing music digital libraries is how to combine the information of dif...
The integration of additional side information to improve music source separation has been investiga...
This paper discusses the concept of transfer learning and its potential applications to MIR tasks su...
In this paper, we tackle the problem of domain-adaptive representation learning for music processing...
This paper presents two systems for extracting the vocals from a musical piece. Vocals extraction fi...
Very few large-scale music research datasets are publicly available. There is an increasing need for...
The incredibly large amount of music that is available online nowadays has created a need for powerf...
Choral singing is a widely practiced form of ensemble singing wherein a group of people sing simulta...
Music and speech exhibit striking similarities in the communication of emotions in the acoustic doma...