Many music information retrieval tasks involve the comparison of a symbolic score representation with an audio recording. A typical strategy is to compare score–audio pairs based on a common mid-level representation, such as chroma features. Several recent studies demonstrated the effectiveness of deep learning models that learn task-specific mid-level representations from temporally aligned training pairs. However, in practice, there is often a lack of strongly aligned training data, in particular for real-world scenarios. In our study, we use weakly aligned score–audio pairs for training, where only the beginning and end of a score excerpt is annotated in an audio recording, without aligned correspondences in between. To exploit such weak...
Modeling various aspects that make a music piece unique is a challenging task, requiring the combina...
Modeling various aspects that make a music piece unique is a challenging task, requiring the combina...
Audio-text retrieval aims at retrieving a target audio clip or caption from a pool of candidates giv...
Despite the success of end-to-end approaches, chroma (or pitch-class) features remain a useful mid-l...
Chroma or pitch-class representations of audio recordings are an essential tool in music information...
Although audio to score alignment is a classic Music Information Retrieval problem, it has not been ...
This work addresses the problem of matching musical audio directly to sheet music, without any highe...
Similarity measures are indispensable in music information retrieval. In recent years, various propo...
Deep cross-modal learning has successfully demonstrated excellent performance in cross-modal multime...
We present an approach for recommending a music track for a given video, and vice versa, based on bo...
Given an audio query, such as polyphonic musical piece, this thesis address the problem of retrievin...
In this paper we review the acoustic features used for music-to-score alignment and study their infl...
While deep learning has enabled great advances in many areas of music, labeled music datasets remain...
We investigate the problem of matching symbolic representations directly to audio based representati...
Inspired by the success of deploying deep learning in the fields of Computer Vision and Natural Lang...
Modeling various aspects that make a music piece unique is a challenging task, requiring the combina...
Modeling various aspects that make a music piece unique is a challenging task, requiring the combina...
Audio-text retrieval aims at retrieving a target audio clip or caption from a pool of candidates giv...
Despite the success of end-to-end approaches, chroma (or pitch-class) features remain a useful mid-l...
Chroma or pitch-class representations of audio recordings are an essential tool in music information...
Although audio to score alignment is a classic Music Information Retrieval problem, it has not been ...
This work addresses the problem of matching musical audio directly to sheet music, without any highe...
Similarity measures are indispensable in music information retrieval. In recent years, various propo...
Deep cross-modal learning has successfully demonstrated excellent performance in cross-modal multime...
We present an approach for recommending a music track for a given video, and vice versa, based on bo...
Given an audio query, such as polyphonic musical piece, this thesis address the problem of retrievin...
In this paper we review the acoustic features used for music-to-score alignment and study their infl...
While deep learning has enabled great advances in many areas of music, labeled music datasets remain...
We investigate the problem of matching symbolic representations directly to audio based representati...
Inspired by the success of deploying deep learning in the fields of Computer Vision and Natural Lang...
Modeling various aspects that make a music piece unique is a challenging task, requiring the combina...
Modeling various aspects that make a music piece unique is a challenging task, requiring the combina...
Audio-text retrieval aims at retrieving a target audio clip or caption from a pool of candidates giv...