In this paper, we introduce LA-Chorus, a chorus detection model based on latent feature augmentation and ResNet FPN architecture. Our contributions in LA-Chorus are three-fold. Firstly, we propose a method for implicitly augmenting chorus data in the latent space during the train7 ing stage. Compared to augmentations on audio surfaces such as time stretching and pitch shifting, latent augmentations indicate changes at a higher level in original audio, thereby increasing the diversity and sufficiency in training. Second, we apply Feature Pyramid Network (FPN) to generate additional embeddings from low dimension to high dimension, consequently achieving a multi-scale training paradigm. Lastly, we release Di-Chorus, a new open-source dataset o...
This paper presents a crowdsourcing-based self-improvement frame-work of vocal activity detection (V...
Comunicació i pòster presentats a l'Interspeech 2017 celebrat del 20 al 24 d'agost a Estocolm, Suèci...
Singing melody extraction essentially involves two tasks: one is detecting the activity of a singing...
In computer vision, state-of-the-art object recognition sys-tems rely on label-preserving image tran...
In computer vision, state-of-the-art object recognition sys-tems rely on label-preserving image tran...
Singing voice detection is still a challenging task because the voice can be obscured by instruments...
Singing voice detection is still a challenging task because the voice can be obscured by instruments...
Singing voice detection is still a challenging task because the voice can be obscured by instruments...
This paper explores sequential modelling of polyphonic music with deep neural networks. While recent...
We recently presented a new model for singing synthesis based on a modified version of the WaveNet a...
We recently presented a new model for singing synthesis based on a modified version of the WaveNet a...
We recently presented a new model for singing synthesis based on a modified version of the WaveNet a...
We recently presented a new model for singing synthesis based on a modified version of the WaveNet a...
Identifying musical instruments in a polyphonic music recording is a difficult yet crucial problem i...
Comunicació i pòster presentats a l'Interspeech 2017 celebrat del 20 al 24 d'agost a Estocolm, Suèci...
This paper presents a crowdsourcing-based self-improvement frame-work of vocal activity detection (V...
Comunicació i pòster presentats a l'Interspeech 2017 celebrat del 20 al 24 d'agost a Estocolm, Suèci...
Singing melody extraction essentially involves two tasks: one is detecting the activity of a singing...
In computer vision, state-of-the-art object recognition sys-tems rely on label-preserving image tran...
In computer vision, state-of-the-art object recognition sys-tems rely on label-preserving image tran...
Singing voice detection is still a challenging task because the voice can be obscured by instruments...
Singing voice detection is still a challenging task because the voice can be obscured by instruments...
Singing voice detection is still a challenging task because the voice can be obscured by instruments...
This paper explores sequential modelling of polyphonic music with deep neural networks. While recent...
We recently presented a new model for singing synthesis based on a modified version of the WaveNet a...
We recently presented a new model for singing synthesis based on a modified version of the WaveNet a...
We recently presented a new model for singing synthesis based on a modified version of the WaveNet a...
We recently presented a new model for singing synthesis based on a modified version of the WaveNet a...
Identifying musical instruments in a polyphonic music recording is a difficult yet crucial problem i...
Comunicació i pòster presentats a l'Interspeech 2017 celebrat del 20 al 24 d'agost a Estocolm, Suèci...
This paper presents a crowdsourcing-based self-improvement frame-work of vocal activity detection (V...
Comunicació i pòster presentats a l'Interspeech 2017 celebrat del 20 al 24 d'agost a Estocolm, Suèci...
Singing melody extraction essentially involves two tasks: one is detecting the activity of a singing...