General-purpose audio tagging refers to classifying sounds that are of a diverse nature, and is relevant in many applications where domain-specific information cannot be exploited. The DCASE 2018 challenge introduces Task 2 for this very problem. In this task, there are a large number of classes and the audio clips vary in duration. Moreover, a subset of the labels are noisy. In this paper, we propose a system to address these challenges. The basis of our system is an ensemble of convolutional neural networks trained on log-scaled mel spectrograms. We use preprocessing and data augmentation methods to improve the performance further. To reduce the effects of label noise, two techniques are proposed: loss function weighting and pseudo-labeli...
Weakly labeled sound event detection (WSED) is an important task as it can facilitate the data colle...
We present Music Tagging Transformer that is trained with a semi-supervised approach. The proposed m...
In this paper, we present a gated convolutional neural network and a temporal attention-based locali...
General-purpose audio tagging refers to classifying sounds that are of a diverse nature, and is rel...
This paper introduces Task 2 of the DCASE2019 Challenge, titled "Audio tagging with noisy labels and...
In this paper we present our audio tagging system for the DCASE 2019 Challenge Task 2. We propose a ...
Label noise refers to the presence of inaccurate target labels in a dataset. It is an impediment to ...
In this paper, we describe our system for the Task 2 of Detection and Classification of Acoustic Sce...
Label noise refers to the presence of inaccurate target labels in a dataset. It is an impediment to ...
Comunicació presentada a: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and ...
This thesis focuses on the aspect of label noise for real-life datasets. Due to the upcoming growing...
Audio tagging is the task of predicting the presence or absence of sound classes within an audio cli...
Audio tagging has attracted increasing attention since last decade and has various potential applic...
Comunicació presentada a: Workshop Machine Learning for Audio Signal Processing at NIPS 2017 (ML4Aud...
Audio tagging aims to detect the types of sound events occurring in an audio recording. To tag the p...
Weakly labeled sound event detection (WSED) is an important task as it can facilitate the data colle...
We present Music Tagging Transformer that is trained with a semi-supervised approach. The proposed m...
In this paper, we present a gated convolutional neural network and a temporal attention-based locali...
General-purpose audio tagging refers to classifying sounds that are of a diverse nature, and is rel...
This paper introduces Task 2 of the DCASE2019 Challenge, titled "Audio tagging with noisy labels and...
In this paper we present our audio tagging system for the DCASE 2019 Challenge Task 2. We propose a ...
Label noise refers to the presence of inaccurate target labels in a dataset. It is an impediment to ...
In this paper, we describe our system for the Task 2 of Detection and Classification of Acoustic Sce...
Label noise refers to the presence of inaccurate target labels in a dataset. It is an impediment to ...
Comunicació presentada a: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and ...
This thesis focuses on the aspect of label noise for real-life datasets. Due to the upcoming growing...
Audio tagging is the task of predicting the presence or absence of sound classes within an audio cli...
Audio tagging has attracted increasing attention since last decade and has various potential applic...
Comunicació presentada a: Workshop Machine Learning for Audio Signal Processing at NIPS 2017 (ML4Aud...
Audio tagging aims to detect the types of sound events occurring in an audio recording. To tag the p...
Weakly labeled sound event detection (WSED) is an important task as it can facilitate the data colle...
We present Music Tagging Transformer that is trained with a semi-supervised approach. The proposed m...
In this paper, we present a gated convolutional neural network and a temporal attention-based locali...