The Sørensen--Dice Coefficient has recently seen rising popularity as a loss function (also known as Dice loss) due to its robustness in tasks where the number of negative samples significantly exceeds that of positive samples, such as semantic segmentation, natural language processing, and sound event detection. Conventional training of polyphonic sound event detection systems with binary cross-entropy loss often results in suboptimal detection performance as the training is often overwhelmed by updates from negative samples. In this paper, we investigated the effect of the Dice loss, intra- and inter-modal transfer learning, data augmentation, and recording formats, on the performance of polyphonic sound event detection systems with multi...
International audience—In this paper, a system for polyphonic sound event detection and tracking is ...
Polyphonic sound event detection (SED) is the task of detecting the time stamps and the class of sou...
Despite that L1 and L2 loss functions do not represent any perceptually-related information besides ...
To detect the class, and start and end times of sound events in real world recordings is a challengi...
Sound event detection is the task of identifying automatically the presence and temporal boundaries ...
We study the merit of transfer learning for two sound recognition problems, i.e., audio tagging and ...
Recently, deep recurrent neural networks have achieved great success in various machine learning tas...
Polyphonic sound event detection aims to detect the types of sound events that occur in given audio ...
This paper presents and discusses various metrics proposed for evaluation of polyphonic sound event ...
Polyphonic sound event localization and detection is not only detecting what sound events are happen...
This paper presents and discusses various metrics proposed for evaluation of polyphonic sound event ...
Polyphonic sound event localization and detection (SELD), which jointly performs sound event detecti...
Sound event detection (SED) and localization refer to recognizing sound events and estimating their ...
Polyphonic Sound Event Detection (SED) in real-world recordings is a challenging task because of the...
State of the art polyphonic sound event detection (SED) systems function as frame-level multi-label ...
International audience—In this paper, a system for polyphonic sound event detection and tracking is ...
Polyphonic sound event detection (SED) is the task of detecting the time stamps and the class of sou...
Despite that L1 and L2 loss functions do not represent any perceptually-related information besides ...
To detect the class, and start and end times of sound events in real world recordings is a challengi...
Sound event detection is the task of identifying automatically the presence and temporal boundaries ...
We study the merit of transfer learning for two sound recognition problems, i.e., audio tagging and ...
Recently, deep recurrent neural networks have achieved great success in various machine learning tas...
Polyphonic sound event detection aims to detect the types of sound events that occur in given audio ...
This paper presents and discusses various metrics proposed for evaluation of polyphonic sound event ...
Polyphonic sound event localization and detection is not only detecting what sound events are happen...
This paper presents and discusses various metrics proposed for evaluation of polyphonic sound event ...
Polyphonic sound event localization and detection (SELD), which jointly performs sound event detecti...
Sound event detection (SED) and localization refer to recognizing sound events and estimating their ...
Polyphonic Sound Event Detection (SED) in real-world recordings is a challenging task because of the...
State of the art polyphonic sound event detection (SED) systems function as frame-level multi-label ...
International audience—In this paper, a system for polyphonic sound event detection and tracking is ...
Polyphonic sound event detection (SED) is the task of detecting the time stamps and the class of sou...
Despite that L1 and L2 loss functions do not represent any perceptually-related information besides ...