Speaker Recognition (SR) is a common task in AI-based sound analysis, involving structurally different methodologies such as Deep Learning or "traditional" Machine Learning (ML). In this paper, we compared and explored the two methodologies on the DEMoS dataset consisting of 8869 audio files of 58 speakers in different emotional states. A custom CNN is compared to several pre-trained nets using image inputs of spectrograms and Cepstral-temporal (MFCC) graphs. AML approach based on acoustic feature extraction, selection and multi-class classification by means of a Naive Bayes model is also considered. Results show how a custom, less deep CNN trained on grayscale spectrogram images obtain the most accurate results, 90.15% on grayscale spectro...
Research has shown the efficacy of using convolutional neural networks (CNN) with audio spectrograms...
Deep learning, especially in the form of convolutional neural networks (CNNs), has triggered substan...
Speech recognition is a popular research topic that analyzes human speech. In addition to understand...
Speaker Recognition (SR) is a common task in AI-based sound analysis, involving structurally differe...
The objective of this paper is speaker recognition under noisy and unconstrained conditions. We mak...
The field of artificial intelligence (AI) has long found that it is the things that humans find very...
This paper discusses a transition from the traditional methods to novel deep learning architectures ...
Speaker identification with deep learning commonly use time-frequency representation of the voice si...
Speaker recognition is a technique of identifying the person talking to a machine using the voice fe...
The objective of this work is speaker recognition under noisy and unconstrained conditions. We make ...
As an important information carrier, sound carries abundant information about the environment, which...
In Automatic Speech Recognition (ASR) the non-linear data projection provided by a one hidden layer ...
Deep learning-based machine learning models have shown significant results in speech recognition and...
Speech emotion recognition is a challenging task in speech processing field. For this reason, featur...
This work considers training neural networks for speaker recognition with a much smaller dataset siz...
Research has shown the efficacy of using convolutional neural networks (CNN) with audio spectrograms...
Deep learning, especially in the form of convolutional neural networks (CNNs), has triggered substan...
Speech recognition is a popular research topic that analyzes human speech. In addition to understand...
Speaker Recognition (SR) is a common task in AI-based sound analysis, involving structurally differe...
The objective of this paper is speaker recognition under noisy and unconstrained conditions. We mak...
The field of artificial intelligence (AI) has long found that it is the things that humans find very...
This paper discusses a transition from the traditional methods to novel deep learning architectures ...
Speaker identification with deep learning commonly use time-frequency representation of the voice si...
Speaker recognition is a technique of identifying the person talking to a machine using the voice fe...
The objective of this work is speaker recognition under noisy and unconstrained conditions. We make ...
As an important information carrier, sound carries abundant information about the environment, which...
In Automatic Speech Recognition (ASR) the non-linear data projection provided by a one hidden layer ...
Deep learning-based machine learning models have shown significant results in speech recognition and...
Speech emotion recognition is a challenging task in speech processing field. For this reason, featur...
This work considers training neural networks for speaker recognition with a much smaller dataset siz...
Research has shown the efficacy of using convolutional neural networks (CNN) with audio spectrograms...
Deep learning, especially in the form of convolutional neural networks (CNNs), has triggered substan...
Speech recognition is a popular research topic that analyzes human speech. In addition to understand...