Increasing digital storage and transmission of speech and audio necessitates the use of codecs that can reduce the digital size of the audio file. Knowledge about the limits of human hearing allows the creation of perceptual models, that can enable the removal of information while the perceived audio distortion remains minimal. The application of such perceptual models can be computationally complex and might be a bottleneck to the coding process. This thesis aims to improve the efficiency of the application of perceptual models in speech and audio codecs. Two approaches are taken to address the same: the first is to use neural networks to approximate the action of the perceptual model, and the second is to improve the efficiency wit...
Abstract: In [I], we have proposed a perceptual distortion measure for speech coders using an audito...
Since early perceptual audio coders such as mp3, the underlying psychoacoustic model that controls t...
[[abstract]]This paper presents a new forward masking model for perceptual audio coding. This model ...
The research contained in this thesis provides an investigation into a new method of minimising the ...
Sophisticated audio coding paradigms incorporate human perceptual effects in order to reduce data ra...
© 2019 Association for Computing Machinery. Generative audio models based on neural networks have le...
The human hearing system is the most robust speech processor despite noisy environments. This work p...
A growing need for on-device machine learning has led to an increased interest in light-weight neura...
Machine-learning based approaches to speech enhancement have recently shown great promise for improv...
This paper introduces high-quality audio coding using psychoacoustic models. This technology is now ...
This paper introduces high-quality audio coding using psychoacoustic models. This technology is now ...
New applications such as Internet broadcast and communications, consumer multimedia products, digita...
Hearing loss research has traditionally been based on perceptual criteria, speech intelligibility an...
This work investigates alternate pre-emphasis filters used as part of the loss function during neura...
[[abstract]]This paper presents a new forward masking model for perceptual audio coding. This model ...
Abstract: In [I], we have proposed a perceptual distortion measure for speech coders using an audito...
Since early perceptual audio coders such as mp3, the underlying psychoacoustic model that controls t...
[[abstract]]This paper presents a new forward masking model for perceptual audio coding. This model ...
The research contained in this thesis provides an investigation into a new method of minimising the ...
Sophisticated audio coding paradigms incorporate human perceptual effects in order to reduce data ra...
© 2019 Association for Computing Machinery. Generative audio models based on neural networks have le...
The human hearing system is the most robust speech processor despite noisy environments. This work p...
A growing need for on-device machine learning has led to an increased interest in light-weight neura...
Machine-learning based approaches to speech enhancement have recently shown great promise for improv...
This paper introduces high-quality audio coding using psychoacoustic models. This technology is now ...
This paper introduces high-quality audio coding using psychoacoustic models. This technology is now ...
New applications such as Internet broadcast and communications, consumer multimedia products, digita...
Hearing loss research has traditionally been based on perceptual criteria, speech intelligibility an...
This work investigates alternate pre-emphasis filters used as part of the loss function during neura...
[[abstract]]This paper presents a new forward masking model for perceptual audio coding. This model ...
Abstract: In [I], we have proposed a perceptual distortion measure for speech coders using an audito...
Since early perceptual audio coders such as mp3, the underlying psychoacoustic model that controls t...
[[abstract]]This paper presents a new forward masking model for perceptual audio coding. This model ...