Human voices consist of specific patterns of acoustic features that are considerably enhanced during affective vocalizations. These acoustic features are presumably used by listeners to accurately discriminate between acoustically or emotionally similar vocalizations. Here we used high-field 7T functional magnetic resonance imaging in human listeners together with a so-called experimental ‘feature elimination approach’ to investigate neural decoding of three important voice features of two affective valence categories (i.e. aggressive and joyful vocalizations). We found a valence-dependent sensitivity to vocal pitch (f0) dynamics and to spectral high-frequency cues already at the level of the auditory thalamus. Furthermore, pitch dynamics a...