The diversity of today’s playback systems requires a flexible, efficient, and immersive reproduction of sound scenes in digital media. Spatial audio reproduction based on primary-ambient extraction (PAE) fulfills this objective, where accurate extraction of primary and ambient components from sound mixtures in channel-based audio is crucial. Severe extraction error was found in existing PAE approaches when dealing with sound mixtures that contain a relatively strong ambient component, a commonly encountered case in the sound scenes of digital media. In this paper, we propose a novel ambient spectrum estimation (ASE) framework to improve the performance of PAE. The ASE framework exploits the equal magnitude of the uncorrelated ambient compon...