Detecting local audio-visual synchrony in monologues utilizing vocal pitch and facial landmark trajectories

Cadavid, Steven
Abdel-Mottaleb, Mohamed
Messinger, Daniel S.
Mahoor, Mohammad H.
Bahrick, Lorraine E.

Publication date

January 2009

Publisher

FIU Digital Commons

Abstract

We describe a novel approach for determining the audio-visual synchrony of a monologue video sequence utilizing vocal pitch and facial landmark trajectories as descriptors of the audio and visual modalities, respectively. The visual component is represented by the horizontal and vertical displacement of corresponding facial landmarks between subsequent frames. These facial landmarks are acquired using the statistical modeling technique, known as the Active Shape Model (ASM). The audio component is represented by the fundamental frequency, or pitch, obtained using the subharmonic-to-harmonic ratio (SHR). The synchrony between the audio and visual feature vectors is computed using Gaussian mutual information. The raw synchrony estimates obtai...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Detecting local audio-visual synchrony in monologues utilizing vocal pitch and facial landmark trajectories

Abstract

Extracted data

Detecting local audio-visual synchrony in monologues utilizing vocal pitch and facial landmark trajectories

Abstract

Extracted data

Related items

Related items