Audio segmentation and speaker localization in meeting videos

Himanshu Vajaria
Tanmoy Islam
Sudeep Sarkar
Ravi Sankar
Ranga Kasturi

Publication date

January 2006

DOI

10.1109/icpr.2006.283

Abstract

Segmenting different individuals in a group meeting and their speech is an important first step for various tasks such as meeting transcription, automatic camera panning, mul-timedia retrieval and monologue detection. In this effort, given a meeting room video, we attempt to segment individ-ual person’s speech and localize them in the video, based on data from a single audio and video source. The segmenta-tion method is driven by audio and enhanced by video cues. We used Bayesian Information Criterion (BIC) to segment the feature vector streams and graph spectral partitioning to cluster them. We compare our results with audio based segmentation method and our localization technique with the commonly used mutual information. 1

Extracted data

We use cookies to provide a better user experience.

Data Protection

Audio segmentation and speaker localization in meeting videos

Abstract

Extracted data

Audio segmentation and speaker localization in meeting videos

Abstract

Extracted data

Related items

Related items