In this paper, we propose a novel method that exploits correlation between audio-visual dynamics of a video to segment and localize objects that are the dominant source of audio. Our approach consists of a two-step spatiotemporal segmentation mechanism that relies on velocity and acceleration of moving objects as visual features. Each frame of the video is segmented into regions based on motion and appearance cues using the QuickShift algorithm, which are then clustered over time using K-means, so as to obtain a spatiotemporal video segmentation. The video is represented by motion features computed over individual segments. The Mel-Frequency Cepstral Coefficients (MFCC) of the audio signal, and their first order derivatives are exploited to...
Abstract: Moving object detection and tracking in a Video sequence is a crucial task in many compute...
This thesis presents a computational framework to jointly analyze auditory and visual information. T...
Vast amounts of digital multimedia data are being produced and dis-tributed today, so methods for th...
In this paper, we propose a novel method that exploits correlation between audio-visual dynamics of ...
In this paper, we investigate techniques to localize the sound source in video made using one microp...
Technological developments and innovations of the first forty years of the digital era have primaril...
Technological developments and innovations of the first forty years of the digital era have primaril...
We propose a novel method to automatically detect and extract the video modality of the sound source...
This paper presents a novel method to correlate audio and visual data generated by the same physical...
We propose a novel method to automatically detect and extract the video modality of the sound source...
Technological developments and innovations of the first forty years of the digital era have primaril...
Technological developments and innovations of the first forty years of the digital era have primaril...
Music videos are good examples of multimedia documents in which the structures of the audio and vide...
We propose a novel method to automatically extract the audio-visual objects that are present in a sc...
Object based video representation is an essential step towards multimedia communications. Using vide...
Abstract: Moving object detection and tracking in a Video sequence is a crucial task in many compute...
This thesis presents a computational framework to jointly analyze auditory and visual information. T...
Vast amounts of digital multimedia data are being produced and dis-tributed today, so methods for th...
In this paper, we propose a novel method that exploits correlation between audio-visual dynamics of ...
In this paper, we investigate techniques to localize the sound source in video made using one microp...
Technological developments and innovations of the first forty years of the digital era have primaril...
Technological developments and innovations of the first forty years of the digital era have primaril...
We propose a novel method to automatically detect and extract the video modality of the sound source...
This paper presents a novel method to correlate audio and visual data generated by the same physical...
We propose a novel method to automatically detect and extract the video modality of the sound source...
Technological developments and innovations of the first forty years of the digital era have primaril...
Technological developments and innovations of the first forty years of the digital era have primaril...
Music videos are good examples of multimedia documents in which the structures of the audio and vide...
We propose a novel method to automatically extract the audio-visual objects that are present in a sc...
Object based video representation is an essential step towards multimedia communications. Using vide...
Abstract: Moving object detection and tracking in a Video sequence is a crucial task in many compute...
This thesis presents a computational framework to jointly analyze auditory and visual information. T...
Vast amounts of digital multimedia data are being produced and dis-tributed today, so methods for th...