Previous methods based on 3DCNN, convLSTM, or optical flow have achieved great success in video salient object detection (VSOD). However, they still suffer from high computational costs or poor quality of the generated saliency maps. To solve these problems, we design a space-time memory (STM)-based network, which extracts useful temporal information of the current frame from adjacent frames as the temporal branch of VSOD. Furthermore, previous methods only considered single-frame prediction without temporal association. As a result, the model may not focus on the temporal information sufficiently. Thus, we initially introduce object motion prediction between inter-frame into VSOD. Our model follows standard encoder--decoder architecture. I...
The object detection problem is composed of two main tasks, object localization and object classifi...
Convolutional neural networks have achieved excellent successes for object recognition in still imag...
Video surveillance outputs different portrait information of scenes such as crime investigation, sec...
Semantics and motion are two cues of essence for the success in video salient object detection. Most...
In this paper, we propose a computationally efficient and consistently accurate spatiotemporal salie...
This paper proposes a deep learning model to efficiently detect salient regions in videos. It addres...
Spatiotemporal information is essential for video salient object detection (VSOD) due to the highly ...
Salient object detection (SOD) in panoramic video is still in the initial exploration stage. The ind...
Unsupervised video object segmentation (VOS) is a task that aims to detect the most salient object i...
As a fundamental task in computer vision, object detection is to locate visual objects of pre-define...
Video salient object detection models trained on pixel-wise dense annotation have achieved excellent...
Despite their great predictive capability, Convolutional Neural Networks (CNNs) are computational-ex...
Salient object detection is a critical and active field that aims at the detection of objects in a v...
This paper presents the novel idea of generating object proposals by leveraging temporal information...
In this dissertation, I present my work towards exploring temporal information for better video unde...
The object detection problem is composed of two main tasks, object localization and object classifi...
Convolutional neural networks have achieved excellent successes for object recognition in still imag...
Video surveillance outputs different portrait information of scenes such as crime investigation, sec...
Semantics and motion are two cues of essence for the success in video salient object detection. Most...
In this paper, we propose a computationally efficient and consistently accurate spatiotemporal salie...
This paper proposes a deep learning model to efficiently detect salient regions in videos. It addres...
Spatiotemporal information is essential for video salient object detection (VSOD) due to the highly ...
Salient object detection (SOD) in panoramic video is still in the initial exploration stage. The ind...
Unsupervised video object segmentation (VOS) is a task that aims to detect the most salient object i...
As a fundamental task in computer vision, object detection is to locate visual objects of pre-define...
Video salient object detection models trained on pixel-wise dense annotation have achieved excellent...
Despite their great predictive capability, Convolutional Neural Networks (CNNs) are computational-ex...
Salient object detection is a critical and active field that aims at the detection of objects in a v...
This paper presents the novel idea of generating object proposals by leveraging temporal information...
In this dissertation, I present my work towards exploring temporal information for better video unde...
The object detection problem is composed of two main tasks, object localization and object classifi...
Convolutional neural networks have achieved excellent successes for object recognition in still imag...
Video surveillance outputs different portrait information of scenes such as crime investigation, sec...