Spatiotemporal information is essential for video salient object detection (VSOD) due to the highly attractive object motion for human's attention. Previous VSOD methods usually use Long Short-Term Memory (LSTM) or 3D ConvNet (C3D), which can only encode motion information through step-by-step propagation in the temporal domain. Recently, the non-local mechanism is proposed to capture long-range dependencies directly. However, it is not straightforward to apply the non-local mechanism into VSOD, because i) it fails to capture motion cues and tends to learn motion-independent global contexts; ii) its computation and memory costs are prohibitive for video dense prediction tasks such as VSOD. To address the above problems, we design a Constrai...
Multimedia applications like retrieval, copy detection etc. can gain from saliency detection, which ...
Semi-supervised video object segmentation is a fundamental yet Challenging task in computer vision. ...
Human vision system actively seeks salient regions and movements in video sequences to reduce the se...
Previous methods based on 3DCNN, convLSTM, or optical flow have achieved great success in video sali...
We propose Spatio-Temporal SlowFast Self-Attention network for action recognition. Conventional Conv...
In this paper, we propose a computationally efficient and consistently accurate spatiotemporal salie...
Semantics and motion are two cues of essence for the success in video salient object detection. Most...
Funding This research was funded by the EU H2020 TERPSICHORE project “Transforming Intangible Folklo...
Salient object detection is a critical and active field that aims at the detection of objects in a v...
For CNN-based visual action recognition, the accuracy may be increased if local key action regions a...
International audienceIn this paper we propose a method for automatic detection of salient objects i...
An adaptive spatiotemporal saliency algorithm for video attention detection using motion vector deci...
Human visual system actively seeks salient regions and movements in video sequences to reduce the se...
In dynamic object detection, it is challenging to construct an effective model to sufficiently char...
We present a new network architecture able to take advantage of spatio-temporal information availabl...
Multimedia applications like retrieval, copy detection etc. can gain from saliency detection, which ...
Semi-supervised video object segmentation is a fundamental yet Challenging task in computer vision. ...
Human vision system actively seeks salient regions and movements in video sequences to reduce the se...
Previous methods based on 3DCNN, convLSTM, or optical flow have achieved great success in video sali...
We propose Spatio-Temporal SlowFast Self-Attention network for action recognition. Conventional Conv...
In this paper, we propose a computationally efficient and consistently accurate spatiotemporal salie...
Semantics and motion are two cues of essence for the success in video salient object detection. Most...
Funding This research was funded by the EU H2020 TERPSICHORE project “Transforming Intangible Folklo...
Salient object detection is a critical and active field that aims at the detection of objects in a v...
For CNN-based visual action recognition, the accuracy may be increased if local key action regions a...
International audienceIn this paper we propose a method for automatic detection of salient objects i...
An adaptive spatiotemporal saliency algorithm for video attention detection using motion vector deci...
Human visual system actively seeks salient regions and movements in video sequences to reduce the se...
In dynamic object detection, it is challenging to construct an effective model to sufficiently char...
We present a new network architecture able to take advantage of spatio-temporal information availabl...
Multimedia applications like retrieval, copy detection etc. can gain from saliency detection, which ...
Semi-supervised video object segmentation is a fundamental yet Challenging task in computer vision. ...
Human vision system actively seeks salient regions and movements in video sequences to reduce the se...