Attention-based models such as transformers have shown outstanding performance on dense prediction tasks, such as semantic segmentation, owing to their capability of capturing long-range dependency in an image. However, the benefit of transformers for monocular depth prediction has seldom been explored so far. This paper benchmarks various transformer-based models for the depth estimation task on an indoor NYUV2 dataset and an outdoor KITTI dataset. We propose a novel attention-based architecture, Depthformer for monocular depth estimation that uses multi-head self-attention to produce the multiscale feature maps, which are effectively combined by our proposed decoder network. We also propose a Transbins module that divides the depth range ...
While convolutional neural networks have shown a tremendous impact on various computer vision tasks,...
Monocular 3D object detection has long been a challenging task in autonomous driving, which requires...
Depth estimation plays an important role in the robotic perception system. Self-supervised monocular...
In the process of environmental perception, traditional CNN is often unable to effectively capture g...
With an unprecedented increase in the number of agents and systems that aim to navigate the real wor...
Multi-frame depth estimation improves over single-frame approaches by also leveraging geometric rela...
The exploration of mutual-benefit cross-domains has shown great potential toward accurate self-super...
Monocular omnidirectional depth estimation is receiving considerable research attention due to its b...
The latest research in computer vision highlighted the effectiveness of the vision transformers (ViT...
Self-supervised monocular depth estimation has been widely studied recently. Most of the work has fo...
Most approaches for semantic segmentation use only information from color cameras to parse the scene...
Depth estimation from a single image represents a very exciting challenge in computer vision. While ...
Depth estimation from a single image represents a very exciting challenge in computer vision. While ...
Self-supervised monocular depth estimation has been widely studied recently. Most of the work has fo...
Depth estimation from a single image is an important task that can be applied to various fields in c...
While convolutional neural networks have shown a tremendous impact on various computer vision tasks,...
Monocular 3D object detection has long been a challenging task in autonomous driving, which requires...
Depth estimation plays an important role in the robotic perception system. Self-supervised monocular...
In the process of environmental perception, traditional CNN is often unable to effectively capture g...
With an unprecedented increase in the number of agents and systems that aim to navigate the real wor...
Multi-frame depth estimation improves over single-frame approaches by also leveraging geometric rela...
The exploration of mutual-benefit cross-domains has shown great potential toward accurate self-super...
Monocular omnidirectional depth estimation is receiving considerable research attention due to its b...
The latest research in computer vision highlighted the effectiveness of the vision transformers (ViT...
Self-supervised monocular depth estimation has been widely studied recently. Most of the work has fo...
Most approaches for semantic segmentation use only information from color cameras to parse the scene...
Depth estimation from a single image represents a very exciting challenge in computer vision. While ...
Depth estimation from a single image represents a very exciting challenge in computer vision. While ...
Self-supervised monocular depth estimation has been widely studied recently. Most of the work has fo...
Depth estimation from a single image is an important task that can be applied to various fields in c...
While convolutional neural networks have shown a tremendous impact on various computer vision tasks,...
Monocular 3D object detection has long been a challenging task in autonomous driving, which requires...
Depth estimation plays an important role in the robotic perception system. Self-supervised monocular...