Masked autoencoding has achieved great success for self-supervised learning in the image and language domains. However, mask based pretraining has yet to show benefits for point cloud understanding, likely due to standard backbones like PointNet being unable to properly handle the training versus testing distribution mismatch introduced by masking during training. In this paper, we bridge this gap by proposing a discriminative mask pretraining Transformer framework, MaskPoint}, for point clouds. Our key idea is to represent the point cloud as discrete occupancy values (1 if part of the point cloud; 0 if not), and perform simple binary classification between masked object points and sampled noise points as the proxy task. In this way, our ap...
We describe a simple pre-training approach for point clouds. It works in three steps: 1. Mask all po...
Masked Autoencoders (MAE) have shown promising performance in self-supervised learning for both 2D a...
Learning representations for point clouds is an important task in 3D computer vision, especially wit...
Transformer-based Self-supervised Representation Learning methods learn generic features from unlabe...
Transformer-based Self-supervised Representation Learning methods learn generic features from unlabe...
Transformer-based Self-supervised Representation Learning methods learn generic features from unlabe...
We present Point-BERT, a new paradigm for learning Transformers to generalize the concept of BERT to...
Masked auto-encoding is a popular and effective self-supervised learning approach to point cloud lea...
Self-supervised learning is attracting large attention in point cloud understanding. However, explor...
In autonomous driving, the 3D LiDAR (Light Detection and Ranging) point cloud data of the target are...
Transformers and masked language modeling are quickly being adopted and explored in computer vision ...
While image data starts to enjoy the simple-but-effective self-supervised learning scheme built upon...
Reducing the quantity of annotations required for supervised training is vital when labels are scarc...
For unsupervised pretraining, mask-reconstruction pretraining (MRP) approaches randomly mask input p...
Self-attention is of vital importance in semantic segmentation as it enables modeling of long-range ...
We describe a simple pre-training approach for point clouds. It works in three steps: 1. Mask all po...
Masked Autoencoders (MAE) have shown promising performance in self-supervised learning for both 2D a...
Learning representations for point clouds is an important task in 3D computer vision, especially wit...
Transformer-based Self-supervised Representation Learning methods learn generic features from unlabe...
Transformer-based Self-supervised Representation Learning methods learn generic features from unlabe...
Transformer-based Self-supervised Representation Learning methods learn generic features from unlabe...
We present Point-BERT, a new paradigm for learning Transformers to generalize the concept of BERT to...
Masked auto-encoding is a popular and effective self-supervised learning approach to point cloud lea...
Self-supervised learning is attracting large attention in point cloud understanding. However, explor...
In autonomous driving, the 3D LiDAR (Light Detection and Ranging) point cloud data of the target are...
Transformers and masked language modeling are quickly being adopted and explored in computer vision ...
While image data starts to enjoy the simple-but-effective self-supervised learning scheme built upon...
Reducing the quantity of annotations required for supervised training is vital when labels are scarc...
For unsupervised pretraining, mask-reconstruction pretraining (MRP) approaches randomly mask input p...
Self-attention is of vital importance in semantic segmentation as it enables modeling of long-range ...
We describe a simple pre-training approach for point clouds. It works in three steps: 1. Mask all po...
Masked Autoencoders (MAE) have shown promising performance in self-supervised learning for both 2D a...
Learning representations for point clouds is an important task in 3D computer vision, especially wit...