Self-Training and Calibration for Learning with Limited Data

Liu, Emma J.

Publication date

May 2022

Publisher

Massachusetts Institute of Technology

Abstract

Semi-supervised learning methods such as self-training are able to leverage unlabeled data, which is widely available, as opposed to only using labeled data like many successful supervised learning methods. One part of self-training is to use a trained model to create pseudo-labels for unlabeled data and then select some of those samples to add to the labeled dataset. One way to do this is to pick samples for which the model has high confidence. However, many models are not well-calibrated, which means that the confidence scores do not necessarily align with the expected distribution in the dataset. Thus, the usage of confidence scores in this manner may result in adding more incorrectly labeled samples to the training dataset than expected...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Self-Training and Calibration for Learning with Limited Data

Abstract

Extracted data

Self-Training and Calibration for Learning with Limited Data

Abstract

Extracted data

Related items

Related items