Semi-supervised learning methods such as self-training are able to leverage unlabeled data, which is widely available, as opposed to only using labeled data like many successful supervised learning methods. One part of self-training is to use a trained model to create pseudo-labels for unlabeled data and then select some of those samples to add to the labeled dataset. One way to do this is to pick samples for which the model has high confidence. However, many models are not well-calibrated, which means that the confidence scores do not necessarily align with the expected distribution in the dataset. Thus, the usage of confidence scores in this manner may result in adding more incorrectly labeled samples to the training dataset than expected...
Problems for parsing morphologically rich languages are, amongst others, caused by the higher variab...
The goal of this paper is to train effective self-supervised speaker representations without identit...
We present a new machine learning framework called “self-taught learning ” for using unlabeled data ...
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-...
Abstract—Practical machine learning and data mining prob-lems often face shortage of labeled trainin...
Data annotation is critical for machine learning based natural language processing models. Although ...
Self-training is a semi-supervised learning algorithm in which a learner keeps on labeling unlabele...
The calibration and self-regulated learning literatures were reviewed. Calibration is a measure of h...
Abstract. Self-training is a semi-supervised learning algorithm in which a learner keeps on labeling...
Current semi-supervised incremental learning approaches select unlabeled examples with predicted hig...
Following the success of supervised learning, semi-supervised learning (SSL) is now becoming increas...
Pseudo labeling and consistency regularization approaches based on confidencethresholding have made ...
In this paper we revisit the idea of pseudo-labeling in the context of semi-supervised learning wher...
Language Models (LMs) pre-trained with self-supervision on large text corpora have become the defaul...
The key to semi-supervised learning (SSL) is to explore adequate information to leverage the unlabel...
Problems for parsing morphologically rich languages are, amongst others, caused by the higher variab...
The goal of this paper is to train effective self-supervised speaker representations without identit...
We present a new machine learning framework called “self-taught learning ” for using unlabeled data ...
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-...
Abstract—Practical machine learning and data mining prob-lems often face shortage of labeled trainin...
Data annotation is critical for machine learning based natural language processing models. Although ...
Self-training is a semi-supervised learning algorithm in which a learner keeps on labeling unlabele...
The calibration and self-regulated learning literatures were reviewed. Calibration is a measure of h...
Abstract. Self-training is a semi-supervised learning algorithm in which a learner keeps on labeling...
Current semi-supervised incremental learning approaches select unlabeled examples with predicted hig...
Following the success of supervised learning, semi-supervised learning (SSL) is now becoming increas...
Pseudo labeling and consistency regularization approaches based on confidencethresholding have made ...
In this paper we revisit the idea of pseudo-labeling in the context of semi-supervised learning wher...
Language Models (LMs) pre-trained with self-supervision on large text corpora have become the defaul...
The key to semi-supervised learning (SSL) is to explore adequate information to leverage the unlabel...
Problems for parsing morphologically rich languages are, amongst others, caused by the higher variab...
The goal of this paper is to train effective self-supervised speaker representations without identit...
We present a new machine learning framework called “self-taught learning ” for using unlabeled data ...