Text classification is one of the most fundamental tasks in Natural Language Processing. How to effectually utilize the unlabeled dataset in text classification and apply weakly supervised learning methods to further improve the performance based on the existing labeled dataset, especially for supervision-starved tasks (hard to obtain high-quality labeled data), is challenging. In this PhD thesis, we show several studies of weakly supervised learning methods in text classification. We first focus on improving the accuracy and interpretability in text classification tasks using weakly supervised learning methods with the help of unlabeled dataset. More specifically, we proposed several new methods to further improve the accuracy and interpre...
Owing to the prohibitive costs of generating large amounts of labeled data, programmatic weak superv...
This thesis focuses on how unlabeled data can improve supervised learning classi-fiers in all contex...
We introduce a Bayesian model, BayesANIL, that is capable of estimating uncertainties as-sociated wi...
The cluster assumption is exploited by most semi-supervised learning (SSL) meth-ods. However, if the...
Weakly supervised text classification methods typically train a deep neural classifier based on pseu...
Within a situation where Semi-Supervised Learning (SSL) is available to exploit unlabeled data, this...
As the availability of unstructured data on the web continues to increase, it is becoming increasing...
For high-resource languages like English, text classification is a well-studied task. The performanc...
document are those of the author and should not be interpreted as representing the official policies...
Machine learning is a garbage-in-garbage-out system, which relies on high-quality labeled data to tr...
In many important text classification problems, acquiring class labels for training documents is cos...
Building machine learning models for natural language understanding (NLU) tasks relies heavily on la...
Obtaining labeled data to train natural language machine learning algorithms is often expen...
Solving text classification in a weakly supervised manner is important for real-world applications w...
Weakly supervised learning is aimed to learn predictive models from partially supervised data, an ea...
Owing to the prohibitive costs of generating large amounts of labeled data, programmatic weak superv...
This thesis focuses on how unlabeled data can improve supervised learning classi-fiers in all contex...
We introduce a Bayesian model, BayesANIL, that is capable of estimating uncertainties as-sociated wi...
The cluster assumption is exploited by most semi-supervised learning (SSL) meth-ods. However, if the...
Weakly supervised text classification methods typically train a deep neural classifier based on pseu...
Within a situation where Semi-Supervised Learning (SSL) is available to exploit unlabeled data, this...
As the availability of unstructured data on the web continues to increase, it is becoming increasing...
For high-resource languages like English, text classification is a well-studied task. The performanc...
document are those of the author and should not be interpreted as representing the official policies...
Machine learning is a garbage-in-garbage-out system, which relies on high-quality labeled data to tr...
In many important text classification problems, acquiring class labels for training documents is cos...
Building machine learning models for natural language understanding (NLU) tasks relies heavily on la...
Obtaining labeled data to train natural language machine learning algorithms is often expen...
Solving text classification in a weakly supervised manner is important for real-world applications w...
Weakly supervised learning is aimed to learn predictive models from partially supervised data, an ea...
Owing to the prohibitive costs of generating large amounts of labeled data, programmatic weak superv...
This thesis focuses on how unlabeled data can improve supervised learning classi-fiers in all contex...
We introduce a Bayesian model, BayesANIL, that is capable of estimating uncertainties as-sociated wi...