In this paper, we evaluate the Lbl2Vec approach for unsupervised text document classification. Lbl2Vec requires only a small number of keywords describing the respective classes to create semantic label representations. For classification, Lbl2Vec uses cosine similarities between label and document representations, but no annotation information. We show that Lbl2Vec significantly outperforms common unsupervised text classification approaches and a widely used zero-shot text classification approach. Furthermore, we show that using more precise keywords can significantly improve the classification results of similarity-based text classification approaches
We created and analyzed a text classification dataset from freely-available web documents from the U...
Manually labeling documents for training a text classifier is expensive and time-consuming. Moreover...
Automatic text classification is the task of organizing documents into pre-determined classes, gener...
The multi-label text categorization is supervised learning, where a document is associated with mult...
Large-scale multi-label text classification (LMTC) aims to associate a document with its relevant la...
In this paper, we systematically study the problem of dataless hierarchical text classification. Unl...
In recent years, we have seen an increasing amount of interest in low-dimensional vector representat...
Abstract Every year, around 28,100 journals publish 2.5 million research publications. Search engine...
In this paper, we propose an extension of the χ-Sim co-clustering algorithm to deal with the text ca...
Text classification is a foundational task in many NLP applications. The text classification task in...
Semantic learning is an important mechanism for the document classification, but most classification...
We introduce DocSCAN, a completely unsupervised text classification approach using Semantic Clusteri...
Conventional multi-label classification algorithms treat the target labels of the classification tas...
The thesis studies the problem of multi-label text classification, and argues that it could benefit ...
The semantic comparison of short sections of text is an emerging aspect of Natural Language Processi...
We created and analyzed a text classification dataset from freely-available web documents from the U...
Manually labeling documents for training a text classifier is expensive and time-consuming. Moreover...
Automatic text classification is the task of organizing documents into pre-determined classes, gener...
The multi-label text categorization is supervised learning, where a document is associated with mult...
Large-scale multi-label text classification (LMTC) aims to associate a document with its relevant la...
In this paper, we systematically study the problem of dataless hierarchical text classification. Unl...
In recent years, we have seen an increasing amount of interest in low-dimensional vector representat...
Abstract Every year, around 28,100 journals publish 2.5 million research publications. Search engine...
In this paper, we propose an extension of the χ-Sim co-clustering algorithm to deal with the text ca...
Text classification is a foundational task in many NLP applications. The text classification task in...
Semantic learning is an important mechanism for the document classification, but most classification...
We introduce DocSCAN, a completely unsupervised text classification approach using Semantic Clusteri...
Conventional multi-label classification algorithms treat the target labels of the classification tas...
The thesis studies the problem of multi-label text classification, and argues that it could benefit ...
The semantic comparison of short sections of text is an emerging aspect of Natural Language Processi...
We created and analyzed a text classification dataset from freely-available web documents from the U...
Manually labeling documents for training a text classifier is expensive and time-consuming. Moreover...
Automatic text classification is the task of organizing documents into pre-determined classes, gener...