DMNLP co-located with the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD)International audienceText clustering and topic learning are two closely related tasks. In this paper, we show that the topics can be learnt without the absolute need of an exact categorization. In particular, the experiments performed on two real case studies with a vocabulary based on bigram features lead to extracting readable topics that cover most of the documents. Precision at 10 is up to 74% for a dataset of scientific abstracts with 10,000 features, which is 4% less than when using unigrams only but provides more interpretable topics
Topics extraction has become increasingly important due to its effectiveness in many tasks, includin...
Topic modeling is an unsupervised learning task that discovers the hidden topics in a ...
Document clustering and topic modeling are two closely related tasks which can mutually benefit each...
DMNLP co-located with the European Conference on Machine Learning and Principles and Practice of Kno...
Text clustering and topic learning are two closely related tasks. In this paper, we show that the to...
10th International Conference on Applications of Natural Language to Information Systems, NLDB 2005,...
In this paper, we introduce a new clustering algorithm for discovering and describing the topics com...
The goal of topic detection or topic modelling is to uncover the hidden topics in a large corpus. It...
∗Signatures are on file in the Graduate School. Discovery of latent semantic groupings and identific...
Recent work incorporates pre-trained word embeddings such as BERT embeddings into Neural Topic Model...
Most documents are about more than one subject, but the majority of natural language processing algo...
Abstract: "The world wide web represents vast stores of information. However, the sheer amount of su...
Topic modeling algorithms, such as LDA, find topics, hidden structures, in document corpora in an un...
Topics extraction from documents has become increasingly important due to its effectiveness in many ...
Topic indexing is the task of identifying the main topics covered by a document. These are useful fo...
Topics extraction has become increasingly important due to its effectiveness in many tasks, includin...
Topic modeling is an unsupervised learning task that discovers the hidden topics in a ...
Document clustering and topic modeling are two closely related tasks which can mutually benefit each...
DMNLP co-located with the European Conference on Machine Learning and Principles and Practice of Kno...
Text clustering and topic learning are two closely related tasks. In this paper, we show that the to...
10th International Conference on Applications of Natural Language to Information Systems, NLDB 2005,...
In this paper, we introduce a new clustering algorithm for discovering and describing the topics com...
The goal of topic detection or topic modelling is to uncover the hidden topics in a large corpus. It...
∗Signatures are on file in the Graduate School. Discovery of latent semantic groupings and identific...
Recent work incorporates pre-trained word embeddings such as BERT embeddings into Neural Topic Model...
Most documents are about more than one subject, but the majority of natural language processing algo...
Abstract: "The world wide web represents vast stores of information. However, the sheer amount of su...
Topic modeling algorithms, such as LDA, find topics, hidden structures, in document corpora in an un...
Topics extraction from documents has become increasingly important due to its effectiveness in many ...
Topic indexing is the task of identifying the main topics covered by a document. These are useful fo...
Topics extraction has become increasingly important due to its effectiveness in many tasks, includin...
Topic modeling is an unsupervised learning task that discovers the hidden topics in a ...
Document clustering and topic modeling are two closely related tasks which can mutually benefit each...