Abstract Background The need to organize any large document collection in a manner that facilitates human comprehension has become crucial with the increasing volume of information available. Two common approaches to provide a broad overview of the information space are document clustering and topic modeling. Clustering aims to group documents or terms into meaningful clusters. Topic modeling, on the other hand, focuses on finding coherent keywords for describing topics appearing in a set of documents. In addition, there have been efforts for clustering documents and finding keywords simultaneously. Results We present an algorithm to analyze document collections that is based on a notion of a theme, defined as a dual representation based on...
Topic models provide a useful tool to organize and understand the structure of large corpora of text...
Objectives: Traditionally, summarization of research themes and trends within a given discipline was...
When analyzing a document collection, a key piece of information is the number of distinct topics it...
Improving the search and browsing ex-perience in PubMedr is a key compo-nent in helping users detect...
Abstract. In this paper we introduce a novel document clustering approach that solves some major pro...
In this work, we study the problem of characterizing an unlabelled corpus of biomedical documents in...
The amount of online documents has grown tremendously in recent years that poses challenges for info...
In this paper, we introduce a new clustering algorithm for discovering and describing the topics com...
Topics extraction has become increasingly important due to its effectiveness in many tasks, includin...
Topics extraction from documents has become increasingly important due to its effectiveness in many ...
The massive growth of biomedical text makes it very challenging for researchers to review all releva...
Topics extraction has become increasingly important due to its effectiveness in many tasks, includin...
Biomedical data exists in the form of journal articles, research studies, electronic health records,...
There are many scenarios where we may want to find pairs of textually similar documents in a large c...
Document clustering is a text mining technique used to provide better document search and browsing i...
Topic models provide a useful tool to organize and understand the structure of large corpora of text...
Objectives: Traditionally, summarization of research themes and trends within a given discipline was...
When analyzing a document collection, a key piece of information is the number of distinct topics it...
Improving the search and browsing ex-perience in PubMedr is a key compo-nent in helping users detect...
Abstract. In this paper we introduce a novel document clustering approach that solves some major pro...
In this work, we study the problem of characterizing an unlabelled corpus of biomedical documents in...
The amount of online documents has grown tremendously in recent years that poses challenges for info...
In this paper, we introduce a new clustering algorithm for discovering and describing the topics com...
Topics extraction has become increasingly important due to its effectiveness in many tasks, includin...
Topics extraction from documents has become increasingly important due to its effectiveness in many ...
The massive growth of biomedical text makes it very challenging for researchers to review all releva...
Topics extraction has become increasingly important due to its effectiveness in many tasks, includin...
Biomedical data exists in the form of journal articles, research studies, electronic health records,...
There are many scenarios where we may want to find pairs of textually similar documents in a large c...
Document clustering is a text mining technique used to provide better document search and browsing i...
Topic models provide a useful tool to organize and understand the structure of large corpora of text...
Objectives: Traditionally, summarization of research themes and trends within a given discipline was...
When analyzing a document collection, a key piece of information is the number of distinct topics it...