Automatic classification of scientific articles based on common characteristics is an interesting problem with many applications in digital library and information retrieval systems. Properly organized articles can be useful for automatic generation of taxonomies in scientific writings, textual summarization, efficient information retrieval etc. Generating article bundles from a large number of input articles, based on the associated features of the articles is tedious and computationally expensive task. In this report we propose an automatic two-step approach for topic extraction and bundling of related articles from a set of scientific articles in real-time. For topic extraction, we make use of Latent Dirichlet Allocation (LDA) topic mode...
In publication driven domains such as the scientic community the availability of topic information i...
We describe the methodology that we followed to automatically extract topics corresponding to known ...
Identifying the topic of an article can involve a lot of manual work. The manual processes can be ex...
Topic modeling is a type of statistical model for discovering the latent "topics" that occur in a co...
We experiment with an automated topic extraction algorithm based on a generative graphical model. La...
The article addresses the problem of document clusterization. The author describes a technology for ...
Recently, a probabilistic topic modelling approach, latent dirichlet allocation (LDA), has been exte...
With the vast amount of information available on the Internet today, helping users find relevant con...
Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requ...
With the rapidly growing number of scientific publications, researchers face an increasing challenge...
Unsupervised statistical analysis of unstructured data has gained wide acceptance especially in natu...
In this paper, I apply latent dirichlet allocation(LDA) to cluster 100,000 health related articles u...
From literature surveys to legal document collections, people need to organize and explore large amo...
In this paper, we introduce a new clustering algorithm for discovering and describing the topics com...
Collections of research article data harvested from the web have become common recently since they a...
In publication driven domains such as the scientic community the availability of topic information i...
We describe the methodology that we followed to automatically extract topics corresponding to known ...
Identifying the topic of an article can involve a lot of manual work. The manual processes can be ex...
Topic modeling is a type of statistical model for discovering the latent "topics" that occur in a co...
We experiment with an automated topic extraction algorithm based on a generative graphical model. La...
The article addresses the problem of document clusterization. The author describes a technology for ...
Recently, a probabilistic topic modelling approach, latent dirichlet allocation (LDA), has been exte...
With the vast amount of information available on the Internet today, helping users find relevant con...
Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requ...
With the rapidly growing number of scientific publications, researchers face an increasing challenge...
Unsupervised statistical analysis of unstructured data has gained wide acceptance especially in natu...
In this paper, I apply latent dirichlet allocation(LDA) to cluster 100,000 health related articles u...
From literature surveys to legal document collections, people need to organize and explore large amo...
In this paper, we introduce a new clustering algorithm for discovering and describing the topics com...
Collections of research article data harvested from the web have become common recently since they a...
In publication driven domains such as the scientic community the availability of topic information i...
We describe the methodology that we followed to automatically extract topics corresponding to known ...
Identifying the topic of an article can involve a lot of manual work. The manual processes can be ex...