Blogs are a new form of internet phenomenon and a vast ever-increasing information resource. Mining blog files for information is a very new research direction in data mining. Blog files are different from standard web files and may need specialized mining strategies. We propose to include the title, body, and comments of the blog pages in clustering datasets from blog documents. In particular, we argue that the author/reader comments of the blog pages may have more discriminating effect in clustering blog documents. We constructed a word-page matrix by downloading blog pages from a well-known website and experimented a k-means clustering algorithm with different weights assigned to the title, body, and comment parts. Our experimental resul...
Abstract—Web content clustering is very important part of topic detection and tracking issue. In our...
This paper investigates graph-based approaches to labeled topic clustering of reader comments in onl...
Abstract — The analysis of weblogs has become a popular area of natural language processing. Due to ...
Abstract. The analysis of blogs is emerging as an exciting new area in the text processing field whi...
Abstract. The analysis of blogs is emerging as an exciting new area in the text processing field whi...
The Web has experienced an exponential growth in the use of weblogs or blogs. Blog entries are gener...
Blogosphere is expanding in an unprecedented speed. A better understanding of the blogosphere can gr...
We investigate the identification of facets of query-biased sets of blog posts. Given a set of blog ...
There is an increasing number of people reading, writing, and commenting on blogs. According to a re...
There is an increasing number of people reading, writing, and commenting on blogs. According to a re...
We investigate the identification of facets of query-biased sets of blog posts. Given a set of blog ...
This paper addresses clustering of blog users and posts in blogosphere. First, we model blogosphere ...
Blog classification is the system of classifying blogs based on pre-defined categories. This area is...
User generated content in general, and blogs in particular, form an interesting and relatively littl...
Additional contributor: Daniel Boley (faculty mentor).According to Java et al., some of the main int...
Abstract—Web content clustering is very important part of topic detection and tracking issue. In our...
This paper investigates graph-based approaches to labeled topic clustering of reader comments in onl...
Abstract — The analysis of weblogs has become a popular area of natural language processing. Due to ...
Abstract. The analysis of blogs is emerging as an exciting new area in the text processing field whi...
Abstract. The analysis of blogs is emerging as an exciting new area in the text processing field whi...
The Web has experienced an exponential growth in the use of weblogs or blogs. Blog entries are gener...
Blogosphere is expanding in an unprecedented speed. A better understanding of the blogosphere can gr...
We investigate the identification of facets of query-biased sets of blog posts. Given a set of blog ...
There is an increasing number of people reading, writing, and commenting on blogs. According to a re...
There is an increasing number of people reading, writing, and commenting on blogs. According to a re...
We investigate the identification of facets of query-biased sets of blog posts. Given a set of blog ...
This paper addresses clustering of blog users and posts in blogosphere. First, we model blogosphere ...
Blog classification is the system of classifying blogs based on pre-defined categories. This area is...
User generated content in general, and blogs in particular, form an interesting and relatively littl...
Additional contributor: Daniel Boley (faculty mentor).According to Java et al., some of the main int...
Abstract—Web content clustering is very important part of topic detection and tracking issue. In our...
This paper investigates graph-based approaches to labeled topic clustering of reader comments in onl...
Abstract — The analysis of weblogs has become a popular area of natural language processing. Due to ...