Topical noise in blogs arises when bloggers digress from the central topical thrust of their blogs. We introduce a method to explicitly incorporate a model of topical noise into a language modeling approach to the task of blog distillation. Topical noise is integrated into the model using a coherence score, which reflects the tightness of the topical structure of a blog. Tests performed on the TRECBlog06 corpus show that a naive integration of the coherence score as blog prior fails to achieve performance improvements. Instead, we develop a set of more sophisticated models in which the coherence score is weighted by a function of the blog retrieval score. The proposed models help improve effectiveness of our language modeling approach to th...
Discourse coherence is an important aspect of text quality that refers to the way different textual ...
The paper is focused on blogosphere research based on the TREC blog distillation task, and aims to e...
Abstract—In this paper, we propose an algorithm called coher-ence hidden Markov model (HMM) to extra...
User generated content in general, and blogs in particular, form an interesting and relatively littl...
Abstract. User generated content in general, and blogs in particu-lar, form an interesting and relat...
We address the task of (blog) feed distillation: to find blogs that are principally devoted to a giv...
In this paper we examine the effects of noise when creating a real-world weblog corpus for informati...
Topical blog post retrieval is the task of rank-ing blog posts with respect to their relevance for a...
Topical blog post retrieval is the task of ranking blog posts with respect to their relevance for a ...
Faceted blog distillation aims at retrieving the blogs that are not only relevant to a query but als...
We describe a method for discovering irregularities in temporal mood patterns appearing in a large c...
We describe our participation in the TREC 2007 Blog track. In the opinion task we looked at the diff...
The article discusses the linear cohesion regularities of the blogger’s initiating message and the r...
DoctorSince the advent of the Internet, it has become one of the most important channels for communi...
doi:10.4156/jdcta.vol4. issue8.9 With the increasing of blog users, the traditional blog search can ...
Discourse coherence is an important aspect of text quality that refers to the way different textual ...
The paper is focused on blogosphere research based on the TREC blog distillation task, and aims to e...
Abstract—In this paper, we propose an algorithm called coher-ence hidden Markov model (HMM) to extra...
User generated content in general, and blogs in particular, form an interesting and relatively littl...
Abstract. User generated content in general, and blogs in particu-lar, form an interesting and relat...
We address the task of (blog) feed distillation: to find blogs that are principally devoted to a giv...
In this paper we examine the effects of noise when creating a real-world weblog corpus for informati...
Topical blog post retrieval is the task of rank-ing blog posts with respect to their relevance for a...
Topical blog post retrieval is the task of ranking blog posts with respect to their relevance for a ...
Faceted blog distillation aims at retrieving the blogs that are not only relevant to a query but als...
We describe a method for discovering irregularities in temporal mood patterns appearing in a large c...
We describe our participation in the TREC 2007 Blog track. In the opinion task we looked at the diff...
The article discusses the linear cohesion regularities of the blogger’s initiating message and the r...
DoctorSince the advent of the Internet, it has become one of the most important channels for communi...
doi:10.4156/jdcta.vol4. issue8.9 With the increasing of blog users, the traditional blog search can ...
Discourse coherence is an important aspect of text quality that refers to the way different textual ...
The paper is focused on blogosphere research based on the TREC blog distillation task, and aims to e...
Abstract—In this paper, we propose an algorithm called coher-ence hidden Markov model (HMM) to extra...