Dividing documents into topically-coherent units and discovering their topic might have many uses. We present a system that proceeds in two steps: (1) the input text is segmented at places where there is a probable topic shift, (2) lexical chains are extracted from each segment as indicators of its topic. Two implementations, based on public domain resources, are presented: one based on WordNet and the second one based on Roget's thesaurus. An evaluation of the algorithm shows that lexical chains are acceptable as topic indicator with of precision and of recall
Most topic models, such as latent Dirichlet allocation, rely on the bag of words assumption. However...
In the Natural Language Understanding field, one of the important tasks is topic detection. Given th...
. We investigate the problem of text segmentation by topic. Applications for this task include topic...
Most documents are about more than one subject, but the majority of natural language processing algo...
I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, includ...
Most documents are aboutmore than one subject, but the majority of natural language processing algor...
This paper presents several methods for topic detection on newspaper articles, using either a genera...
Topic segmentation classically relies on one of two criteria, either finding areas with co-herent vo...
Topic segmentation attempts to divide a document into segments, where each segment corresponds to a ...
Colloque avec actes et comité de lecture. nationale.National audienceThis paper presents several met...
This paper deals with the problem of automatic topic detection in text documents. The proposed metho...
Detecting topics by extracting keywords from written text using TF-IDF has been studied and successf...
Topic Detection and Tracking (TDT) research has produced some successful statistical tracking system...
Abstract. In this paper, we review two techniques for topic discovery in collections of text documen...
International audienceTopic segmentation traditionally relies on lexical cohesion measured through w...
Most topic models, such as latent Dirichlet allocation, rely on the bag of words assumption. However...
In the Natural Language Understanding field, one of the important tasks is topic detection. Given th...
. We investigate the problem of text segmentation by topic. Applications for this task include topic...
Most documents are about more than one subject, but the majority of natural language processing algo...
I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, includ...
Most documents are aboutmore than one subject, but the majority of natural language processing algor...
This paper presents several methods for topic detection on newspaper articles, using either a genera...
Topic segmentation classically relies on one of two criteria, either finding areas with co-herent vo...
Topic segmentation attempts to divide a document into segments, where each segment corresponds to a ...
Colloque avec actes et comité de lecture. nationale.National audienceThis paper presents several met...
This paper deals with the problem of automatic topic detection in text documents. The proposed metho...
Detecting topics by extracting keywords from written text using TF-IDF has been studied and successf...
Topic Detection and Tracking (TDT) research has produced some successful statistical tracking system...
Abstract. In this paper, we review two techniques for topic discovery in collections of text documen...
International audienceTopic segmentation traditionally relies on lexical cohesion measured through w...
Most topic models, such as latent Dirichlet allocation, rely on the bag of words assumption. However...
In the Natural Language Understanding field, one of the important tasks is topic detection. Given th...
. We investigate the problem of text segmentation by topic. Applications for this task include topic...