Clusters in document streams, such as online news articles, can be induced by their textual contents, as well as by the temporal dynamics of their arriving patterns. Can we lever-age both sources of information to obtain a better clustering of the documents, and distill information that is not possi-ble to extract using contents only? In this paper, we pro-pose a novel random process, referred to as the Dirichlet-Hawkes process, to take into account both information in a unified framework. A distinctive feature of the proposed model is that the preferential attachment of items to clusters according to cluster sizes, present in Dirichlet processes, is now driven according to the intensities of cluster-wise self-exciting temporal point proces...
A challenge created by the recent development in information technology is that people are often fac...
We address here two major challenges presented by dynamic data mining: 1) the stability challenge: w...
International audienceIn the domain of data-stream clustering, e.g., dynamic text mining as our appl...
Clusters in document streams, such as online news articles, can be induced by their textual contents...
International audienceThe textual content of a document and its publication date are intertwined. Fo...
We present the time-dependent topic-cluster model, a hierarchical approach for combining Latent Diri...
The publication time of a document carries a relevant information about its semantic content. The Di...
People are increasingly relying on the Web and social media to find solutions to their problems in a...
International audienceInformation spread on networks can be efficiently modeled by considering three...
Information spread on networks can be efficiently modeled by considering three features: documents' ...
In this paper we propose a probabilistic model for online document clustering. We use non-parametric...
We present and analyze the off-line star algorithm for clustering static information systems and the...
International audienceWe address here two major challenges presented by dynamic data mining: 1) the ...
We describe a large scale system for clustering a stream of news articles that was developed as part...
International audienceThe publication time of a document carries a relevant information about its se...
A challenge created by the recent development in information technology is that people are often fac...
We address here two major challenges presented by dynamic data mining: 1) the stability challenge: w...
International audienceIn the domain of data-stream clustering, e.g., dynamic text mining as our appl...
Clusters in document streams, such as online news articles, can be induced by their textual contents...
International audienceThe textual content of a document and its publication date are intertwined. Fo...
We present the time-dependent topic-cluster model, a hierarchical approach for combining Latent Diri...
The publication time of a document carries a relevant information about its semantic content. The Di...
People are increasingly relying on the Web and social media to find solutions to their problems in a...
International audienceInformation spread on networks can be efficiently modeled by considering three...
Information spread on networks can be efficiently modeled by considering three features: documents' ...
In this paper we propose a probabilistic model for online document clustering. We use non-parametric...
We present and analyze the off-line star algorithm for clustering static information systems and the...
International audienceWe address here two major challenges presented by dynamic data mining: 1) the ...
We describe a large scale system for clustering a stream of news articles that was developed as part...
International audienceThe publication time of a document carries a relevant information about its se...
A challenge created by the recent development in information technology is that people are often fac...
We address here two major challenges presented by dynamic data mining: 1) the stability challenge: w...
International audienceIn the domain of data-stream clustering, e.g., dynamic text mining as our appl...