A persistent item in a stream is one that occurs regularly in the stream without necessarily contributing significantly to the volume of the stream. Persistent items are often associated with anomalies in network streams, such as botnet traffic and click fraud. While it is important to track persistent items in an online manner, it is challenging to zero-in on such items in a massive distributed stream. We present the first communication-efficient distributed algorithms for tracking persistent items in a data stream whose elements are partitioned across many different sites. We consider both infinite window and sliding window settings, and present algorithms that can track persistent items approximately with a probabilistic guarantee on the...
While traditional database systems optimize for performance on one-shot query processing, emerging l...
GDD_HCERES2020Estimating the frequency of any piece of information in large-scale distributed data s...
AbstractIn data streaming applications, data arrives at rapid rates and in high volume, thus making ...
Motivated by scenarios in network anomaly detection, we consider the problem of detecting persistent...
The past decade has witnessed many interesting algorithms for maintaining statistics over a data str...
It is natural to model and represent interaction data as graphs in a broad range of domains such as ...
In this paper we extend the study of algorithms for monitoring distributed data streams from whole d...
We investigate several basic problems in the distributed streaming model. In the this model, we have...
We study the problem of finding frequent itemsets in a continuous stream of transactions. The curren...
We consider the problem of maintaining frequency counts for items occurring frequently in the union ...
International audienceEstimating the frequency of any piece of informa- tion in large-scale distribu...
We consider the problem of maintaining frequency counts for items occurring frequently in the union ...
In emerging pervasive scenarios, data is collected by sensing devices in streams that occur at sever...
International audienceWe investigate the problem of estimating on the fly the frequency at which ite...
International audienceThis article studies the recovery of static communities in a temporal network....
While traditional database systems optimize for performance on one-shot query processing, emerging l...
GDD_HCERES2020Estimating the frequency of any piece of information in large-scale distributed data s...
AbstractIn data streaming applications, data arrives at rapid rates and in high volume, thus making ...
Motivated by scenarios in network anomaly detection, we consider the problem of detecting persistent...
The past decade has witnessed many interesting algorithms for maintaining statistics over a data str...
It is natural to model and represent interaction data as graphs in a broad range of domains such as ...
In this paper we extend the study of algorithms for monitoring distributed data streams from whole d...
We investigate several basic problems in the distributed streaming model. In the this model, we have...
We study the problem of finding frequent itemsets in a continuous stream of transactions. The curren...
We consider the problem of maintaining frequency counts for items occurring frequently in the union ...
International audienceEstimating the frequency of any piece of informa- tion in large-scale distribu...
We consider the problem of maintaining frequency counts for items occurring frequently in the union ...
In emerging pervasive scenarios, data is collected by sensing devices in streams that occur at sever...
International audienceWe investigate the problem of estimating on the fly the frequency at which ite...
International audienceThis article studies the recovery of static communities in a temporal network....
While traditional database systems optimize for performance on one-shot query processing, emerging l...
GDD_HCERES2020Estimating the frequency of any piece of information in large-scale distributed data s...
AbstractIn data streaming applications, data arrives at rapid rates and in high volume, thus making ...