International audienceA growing number of domains (finance, seismology, internet-of-things, etc.) collect massive time series. When the number of series grow to the hundreds of millions or even billions, similarity queries become intractable on a single machine. Further, naive (quadratic) parallelization won't work well. So, we need both efficient indexing and parallelization. We propose a demonstration of Spark-parSketch, a complete solution based on sketches / random projections to efficiently perform both the parallel indexing of large sets of time series and a similarity search on them. Because our method is approximate, we explore the tradeoff between time and precision. A video showing the dynamics of the demonstration can be found by...
Fast indexing in time sequence databases for similarity searching has attracted a lot of research re...
International audienceTime series data are increasing at a dramatic rate, yet their analysis remains...
grantor: University of TorontoThe idea of posing queries in terms of similarity of objects...
International audiencePerforming similarity queries on hundreds of millions of time series is a chal...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
International audienceAs sensors improve in both bandwidth and quantity over time, the need for high...
Time series arise in many application domains such as finance, agronomy, health, earth monitoring, w...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
We address the problem of similarity search in large time series databases. We introduce a novel ind...
As advances in science and technology have continually increased the existence of, and capability fo...
Current research in indexing and mining time series data has produced many interesting algorithms an...
We consider the problem of querying large scale multidimensional time series data to discover events...
International audienceThe mining of time series data plays an important role in modern information r...
Abstract—We consider the problem of finding similar patterns in a time sequence. Typical application...
We study a set of linear transformations on the Fourier series representation of a sequence that can...
Fast indexing in time sequence databases for similarity searching has attracted a lot of research re...
International audienceTime series data are increasing at a dramatic rate, yet their analysis remains...
grantor: University of TorontoThe idea of posing queries in terms of similarity of objects...
International audiencePerforming similarity queries on hundreds of millions of time series is a chal...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
International audienceAs sensors improve in both bandwidth and quantity over time, the need for high...
Time series arise in many application domains such as finance, agronomy, health, earth monitoring, w...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
We address the problem of similarity search in large time series databases. We introduce a novel ind...
As advances in science and technology have continually increased the existence of, and capability fo...
Current research in indexing and mining time series data has produced many interesting algorithms an...
We consider the problem of querying large scale multidimensional time series data to discover events...
International audienceThe mining of time series data plays an important role in modern information r...
Abstract—We consider the problem of finding similar patterns in a time sequence. Typical application...
We study a set of linear transformations on the Fourier series representation of a sequence that can...
Fast indexing in time sequence databases for similarity searching has attracted a lot of research re...
International audienceTime series data are increasing at a dramatic rate, yet their analysis remains...
grantor: University of TorontoThe idea of posing queries in terms of similarity of objects...