International audiencePerforming similarity queries on hundreds of millions of time series is a challenge requiring both efficient indexing techniques and parallelization. We propose a sketch/random projection-based approach that scales nearly linearly in parallel environments, and provides high quality answers. We illustrate the performance of our approach, called RadiusSketch, on real and synthetic datasets of up to 1 Terabytes and 500 million time series. The sketch method, as we have implemented, is superior in both quality and response time compared with the state of the art approach, iSAX2+. Already, in the sequential case it improves recall and precision by a factor of two, while giving shorter response times. In a parallel environme...
© 2010 Mei MaTime series datasets are useful in a wide range of diverse real world applications. Re...
Due to the increasing complexity of current digital data, the similarity search has become a fundame...
Modern scientific datasets present numerous data management and analysis challenges. State-of-the-ar...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
International audienceA growing number of domains (finance, seismology, internet-of-things, etc.) co...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
Time series arise in many application domains such as finance, agronomy, health, earth monitoring, w...
We consider the problem of querying large scale multidimensional time series data to discover events...
We address the problem of similarity search in large time series databases. We introduce a novel ind...
As advances in science and technology have continually increased the existence of, and capability fo...
Current research in indexing and mining time series data has produced many interesting algorithms an...
Abstract—We consider the problem of finding similar patterns in a time sequence. Typical application...
International audienceThe mining of time series data plays an important role in modern information r...
This thesis studies the scalability of the similarity search problem in large-scale multidimensional...
International audienceThis paper presents parallel solutions (developed based on two state-of-the-ar...
© 2010 Mei MaTime series datasets are useful in a wide range of diverse real world applications. Re...
Due to the increasing complexity of current digital data, the similarity search has become a fundame...
Modern scientific datasets present numerous data management and analysis challenges. State-of-the-ar...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
International audienceA growing number of domains (finance, seismology, internet-of-things, etc.) co...
International audienceIndexing is crucial for many data mining tasks that rely on efficient and effe...
Time series arise in many application domains such as finance, agronomy, health, earth monitoring, w...
We consider the problem of querying large scale multidimensional time series data to discover events...
We address the problem of similarity search in large time series databases. We introduce a novel ind...
As advances in science and technology have continually increased the existence of, and capability fo...
Current research in indexing and mining time series data has produced many interesting algorithms an...
Abstract—We consider the problem of finding similar patterns in a time sequence. Typical application...
International audienceThe mining of time series data plays an important role in modern information r...
This thesis studies the scalability of the similarity search problem in large-scale multidimensional...
International audienceThis paper presents parallel solutions (developed based on two state-of-the-ar...
© 2010 Mei MaTime series datasets are useful in a wide range of diverse real world applications. Re...
Due to the increasing complexity of current digital data, the similarity search has become a fundame...
Modern scientific datasets present numerous data management and analysis challenges. State-of-the-ar...