Data intensive scientific computations as well on-lineanalytical processing applications as are done on very large datasetsthat are modeled as k-dimensional arrays. The storage organization ofsuch arrays on disks is done by partitioning the large global array intofixed size hyper-rectangular sub-arrays called chunks or tiles that formthe units of data transfer between disk and memory. Typical queriesinvolve the retrieval of sub-arrays in a manner that accesses all chunksthat overlap the query results. An important metric of the storageefficiency is the expected number of chunks retrieved over all suchqueries. The question that immediately arises is "what shapes of arraychunks give the minimum expected number of chunks over a query workload?...
The ubiquity of high-dimensional data in machine learning and data mining applications makes its eff...
We review the time and storage costs of search and clustering algorithms. We exemplify these, based ...
http://www.springerlink.com/We present a cost-based adaptive clustering method to improve average pe...
Very large multidimensional arrays are commonly used in data intensive scientific computations as we...
Data intensive scientific computations as well on-line analytical processing applications as are do...
Very large multidimensional arrays are commonly used in data intensive scientific computations as we...
Scientists today are able to generate data at an unprecedented scale and rate. For example the Sloan...
Thesis (Ph.D.)--University of Washington, 2014Scientists today are able to generate data at an unpre...
Multidimensional Analysis and On-Line Analytical Pro-cessing (OLAP) uses summary information that re...
Large scale scientific datasets are generally mod-eled as k-dimensional arrays, since this model is ...
The size of spatial scientific datasets is steadily increasing due to improvements in instruments an...
Databases and data warehouses contain an overwhelming volume of information that users must wade thr...
The main assumption of chunking theory is that knowledge about semantic units in a certain task doma...
Many statistical and MOLAP applications use multidimensional arrays as the basic data structure to a...
Multi-dimensional arrays (also known as raster data or gridded data) play a key role in many, if not...
The ubiquity of high-dimensional data in machine learning and data mining applications makes its eff...
We review the time and storage costs of search and clustering algorithms. We exemplify these, based ...
http://www.springerlink.com/We present a cost-based adaptive clustering method to improve average pe...
Very large multidimensional arrays are commonly used in data intensive scientific computations as we...
Data intensive scientific computations as well on-line analytical processing applications as are do...
Very large multidimensional arrays are commonly used in data intensive scientific computations as we...
Scientists today are able to generate data at an unprecedented scale and rate. For example the Sloan...
Thesis (Ph.D.)--University of Washington, 2014Scientists today are able to generate data at an unpre...
Multidimensional Analysis and On-Line Analytical Pro-cessing (OLAP) uses summary information that re...
Large scale scientific datasets are generally mod-eled as k-dimensional arrays, since this model is ...
The size of spatial scientific datasets is steadily increasing due to improvements in instruments an...
Databases and data warehouses contain an overwhelming volume of information that users must wade thr...
The main assumption of chunking theory is that knowledge about semantic units in a certain task doma...
Many statistical and MOLAP applications use multidimensional arrays as the basic data structure to a...
Multi-dimensional arrays (also known as raster data or gridded data) play a key role in many, if not...
The ubiquity of high-dimensional data in machine learning and data mining applications makes its eff...
We review the time and storage costs of search and clustering algorithms. We exemplify these, based ...
http://www.springerlink.com/We present a cost-based adaptive clustering method to improve average pe...