This paper considers the problem of bulk-loading large data sets for for the gridfile multi-attribute indexing technique. We propose a rectilinear partitioning algorithm that heuristically seeks to minimize the size of the gridfile needed to ensure no bucket overflows. Empirical studies on both synthetic data sets and on data sets drawn from computational fluid dynamics applications demonstrate that our algorithm is very efficient, and is able to handle large data sets. In addition, we present an algorithm for bulk-loading data sets too large to fit in main memory. Utilizing a sort of the entire data set it creates a gridfile without incurring any overflows. ICASE Report # 94-74 This research was supported by the National Aeronautics and ...
Many applications require the clustering of large amounts of high-dimensional data. Most clustering ...
Recently there has been an increasing interest in supporting bulk operations on multidimensional ind...
Applications that use collections of very large, dis-tributed datasets have become an increasingly i...
This paper considers the problem of bulk-loading large data sets for the gridfile multiattribute ind...
Efficient storage and retrieval of large multidimensional datasets is an important concern for large...
We present a new dynamic index structure for multidimensional data. The considered index structure i...
In this paper, we propose a new bulk-loading technique for high-dimensional indexes which represent ...
Abstract. In this paper, we propose a new bulk-loading technique for high-di-mensional indexes which...
We propose a new similarity-based technique for declustering data. The proposed method can adapt to ...
Abstract. In this paper, we propose a new bulk-loading technique for high-di-mensional indexes which...
A major part of the interface to a database is made up of the queries that can be addressed to this ...
Abstract- Declustering problems are well-known in the databases for parallel computing envi-ronments...
Advanced instruments in a variety of scientific domains are collecting massive amounts of data that ...
It is important to improve data reliability and data access efficiency for data-intensive applicatio...
In multimedia databases spatial or high-dimensional data manipulation is important for storage and ...
Many applications require the clustering of large amounts of high-dimensional data. Most clustering ...
Recently there has been an increasing interest in supporting bulk operations on multidimensional ind...
Applications that use collections of very large, dis-tributed datasets have become an increasingly i...
This paper considers the problem of bulk-loading large data sets for the gridfile multiattribute ind...
Efficient storage and retrieval of large multidimensional datasets is an important concern for large...
We present a new dynamic index structure for multidimensional data. The considered index structure i...
In this paper, we propose a new bulk-loading technique for high-dimensional indexes which represent ...
Abstract. In this paper, we propose a new bulk-loading technique for high-di-mensional indexes which...
We propose a new similarity-based technique for declustering data. The proposed method can adapt to ...
Abstract. In this paper, we propose a new bulk-loading technique for high-di-mensional indexes which...
A major part of the interface to a database is made up of the queries that can be addressed to this ...
Abstract- Declustering problems are well-known in the databases for parallel computing envi-ronments...
Advanced instruments in a variety of scientific domains are collecting massive amounts of data that ...
It is important to improve data reliability and data access efficiency for data-intensive applicatio...
In multimedia databases spatial or high-dimensional data manipulation is important for storage and ...
Many applications require the clustering of large amounts of high-dimensional data. Most clustering ...
Recently there has been an increasing interest in supporting bulk operations on multidimensional ind...
Applications that use collections of very large, dis-tributed datasets have become an increasingly i...