Many computing problems benefit from dynamic data partitioning—dividing a large amount of data into smaller chunks with better locality. When data can be sorted, two methods are commonly used in partitioning. The first selects pivots, which enable balanced partitioning but cause a large overhead of up to half of the sorting time. The second method uses simple functions, which is fast but requires that the input data con- firm to a uniform distribution. In this paper, we propose a new method, which partitions data using the cumulative distribution function. It partitions data of any distribution in linear time, independent to the number of sublists to be partitioned into. Experiments show 10-30% improvement in partitioning balance an...
The Parallel Disks Model (PDM) has been proposed to alleviate the I/O bottleneck that arises in the ...
Many sorting algorithms that perform well on uniformly distributed data suffer significant performan...
Many algorithms can find optimal bipartitions for various objectives including minimizing the maximu...
A sorting algorithm is adaptive if its run time, for inputs of the same size n, varies smoothly from...
Distributive Partitioned Sort (DPS) is a fast internal sorting algorithm which rung in 0(n) expected...
In multiprocessor systems, data parallelism is the execution of the same task on data distributed ac...
A sorting algorithm is adaptive if its run time for inputs of the same size n varies smoothly from O...
International audienceThe aim of the paper is to introduce general techniques in order to optimize t...
When a partitional structure is derived from a data set using a data mining algorithm, it is not unu...
An algorithm that remains in use at the core of many partitioning systems is the Kernighan-Lin algor...
Nowadays, high performance parallel computation is deemed as a good solution to the complicated proc...
<p>Partition-based methods estimate data density by cutting the data space into smaller rectangles r...
In our previous work there was some indication that Partition Sort could be having a more robust ave...
Data mining uses algorithms to extract knowledge from a large and complex data set. Due to the large...
One of the basic problems of Computer Science is sorting a list of items. It refers to the arrangeme...
The Parallel Disks Model (PDM) has been proposed to alleviate the I/O bottleneck that arises in the ...
Many sorting algorithms that perform well on uniformly distributed data suffer significant performan...
Many algorithms can find optimal bipartitions for various objectives including minimizing the maximu...
A sorting algorithm is adaptive if its run time, for inputs of the same size n, varies smoothly from...
Distributive Partitioned Sort (DPS) is a fast internal sorting algorithm which rung in 0(n) expected...
In multiprocessor systems, data parallelism is the execution of the same task on data distributed ac...
A sorting algorithm is adaptive if its run time for inputs of the same size n varies smoothly from O...
International audienceThe aim of the paper is to introduce general techniques in order to optimize t...
When a partitional structure is derived from a data set using a data mining algorithm, it is not unu...
An algorithm that remains in use at the core of many partitioning systems is the Kernighan-Lin algor...
Nowadays, high performance parallel computation is deemed as a good solution to the complicated proc...
<p>Partition-based methods estimate data density by cutting the data space into smaller rectangles r...
In our previous work there was some indication that Partition Sort could be having a more robust ave...
Data mining uses algorithms to extract knowledge from a large and complex data set. Due to the large...
One of the basic problems of Computer Science is sorting a list of items. It refers to the arrangeme...
The Parallel Disks Model (PDM) has been proposed to alleviate the I/O bottleneck that arises in the ...
Many sorting algorithms that perform well on uniformly distributed data suffer significant performan...
Many algorithms can find optimal bipartitions for various objectives including minimizing the maximu...