International audienceThe way developers implement their algorithms and how these implementations behave on modern CPUs are governed by the design and organization of these. The vectorization units (SIMD) are among the few CPUs' parts that can and must be explicitly controlled. In the HPC community, the x86 CPUs and their vectorization instruction sets were de-facto the standard for decades. Each new release of an instruction set was usually a doubling of the vector length coupled with new operations. Each generation was pushing for adapting and improving previous implementations. The release of the ARM scalable vector extension (SVE) changed things radically for several reasons. First, we expect ARM processors to equip many supercomputers ...
FastSort is an external sort that uses parallel processing, large main memories and parallel disc ac...
Sorting is a kind of widely used basic algorithms. As the high performance computing devices are inc...
The sparse matrix/vector product (SpMV) is a fundamental operation in scientific computing. Having a...
International audienceThe way developers implement their algorithms and how these implementations be...
Modern CPUs provide single instruction-multiple data (SIMD) instructions. SIMD instructions process ...
We have designed a radix sort algorithm for vector multiprocessors and have implemented the algorith...
Merging and Sorting algorithms are the backbone of many modern computer applica- tions. As such, eff...
Abstract — Sorting is a commonly used process with a wide breadth of applications in the high perfor...
Sample sort, a generalization of quicksort that partitions the input into many pieces, is known as t...
This paper demonstrates how modern software development methodologies can be used to give an existin...
Hardware sorters exploit inherent concurrency to improve the performance of sequential, software-bas...
Abstract|Sorting is a fundamental algorithm used extensively in computer science as an interme-diate...
Sorting is one of the most fundamental algorithmic kernels, used by a large fraction of computer app...
Accelerating program performance via SIMD vector units is very common in modern processors, as evide...
We present the design and implementation of a parallel out-of-core sorting algorithm, which is based...
FastSort is an external sort that uses parallel processing, large main memories and parallel disc ac...
Sorting is a kind of widely used basic algorithms. As the high performance computing devices are inc...
The sparse matrix/vector product (SpMV) is a fundamental operation in scientific computing. Having a...
International audienceThe way developers implement their algorithms and how these implementations be...
Modern CPUs provide single instruction-multiple data (SIMD) instructions. SIMD instructions process ...
We have designed a radix sort algorithm for vector multiprocessors and have implemented the algorith...
Merging and Sorting algorithms are the backbone of many modern computer applica- tions. As such, eff...
Abstract — Sorting is a commonly used process with a wide breadth of applications in the high perfor...
Sample sort, a generalization of quicksort that partitions the input into many pieces, is known as t...
This paper demonstrates how modern software development methodologies can be used to give an existin...
Hardware sorters exploit inherent concurrency to improve the performance of sequential, software-bas...
Abstract|Sorting is a fundamental algorithm used extensively in computer science as an interme-diate...
Sorting is one of the most fundamental algorithmic kernels, used by a large fraction of computer app...
Accelerating program performance via SIMD vector units is very common in modern processors, as evide...
We present the design and implementation of a parallel out-of-core sorting algorithm, which is based...
FastSort is an external sort that uses parallel processing, large main memories and parallel disc ac...
Sorting is a kind of widely used basic algorithms. As the high performance computing devices are inc...
The sparse matrix/vector product (SpMV) is a fundamental operation in scientific computing. Having a...