Histogramming is a tool commonly used in data analysis. Although its serial version is simple to implement, providing an efficient and scalable way to parallelize it can be challenging. This especially holds in case of platforms that contain one or several massively parallel devices like CUDA-capable GPUs due to issues with domain decomposition, use of global memory and similar. In this paper we compare two approaches for implementing general purpose histogramming on GPUs. The first algorithm is based on private copies of bin counters stored in shared memory for each block of threads. The second one uses the Thrust library to sort the input elements and then to search for upper bounds according to bin widths. For both algorithms we analyze ...
The objective of this thesis is to optimize the Seam Carving method in CUDA (Compute Unified Device ...
Abstract. We introduce a CUDA GPU library to accelerate evaluations with homomorphic schemes defined...
Abstract — GPU based on CUDA Architecture developed by NVIDIA is a high performance computing device...
Histogramming is a tool commonly used in data analysis. Although its serial version is simple to imp...
Abstract—Histogramming is a tool commonly used in data analysis. Although its serial version is simp...
Abstract—We present two efficient histogram algorithms de-signed for NVIDIA’s compute unified device...
Graphics Processing Units (GPUs) are suitable for highly data parallel algorithms such as image proc...
Graphics Processing Units (GPUs) are suitable for highly data parallel algorithms such as image proc...
Graphics Processing Units (GPUs) are a fast evolving architecture. Over the last decade their progra...
Histogramming is a technique by which input datasets are mined to extract features and patterns. His...
The traditional sorting technique, sequential sorting, is inefficient with increasing amounts of dat...
Sorting is a very important task in computer science and becomes a critical operation for programs t...
We present four CUDA based parallel implementations of the Space-Saving algorithm for determining fr...
Using the histogram procedure, this work studies performance determining factors in computing in par...
We describe the design of high-performance parallel radix sort and merge sort routines for manycore ...
The objective of this thesis is to optimize the Seam Carving method in CUDA (Compute Unified Device ...
Abstract. We introduce a CUDA GPU library to accelerate evaluations with homomorphic schemes defined...
Abstract — GPU based on CUDA Architecture developed by NVIDIA is a high performance computing device...
Histogramming is a tool commonly used in data analysis. Although its serial version is simple to imp...
Abstract—Histogramming is a tool commonly used in data analysis. Although its serial version is simp...
Abstract—We present two efficient histogram algorithms de-signed for NVIDIA’s compute unified device...
Graphics Processing Units (GPUs) are suitable for highly data parallel algorithms such as image proc...
Graphics Processing Units (GPUs) are suitable for highly data parallel algorithms such as image proc...
Graphics Processing Units (GPUs) are a fast evolving architecture. Over the last decade their progra...
Histogramming is a technique by which input datasets are mined to extract features and patterns. His...
The traditional sorting technique, sequential sorting, is inefficient with increasing amounts of dat...
Sorting is a very important task in computer science and becomes a critical operation for programs t...
We present four CUDA based parallel implementations of the Space-Saving algorithm for determining fr...
Using the histogram procedure, this work studies performance determining factors in computing in par...
We describe the design of high-performance parallel radix sort and merge sort routines for manycore ...
The objective of this thesis is to optimize the Seam Carving method in CUDA (Compute Unified Device ...
Abstract. We introduce a CUDA GPU library to accelerate evaluations with homomorphic schemes defined...
Abstract — GPU based on CUDA Architecture developed by NVIDIA is a high performance computing device...