Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.Current GPUs (Graphic Processing Units) can obtain high computational performance in scientific applications. Nevertheless, programmers have to use suitable parallel algorithms for these architectures and have to consider optimization techniques in the implementation in order to achieve that performance. This thesis is focused on designing and implementing parallel prefix algorithms into GPU architectures with little effort. For that, we have developed a very optimized library called BPLG (Tuning Butterfly Processing Library for GPUs) and based on a set of building blocks that enable to easily des...
The computing power of current Graphical Processing Units (GPUs) has increased rapidly over the year...
In 2006 NVIDIA introduced a new unified GPU architecture facilitating general-purpose computation on...
Writing high performance GPGPU code is often difficult and time-consuming, potentially requiring lab...
Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)...
[Abstract] Current Graphics Processing Units (GPUs) are capable of obtaining high computational perf...
GPUs have been used for years in compute intensive applications. Their massive parallel processing c...
2012-05-02Graphics Processing Units (GPUs) have evolved to devices with teraflop-level performance p...
We present a number of optimization techniques to compute prefix sums on linked lists and implement ...
We present a number of optimization techniques to compute prefix sums on linked lists and implement ...
Graphics Processing Units (GPUs) are a fast evolving architecture. Over the last decade their progra...
Manual tuning of applications for heterogeneous parallel systems is tedious and complex. Optimizati...
In the paper we present the parallel implementation of the alpha-beta algorithm running on the graph...
AbstractWhile developing naive code is uncomplicated, optimizing extremely parallel algorithms requi...
Graphics hardware's performance is advancing much faster than the performance of conventional microp...
The overarching objective of this thesis was to develop tools for parallelising, optimising, and im...
The computing power of current Graphical Processing Units (GPUs) has increased rapidly over the year...
In 2006 NVIDIA introduced a new unified GPU architecture facilitating general-purpose computation on...
Writing high performance GPGPU code is often difficult and time-consuming, potentially requiring lab...
Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)...
[Abstract] Current Graphics Processing Units (GPUs) are capable of obtaining high computational perf...
GPUs have been used for years in compute intensive applications. Their massive parallel processing c...
2012-05-02Graphics Processing Units (GPUs) have evolved to devices with teraflop-level performance p...
We present a number of optimization techniques to compute prefix sums on linked lists and implement ...
We present a number of optimization techniques to compute prefix sums on linked lists and implement ...
Graphics Processing Units (GPUs) are a fast evolving architecture. Over the last decade their progra...
Manual tuning of applications for heterogeneous parallel systems is tedious and complex. Optimizati...
In the paper we present the parallel implementation of the alpha-beta algorithm running on the graph...
AbstractWhile developing naive code is uncomplicated, optimizing extremely parallel algorithms requi...
Graphics hardware's performance is advancing much faster than the performance of conventional microp...
The overarching objective of this thesis was to develop tools for parallelising, optimising, and im...
The computing power of current Graphical Processing Units (GPUs) has increased rapidly over the year...
In 2006 NVIDIA introduced a new unified GPU architecture facilitating general-purpose computation on...
Writing high performance GPGPU code is often difficult and time-consuming, potentially requiring lab...