Padding Free Bank Conflict Resolution for CUDA-Based Matrix Transpose Algorithm

Ayaz Ul Hassan Khan
Mayez Al-mouhamed
Allam Fatayer
Anas Almousa
Abdulrahman Baqais
Mohammed Assayony

Publication date

September 2016

Abstract

The advances of Graphic Processing Units (GPU) technology and the introduction of CUDA program-ming model facilitates developing new solutions for sparse and dense linear algebra solvers. Matrix Transpose is an important linear algebra procedure that has deep impact in various computational science and engineering applications. Several factors hinder the expected performance of large matrix transpose on GPU devices. The degradation in performance involves the memory access pattern such as coalesced access in the global memory and bank conflict in the shared memory of streaming multiprocessors within the GPU. In this paper, two matrix transpose algorithms are proposed to alleviate the aforementioned issues of ensuring coalesced access and co...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Padding Free Bank Conflict Resolution for CUDA-Based Matrix Transpose Algorithm

Abstract

Extracted data

Padding Free Bank Conflict Resolution for CUDA-Based Matrix Transpose Algorithm

Abstract

Extracted data

Related items

Related items