General-purpose computing on GPUs has become more accessible due to features such as shared virtual memory and demand paging. Unfortunately it comes at a price, and that is performance. Automatic memory management is convenient but suffers from many drawbacks, preventing heterogeneous systems from achieving their full potential. In this work we analyze the challenges and inefficiencies of demand paging in GPUs, in particular on collaborative computations where data migrates multiple times between host and device. We establish that demand paging on GPUs introduces significant overheads for these kind of computations, and identify the issues of false sharing and unnecessary data transfers derived from the granularity at which data is migrated...
International audienceDespite the increasing investment in integrated GPUs and next-generation inter...
Current GPU computing models support a mixture of coherent and incoherent classes of memory operatio...
The massive amount of fine-grained parallelism exposed by a GPU program makes it difficult to exploi...
Heterogeneous systems are ubiquitous in the field of High- Performance Computing (HPC). Graphics pro...
GPUs are becoming increasingly popular in large scale data center installations due to their strong,...
Despite dramatic improvements in GPU and interconnect architectures, inter-GPU communication remains...
The proliferation of heterogeneous compute platforms, of which CPU/GPU is a prevalent example, neces...
High compute-density with massive thread-level parallelism of Graphics Processing Units (GPUs) is be...
2018-02-23Graphics Processing Units (GPUs) are designed primarily to execute multimedia, and game re...
Modern GPUs are powerful high-core-count processors, which are no longer used solely for graphics ap...
<p>The continued growth of the computational capability of throughput processors has made throughput...
The proliferation of heterogeneous compute platforms, of which CPU/GPU is a prevalent example, neces...
Graphics processing units (GPUs) have become prevalent in modern computing systems. While their high...
Abstract—Graphics processing units (GPUs) embrace many-core compute devices where massively parallel...
Abstract—Graphics processing units (GPUs) are increasingly being used for general purpose parallel c...
International audienceDespite the increasing investment in integrated GPUs and next-generation inter...
Current GPU computing models support a mixture of coherent and incoherent classes of memory operatio...
The massive amount of fine-grained parallelism exposed by a GPU program makes it difficult to exploi...
Heterogeneous systems are ubiquitous in the field of High- Performance Computing (HPC). Graphics pro...
GPUs are becoming increasingly popular in large scale data center installations due to their strong,...
Despite dramatic improvements in GPU and interconnect architectures, inter-GPU communication remains...
The proliferation of heterogeneous compute platforms, of which CPU/GPU is a prevalent example, neces...
High compute-density with massive thread-level parallelism of Graphics Processing Units (GPUs) is be...
2018-02-23Graphics Processing Units (GPUs) are designed primarily to execute multimedia, and game re...
Modern GPUs are powerful high-core-count processors, which are no longer used solely for graphics ap...
<p>The continued growth of the computational capability of throughput processors has made throughput...
The proliferation of heterogeneous compute platforms, of which CPU/GPU is a prevalent example, neces...
Graphics processing units (GPUs) have become prevalent in modern computing systems. While their high...
Abstract—Graphics processing units (GPUs) embrace many-core compute devices where massively parallel...
Abstract—Graphics processing units (GPUs) are increasingly being used for general purpose parallel c...
International audienceDespite the increasing investment in integrated GPUs and next-generation inter...
Current GPU computing models support a mixture of coherent and incoherent classes of memory operatio...
The massive amount of fine-grained parallelism exposed by a GPU program makes it difficult to exploi...