Efficiently managing the memory subsystem of modern multi/manycore architectures is increasingly becoming a challenge as systems grow in complexity and heterogeneity. In the field of high performance computing (HPC) in particular, where massively parallel architectures are used and input sets of several terabytes are common, careful management of the memory hierarchy is crucial to exploit the full computing power of these systems. The goal of this thesis is to provide computer architects with valuable information to guide the design of future systems, and in particular of those more widely used in the field of HPC, i.e., symmetric multicore processors (SMPs) and GPUs. With that aim, we present an analysis of some of the inefficiencies and ...
Enhancing the match between software executions and hardware features is key to computing efficiency...
International audienceIn a parallel computing context, peak performance is hard to reach with irregu...
A major contributor to the deployment and operational costs of a large-scale high-performance comput...
Efficiently managing the memory subsystem of modern multi/manycore architectures is increasingly bec...
Most computing systems are heavily dependent on their main memories, as their primary storage, or as...
Multi-core processors have become the dominant processor architecture with 2, 4, and 8 cores on a ch...
El aumento del número de núcleos e hilos por procesador en los últimos 15 años ha permitido mantener...
From single-core CPUs to detachable compute accelerators, supercomputers made a tremendous progress ...
Multi-GPU systems are widely used in High Performance Computing environments to accelerate scientifi...
Multi-GPU systems are widely used in High Performance Computing environments to accelerate scientifi...
abstract: With the massive multithreading execution feature, graphics processing units (GPUs) have b...
<p>The continued growth of the computational capability of throughput processors has made throughput...
A major contributor to the deployment and operational costs of a large-scale high-performance comput...
To help shrink the programmability-performance efficiency gap, we discuss that adaptive runtime syst...
Current GPU computing models support a mixture of coherent and incoherent classes of memory operatio...
Enhancing the match between software executions and hardware features is key to computing efficiency...
International audienceIn a parallel computing context, peak performance is hard to reach with irregu...
A major contributor to the deployment and operational costs of a large-scale high-performance comput...
Efficiently managing the memory subsystem of modern multi/manycore architectures is increasingly bec...
Most computing systems are heavily dependent on their main memories, as their primary storage, or as...
Multi-core processors have become the dominant processor architecture with 2, 4, and 8 cores on a ch...
El aumento del número de núcleos e hilos por procesador en los últimos 15 años ha permitido mantener...
From single-core CPUs to detachable compute accelerators, supercomputers made a tremendous progress ...
Multi-GPU systems are widely used in High Performance Computing environments to accelerate scientifi...
Multi-GPU systems are widely used in High Performance Computing environments to accelerate scientifi...
abstract: With the massive multithreading execution feature, graphics processing units (GPUs) have b...
<p>The continued growth of the computational capability of throughput processors has made throughput...
A major contributor to the deployment and operational costs of a large-scale high-performance comput...
To help shrink the programmability-performance efficiency gap, we discuss that adaptive runtime syst...
Current GPU computing models support a mixture of coherent and incoherent classes of memory operatio...
Enhancing the match between software executions and hardware features is key to computing efficiency...
International audienceIn a parallel computing context, peak performance is hard to reach with irregu...
A major contributor to the deployment and operational costs of a large-scale high-performance comput...