High-level domain-specific languages for array processing on the GPU are increasingly common, but they typically only run on a single GPU. As computational power is distributed across more devices, languages must target multiple devices simultaneously. To this end, we present a compositional translation that fissions data-parallel programs in the Accelerate language, allowing subsequent compiler and runtime stages to map computations onto multiple devices for improved performance—even programs that begin as a single data-parallel kernel
The aim of this thesis is to research how the functional paradigm can be used for hardware accelerat...
Functional languages provide a solid foundation on which complex optimization passes can be designed...
We describe our experiences with a very high-level parallel composition language (called GLU) that e...
High-level domain-specific languages for array processing on the GPU are increasingly common, but th...
It is well acknowledged that the dominant mechanism for scaling processor performance has become to ...
As the demand increases for high performance and power efficiency in modern computer runtime systems...
Purely functional, embedded array programs are a good match for SIMD hardware, such as GPUs. However...
GPUs have been gaining popularity as general purpose parallel processors that deliver a performance ...
The need to speed-up computing has introduced the interest to explore parallelism in algorithms and ...
Graphics Processing Units (GPUs) have been successfully used to accelerate scientific applications d...
We present Singe, a Domain Specific Language (DSL) compiler for combustion chemistry that leverages ...
Heterogeneous multicores like GPGPUs are now commonplace in modern computing systems. Although heter...
The diversity of microarchitecture designs in heterogeneous computing systems allows programs to ach...
Graphics Processing Units (GPU) have been widely adopted to accelerate the execution of HPC workload...
Functional languages provide a solid foundation on which complex optimization passes can be designed...
The aim of this thesis is to research how the functional paradigm can be used for hardware accelerat...
Functional languages provide a solid foundation on which complex optimization passes can be designed...
We describe our experiences with a very high-level parallel composition language (called GLU) that e...
High-level domain-specific languages for array processing on the GPU are increasingly common, but th...
It is well acknowledged that the dominant mechanism for scaling processor performance has become to ...
As the demand increases for high performance and power efficiency in modern computer runtime systems...
Purely functional, embedded array programs are a good match for SIMD hardware, such as GPUs. However...
GPUs have been gaining popularity as general purpose parallel processors that deliver a performance ...
The need to speed-up computing has introduced the interest to explore parallelism in algorithms and ...
Graphics Processing Units (GPUs) have been successfully used to accelerate scientific applications d...
We present Singe, a Domain Specific Language (DSL) compiler for combustion chemistry that leverages ...
Heterogeneous multicores like GPGPUs are now commonplace in modern computing systems. Although heter...
The diversity of microarchitecture designs in heterogeneous computing systems allows programs to ach...
Graphics Processing Units (GPU) have been widely adopted to accelerate the execution of HPC workload...
Functional languages provide a solid foundation on which complex optimization passes can be designed...
The aim of this thesis is to research how the functional paradigm can be used for hardware accelerat...
Functional languages provide a solid foundation on which complex optimization passes can be designed...
We describe our experiences with a very high-level parallel composition language (called GLU) that e...