Efficient parallel implementations of scientific applications on multi-core CPUs with accelerators such as GPUs and Xeon Phis is challenging. This requires - exploiting the data parallel architecture of the accelerator along with the vector pipelines of modern x86 CPU architectures, load balancing, and efficient memory transfer between different devices. It is relatively easy to meet these requirements for highly-structured scientific applications. In contrast, a number of scientific and engineering applications are unstructured. Getting performance on accelerators for these applications is extremely challenging because many of these applications employ irregular algorithms which exhibit data-dependent control-flow and irregular memory acce...
Emerging computer architectures and advanced computing technologies, such as Intel’s Many Integrated...
To help shrink the programmability-performance efficiency gap, we discuss that adaptive runtime syst...
Multi-core processors are now ubiquitous and are widely seen as the most viable means of delivering...
As many-core accelerators keep integrating more processing units, it becomes increasingly more diffi...
Many-core accelerators, as represented by the XeonPhi coprocessors and GPGPUs, allow software to exp...
Modern day hardware platforms are parallel and diverse, ranging from mobiles to data centers. Mains...
The amelioration of high performance computing platforms has provided unprecedented computing power ...
Scientific applications often require massive amounts of compute time and power. With the constantly...
<p>Heterogeneous processors with accelerators provide an opportunity to improve performance within a...
Enhancing the match between software executions and hardware features is key to computing efficiency...
Rising power costs and constraints are driving a growing focus on the energy efficiency of high perf...
This paper presents a new technique for introducing and tuning parallelism for heterogeneous shared-...
Specialized accelerators are increasingly attractive solutions to continue expected generational per...
In-memory cluster computing platforms have gained momentum in the last years, due to their ability t...
This paper discusses the implementation of particle based numerical methods on multi-core machines. ...
Emerging computer architectures and advanced computing technologies, such as Intel’s Many Integrated...
To help shrink the programmability-performance efficiency gap, we discuss that adaptive runtime syst...
Multi-core processors are now ubiquitous and are widely seen as the most viable means of delivering...
As many-core accelerators keep integrating more processing units, it becomes increasingly more diffi...
Many-core accelerators, as represented by the XeonPhi coprocessors and GPGPUs, allow software to exp...
Modern day hardware platforms are parallel and diverse, ranging from mobiles to data centers. Mains...
The amelioration of high performance computing platforms has provided unprecedented computing power ...
Scientific applications often require massive amounts of compute time and power. With the constantly...
<p>Heterogeneous processors with accelerators provide an opportunity to improve performance within a...
Enhancing the match between software executions and hardware features is key to computing efficiency...
Rising power costs and constraints are driving a growing focus on the energy efficiency of high perf...
This paper presents a new technique for introducing and tuning parallelism for heterogeneous shared-...
Specialized accelerators are increasingly attractive solutions to continue expected generational per...
In-memory cluster computing platforms have gained momentum in the last years, due to their ability t...
This paper discusses the implementation of particle based numerical methods on multi-core machines. ...
Emerging computer architectures and advanced computing technologies, such as Intel’s Many Integrated...
To help shrink the programmability-performance efficiency gap, we discuss that adaptive runtime syst...
Multi-core processors are now ubiquitous and are widely seen as the most viable means of delivering...