This whitepaper investigates the parallel performance of a sample application that implements an approximate expectation-maximization method for inferring the network structure and time varying states of a hidden population within the framework of the kinetic Ising model. The size of networks that can yield informative results can be made arbitrarily large, and the long-running computational demand is highly localized, making the application a strong candidate for future exascale platforms. Previous investigations using OpenMP on the Intel Xeon Phi architecture have suggested that the class of accelerator unit may play a significant part in attainable application performance. An OpenCL parallelization enables experiments with a variety of a...
Data analysis is a rising field of interest for computer science research due to the growing amount ...
Recent technological advances have proliferated the available computing power, memory, and speed of ...
The OpenCL standard allows targeting a large variety of CPU, GPU and accelerator architectures using...
This whitepaper investigates the parallel performance of a sample application that implements an app...
This paper reports on the development of an MPI/OpenCL implementation of LU, an application-level be...
The decline of Moore’s law has led to a fundamental shift in the design of micro-processor architect...
Accelerator processors allow energy-efficient computation at high performance, especially for comput...
The next-generation of supercomputers will feature a diverse mix of accelerator devices. The increas...
The article discusses possibilities of implementing a neural network in a parallel way. The issues o...
This work describes my solution to the performance portability problem: between CPUs and GPUs in par...
The size of data that can be fitted with a statistical model becomes restrictive when accounting for...
The proliferation of heterogeneous computing systems presents the parallel computing community with ...
This paper investigates the development of a molecular dynamics code that is highly portable between...
AbstractThe architecture of high performance computing systems is becoming more and more heterogeneo...
Abstract The architecture of high performance computing systems is becoming more and more heterogene...
Data analysis is a rising field of interest for computer science research due to the growing amount ...
Recent technological advances have proliferated the available computing power, memory, and speed of ...
The OpenCL standard allows targeting a large variety of CPU, GPU and accelerator architectures using...
This whitepaper investigates the parallel performance of a sample application that implements an app...
This paper reports on the development of an MPI/OpenCL implementation of LU, an application-level be...
The decline of Moore’s law has led to a fundamental shift in the design of micro-processor architect...
Accelerator processors allow energy-efficient computation at high performance, especially for comput...
The next-generation of supercomputers will feature a diverse mix of accelerator devices. The increas...
The article discusses possibilities of implementing a neural network in a parallel way. The issues o...
This work describes my solution to the performance portability problem: between CPUs and GPUs in par...
The size of data that can be fitted with a statistical model becomes restrictive when accounting for...
The proliferation of heterogeneous computing systems presents the parallel computing community with ...
This paper investigates the development of a molecular dynamics code that is highly portable between...
AbstractThe architecture of high performance computing systems is becoming more and more heterogeneo...
Abstract The architecture of high performance computing systems is becoming more and more heterogene...
Data analysis is a rising field of interest for computer science research due to the growing amount ...
Recent technological advances have proliferated the available computing power, memory, and speed of ...
The OpenCL standard allows targeting a large variety of CPU, GPU and accelerator architectures using...