In the multicore era it was possible to exploit the increase in on-chip parallelism by simply running multiple MPI processes per chip. Unfortunately, manycore processors' greatly increased thread- and data-level parallelism coupled with a reduced memory capacity demand an altogether different approach. In this paper we explore augmenting two NWChem modules, triples correction of the CCSD(T) and Fock matrix construction, with OpenMP in order that they might run efficiently on future manycore architectures. As the next NERSC machine will be a self-hosted Intel MIC (Xeon Phi) based supercomputer, we leverage an existing MIC testbed at NERSC to evaluate our experiments. In order to proxy the fact that future MIC machines will not have a host pr...
Many-core architectures, such as the Intel Xeon Phi, provide dozens of cores and hundreds of hardwar...
To accelerate the solution of large eigenvalue problems arising from many-body calculations in nucle...
Today's supercomputers often consists of clusters of SMP nodes. Both OpenMP and MPI are programming ...
In the multicore era it was possible to exploit the increase in on-chip parallelism by simply runnin...
Core (MIC) Architecture have been adopted in many high-performance computer clusters. Typical parall...
MPI is the predominant model for parallel programming in technical high performance computing. With ...
Many/multi-core supercomputers provide a natural programming paradigm for hybrid MPI/OpenMP scientif...
The Configuration Interaction (CI) method has been widely used to solve the non-relativistic many-bo...
Thesis: S.M., Massachusetts Institute of Technology, Department of Nuclear Science and Engineering, ...
drodenas,xavim,eduard,jesus¡ In this paper, we present two approaches to improve the execution of Op...
Abstract. The Sparse Matrix-Vector Multiplication is the key operation in many iterative methods. Th...
There are many potential issues associated with deploying the Intel Xeon PhiTM (code named Knights L...
This paper applies a Hybrid MPI-OpenMP program-ming model with a thread-to-thread communication meth...
Part 2: AlgorithmsInternational audienceThis paper describes the acceleration of the most computatio...
After a brief introduction on Cross Motif Search and its OpenMP and Hybrid OpenMP-MPI implementatio...
Many-core architectures, such as the Intel Xeon Phi, provide dozens of cores and hundreds of hardwar...
To accelerate the solution of large eigenvalue problems arising from many-body calculations in nucle...
Today's supercomputers often consists of clusters of SMP nodes. Both OpenMP and MPI are programming ...
In the multicore era it was possible to exploit the increase in on-chip parallelism by simply runnin...
Core (MIC) Architecture have been adopted in many high-performance computer clusters. Typical parall...
MPI is the predominant model for parallel programming in technical high performance computing. With ...
Many/multi-core supercomputers provide a natural programming paradigm for hybrid MPI/OpenMP scientif...
The Configuration Interaction (CI) method has been widely used to solve the non-relativistic many-bo...
Thesis: S.M., Massachusetts Institute of Technology, Department of Nuclear Science and Engineering, ...
drodenas,xavim,eduard,jesus¡ In this paper, we present two approaches to improve the execution of Op...
Abstract. The Sparse Matrix-Vector Multiplication is the key operation in many iterative methods. Th...
There are many potential issues associated with deploying the Intel Xeon PhiTM (code named Knights L...
This paper applies a Hybrid MPI-OpenMP program-ming model with a thread-to-thread communication meth...
Part 2: AlgorithmsInternational audienceThis paper describes the acceleration of the most computatio...
After a brief introduction on Cross Motif Search and its OpenMP and Hybrid OpenMP-MPI implementatio...
Many-core architectures, such as the Intel Xeon Phi, provide dozens of cores and hundreds of hardwar...
To accelerate the solution of large eigenvalue problems arising from many-body calculations in nucle...
Today's supercomputers often consists of clusters of SMP nodes. Both OpenMP and MPI are programming ...