The last decade has witnessed a rapid proliferation of superscalarcache-based microprocessors to build high-end computing (HEC) platforms, primarily because of their generality, scalability, and cost effectiveness. However, the growing gap between sustained and peak performance for full-scale scientific applications on such platforms has become major concern in high performance computing. The latest generation of custom-built parallel vector systems have the potential to address this concern for numerical algorithms with sufficient regularity in their computational structure. In this work, we explore two and three dimensional implementations of a lattice-Boltzmann magnetohydrodynamics (MHD) physics application, on some of today's most power...
Accelerators are quickly emerging as the leading technology to further boost computing performances;...
We describe the implementation and optimization of a state-of-the-art Lattice Boltzmann code for com...
The primary objective of this project is to develop an advanced algorithm for parallel supercomputer...
The last decade has witnessed a rapid proliferation of superscalarcache-based microprocessors to bu...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to b...
Abstract. The last decade has witnessed a rapid proliferation of superscalar cache-based microproces...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to bu...
The last decade has witnessed a rapid proliferation ofsuperscalar cache-based microprocessors to bui...
We are witnessing a rapid evolution of HPC node architectures and on-chip parallelism as power and c...
Abstract The last decade has witnessed a rapid proliferation of superscalar cache-based microprocess...
The equations of magnetohydrodynamics (MHD) are discussed in the framework of parallel computing. Bo...
Numerical analysts and programmers are currently facing a conceptual change in processor technology....
After a decade where high-end computing was dominated by the rapid pace of improvements to CPU frequ...
We apply auto-tuning to a hybrid MPI-pthreads lattice Boltzmann computation running on the Cray XT4 ...
This paper presents the performance analysis for both the computing performance and the energy effic...
Accelerators are quickly emerging as the leading technology to further boost computing performances;...
We describe the implementation and optimization of a state-of-the-art Lattice Boltzmann code for com...
The primary objective of this project is to develop an advanced algorithm for parallel supercomputer...
The last decade has witnessed a rapid proliferation of superscalarcache-based microprocessors to bu...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to b...
Abstract. The last decade has witnessed a rapid proliferation of superscalar cache-based microproces...
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to bu...
The last decade has witnessed a rapid proliferation ofsuperscalar cache-based microprocessors to bui...
We are witnessing a rapid evolution of HPC node architectures and on-chip parallelism as power and c...
Abstract The last decade has witnessed a rapid proliferation of superscalar cache-based microprocess...
The equations of magnetohydrodynamics (MHD) are discussed in the framework of parallel computing. Bo...
Numerical analysts and programmers are currently facing a conceptual change in processor technology....
After a decade where high-end computing was dominated by the rapid pace of improvements to CPU frequ...
We apply auto-tuning to a hybrid MPI-pthreads lattice Boltzmann computation running on the Cray XT4 ...
This paper presents the performance analysis for both the computing performance and the energy effic...
Accelerators are quickly emerging as the leading technology to further boost computing performances;...
We describe the implementation and optimization of a state-of-the-art Lattice Boltzmann code for com...
The primary objective of this project is to develop an advanced algorithm for parallel supercomputer...