AbstractWe report our implementation experience of a lattice gauge theory code on the Cell Broadband Engine, which is a new heterogeneous multi-core processor. As a typical operation, we take a SU(3) matrix multiplication which is one of the most important parts of lattice gauge theories. Employing full advantage of the Cell/B.E. including SIMD operations and many registers, which enable the full use of the arithmetic units through the loop-unrolling, we obtain about 200 GFLOPS with 16 SPE, which corresponds around 80% of the theoretical peak. To our knowledge, this is the fastest value of this operation obtained on the Cell/B.E. so far. However, when we measure the whole time including the data supply, the speed drops down to about 13 GFLO...
The Cell Broadband Engine processor is a powerful processor capable of over 220 GFLOPS. It is highly...
We implement a Monte Carlo algorithm for spin glass systems and optimize for the Cell-BE processor, ...
Current consumer-grade computers and game devices incor-porate very powerful processors that can be ...
AbstractWe report our implementation experience of a lattice gauge theory code on the Cell Broadband...
We evaluate IBM's Enhanced Cell Broadband Engine (BE) as a possible building block of a new generati...
Lattice Quantum Chromodynamic (QCD) models subatomic interactions based on a four-dimensional discre...
We report our experience of developing a QCD code on a CELL BE machine. First we describe what CELL ...
Computing the actions of Wilson-Dirac operators consumes most of the CPU time for the grand challeng...
This paper presents software implementation speed records for modular multiplication arithmetic on t...
Several large-scale computational scientific problems require high-end computing systems to be solve...
The Cell Broadband Engine architecture is a revolutionary processor architecture well suited for man...
A technique to speed up Montgomery multiplication targeted at the Synergistic Processor Elements (SP...
Current PC processors are equipped with vector processing units and have other advanced features tha...
AbstractPezy-SC processor is a novel new architecture developed by Pezy Computing K. K. that has ach...
Mainstream processor development is mostly targeted at compatibility and continuity. Thus, the proce...
The Cell Broadband Engine processor is a powerful processor capable of over 220 GFLOPS. It is highly...
We implement a Monte Carlo algorithm for spin glass systems and optimize for the Cell-BE processor, ...
Current consumer-grade computers and game devices incor-porate very powerful processors that can be ...
AbstractWe report our implementation experience of a lattice gauge theory code on the Cell Broadband...
We evaluate IBM's Enhanced Cell Broadband Engine (BE) as a possible building block of a new generati...
Lattice Quantum Chromodynamic (QCD) models subatomic interactions based on a four-dimensional discre...
We report our experience of developing a QCD code on a CELL BE machine. First we describe what CELL ...
Computing the actions of Wilson-Dirac operators consumes most of the CPU time for the grand challeng...
This paper presents software implementation speed records for modular multiplication arithmetic on t...
Several large-scale computational scientific problems require high-end computing systems to be solve...
The Cell Broadband Engine architecture is a revolutionary processor architecture well suited for man...
A technique to speed up Montgomery multiplication targeted at the Synergistic Processor Elements (SP...
Current PC processors are equipped with vector processing units and have other advanced features tha...
AbstractPezy-SC processor is a novel new architecture developed by Pezy Computing K. K. that has ach...
Mainstream processor development is mostly targeted at compatibility and continuity. Thus, the proce...
The Cell Broadband Engine processor is a powerful processor capable of over 220 GFLOPS. It is highly...
We implement a Monte Carlo algorithm for spin glass systems and optimize for the Cell-BE processor, ...
Current consumer-grade computers and game devices incor-porate very powerful processors that can be ...