In this work, we examine the potential of using the recently-released STI Cell processor as a building block for future high-end scientific computing systems. Our work contains several novel contributions. First, we introduce a performance model for Cell and apply it to several key numerical kernels: dense matrix multiply, sparse matrix vector multiply, stencil computations, and 1D/2D FFTs. Next, we validate our model by comparing results against published hardware data, as well as our own Cell blade implementations. Additionally, we compare Cell performance to benchmarks run on leading superscalar (AMD Opteron), VLIW (Intel Itanium2), and vector (Cray X1E) architectures. Our work also explores several different kernel implementations and d...
Abstract. Various processor architectures have been proposed until today, and the performance has im...
AbstractA profile is given of current research, as it pertains to computational mathematics, on Very...
Developed for multimedia and game applications, as well as other numerically intensive workloads, th...
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing ...
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing ...
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing ...
The STI CELL processor introduces pioneering solutions in processor architecture. At the same time i...
Matrix factorization (or often called decomposition) is a frequently used kernel in a large number o...
The Cell Broadband Engine processor is a powerful processor capable of over 220 GFLOPS. It is highly...
International audienceIn order to implement a complete Fast Multipole Method on the Cell processor, ...
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as...
International audienceOn modern architectures, the performance of 32-bit operations is often at leas...
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as...
This paper evaluates the performance of bioinformatics applications on the Cell Broadband Engine (Ce...
The Cell Broadband Engine architecture is a revolutionary processor architecture well suited for man...
Abstract. Various processor architectures have been proposed until today, and the performance has im...
AbstractA profile is given of current research, as it pertains to computational mathematics, on Very...
Developed for multimedia and game applications, as well as other numerically intensive workloads, th...
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing ...
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing ...
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing ...
The STI CELL processor introduces pioneering solutions in processor architecture. At the same time i...
Matrix factorization (or often called decomposition) is a frequently used kernel in a large number o...
The Cell Broadband Engine processor is a powerful processor capable of over 220 GFLOPS. It is highly...
International audienceIn order to implement a complete Fast Multipole Method on the Cell processor, ...
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as...
International audienceOn modern architectures, the performance of 32-bit operations is often at leas...
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as...
This paper evaluates the performance of bioinformatics applications on the Cell Broadband Engine (Ce...
The Cell Broadband Engine architecture is a revolutionary processor architecture well suited for man...
Abstract. Various processor architectures have been proposed until today, and the performance has im...
AbstractA profile is given of current research, as it pertains to computational mathematics, on Very...
Developed for multimedia and game applications, as well as other numerically intensive workloads, th...