Abstract—Dense linear algebra has been traditionally used to evaluate the performance and efficiency of new architectures. This trend has continued for the past half decade with the advent of multi-core processors and hardware accelerators. In this paper we describe how several flavors of the Linpack benchmark are accelerated on Intel’s recently released Intel R© Xeon Phi TM 1 co-processor (code-named Knights Corner) in both native and hybrid configurations. Our native DGEMM implementation takes full advantage of Knights Corner’s salient architectural features and successfully utilizes close to 90 % of its peak compute capability. Our native Linpack implementation running entirely on Knights Corner employs novel dynamic scheduling and achie...
This paper discusses the design of linear algebra libraries for high performance computers. Particul...
Abstract—LU factorization with partial pivoting is a canonical numerical procedure and the main comp...
If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced with accel...
In this paper we present the design and implementation of the Linpack benchmark for the IBM BladeCen...
This paper presents the design and implementation of several fundamental dense linear algebra (DLA) ...
The aim of this project was to encapsulate the needs of computational science applications. Performa...
This paper describes an implementation of Level 3 of the Basic Linear Algebra Subprogram (BLAS-3) li...
This paper describes an implementation of Level 3 of the Basic Linear Algebra Subprogram (BLAS-3) li...
Abstract. This paper presents the design and implementation of several funda-mental dense linear alg...
With the advent of multi-core technology, scientific and high performance computing research is beco...
Abstract: If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced ...
This article discusses the core factorization routines included in the ScaLAPACK library. These rout...
Abstract. Intel Xeon Phi is a recently released high-performance co-processor which features 61 core...
Abstract. If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced ...
Abstract—We report Linpack benchmark results on the TSUBAME supercomputer, a large scale heterogeneo...
This paper discusses the design of linear algebra libraries for high performance computers. Particul...
Abstract—LU factorization with partial pivoting is a canonical numerical procedure and the main comp...
If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced with accel...
In this paper we present the design and implementation of the Linpack benchmark for the IBM BladeCen...
This paper presents the design and implementation of several fundamental dense linear algebra (DLA) ...
The aim of this project was to encapsulate the needs of computational science applications. Performa...
This paper describes an implementation of Level 3 of the Basic Linear Algebra Subprogram (BLAS-3) li...
This paper describes an implementation of Level 3 of the Basic Linear Algebra Subprogram (BLAS-3) li...
Abstract. This paper presents the design and implementation of several funda-mental dense linear alg...
With the advent of multi-core technology, scientific and high performance computing research is beco...
Abstract: If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced ...
This article discusses the core factorization routines included in the ScaLAPACK library. These rout...
Abstract. Intel Xeon Phi is a recently released high-performance co-processor which features 61 core...
Abstract. If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced ...
Abstract—We report Linpack benchmark results on the TSUBAME supercomputer, a large scale heterogeneo...
This paper discusses the design of linear algebra libraries for high performance computers. Particul...
Abstract—LU factorization with partial pivoting is a canonical numerical procedure and the main comp...
If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced with accel...