Accelerating scientific computations with mixed precision algorithms

The Potential of the Cell Processor for Scientific Computing

Samuel Williams
John Shalf
Leonid Oliker
Shoaib Kamil
Parry Husbands
Katherine Yelick

January 2006

The slowing pace of commodity microprocessor performance improvements combined with ever-increasing ...

High Performance Reconfigurable Computing for Linear Algebra: Design and Performance Analysis

Sun, Junqing

May 2008

Field Programmable Gate Arrays (FPGAs) enable powerful performance acceleration for scientific compu...

FPGA based high performance double-precision matrix multiplication

KUMAR, VBY
JOSHI, S
PATKAR, SB
NARAYANAN, H

January 2010

We present two designs (I and II) for IEEE 754 double precision floating point matrix multiplication...

Accelerating scientific computations with mixed precision algorithms

Baboulin, Marc
Buttari, Alfredo
Dongarra, Jack,
Kurzak, Jakub
Langou, Julie
Langou, Julien
Luszczek, Piotr
Tomov, Stanimire

December 2009

International audienceOn modern architectures, the performance of 32-bit operations is often at leas...

Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems

Buttari, Alfredo
Dongarra, Jack,
Langou, Julie
Langou, Julien
Luszczek, Piotr
Kurzak, Jakub

July 2016

International audienceBy using a combination of 32-bit and 64-bit floating point arithmetic, the per...

Mixed precision iterative refinement techniques for the solution of dense linear systems, Int

Alfredo Buttari
Jack Dongarra
Julie Langou
Julien Langou
Piotr Luszczek
Jakub Kurzak
Mims Eprint
Alfredo Buttari
Jack Dongarra
Julie Langou
Julien Langou
Piotr Luszczek
Jakub Kurzak

March 2015

By using a combination of 32-bit and 64-bit floating point arithmetic, the per-formance of many dens...

Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy

Buttari, Alfredo
Dongarra, Jack,
Kurzak, Jakub
Luszczek, Piotr
Tomov, Stanimir

July 2008

International audienceBy using a combination of 32-bit and 64-bit floating point arithmetic, the per...

UT-CS-06-574 Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy (Revisiting Iterative Refinement for Linear Systems)

Julie Langou
Julien Langou
Piotr Luszczek
Jakub Kurzak
Alfredo Buttari
Jack Dongarra

January 2006

Recent versions of microprocessors exhibit performance characteristics for 32 bit floating point ari...

An Energy-Efficient Sparse-BLAS Coprocessor using STT-MRAM

Dorrance, Richard William

January 2015

Sparse linear algebra arises in a wide variety of computational disciplines, including medical imagi...

Trends Of CPU, GPU and FPGA for high-performance computing

Véstias, Mário
Neto, Horácio

October 2014

Floating-point computing with more than one TFLOP of peak performance is already a reality in recent...

FPGA-Based Multi-precision Architecture for Accelerating Large-Scale Floating-Point Matrix Computing

Zhang, Longlong
Peng, Yuanxi
Hu, Xiao
Huang, Ahui
Tian, Tian

September 2020

Part 4: Architecture and HardwareInternational audienceMatrix computing plays a vital role in many s...

Exploiting the capabilities of modern GPUs for dense matrix computations

Barrachina Mir, Sergio
Castillo Catalán, Maribel
Igual, Francisco D.
Mayo, Rafael
Quintana-Orti, Enrique S.
Quintana-Ortí, Gregorio

January 2009

We present several algorithms to compute the solution of a linear system of equations on a graphics ...

Mapping sparse matrix scientific applications onto FPGA-augmented reconfigurable supercomputers

Morris, Gerald Roger

October 2006

UnrestrictedThe large capacity of field programmable gate arrays (FPGAs) has prompted researchers to...

Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization

Kurzak, Jakub
Buttari, Alfredo
Dongarra, Jack

July 2007

The STI CELL processor introduces pioneering solutions in processor architecture. At the same time i...

Accelerating BLAS and LAPACK via efficient floating point architecture design

Merchant, Farhad
Chattopadhyay, Anupam
Raha, Soumyendu
Nandy, S. K.
Narayan, Ranjani

January 2017

Basic Linear Algebra Subprograms (BLAS) and Linear Algebra Package (LAPACK) form basic building bloc...

The Potential of the Cell Processor for Scientific Computing

Samuel Williams
John Shalf
Leonid Oliker
Shoaib Kamil
Parry Husbands
Katherine Yelick

January 2006

The slowing pace of commodity microprocessor performance improvements combined with ever-increasing ...

High Performance Reconfigurable Computing for Linear Algebra: Design and Performance Analysis

Sun, Junqing

May 2008

Field Programmable Gate Arrays (FPGAs) enable powerful performance acceleration for scientific compu...

FPGA based high performance double-precision matrix multiplication

KUMAR, VBY
JOSHI, S
PATKAR, SB
NARAYANAN, H

January 2010

We present two designs (I and II) for IEEE 754 double precision floating point matrix multiplication...

Accelerating scientific computations with mixed precision algorithms

Baboulin, Marc
Buttari, Alfredo
Dongarra, Jack,
Kurzak, Jakub
Langou, Julie
Langou, Julien
Luszczek, Piotr
Tomov, Stanimire

December 2009

International audienceOn modern architectures, the performance of 32-bit operations is often at leas...

Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems

Buttari, Alfredo
Dongarra, Jack,
Langou, Julie
Langou, Julien
Luszczek, Piotr
Kurzak, Jakub

July 2016

International audienceBy using a combination of 32-bit and 64-bit floating point arithmetic, the per...

Mixed precision iterative refinement techniques for the solution of dense linear systems, Int

Alfredo Buttari
Jack Dongarra
Julie Langou
Julien Langou
Piotr Luszczek
Jakub Kurzak
Mims Eprint
Alfredo Buttari
Jack Dongarra
Julie Langou
Julien Langou
Piotr Luszczek
Jakub Kurzak

March 2015

By using a combination of 32-bit and 64-bit floating point arithmetic, the per-formance of many dens...

Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy

Buttari, Alfredo
Dongarra, Jack,
Kurzak, Jakub
Luszczek, Piotr
Tomov, Stanimir

July 2008

International audienceBy using a combination of 32-bit and 64-bit floating point arithmetic, the per...

UT-CS-06-574 Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy (Revisiting Iterative Refinement for Linear Systems)

Julie Langou
Julien Langou
Piotr Luszczek
Jakub Kurzak
Alfredo Buttari
Jack Dongarra

January 2006

Recent versions of microprocessors exhibit performance characteristics for 32 bit floating point ari...

An Energy-Efficient Sparse-BLAS Coprocessor using STT-MRAM

Dorrance, Richard William

January 2015

Sparse linear algebra arises in a wide variety of computational disciplines, including medical imagi...

Trends Of CPU, GPU and FPGA for high-performance computing

Véstias, Mário
Neto, Horácio

October 2014

Floating-point computing with more than one TFLOP of peak performance is already a reality in recent...

FPGA-Based Multi-precision Architecture for Accelerating Large-Scale Floating-Point Matrix Computing

Zhang, Longlong
Peng, Yuanxi
Hu, Xiao
Huang, Ahui
Tian, Tian

September 2020

Part 4: Architecture and HardwareInternational audienceMatrix computing plays a vital role in many s...

Exploiting the capabilities of modern GPUs for dense matrix computations

Barrachina Mir, Sergio
Castillo Catalán, Maribel
Igual, Francisco D.
Mayo, Rafael
Quintana-Orti, Enrique S.
Quintana-Ortí, Gregorio

January 2009

We present several algorithms to compute the solution of a linear system of equations on a graphics ...

Mapping sparse matrix scientific applications onto FPGA-augmented reconfigurable supercomputers

Morris, Gerald Roger

October 2006

UnrestrictedThe large capacity of field programmable gate arrays (FPGAs) has prompted researchers to...

Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization

Kurzak, Jakub
Buttari, Alfredo
Dongarra, Jack

July 2007

The STI CELL processor introduces pioneering solutions in processor architecture. At the same time i...

Accelerating BLAS and LAPACK via efficient floating point architecture design

Merchant, Farhad
Chattopadhyay, Anupam
Raha, Soumyendu
Nandy, S. K.
Narayan, Ranjani

January 2017

Basic Linear Algebra Subprograms (BLAS) and Linear Algebra Package (LAPACK) form basic building bloc...

The Potential of the Cell Processor for Scientific Computing

Samuel Williams
John Shalf
Leonid Oliker
Shoaib Kamil
Parry Husbands
Katherine Yelick

January 2006

The slowing pace of commodity microprocessor performance improvements combined with ever-increasing ...

High Performance Reconfigurable Computing for Linear Algebra: Design and Performance Analysis

Sun, Junqing

May 2008

Field Programmable Gate Arrays (FPGAs) enable powerful performance acceleration for scientific compu...

FPGA based high performance double-precision matrix multiplication

KUMAR, VBY
JOSHI, S
PATKAR, SB
NARAYANAN, H

January 2010

We present two designs (I and II) for IEEE 754 double precision floating point matrix multiplication...

Accelerating scientific computations with mixed precision algorithms

Abstract

Extracted data

Accelerating scientific computations with mixed precision algorithms

Abstract

Extracted data

Related items

Related items