The performance of scientic computing applications often achieves a small fraction of peak performance [7,17]. In this paper, we discuss two causes of performance problems| insucient memory bandwidth and a suboptimal instruction mix|in the context of a complete, parallel, unstructured mesh implicit CFD code. These results show that the performance of our code and of similar implicit codes is limited by the memory bandwidth of RISC-based processor nodes to as little as 10 % of peak performance for some critical computational kernels. Limits on the number of basic operations that can be performed in a single clock cycle also limit the performance of \cache-friendly " parts of the code. 1
This research focuses on evaluating and enhancing the performance of an in-house, structured, 2D CFD...
This research focuses on evaluating and enhancing the performance of an in-house, structured, 2D CFD...
Obtaining high performance without machine-specific tuning is an important goal of scientific applic...
In this paper, the authors identify the scalability bottlenecks of an unstructured grid CFD code (PE...
Prize winning PETSc-FUN3D aerodynamics code, extending it with highly-tuned shared-memory paralleliz...
This paper highlights a three-year project by an interdisciplinary team on a legacy F77 computationa...
This research focuses on evaluating and enhancing the performance of an in-house, structured, 2D CFD...
Generally, parallel scientific applications are executed on a fixed number of processors determined ...
We comment on the current performance of computational fluid dynamics codes on a variety of scalable...
Many state of the art CFD codes that exhibit low computational intensity (flops per RAM access) "sat...
Many state of the art CFD codes that exhibit low computational intensity (flops per RAM access) "sat...
This research focuses on evaluating and enhancing the performance of an in-house, structured, 2D CFD...
AbstractFuture architectures designed to deliver exascale performance motivate the need for novel al...
Future architectures designed to deliver exascale performance motivate the need for novel algorithmi...
This research focuses on evaluating and enhancing the performance of an in-house, structured, 2D CFD...
This research focuses on evaluating and enhancing the performance of an in-house, structured, 2D CFD...
This research focuses on evaluating and enhancing the performance of an in-house, structured, 2D CFD...
Obtaining high performance without machine-specific tuning is an important goal of scientific applic...
In this paper, the authors identify the scalability bottlenecks of an unstructured grid CFD code (PE...
Prize winning PETSc-FUN3D aerodynamics code, extending it with highly-tuned shared-memory paralleliz...
This paper highlights a three-year project by an interdisciplinary team on a legacy F77 computationa...
This research focuses on evaluating and enhancing the performance of an in-house, structured, 2D CFD...
Generally, parallel scientific applications are executed on a fixed number of processors determined ...
We comment on the current performance of computational fluid dynamics codes on a variety of scalable...
Many state of the art CFD codes that exhibit low computational intensity (flops per RAM access) "sat...
Many state of the art CFD codes that exhibit low computational intensity (flops per RAM access) "sat...
This research focuses on evaluating and enhancing the performance of an in-house, structured, 2D CFD...
AbstractFuture architectures designed to deliver exascale performance motivate the need for novel al...
Future architectures designed to deliver exascale performance motivate the need for novel algorithmi...
This research focuses on evaluating and enhancing the performance of an in-house, structured, 2D CFD...
This research focuses on evaluating and enhancing the performance of an in-house, structured, 2D CFD...
This research focuses on evaluating and enhancing the performance of an in-house, structured, 2D CFD...
Obtaining high performance without machine-specific tuning is an important goal of scientific applic...