Multigrid methods are widely used to accelerate the convergence of iterative solvers for linear systems used in a number of different application areas. In this paper, we explore optimization techniques for geometric multigrid on existing and emerging multicore systems including the Opteron-based Cray XE6, Intel® Xeon® E5-2670 and X5550 processor-based Infiniband clusters, as well as the new Intel® Xeon Phi coprocessor (Knights Corner). Our work examines a variety of novel techniques including communication-aggregation, threaded wavefront-based DRAM communication-avoiding, dynamic threading decisions, SIMDization, and fusion of operators. We quantify performance through each phase of the V-cycle for both single-node and distributed-memory e...
Modern multicore and manycore processors exhibit multiple levels of parallelism through a wide range...
The scalable implementation of multigrid methods for machines with several thousands of processors i...
Understanding the most efficient design and utilization of emerging multicore systems is one of the ...
Multigrid methods are widely used to accelerate the convergence of iterative solvers for linear syst...
Multigrid methods are widely used to accelerate the convergence of iterative solvers for linear syst...
Algebraic multigrid (AMG) is a popular solver for large-scale scientific computing and an essential ...
Abstract: Making multigrid algorithms run efficiently on large parallel computers is a challenge. Wi...
We study the potential performance of multigrid algorithms running on massively parallel computers w...
This work presents the first extensive study of single- node performance optimization, tuning, and a...
Many applications in scientific computing require solving one or more partial differential equations...
The solution of elliptic partial differential equations is a common performance bottleneck in scient...
AbstractModern multicore and manycore processors exhibit multiple levels of parallelism through a wi...
This work presents a parallel implementation of density-based topology optimization using distribute...
We study the performance of a two-level algebraic-multigrid algorithm, with a focus on the impact of...
Abstract. Fast, robust and efficient multigrid solvers are a key numer-ical tool in the solution of ...
Modern multicore and manycore processors exhibit multiple levels of parallelism through a wide range...
The scalable implementation of multigrid methods for machines with several thousands of processors i...
Understanding the most efficient design and utilization of emerging multicore systems is one of the ...
Multigrid methods are widely used to accelerate the convergence of iterative solvers for linear syst...
Multigrid methods are widely used to accelerate the convergence of iterative solvers for linear syst...
Algebraic multigrid (AMG) is a popular solver for large-scale scientific computing and an essential ...
Abstract: Making multigrid algorithms run efficiently on large parallel computers is a challenge. Wi...
We study the potential performance of multigrid algorithms running on massively parallel computers w...
This work presents the first extensive study of single- node performance optimization, tuning, and a...
Many applications in scientific computing require solving one or more partial differential equations...
The solution of elliptic partial differential equations is a common performance bottleneck in scient...
AbstractModern multicore and manycore processors exhibit multiple levels of parallelism through a wi...
This work presents a parallel implementation of density-based topology optimization using distribute...
We study the performance of a two-level algebraic-multigrid algorithm, with a focus on the impact of...
Abstract. Fast, robust and efficient multigrid solvers are a key numer-ical tool in the solution of ...
Modern multicore and manycore processors exhibit multiple levels of parallelism through a wide range...
The scalable implementation of multigrid methods for machines with several thousands of processors i...
Understanding the most efficient design and utilization of emerging multicore systems is one of the ...