This paper describes an approach for acceleration of the Hybrid Total FETI (HTFETI) domain decomposition method using the Intel Xeon Phi coprocessors. The HTFETI method is a memory bound algorithm which uses sparse linear BLAS operations with irregular memory access pattern. The presented local Schur complement (LSC) method has regular memory access pattern, that allows the solver to fully utilize the Intel Xeon Phi fast memory bandwidth. This translates to speedup over 10.9 of the HTFETI iterative solver when solving 3 billion unknown heat transfer problem (3D Laplace equation) on almost 400 compute nodes. The comparison is between the CPU computation using sparse data structures (PARDISO sparse direct solver) and the LSC computation o...
The Intel R Xeon PhiTM is the first processor based on Intel’s MIC (Many Integrated Cores) architect...
Partial Differential Equations (PDEs) are widely used to simulate many scenarios in science and engi...
Manycores are consolidating in HPC community as a way of improving performance while keeping power e...
In the paper we provide a comparison of several runtimes which can be used for offloading computatio...
In this article, we present the ExaScale PaRallel finite element tearing and interconnecting SOlver ...
This paper describes our new hybrid parallelization of the Finite Element Tearing and Interconnectin...
The paper describes an efficient direct method to solve an equation Ax = b, where A is a sparse matr...
The bottlenecks related to the numerical solution of many engineering problems are very dependent on...
In this paper we describe different applications we have ported to Intel Xeon Phi architectures, ana...
We describe a hybrid FETI (Finite Element Tearing and Interconnecting) method based on our variant o...
This paper presents the design and implementation of several fundamental dense linear algebra (DLA) ...
Abstract. Intel Xeon Phi is a recently released high-performance co-processor which features 61 core...
In this paper, we propose a lightweight optimization methodology for the ubiquitous sparse matrix-ve...
Abstract. This paper presents the design and implementation of several funda-mental dense linear alg...
Most of computations (subdomain problems) appearing in FETI-type methods are purely local and theref...
The Intel R Xeon PhiTM is the first processor based on Intel’s MIC (Many Integrated Cores) architect...
Partial Differential Equations (PDEs) are widely used to simulate many scenarios in science and engi...
Manycores are consolidating in HPC community as a way of improving performance while keeping power e...
In the paper we provide a comparison of several runtimes which can be used for offloading computatio...
In this article, we present the ExaScale PaRallel finite element tearing and interconnecting SOlver ...
This paper describes our new hybrid parallelization of the Finite Element Tearing and Interconnectin...
The paper describes an efficient direct method to solve an equation Ax = b, where A is a sparse matr...
The bottlenecks related to the numerical solution of many engineering problems are very dependent on...
In this paper we describe different applications we have ported to Intel Xeon Phi architectures, ana...
We describe a hybrid FETI (Finite Element Tearing and Interconnecting) method based on our variant o...
This paper presents the design and implementation of several fundamental dense linear algebra (DLA) ...
Abstract. Intel Xeon Phi is a recently released high-performance co-processor which features 61 core...
In this paper, we propose a lightweight optimization methodology for the ubiquitous sparse matrix-ve...
Abstract. This paper presents the design and implementation of several funda-mental dense linear alg...
Most of computations (subdomain problems) appearing in FETI-type methods are purely local and theref...
The Intel R Xeon PhiTM is the first processor based on Intel’s MIC (Many Integrated Cores) architect...
Partial Differential Equations (PDEs) are widely used to simulate many scenarios in science and engi...
Manycores are consolidating in HPC community as a way of improving performance while keeping power e...