The authors propose a general code optimization method for implementing polynomial approximation functions on clustered instruction-level parallelism (ILP) processors. In the proposed method, we first introduce the parallel algorithm with minimized data dependency. We then schedule and map the data dependency graph (DDG) constructed based on the parallel algorithm to appropriate clusters and functional units of a specific clustered ILP processor using the proposed parallel scheduling and mapping (PSAM) algorithm. The PSAM algorithm prioritizes those nodes on the critical path to minimize the total schedule length and ensures that the resulting schedule satisfies the resource constraints imposed by a specific cluster ILP processor. As a resu...
We advocate using performance bounds to guide code optimizations. Accurate performance bounds establ...
Algorithms for the evaluation of polynomials on a hypothetical computer with k independent arithmeti...
There exist significant, well established code bases in the scientific computing and research commun...
The authors propose a general code optimization method for implementing polynomial approximation fun...
Exact computation and manipulation of polynomial equations can be performed by symbolic polynomial m...
We consider the following scheduling problem. There arem parallel machines andn independent jobs. Ea...
Three related problems, among others, are faced when trying to execute an algorithm on a parallel ma...
This paper presents a novel scheme to schedule loops for clustered microarchitectures. The scheme is...
The model of malleable task (MT) was introduced some years ago and has been proved to be an efficien...
This paper describes a number of optimizations that can be used to support the efficient execution o...
International audienceThis work focuses on dynamic DAG scheduling under memory constraints. We targe...
Tech ReportThis paper is a study of scheduling on a 2-processor distributed system when one processo...
Modern superscalar architectures with dynamic scheduling and register renaming capabilities have int...
This thesis studies a heuristic approach to scheduling •on a 2-processor distributed system when one...
VLIW (Very Long Instruction Word) processors issue and execute multiple operations in parallel, on d...
We advocate using performance bounds to guide code optimizations. Accurate performance bounds establ...
Algorithms for the evaluation of polynomials on a hypothetical computer with k independent arithmeti...
There exist significant, well established code bases in the scientific computing and research commun...
The authors propose a general code optimization method for implementing polynomial approximation fun...
Exact computation and manipulation of polynomial equations can be performed by symbolic polynomial m...
We consider the following scheduling problem. There arem parallel machines andn independent jobs. Ea...
Three related problems, among others, are faced when trying to execute an algorithm on a parallel ma...
This paper presents a novel scheme to schedule loops for clustered microarchitectures. The scheme is...
The model of malleable task (MT) was introduced some years ago and has been proved to be an efficien...
This paper describes a number of optimizations that can be used to support the efficient execution o...
International audienceThis work focuses on dynamic DAG scheduling under memory constraints. We targe...
Tech ReportThis paper is a study of scheduling on a 2-processor distributed system when one processo...
Modern superscalar architectures with dynamic scheduling and register renaming capabilities have int...
This thesis studies a heuristic approach to scheduling •on a 2-processor distributed system when one...
VLIW (Very Long Instruction Word) processors issue and execute multiple operations in parallel, on d...
We advocate using performance bounds to guide code optimizations. Accurate performance bounds establ...
Algorithms for the evaluation of polynomials on a hypothetical computer with k independent arithmeti...
There exist significant, well established code bases in the scientific computing and research commun...