The existing parallel algorithms in the TORT discrete ordinates were updated to function in a UNI-COS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead
Communicated by Editor’s name Today, large scale parallel systems are available at low cost. Many po...
© 2018 The Author(s). Porting scientific key algorithms to HPC architectures requires a thorough und...
The shift towards multicore processing has led to a much wider population of developers being faced ...
A limitation on the parallel performance of TORT on the CRAY J90 is the amount of extra work introdu...
Today most of the multiprocessor supercomputer systems are still used within a multiprogramming envi...
Abstract:-This paper presents a queueing model to measure the performance of parallel processing net...
The multitasking options in the three-dimensional neutral particle transport code TORT originally im...
This thesis presents a unified approach to modeling of parallel architectures and algorithms with sp...
While the efficiency of autotasking with respect to speedup values already is proved for parallel pr...
The CPUs, memory, interconnection network, operating system, runtime system, I/O subsystem, and appl...
In this paper we show that it is feasible to characterize the overheads present in conservative para...
The performance of a computer system is important. One way of improving performance is to use multip...
Many parallel algorithms can be modelled as directed acyclic task graphs. Recently, Degree of Simult...
In this paper, we investigate the traffic characteristics of parallel and high performance computi...
Abstract — A parallel program should be evaluated to determine its efficiency, accuracy and benefits...
Communicated by Editor’s name Today, large scale parallel systems are available at low cost. Many po...
© 2018 The Author(s). Porting scientific key algorithms to HPC architectures requires a thorough und...
The shift towards multicore processing has led to a much wider population of developers being faced ...
A limitation on the parallel performance of TORT on the CRAY J90 is the amount of extra work introdu...
Today most of the multiprocessor supercomputer systems are still used within a multiprogramming envi...
Abstract:-This paper presents a queueing model to measure the performance of parallel processing net...
The multitasking options in the three-dimensional neutral particle transport code TORT originally im...
This thesis presents a unified approach to modeling of parallel architectures and algorithms with sp...
While the efficiency of autotasking with respect to speedup values already is proved for parallel pr...
The CPUs, memory, interconnection network, operating system, runtime system, I/O subsystem, and appl...
In this paper we show that it is feasible to characterize the overheads present in conservative para...
The performance of a computer system is important. One way of improving performance is to use multip...
Many parallel algorithms can be modelled as directed acyclic task graphs. Recently, Degree of Simult...
In this paper, we investigate the traffic characteristics of parallel and high performance computi...
Abstract — A parallel program should be evaluated to determine its efficiency, accuracy and benefits...
Communicated by Editor’s name Today, large scale parallel systems are available at low cost. Many po...
© 2018 The Author(s). Porting scientific key algorithms to HPC architectures requires a thorough und...
The shift towards multicore processing has led to a much wider population of developers being faced ...