This paper discusses the importance of memory access optimizations which are shown to be highly effec-tive on the MasPar architecture. The study is based on two MasPar machines, a 16K–processor MP–1 and a 4K–processor MP–2. A software pipelining technique overlaps memory accesses with computation and/ or communication. Another optimization, called the register window technique reduces the number of loads in a loop. These techniques are evaluated using three parallel matrix multiplication algorithms on both the MasPar machines. The matrix multiplication study shows that for a highly computation intensive problem, reducing the interprocessor communication can become a secondary issue compared to memory access optimization. Also, it is shown t...
Abstract. Moore’s Law suggests that the number of processing cores on a single chip increases expone...
In the last three years, GPUs are more and more being used for general purpose applications instead ...
During the last half-decade, a number of research efforts have centered around developing software f...
This paper discusses the importance of memory access optimizations which are shown to be highly effe...
This paper reports an experimental study on the suitability of systolic algorithms scheduling method...
Thesis (Ph.D.), School of Electrical Engineering and Computer Science, Washington State UniversityPa...
One of the critical problems facing designers of high performance processors is the disparity betwee...
The performance of memory-bound commercial applications such as databases is limited by increasing m...
This work explores the tradeoffs of the memory system of a new massively parallel multiprocessor in ...
Abstract. Traditional parallel programming methodologies for improv-ing performance assume cache-bas...
The multicore era has initiated a move to ubiquitous parallelization of software. In the process, co...
ABSTRACT: High performance microprocessor design using Q-Dot technology addresses the key design iss...
Accessing the memory efficiently to keep up with the data processing rate is a well known problem in...
While parallel computing offers an attractive perspective for the future, developing efficient paral...
Parallel algorithms play an imperative role in the high performance computing environment. Dividing ...
Abstract. Moore’s Law suggests that the number of processing cores on a single chip increases expone...
In the last three years, GPUs are more and more being used for general purpose applications instead ...
During the last half-decade, a number of research efforts have centered around developing software f...
This paper discusses the importance of memory access optimizations which are shown to be highly effe...
This paper reports an experimental study on the suitability of systolic algorithms scheduling method...
Thesis (Ph.D.), School of Electrical Engineering and Computer Science, Washington State UniversityPa...
One of the critical problems facing designers of high performance processors is the disparity betwee...
The performance of memory-bound commercial applications such as databases is limited by increasing m...
This work explores the tradeoffs of the memory system of a new massively parallel multiprocessor in ...
Abstract. Traditional parallel programming methodologies for improv-ing performance assume cache-bas...
The multicore era has initiated a move to ubiquitous parallelization of software. In the process, co...
ABSTRACT: High performance microprocessor design using Q-Dot technology addresses the key design iss...
Accessing the memory efficiently to keep up with the data processing rate is a well known problem in...
While parallel computing offers an attractive perspective for the future, developing efficient paral...
Parallel algorithms play an imperative role in the high performance computing environment. Dividing ...
Abstract. Moore’s Law suggests that the number of processing cores on a single chip increases expone...
In the last three years, GPUs are more and more being used for general purpose applications instead ...
During the last half-decade, a number of research efforts have centered around developing software f...