A coarse-grain parallel program typically has one thread (task) per processor, whereas a fine-grain program has one thread for each independent unit of work. Although there are several advantages to fine-grain parallelism, conventional wisdom is that coarsegrain parallelism is more efficient. This paper illustrates the advantages of fine-grain parallelism and presents an efficient implementation for shared-memory machines. The approach has been implemented in a portable software package called Filaments, which employs a unique combination of techniques to achieve efficiency. The performance of the fine-grain programs discussed in this paper is always within 13% of a hand-coded coarse-grain program and is usually within 5 percent. January 29...
The last decade has produced enormous improvements in processor speeds without a corresponding impro...
With ubiquitous multi-core architectures, a major challenge is how to effectively use these machines...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
. It has long been thought that coarse-grain parallelism is much more efficient than fine-grain para...
A fine-grain parallel program is one in which processes are typically small, ranging from a few to a...
A fine-grain parallel program is one in which processes are typically small, ranging from a few to a...
This dissertation addresses creating portable and efficient parallel programs for scientific computi...
While parallel programming is needed to solve large-scale scientific applications, it is more diffic...
With the rise of chip-multiprocessors, the problem of parallelizing general-purpose programs has onc...
© 2017 IEEE. The overwhelming wealth of parallelism exposed by Extreme-scale computing is rekindling...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
Symmetric multiprocessors (SMPs) connected with low-latency networks provide attractive building blo...
Parallel systems supporting a shared memory programming interface have been implemented both in soft...
A wide variety of computer architectures have been proposed to exploit parallelism at different gran...
The last decade has produced enormous improvements in processor speeds without a corresponding impro...
With ubiquitous multi-core architectures, a major challenge is how to effectively use these machines...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
. It has long been thought that coarse-grain parallelism is much more efficient than fine-grain para...
A fine-grain parallel program is one in which processes are typically small, ranging from a few to a...
A fine-grain parallel program is one in which processes are typically small, ranging from a few to a...
This dissertation addresses creating portable and efficient parallel programs for scientific computi...
While parallel programming is needed to solve large-scale scientific applications, it is more diffic...
With the rise of chip-multiprocessors, the problem of parallelizing general-purpose programs has onc...
© 2017 IEEE. The overwhelming wealth of parallelism exposed by Extreme-scale computing is rekindling...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
Symmetric multiprocessors (SMPs) connected with low-latency networks provide attractive building blo...
Parallel systems supporting a shared memory programming interface have been implemented both in soft...
A wide variety of computer architectures have been proposed to exploit parallelism at different gran...
The last decade has produced enormous improvements in processor speeds without a corresponding impro...
With ubiquitous multi-core architectures, a major challenge is how to effectively use these machines...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...