While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its continuing success largely depends on the parallelizability of complex programs. In the early 1990s great successes were obtained to extract parallelism from the inner loops of scientific computations. In this paper we show that significant amounts of coarse-grain parallelism exists in the outer program loops, even in general-purpose programs. This coarse-grain parallelism can be exploited efficiently on CMPs without additional hardware support. KEYWORDS: thread-level parallelism, coarse-grain parallelism, DO-ACROSS
We develop a technique for extracting parallelism from ordinary (sequential) programs. The technique...
Traditional parallelism detection in compilers is performed by means of static analysis and more spe...
Thesis (Ph. D.--University of Rochester. Dept. of Computer Science, 1991. Simultaneously published i...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
With the rise of chip-multiprocessors, the problem of parallelizing general-purpose programs has onc...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
A coarse-grain parallel program typically has one thread (task) per processor, whereas a fine-grain ...
In recent years research in the area of parallel architectures and parallel languages has become mor...
The main objective of compiler and processor designers is to eectively exploit the instruction{level...
. Research into automatic extraction of instruction-level parallelism and data parallelism from sequ...
Traditional static analysis fails to auto-parallelize programs with a complex control and data flow....
© 2017 IEEE. The overwhelming wealth of parallelism exposed by Extreme-scale computing is rekindling...
Today’s processors exploit the fine grain data parallelism that exists in many applications via ILP ...
This paper presents a new approach for the detection of coarse-grain parallelism in loop nests that ...
We develop a technique for extracting parallelism from ordinary (sequential) programs. The technique...
Traditional parallelism detection in compilers is performed by means of static analysis and more spe...
Thesis (Ph. D.--University of Rochester. Dept. of Computer Science, 1991. Simultaneously published i...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
With the rise of chip-multiprocessors, the problem of parallelizing general-purpose programs has onc...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
A coarse-grain parallel program typically has one thread (task) per processor, whereas a fine-grain ...
In recent years research in the area of parallel architectures and parallel languages has become mor...
The main objective of compiler and processor designers is to eectively exploit the instruction{level...
. Research into automatic extraction of instruction-level parallelism and data parallelism from sequ...
Traditional static analysis fails to auto-parallelize programs with a complex control and data flow....
© 2017 IEEE. The overwhelming wealth of parallelism exposed by Extreme-scale computing is rekindling...
Today’s processors exploit the fine grain data parallelism that exists in many applications via ILP ...
This paper presents a new approach for the detection of coarse-grain parallelism in loop nests that ...
We develop a technique for extracting parallelism from ordinary (sequential) programs. The technique...
Traditional parallelism detection in compilers is performed by means of static analysis and more spe...
Thesis (Ph. D.--University of Rochester. Dept. of Computer Science, 1991. Simultaneously published i...