On recent high-performance multiprocessors, there is a potential conflict between the goals of achieving the full performance potential of the hardware and providing a parallel programming environment that makes effective use of programmer effort. On one hand, an explicit coarse-grain programming style may appear to be necessary, both to achieve good cache performance and to limit the amount of overhead due to context switching and synchronization. On the other hand, it may be more expedient to use more natural and finer-grain programming styles based on abstractions such as task heaps, light-weight threads, parallel loops, or object-oriented parallelism. Unfortunately, using these styles can cause a loss of performance due to poor locality...
Lightweight threads have become a common abstraction in the field of programming languages and opera...
To scale applications on multicores up to bigger problems, software systems must be optimized both f...
Today, almost all desktop and laptop computers are shared-memory multicores, but the code they run i...
The task parallel programming model allows programmers to express concurrency at a high level of abs...
grantor: University of TorontoThis dissertation proposes and evaluates compiler techniques...
This paper describes a method to improve the cache locality of sequential programs by scheduling fin...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 1993. Simultaneously published...
Modern computer architectures expose an increasing number of parallel features supported by complex ...
Abstract—The emergence of multi-core systems opens new opportunities for thread-level parallelism an...
The emergence of commercial multiprocessors has prompted computer scientists to take a closer look a...
Making computer systems more energy efficient while obtaining the maximum performance possible is ke...
Single threaded tasks are the basic unit of scheduling in modern runtimes targeting multicore hardwa...
In systems with complex many-core cache hierarchy, exploiting data locality can significantly reduce...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
A wide variety of computer architectures have been proposed to exploit parallelism at different gran...
Lightweight threads have become a common abstraction in the field of programming languages and opera...
To scale applications on multicores up to bigger problems, software systems must be optimized both f...
Today, almost all desktop and laptop computers are shared-memory multicores, but the code they run i...
The task parallel programming model allows programmers to express concurrency at a high level of abs...
grantor: University of TorontoThis dissertation proposes and evaluates compiler techniques...
This paper describes a method to improve the cache locality of sequential programs by scheduling fin...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 1993. Simultaneously published...
Modern computer architectures expose an increasing number of parallel features supported by complex ...
Abstract—The emergence of multi-core systems opens new opportunities for thread-level parallelism an...
The emergence of commercial multiprocessors has prompted computer scientists to take a closer look a...
Making computer systems more energy efficient while obtaining the maximum performance possible is ke...
Single threaded tasks are the basic unit of scheduling in modern runtimes targeting multicore hardwa...
In systems with complex many-core cache hierarchy, exploiting data locality can significantly reduce...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
A wide variety of computer architectures have been proposed to exploit parallelism at different gran...
Lightweight threads have become a common abstraction in the field of programming languages and opera...
To scale applications on multicores up to bigger problems, software systems must be optimized both f...
Today, almost all desktop and laptop computers are shared-memory multicores, but the code they run i...