We investigate the construction and application of parallel software caches in shared memory multiprocessors. In contrast to maintaining a private cache for each thread, a parallel cache allows the re-use of results of lengthy computations by other threads. This is especially important in irregular applications where the re-use of intermediate results by scheduling is not possible. Example applications are the computation of intersections between a scanline and a polygon in computational geometry, and the computation of intersections between rays and objects in ray tracing. A parallel software cache is based on a readers/writers lock, i.e. as long as no thread alters the cache data structure, multiple threads may read simultaneously. If a t...
AbstractCache thrashing due to true data sharing can degrade the performance of parallel programs si...
Maximal utilization of cores in multicore architectures is key to realize the potential performance ...
We describe an efficient software cache consistency mechanism for shared memory multiprocessors that...
We investigate the construction and application of parallel software caches in shared memory multipr...
technical reportThe next generation of scalable parallel systems (e.g., machines by KSR, Convex, and...
Reordering instructions and data layout can bring significant performance improvement for memory bou...
Multi-core processors have become the dominant processor architecture with 2, 4, and 8 cores on a ch...
Improvements in the processing speed of multiprocessors are outpacing improvements in the speed of d...
A wide variety of computer architectures have been proposed to exploit parallelism at different gran...
A wide variety of computer architectures have been proposed to exploit parallelism at different gran...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 2011.Computer architects have e...
Applications with regular patterns of memory access can experience high levels of cache conflict mis...
Reordering instructions and data layout can bring significant performance improvement for memory bou...
An adaptive cache coherence mechanism exploits semantic information about the expected or observed a...
The goal of the RAP-WAM AND-parallel Prolog abstract architecture is to provide inference speeds sig...
AbstractCache thrashing due to true data sharing can degrade the performance of parallel programs si...
Maximal utilization of cores in multicore architectures is key to realize the potential performance ...
We describe an efficient software cache consistency mechanism for shared memory multiprocessors that...
We investigate the construction and application of parallel software caches in shared memory multipr...
technical reportThe next generation of scalable parallel systems (e.g., machines by KSR, Convex, and...
Reordering instructions and data layout can bring significant performance improvement for memory bou...
Multi-core processors have become the dominant processor architecture with 2, 4, and 8 cores on a ch...
Improvements in the processing speed of multiprocessors are outpacing improvements in the speed of d...
A wide variety of computer architectures have been proposed to exploit parallelism at different gran...
A wide variety of computer architectures have been proposed to exploit parallelism at different gran...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 2011.Computer architects have e...
Applications with regular patterns of memory access can experience high levels of cache conflict mis...
Reordering instructions and data layout can bring significant performance improvement for memory bou...
An adaptive cache coherence mechanism exploits semantic information about the expected or observed a...
The goal of the RAP-WAM AND-parallel Prolog abstract architecture is to provide inference speeds sig...
AbstractCache thrashing due to true data sharing can degrade the performance of parallel programs si...
Maximal utilization of cores in multicore architectures is key to realize the potential performance ...
We describe an efficient software cache consistency mechanism for shared memory multiprocessors that...