With the increasing gap between processor speed and memory speed, a sophisticated memory hierarchy is key to high performance. However, the operating system tends to use the memory hierarchy poorly. This thesis presents a comprehensive characterization and optimization of the performance of multiprocessor memory hierarchies for operating systems. The operating system instruction cache misses are reduced by 81% using a code reorganization scheme tailored to the operating system, guarded sequential prefetching, and stream buffers. The operating system data cache misses are reduced by 53% using a DMA-like pipelined block transfer engine, a selective update protocol, data relocation and privatization, and data prefetching in miss hot spots. The...
To design computers which reach the performance limits of the implementation technology, one must un...
Journal ArticleConventional microarchitectures choose a single memory hierarchy design point target...
We propose a simple structuring technique based on clustering for designing scalable shared memory m...
Designing an operating system for good performance is fundamentally more difficult for shared-memory...
Journal ArticleAlthough microprocessor performance continues to increase at a rapid pace, the growin...
This dissertation examines scalability issues in the design of operating systems for largescale, sha...
Modern processors apply sophisticated techniques, such as deep cache hierarchies and hardware prefet...
PhD ThesisCurrent microprocessors improve performance by exploiting instruction-level parallelism (I...
A processor’s memory hierarchy has a major impact on the performance of running code. As memory hier...
Current microprocessors improve performance by exploiting instruction-level parallelism (ILP). ILP h...
In modern computers, memory hierarchies play a paramount role in improving the average execution tim...
Application performance on modern processors has become increasingly dictated by the use of on-chip ...
Abstract. We introduce the concept of hierarchical clustering as a way to structure shared-memory mu...
The paper presents a task allocation scheme for system-level synthesis of multirate real-time tasks ...
To reduce latency and increase bandwidth to memory, modern microprocessors are often designed with d...
To design computers which reach the performance limits of the implementation technology, one must un...
Journal ArticleConventional microarchitectures choose a single memory hierarchy design point target...
We propose a simple structuring technique based on clustering for designing scalable shared memory m...
Designing an operating system for good performance is fundamentally more difficult for shared-memory...
Journal ArticleAlthough microprocessor performance continues to increase at a rapid pace, the growin...
This dissertation examines scalability issues in the design of operating systems for largescale, sha...
Modern processors apply sophisticated techniques, such as deep cache hierarchies and hardware prefet...
PhD ThesisCurrent microprocessors improve performance by exploiting instruction-level parallelism (I...
A processor’s memory hierarchy has a major impact on the performance of running code. As memory hier...
Current microprocessors improve performance by exploiting instruction-level parallelism (ILP). ILP h...
In modern computers, memory hierarchies play a paramount role in improving the average execution tim...
Application performance on modern processors has become increasingly dictated by the use of on-chip ...
Abstract. We introduce the concept of hierarchical clustering as a way to structure shared-memory mu...
The paper presents a task allocation scheme for system-level synthesis of multirate real-time tasks ...
To reduce latency and increase bandwidth to memory, modern microprocessors are often designed with d...
To design computers which reach the performance limits of the implementation technology, one must un...
Journal ArticleConventional microarchitectures choose a single memory hierarchy design point target...
We propose a simple structuring technique based on clustering for designing scalable shared memory m...