Block memory operations are frequently performed by the operating system and consume an increasing fraction of kernel execution time. These operations include memory copies, page zeroing, interprocess communication, and networking. This thesis demonstrates that performance of these common OS operations is highly dependent on the cache state and fu-ture use pattern of the data. This thesis argues that prediction of both initial cache state and data reuse patterns can be used to dynamically select the optimal algorithm. It describes an innovative method for predicting the state of the cache by using a single cache-line probe. The performance of networking, which is dominated by kernel copies, is improved by the addition of dedicated hardware ...
Modern life demands fast computations. Even the slightest latencies can have severe consequences and...
Performance is an important aspect of computer systems since it directly affects user experience. On...
Cache injection is a viable technique to improve the performance of data-intensive parallel applicat...
Blocking is a well-known optimization technique for improving the effectiveness of memory hierarchie...
As buffer cache is used to overcome the speed gap between processor and storage devices, performance...
Architects have adopted the shared memory model that implicitly manages cache coherence and cache ca...
We introduce a novel approach to predict whether a block should be allocated in the cache or not upo...
Cache memory is a memory which is used by the central processing unit in a computer to reduce the bu...
Processor performance is directly impacted by the latency of the memory system. As processor core cy...
Cache memory is one of the most important components of a computer system. The cache allows quickly...
An ideal high performance computer includes a fast processor and a multi-million byte memory of comp...
Our thesis is that operating systems should manage the on-chip shared caches of multicore processors...
With the software applications increasing in complexity, description of hardware is becoming increas...
Memory (cache, DRAM, and disk) is in charge of providing data and instructions to a computer\u27s pr...
This paper presents a Least Popularly Used buffer cache algorithm to exploit both temporal locality ...
Modern life demands fast computations. Even the slightest latencies can have severe consequences and...
Performance is an important aspect of computer systems since it directly affects user experience. On...
Cache injection is a viable technique to improve the performance of data-intensive parallel applicat...
Blocking is a well-known optimization technique for improving the effectiveness of memory hierarchie...
As buffer cache is used to overcome the speed gap between processor and storage devices, performance...
Architects have adopted the shared memory model that implicitly manages cache coherence and cache ca...
We introduce a novel approach to predict whether a block should be allocated in the cache or not upo...
Cache memory is a memory which is used by the central processing unit in a computer to reduce the bu...
Processor performance is directly impacted by the latency of the memory system. As processor core cy...
Cache memory is one of the most important components of a computer system. The cache allows quickly...
An ideal high performance computer includes a fast processor and a multi-million byte memory of comp...
Our thesis is that operating systems should manage the on-chip shared caches of multicore processors...
With the software applications increasing in complexity, description of hardware is becoming increas...
Memory (cache, DRAM, and disk) is in charge of providing data and instructions to a computer\u27s pr...
This paper presents a Least Popularly Used buffer cache algorithm to exploit both temporal locality ...
Modern life demands fast computations. Even the slightest latencies can have severe consequences and...
Performance is an important aspect of computer systems since it directly affects user experience. On...
Cache injection is a viable technique to improve the performance of data-intensive parallel applicat...