In this paper, we present a new architecture of the cache-based memory copy hardware accelerator in a multicore system supporting message passing. The accelerator is able to accelerate memory data movements, in particular memory copies. We perform an analytical analysis based on open-queuing theory to study the utilization of our accelerator in a multicore system. In order to correctly model the system, we gather the necessary information by utilizing a full-system simulator. We present both the simulation results and the analytical analysis. We demonstrate the advantages of our solution based on a full-system simulator utilizing several applications: the STREAM benchmark and the receiver-side of the TCP/IP stack. Our accelerator provides s...
Data or instructions that are regularly used are saved in cache so that it is very easy to retrieve ...
To increase performance, modern processors employ complex techniques such as out-of-order pipelines ...
This paper presents the design and evaluation of the M-cache, a small, fast and intelligent memory f...
This dissertation presents a hardware accelerator that is able to accelerate large (including non-pa...
Memory copies for bulk data transport incur large overheads due to CPU stalling, small register-size...
To reduce the average time needed to perform a read or a write access in a multiprocessor, a cache i...
With rapidly evolving technology, multicore and manycore processors have emerged as promising archit...
We introduce the Execution Migration Machine (EM²), a novel data-centric multicore memory system arc...
Abstract — One of the key challenges in advanced micro-architecture is to provide high performance h...
Our thesis is that operating systems should manage the on-chip shared caches of multicore processors...
Abstract—We describe new multi-ported cache designs suit-able for use in FPGA-based processor/parall...
Single chip multicore processors are now prevalent and processors with hundreds of cores are being p...
Abstract—Bulk memory copying and initialization is one of the most ubiquitous operations performed i...
In heterogeneous computer architectures, the serial part of an application is coupled with domain-sp...
Caches are known to consume a large part of total microprocessor power. Traditionally, voltage scali...
Data or instructions that are regularly used are saved in cache so that it is very easy to retrieve ...
To increase performance, modern processors employ complex techniques such as out-of-order pipelines ...
This paper presents the design and evaluation of the M-cache, a small, fast and intelligent memory f...
This dissertation presents a hardware accelerator that is able to accelerate large (including non-pa...
Memory copies for bulk data transport incur large overheads due to CPU stalling, small register-size...
To reduce the average time needed to perform a read or a write access in a multiprocessor, a cache i...
With rapidly evolving technology, multicore and manycore processors have emerged as promising archit...
We introduce the Execution Migration Machine (EM²), a novel data-centric multicore memory system arc...
Abstract — One of the key challenges in advanced micro-architecture is to provide high performance h...
Our thesis is that operating systems should manage the on-chip shared caches of multicore processors...
Abstract—We describe new multi-ported cache designs suit-able for use in FPGA-based processor/parall...
Single chip multicore processors are now prevalent and processors with hundreds of cores are being p...
Abstract—Bulk memory copying and initialization is one of the most ubiquitous operations performed i...
In heterogeneous computer architectures, the serial part of an application is coupled with domain-sp...
Caches are known to consume a large part of total microprocessor power. Traditionally, voltage scali...
Data or instructions that are regularly used are saved in cache so that it is very easy to retrieve ...
To increase performance, modern processors employ complex techniques such as out-of-order pipelines ...
This paper presents the design and evaluation of the M-cache, a small, fast and intelligent memory f...