This paper describes several techniques designed to improve protocol latency, and reports on their effectiveness when measured on a modern RISC machine employing the DEC Alpha processor. We found that the memory system---which has long been known to dominate network throughput---is also a key factor in protocol latency. In particular, improving instruction cache effectiveness can greatly reduce protocol processing overheads. An important metric in this context is the memory cycles per instructions (mCPI), which is the average number of cycles that an instruction stalls waiting for a memory access to complete. The techniques presented in this paper reduce the mCPI by up to a factor of 5.8. In analyzing the effectiveness of the techniques, we...
Integrated circuits have been in constant progression since the first prototype in 1958, with the se...
Networked information systems have seen explosive growth in the last few years, and are transforming...
Summarization: By examining the rate at which successive generations of processor and DRAM cycle tim...
TCP/IP protocol processing latency has been an important issue in high-speed networks. In this paper...
Many techniques have been discovered to improve performance of bulk data transfer protocols which us...
Networking research and development have historically focused on increasing network throughput and p...
Summarization: To meet the demand for higher performance, flexibility, and economy in today's state-...
This paper presents detailed measurements of processing overheads for the Ultrix 4.2a implementation...
Current microprocessors improve performance by exploiting instruction-level parallelism (ILP). ILP h...
In a heterogeneous computing environment, computers have to use a suitable transfer syntax to commun...
PhD ThesisCurrent microprocessors improve performance by exploiting instruction-level parallelism (I...
Processor performance is directly impacted by the latency of the memory system. As processor core cy...
The increasing demand for more and more computing power causes steady advancements of High Performan...
Performance improvements in memory systems have traditionally been obtained by scaling data bus widt...
In both hardware-only and software-only directory protocols the performance is often limited by memo...
Integrated circuits have been in constant progression since the first prototype in 1958, with the se...
Networked information systems have seen explosive growth in the last few years, and are transforming...
Summarization: By examining the rate at which successive generations of processor and DRAM cycle tim...
TCP/IP protocol processing latency has been an important issue in high-speed networks. In this paper...
Many techniques have been discovered to improve performance of bulk data transfer protocols which us...
Networking research and development have historically focused on increasing network throughput and p...
Summarization: To meet the demand for higher performance, flexibility, and economy in today's state-...
This paper presents detailed measurements of processing overheads for the Ultrix 4.2a implementation...
Current microprocessors improve performance by exploiting instruction-level parallelism (ILP). ILP h...
In a heterogeneous computing environment, computers have to use a suitable transfer syntax to commun...
PhD ThesisCurrent microprocessors improve performance by exploiting instruction-level parallelism (I...
Processor performance is directly impacted by the latency of the memory system. As processor core cy...
The increasing demand for more and more computing power causes steady advancements of High Performan...
Performance improvements in memory systems have traditionally been obtained by scaling data bus widt...
In both hardware-only and software-only directory protocols the performance is often limited by memo...
Integrated circuits have been in constant progression since the first prototype in 1958, with the se...
Networked information systems have seen explosive growth in the last few years, and are transforming...
Summarization: By examining the rate at which successive generations of processor and DRAM cycle tim...