Accessing the memory efficiently to keep up with the data processing rate is a well known problem in modern architectures. Regular access patterns, such as constant strides, have been exploited in designing efficient memory systems. The objective of this thesis is to explore the possibilities of using a hardware support to improve memory performance by harnessing the knowledge of access patterns. A novel hardware support called the distTree is proposed to speed up memory accesses whose patterns of access are known at compile time. The distTree facilitates the memory accesses for applications even when the access patterns are irregular.The distTree is implemented and adoptable areas are explored. Two adoptable applications are highlighted in...
The explosive increase in data volume in emerging applications poses grand challenges to computing s...
One of the key kernels in scientific applications is the Sparse Matrix Vector Multiplication (SMVM)....
Memory bandwidth is rapidly becoming the performance bottleneck in the application of high performan...
Many algorithms and applications in scientific computing exhibit irregular access patterns as consec...
The Gustavson’s algorithm (i.e., the row-wise product algorithm) shows its potential as the backbone...
Sparse matrix-vector multiplication (SpMV) is an important ker-nel in many scientific applications a...
Many data-intensive applications exhibit poor temporal and spatial locality and perform poorly on co...
The performance gap between CPUs, and memory memory has diverged significantly since the 1980's maki...
The bandwidth mismatch between processor and main memory is one major limiting problem. Although str...
Hardware Support for Dynamic Access Ordering: Performance of Some Design Options Sally A. McKee Depa...
AbstractThe sparse matrix-vector multiplication (SpMV) is a fundamental kernel used in computational...
The work presented in this thesis investigates how existing and future computer architectures can be...
The last two decade has witnessed two opposing hardware trends where the DRAM capacity and the acces...
This dissertation presents an architecture to accelerate sparse matrix linear algebra,which is among...
Abstract. Traditional parallel programming methodologies for improv-ing performance assume cache-bas...
The explosive increase in data volume in emerging applications poses grand challenges to computing s...
One of the key kernels in scientific applications is the Sparse Matrix Vector Multiplication (SMVM)....
Memory bandwidth is rapidly becoming the performance bottleneck in the application of high performan...
Many algorithms and applications in scientific computing exhibit irregular access patterns as consec...
The Gustavson’s algorithm (i.e., the row-wise product algorithm) shows its potential as the backbone...
Sparse matrix-vector multiplication (SpMV) is an important ker-nel in many scientific applications a...
Many data-intensive applications exhibit poor temporal and spatial locality and perform poorly on co...
The performance gap between CPUs, and memory memory has diverged significantly since the 1980's maki...
The bandwidth mismatch between processor and main memory is one major limiting problem. Although str...
Hardware Support for Dynamic Access Ordering: Performance of Some Design Options Sally A. McKee Depa...
AbstractThe sparse matrix-vector multiplication (SpMV) is a fundamental kernel used in computational...
The work presented in this thesis investigates how existing and future computer architectures can be...
The last two decade has witnessed two opposing hardware trends where the DRAM capacity and the acces...
This dissertation presents an architecture to accelerate sparse matrix linear algebra,which is among...
Abstract. Traditional parallel programming methodologies for improv-ing performance assume cache-bas...
The explosive increase in data volume in emerging applications poses grand challenges to computing s...
One of the key kernels in scientific applications is the Sparse Matrix Vector Multiplication (SMVM)....
Memory bandwidth is rapidly becoming the performance bottleneck in the application of high performan...