Modern computer systems have evolved to employ powerful parallel architectures, including multi-core processors, multi-socket chips, large memory subsystems, and fast network communication. Given such powerful hardware, developers rely on performance profiling and modeling to guide their performance optimization. However, performance optimization is facing new challenges on efficiency and accuracy with emerging computer systems. In this dissertation, we propose approaches to address these challenges. We first study memory contention in Non-Uniform Memory Access (NUMA) architectures. We present DR-BW, a new tool based on machine learning to identify bandwidth contention in NUMA architectures and provide optimization guidance. DR-BW collec...
Consistently growing architectural complexity and machine scales make creating accurate performance ...
Abstract—An important aspect of workload characterization is understanding memory system performance...
The many configuration options of modern applications make it difficult for users to select a perfor...
Applications may have unintended performance problems in spite of compiler optimizations, because of...
thesisTo address the need of understanding and optimizing the performance of complex applications an...
While parallel computing offers an attractive perspective for the future, developing efficient paral...
To analyze the performance of applications and architectures, both programmers and architects desire...
As the number of cores increases Non-Uniform Memory Access (NUMA) is becoming increasingly prevalent...
The memory system is increasingly becoming a performance bottleneck. Several intelligent memory syst...
Performance is the critical feature in the design and productivity of software systems. A key to imp...
To reduce latency and increase bandwidth to memory, modern microprocessors are often designed with d...
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Compute...
Projecting performance of applications and hardware is important to several market segments—hardware...
The increasing programmability, performance, and cost/effectiveness of GPUs have led to a widespread...
I/O is one of the main performance bottlenecks for many data-intensive scientific applications. Accu...
Consistently growing architectural complexity and machine scales make creating accurate performance ...
Abstract—An important aspect of workload characterization is understanding memory system performance...
The many configuration options of modern applications make it difficult for users to select a perfor...
Applications may have unintended performance problems in spite of compiler optimizations, because of...
thesisTo address the need of understanding and optimizing the performance of complex applications an...
While parallel computing offers an attractive perspective for the future, developing efficient paral...
To analyze the performance of applications and architectures, both programmers and architects desire...
As the number of cores increases Non-Uniform Memory Access (NUMA) is becoming increasingly prevalent...
The memory system is increasingly becoming a performance bottleneck. Several intelligent memory syst...
Performance is the critical feature in the design and productivity of software systems. A key to imp...
To reduce latency and increase bandwidth to memory, modern microprocessors are often designed with d...
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Compute...
Projecting performance of applications and hardware is important to several market segments—hardware...
The increasing programmability, performance, and cost/effectiveness of GPUs have led to a widespread...
I/O is one of the main performance bottlenecks for many data-intensive scientific applications. Accu...
Consistently growing architectural complexity and machine scales make creating accurate performance ...
Abstract—An important aspect of workload characterization is understanding memory system performance...
The many configuration options of modern applications make it difficult for users to select a perfor...