Power consumption and fabrication limitations are increasingly playing significant roles in the design of extreme scale parallel systems. These factors are influencing system designers to support higher on-node computing capability via throughput-optimized processors instead of latency-optimized processors. However, the inter- and intra-processor communication capabilities on such systems are not increasing at the same rate as the on-node computing capability. Consequently, achieving high performance requires careful orchestration of both single- and multiprocessor parallelism. This thesis shows that compiler technology and expressive programming model constructs can help applications more effectively exploit both forms of parallelism. Co...
Distributed-memory message-passing machines deliver scalable perfor-mance but are difficult to progr...
The end of Dennard scaling also brought an end to frequency scaling as a means to improve performanc...
The current trends in high performance computing show that large machines with tens of thousands of ...
Parallel computing is regarded by most computer scientists as the most likely approach for significa...
As the demand increases for high performance and power efficiency in modern computer runtime systems...
Data-parallel languages allow programmers to use the familiar machine-independent programming style ...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/19...
Since the invention of the transistor, clock frequency increase was the primary method of improving ...
This paper describes methods to adapt existing optimizing compilers for sequential languages to prod...
The goal of this dissertation is to give programmers the ability to achieve high performance by focu...
This paper introduces the goals of the Portable, Scalable, Architecture Independent (PSI) Compiler P...
Modern computers will increasingly rely on parallelism to achieve high computation rates. Techniques...
Over the past few decades, scientific research has grown to rely increasingly on simulation and othe...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/16...
The current trends in high performance computing show that large machines with tens of thousands of ...
Distributed-memory message-passing machines deliver scalable perfor-mance but are difficult to progr...
The end of Dennard scaling also brought an end to frequency scaling as a means to improve performanc...
The current trends in high performance computing show that large machines with tens of thousands of ...
Parallel computing is regarded by most computer scientists as the most likely approach for significa...
As the demand increases for high performance and power efficiency in modern computer runtime systems...
Data-parallel languages allow programmers to use the familiar machine-independent programming style ...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/19...
Since the invention of the transistor, clock frequency increase was the primary method of improving ...
This paper describes methods to adapt existing optimizing compilers for sequential languages to prod...
The goal of this dissertation is to give programmers the ability to achieve high performance by focu...
This paper introduces the goals of the Portable, Scalable, Architecture Independent (PSI) Compiler P...
Modern computers will increasingly rely on parallelism to achieve high computation rates. Techniques...
Over the past few decades, scientific research has grown to rely increasingly on simulation and othe...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/16...
The current trends in high performance computing show that large machines with tens of thousands of ...
Distributed-memory message-passing machines deliver scalable perfor-mance but are difficult to progr...
The end of Dennard scaling also brought an end to frequency scaling as a means to improve performanc...
The current trends in high performance computing show that large machines with tens of thousands of ...