Today there is an urgent need for algorithms, programming lan-guage systems and tools, and hardware that deliver on the potential of parallelism due to the end of Dennard scaling. This work (from my PhD dissertation, supervised by Ken Kennedy) was one of the early papers to optimize for and experimentally explore the tension between data locality and parallelism on shared memory machines. A key result was that false sharing of cache lines between proces-sors with local caches on separate chips was disastrous to the per-formance and scaling of applications. This retrospective includes a short personal tour through the history of parallel computing, a dis-cussion of locality and parallelism modeling versus a polyhedral formulation of optimizi...
The evolution of parallel processing over the past several decades can be viewed as the development ...
This was a two-page overview of my NSF-funded project Supercomputing on a Cluster of Workstations v...
In this paper we present a retrospective on our paper published in ICS 1995, which to best of our kn...
Today there is an urgent need for algorithms, programming lan-guage systems and tools, and hardware ...
The goal of this dissertation is to give programmers the ability to achieve high performance by focu...
“A Data Locality Optimizing Algorithm ” was one of the first pa-pers published as part of the SUIF p...
University of Minnesota Ph.D. dissertation. September 2014. Major: Computer Science. Advisor: Pen-Ch...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/16...
Thesis (Ph. D.)--University of Rochester. Department of Computer Science, 2017On modern processors, ...
In this paper we present a retrospective on our paper published in ICS 1995, which to best of our kn...
1. The need for local and parallel optimization Although processor speeds have been increasing rapid...
To scale applications on multicores up to bigger problems, software systems must be optimized both f...
grantor: University of TorontoThis dissertation proposes and evaluates compiler techniques...
In the past decade, processor speed has become significantly faster than memory speed. Small, fast c...
This dissertation details contributions made by the author to the field of computer science while wo...
The evolution of parallel processing over the past several decades can be viewed as the development ...
This was a two-page overview of my NSF-funded project Supercomputing on a Cluster of Workstations v...
In this paper we present a retrospective on our paper published in ICS 1995, which to best of our kn...
Today there is an urgent need for algorithms, programming lan-guage systems and tools, and hardware ...
The goal of this dissertation is to give programmers the ability to achieve high performance by focu...
“A Data Locality Optimizing Algorithm ” was one of the first pa-pers published as part of the SUIF p...
University of Minnesota Ph.D. dissertation. September 2014. Major: Computer Science. Advisor: Pen-Ch...
This work was also published as a Rice University thesis/dissertation: http://hdl.handle.net/1911/16...
Thesis (Ph. D.)--University of Rochester. Department of Computer Science, 2017On modern processors, ...
In this paper we present a retrospective on our paper published in ICS 1995, which to best of our kn...
1. The need for local and parallel optimization Although processor speeds have been increasing rapid...
To scale applications on multicores up to bigger problems, software systems must be optimized both f...
grantor: University of TorontoThis dissertation proposes and evaluates compiler techniques...
In the past decade, processor speed has become significantly faster than memory speed. Small, fast c...
This dissertation details contributions made by the author to the field of computer science while wo...
The evolution of parallel processing over the past several decades can be viewed as the development ...
This was a two-page overview of my NSF-funded project Supercomputing on a Cluster of Workstations v...
In this paper we present a retrospective on our paper published in ICS 1995, which to best of our kn...