This paper describes a method of analysis for detecting and minimizing memory latency using a directed data dependency graph produced from a compiler. These results are applicable to the development of methods for the optimal generation of instruction threads to be executed on a multi-threaded, datadriven architecture. The resulting runtime reductions are accomplished by minimizing memory access times by individual processing elements. Additionally, these analysis methods can be used to predict measures of achievable parallelism for a given program graph which can be exploited by a reconfigurable, multi-threaded architecture. 1 Introduction This paper describes a method for the detection and minimization of memory latencies in a datadrive...
Parallel graph reduction is a conceptually simple model for the concurrent evaluation of lazy functi...
The design and evaluation of microprocessor architectures is a difficult and time-consuming task. Al...
AbstractIn this paper, we present a compiler extension for applications targeting high performance e...
This paper considers the thread scheduling problem. The thread scheduling problem abstracts the prob...
Abstract: "Data-parallel programming languages have many desirable features, such as single-thread s...
International audienceThis paper describes a tool using one or more executions of a sequential progr...
Abstract: This paper considers the thread scheduling problem. The thread scheduling problem abstract...
Performance tuning of non-blocking threads is based on graph partitioning algorithms that create ser...
The programming complexity of increasingly parallel processors calls for new tools to assist program...
Memory bandwidth has become the performance bottleneck for memory intensive programs on modern proce...
AbstractToday central topic in science and engineering is parallel and distributed computing, resear...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Science and Engineering advancements depend more and more on computational simulations. These simula...
[[abstract]]The data dependence graph (DDG) is a useful tool for the parallelism detection which is ...
Optimizations. (Under the direction of Associate Professor Dr. Frank Mueller). Thread level speculat...
Parallel graph reduction is a conceptually simple model for the concurrent evaluation of lazy functi...
The design and evaluation of microprocessor architectures is a difficult and time-consuming task. Al...
AbstractIn this paper, we present a compiler extension for applications targeting high performance e...
This paper considers the thread scheduling problem. The thread scheduling problem abstracts the prob...
Abstract: "Data-parallel programming languages have many desirable features, such as single-thread s...
International audienceThis paper describes a tool using one or more executions of a sequential progr...
Abstract: This paper considers the thread scheduling problem. The thread scheduling problem abstract...
Performance tuning of non-blocking threads is based on graph partitioning algorithms that create ser...
The programming complexity of increasingly parallel processors calls for new tools to assist program...
Memory bandwidth has become the performance bottleneck for memory intensive programs on modern proce...
AbstractToday central topic in science and engineering is parallel and distributed computing, resear...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Science and Engineering advancements depend more and more on computational simulations. These simula...
[[abstract]]The data dependence graph (DDG) is a useful tool for the parallelism detection which is ...
Optimizations. (Under the direction of Associate Professor Dr. Frank Mueller). Thread level speculat...
Parallel graph reduction is a conceptually simple model for the concurrent evaluation of lazy functi...
The design and evaluation of microprocessor architectures is a difficult and time-consuming task. Al...
AbstractIn this paper, we present a compiler extension for applications targeting high performance e...