This paper describes the design and evaluation of an auto-memoization processor. The major point of this proposal is to detect the multilevel functions and loops with no additional instructions controlled by the compiler. This general purpose processor detects the functions and loops, and memoizes them automatically and dynamically. Hence, any load modules and binary programs can gain speedup without recompilation or rewriting. We also propose a parallel execution by multiple speculative cores and one main memoing core. While main core executes a memoizable region, speculative cores exe-cute the same region simultaneously. The speculative exe-cution uses predicted inputs. This can omit the execution of instruction regions whose inputs show ...
International audienceMemoization is the technique of saving result of executions so that future exe...
University of Minnesota Ph.D. dissertation. June 2009. Major: Computer Science. Advisors: Prof. Pen-...
An architecture that features dynamic multithreading execution of a single program is studied in thi...
Abstract—We have proposed an auto-memoization processor. This processor automatically and dynamicall...
We have proposed an auto-memoization processor based on computation reuse, and merged it with specul...
We have proposed an auto-memoization processor based on computation reuse, and merged it with specul...
Central processing unit (CPU) and graphics processing unit (GPU) are weak (“weak” means ...
Compiler-based auto-parallelization is a much studied area, yet has still not found wide-spread appl...
Compiler-based auto-parallelization is a much studied area, yet has still not found wide-spread appl...
Many functions perform redundant calculations. Within a single function invocation, several sub-func...
We have proposed mechanisms to implement function memoization at a software level as part of our eff...
Many sequential applications are difficult to parallelize because of unpredictable control flow, ind...
One of the main performance bottlenecks of processors today is the discrepancy between processor and...
With speculative thread-level parallelization, codes that cannot be fully compiler-analyzed are aggr...
We consider extensible processor designs in which the number of gates and the distance that a signal...
International audienceMemoization is the technique of saving result of executions so that future exe...
University of Minnesota Ph.D. dissertation. June 2009. Major: Computer Science. Advisors: Prof. Pen-...
An architecture that features dynamic multithreading execution of a single program is studied in thi...
Abstract—We have proposed an auto-memoization processor. This processor automatically and dynamicall...
We have proposed an auto-memoization processor based on computation reuse, and merged it with specul...
We have proposed an auto-memoization processor based on computation reuse, and merged it with specul...
Central processing unit (CPU) and graphics processing unit (GPU) are weak (“weak” means ...
Compiler-based auto-parallelization is a much studied area, yet has still not found wide-spread appl...
Compiler-based auto-parallelization is a much studied area, yet has still not found wide-spread appl...
Many functions perform redundant calculations. Within a single function invocation, several sub-func...
We have proposed mechanisms to implement function memoization at a software level as part of our eff...
Many sequential applications are difficult to parallelize because of unpredictable control flow, ind...
One of the main performance bottlenecks of processors today is the discrepancy between processor and...
With speculative thread-level parallelization, codes that cannot be fully compiler-analyzed are aggr...
We consider extensible processor designs in which the number of gates and the distance that a signal...
International audienceMemoization is the technique of saving result of executions so that future exe...
University of Minnesota Ph.D. dissertation. June 2009. Major: Computer Science. Advisors: Prof. Pen-...
An architecture that features dynamic multithreading execution of a single program is studied in thi...