Many applications provide inherent resilience to some amount of error and can potentially trade accuracy for performance by using approximate computing. Applications running on GPUs often use local memory to minimize the number of global memory accesses and to speed up execution. Local memory can also be very useful to improve the way approximate computation is performed, e.g., by improving the quality of approximation with data reconstruction techniques. This paper introduces local memory-aware perforation techniques specifically designed for the acceleration and approximation of GPU kernels. We propose a local memory-aware kernel perforation technique that first skips the loading of parts of the input data from global memory, and later us...
Approximate computing, where computation accuracy is traded off for better performance or higher dat...
Parallelism is everywhere, with co-processors such as Graphics Processing Units (GPUs) accelerating ...
An application that can produce a useful result despite some level of computational error is said to...
Many applications provide inherent resilience to some amount of error and can potentially trade accu...
Accepted for 2019 International Conference on High Performance Computing & Simulation (HPCS)Approxim...
Faster and more efficient hardware is needed to handle the rapid growth of Big Data processing. Appl...
Approximate computing, where computation accuracy is traded off for better performance or higher dat...
Abstract—Due to the diversity of processor architectures and application memory access patterns, the...
Iterative memory-bound solvers commonly occur in HPC codes. Typical GPU implementations have a loop ...
Many modern computations (such as video and audio encoders, Monte Carlo simulations, and machine lea...
Variation in performance and power across manufactured parts and their operating conditions is an ac...
This paper describes the implementation of approximate memory support in Linux operating system kern...
Improving power consumption and performance of error tolerant applications is the target of the desi...
Graph algorithms have gained popularity and are utilized in high performance and mobile computing pa...
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far coul...
Approximate computing, where computation accuracy is traded off for better performance or higher dat...
Parallelism is everywhere, with co-processors such as Graphics Processing Units (GPUs) accelerating ...
An application that can produce a useful result despite some level of computational error is said to...
Many applications provide inherent resilience to some amount of error and can potentially trade accu...
Accepted for 2019 International Conference on High Performance Computing & Simulation (HPCS)Approxim...
Faster and more efficient hardware is needed to handle the rapid growth of Big Data processing. Appl...
Approximate computing, where computation accuracy is traded off for better performance or higher dat...
Abstract—Due to the diversity of processor architectures and application memory access patterns, the...
Iterative memory-bound solvers commonly occur in HPC codes. Typical GPU implementations have a loop ...
Many modern computations (such as video and audio encoders, Monte Carlo simulations, and machine lea...
Variation in performance and power across manufactured parts and their operating conditions is an ac...
This paper describes the implementation of approximate memory support in Linux operating system kern...
Improving power consumption and performance of error tolerant applications is the target of the desi...
Graph algorithms have gained popularity and are utilized in high performance and mobile computing pa...
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far coul...
Approximate computing, where computation accuracy is traded off for better performance or higher dat...
Parallelism is everywhere, with co-processors such as Graphics Processing Units (GPUs) accelerating ...
An application that can produce a useful result despite some level of computational error is said to...