Many high performance applications run well below the peak arithmetic performance of the underlying machine, with inefficiencies often attributed to a lack of memory bandwidth. In this work we examine two emerging media processors designed to address the well-known gap between processor and memory performance, in the context of scientific computing. The VIRAM architecture uses novel PIM technology to combine embedded DRAM with a vector co-processor for exploiting its large bandwidth potential. The Imagine architecture, on the other hand, provides a stream-aware memory hierarchy to support the tremendous processing potential of the SIMD controlled VLIW clusters. First we develop a scalable synthetic probe that allows us to parametize ...
General purpose processors and accelerators including system-on-a-chip and graphics processing units...
In this thesis, image and video processing algorithms, especially the compression algorithms, are fi...
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing...
Many high performance applications run well below the peak arithmetic performance of the underlying ...
Many high performance applications run well below the peak arithmetic performance of the underlying...
Many high performance applications run well below the peak arithmetic performance of the underlying ...
This work presents two emerging media microprocessors, VIRAM and Imagine, and compares the implement...
This work presents two emerging media microprocessors, VIRAM and Imagine, and comparesthe implement...
Conference paperMedia applications are characterized by large amounts of available parallelism, litt...
Media applications are characterized by large amounts of available parallelism, little data reuse, a...
Many architectural ideas that appear to be useful from a hardware standpoint fail to achieve wide ac...
Many modern workloads, such as neural networks, databases, and graph processing, are fundamentally m...
In this dissertation, we address high performance media processing based on a tightly coupled co-pro...
The increasing gap between processor and memory performance has led to new architectural models for...
International audienceThis paper introduces a new combination of software and hardware PIM (Process-...
General purpose processors and accelerators including system-on-a-chip and graphics processing units...
In this thesis, image and video processing algorithms, especially the compression algorithms, are fi...
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing...
Many high performance applications run well below the peak arithmetic performance of the underlying ...
Many high performance applications run well below the peak arithmetic performance of the underlying...
Many high performance applications run well below the peak arithmetic performance of the underlying ...
This work presents two emerging media microprocessors, VIRAM and Imagine, and compares the implement...
This work presents two emerging media microprocessors, VIRAM and Imagine, and comparesthe implement...
Conference paperMedia applications are characterized by large amounts of available parallelism, litt...
Media applications are characterized by large amounts of available parallelism, little data reuse, a...
Many architectural ideas that appear to be useful from a hardware standpoint fail to achieve wide ac...
Many modern workloads, such as neural networks, databases, and graph processing, are fundamentally m...
In this dissertation, we address high performance media processing based on a tightly coupled co-pro...
The increasing gap between processor and memory performance has led to new architectural models for...
International audienceThis paper introduces a new combination of software and hardware PIM (Process-...
General purpose processors and accelerators including system-on-a-chip and graphics processing units...
In this thesis, image and video processing algorithms, especially the compression algorithms, are fi...
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing...