\u3cp\u3eEfficient code generation for image processing applications continues to pose a challenge in a domain where high performance is often necessary to meet real-time constraints. The inherently complex structure found in most image-processing pipelines, the plethora of transformations that can be applied to optimize the performance of an implementation, as well as the interaction of these optimizations with locality, redundant computation and parallelism, can be indentified as the key reasons behind this issue. Recent domain-specific languages (DSL) such as the Halide DSL and compiler attempt to encourage high-level design-space exploration to facilitate the optimization process. We propose a novel optimization strategy that aims to ma...
Tensor Cores (TCUs) are specialized units first introduced by NVIDIA in the Volta microarchitecture ...
Orientador: Roberto de Alencar LotufoDissertação (mestrado) - Universidade Estadual de Campinas, Fac...
Modern embedded systems for image processing involve increasingly complex levels of functionality un...
Efficient code generation for image processing applications continues to pose a challenge in a domai...
The Halide DSL and compiler have enabled high-performance code generation for image processing pipel...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
We present a new algorithm to automatically generate high-performance GPU implementations of complex...
Many image processing tasks are naturally expressed as a pipeline of small computational kernels kno...
We present a new algorithm to automatically schedule Halide programs for high-performance image proc...
Image processing applications typically involve data-oriented kernels with limited control divergenc...
This paper presents the design and implementation of PolyMage, a domain-specific language and compil...
Even though computer graphics applications are widely used, they remain challenging to implement and...
Specialized Digital Signal Processors (DSPs) play an important role in power-efficient, high-perform...
Effective models for fusion of loop nests continue to remain a challenge in both general-purpose and...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...
Tensor Cores (TCUs) are specialized units first introduced by NVIDIA in the Volta microarchitecture ...
Orientador: Roberto de Alencar LotufoDissertação (mestrado) - Universidade Estadual de Campinas, Fac...
Modern embedded systems for image processing involve increasingly complex levels of functionality un...
Efficient code generation for image processing applications continues to pose a challenge in a domai...
The Halide DSL and compiler have enabled high-performance code generation for image processing pipel...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
We present a new algorithm to automatically generate high-performance GPU implementations of complex...
Many image processing tasks are naturally expressed as a pipeline of small computational kernels kno...
We present a new algorithm to automatically schedule Halide programs for high-performance image proc...
Image processing applications typically involve data-oriented kernels with limited control divergenc...
This paper presents the design and implementation of PolyMage, a domain-specific language and compil...
Even though computer graphics applications are widely used, they remain challenging to implement and...
Specialized Digital Signal Processors (DSPs) play an important role in power-efficient, high-perform...
Effective models for fusion of loop nests continue to remain a challenge in both general-purpose and...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...
Tensor Cores (TCUs) are specialized units first introduced by NVIDIA in the Volta microarchitecture ...
Orientador: Roberto de Alencar LotufoDissertação (mestrado) - Universidade Estadual de Campinas, Fac...
Modern embedded systems for image processing involve increasingly complex levels of functionality un...