International audienceThis paper considers the correctness of domain-specific compilers for tensor programming languages through the study of Halide, a popular representative. It describes a translation validation algorithm for affine Halide specifications, independently of the scheduling language. The algorithm relies on "propheticž annotations added by the compiler to the generated array assignments. The annotations provide a refinement mapping from assignments in the generated code to the tensor definitions from the specification. Our implementation leverages an affine solver and a general SMT solver, and scales to complete Halide benchmarks
International audienceMany modern application domains crucially rely on tensor operations. The optim...
�� 2012. Published by ELRA. This is an open access article available under a Creative Commons licenc...
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific res...
International audienceThis paper considers the correctness of domain-specific compilers for tensor p...
Tensor compilers are used in domains such as image processing and deep learning to generate efficien...
Efficient code generation for image processing applications continues to pose a challenge in a domai...
We present a lightweight Coq framework for optimizing tensor kernels written in a pure, functional a...
I propose a lightweight Coq framework for optimizing tensor kernels written in a pure, functional ar...
Tensor Cores (TCUs) are specialized units first introduced by NVIDIA in the Volta microarchitecture ...
International audienceTranslation validation consists of transforming a program and a posteriori val...
Tensors are higher-dimensional analogs of matrices, and represent a key data abstraction for many ap...
Improving data locality of tensor data structures is a crucial optimization for maximizing the perfo...
Orientador: Roberto de Alencar LotufoDissertação (mestrado) - Universidade Estadual de Campinas, Fac...
International audienceSoftware pipelining is a loop optimization that overlaps the execution of seve...
This thesis studies data-parallelism in tensor assignments. Building on an existent domain specific ...
International audienceMany modern application domains crucially rely on tensor operations. The optim...
�� 2012. Published by ELRA. This is an open access article available under a Creative Commons licenc...
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific res...
International audienceThis paper considers the correctness of domain-specific compilers for tensor p...
Tensor compilers are used in domains such as image processing and deep learning to generate efficien...
Efficient code generation for image processing applications continues to pose a challenge in a domai...
We present a lightweight Coq framework for optimizing tensor kernels written in a pure, functional a...
I propose a lightweight Coq framework for optimizing tensor kernels written in a pure, functional ar...
Tensor Cores (TCUs) are specialized units first introduced by NVIDIA in the Volta microarchitecture ...
International audienceTranslation validation consists of transforming a program and a posteriori val...
Tensors are higher-dimensional analogs of matrices, and represent a key data abstraction for many ap...
Improving data locality of tensor data structures is a crucial optimization for maximizing the perfo...
Orientador: Roberto de Alencar LotufoDissertação (mestrado) - Universidade Estadual de Campinas, Fac...
International audienceSoftware pipelining is a loop optimization that overlaps the execution of seve...
This thesis studies data-parallelism in tensor assignments. Building on an existent domain specific ...
International audienceMany modern application domains crucially rely on tensor operations. The optim...
�� 2012. Published by ELRA. This is an open access article available under a Creative Commons licenc...
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific res...