Auto-Tuning DL compilers are gaining ground as an optimizing back-end for DL frameworks. While existing work can generate deep learning models that exceed the performance of hand-Tuned libraries, they still suffer from prohibitively long auto-Tuning time due to repeated hardware measurements in large search spaces. In this paper, we take a neural-predictor inspired approach to reduce the auto-Tuning overhead and show that a performance predictor model trained prior to compilation can produce optimized tensor operation codes without repeated search and hardware measurements. To generate a sample-efficient training dataset, we extend input representation to include task-specific information and to guide data sampling methods to focus on learn...
Deep Neural Networks (DNNs) are extremely computationally demanding, which presents a large barrier ...
Iterative compiler optimization has been shown to outperform static approaches. This, however, is at...
The rapidly evolving landscape of multicore architectures makes the construction of efficient librar...
One Shot Tuner a neural-predictor inspired approach to auto-tuning Docker guide(recommended) git...
International audienceA wide range of scientific and machine learning applications depend on highly ...
Thesis (Ph.D.)--University of Washington, 2022As the scaling and performance demands for deep learni...
Tuning and optimising the operations executed in deep learning frameworks is a fundamental task in a...
The end of Moore's law is driving the search for new techniques to improve system performance as app...
Deep Neural Networks (DNNs) have revolutionized many aspects of our lives. The use of DNNs is becomi...
In recent years, machine learning (ML) and, more noticeably, deep learning (DL), have be- come incre...
The success of deep learning has shown impressive empirical breakthroughs, but many theoretical ques...
Deep Neural Networks (DNNs) are constantly evolving, enabling the power of deep learning to be appli...
Deep neural networks (DNNs) are becoming a key enabling technique for many application domains. Howe...
Deep learning (DL) has been widely adopted those last years but they are computing-intensive method....
Machine learning has been widely used in various application domains such as recommendation, compute...
Deep Neural Networks (DNNs) are extremely computationally demanding, which presents a large barrier ...
Iterative compiler optimization has been shown to outperform static approaches. This, however, is at...
The rapidly evolving landscape of multicore architectures makes the construction of efficient librar...
One Shot Tuner a neural-predictor inspired approach to auto-tuning Docker guide(recommended) git...
International audienceA wide range of scientific and machine learning applications depend on highly ...
Thesis (Ph.D.)--University of Washington, 2022As the scaling and performance demands for deep learni...
Tuning and optimising the operations executed in deep learning frameworks is a fundamental task in a...
The end of Moore's law is driving the search for new techniques to improve system performance as app...
Deep Neural Networks (DNNs) have revolutionized many aspects of our lives. The use of DNNs is becomi...
In recent years, machine learning (ML) and, more noticeably, deep learning (DL), have be- come incre...
The success of deep learning has shown impressive empirical breakthroughs, but many theoretical ques...
Deep Neural Networks (DNNs) are constantly evolving, enabling the power of deep learning to be appli...
Deep neural networks (DNNs) are becoming a key enabling technique for many application domains. Howe...
Deep learning (DL) has been widely adopted those last years but they are computing-intensive method....
Machine learning has been widely used in various application domains such as recommendation, compute...
Deep Neural Networks (DNNs) are extremely computationally demanding, which presents a large barrier ...
Iterative compiler optimization has been shown to outperform static approaches. This, however, is at...
The rapidly evolving landscape of multicore architectures makes the construction of efficient librar...