In this thesis, we evaluate the interference between multiple GPU (Graphics processing unit) kernels running in parallel based on artificial random and sequential walks, the 2D Convolution benchmark provided by polybench and the KCF (Kernelized Correlation Filter) tracker implemented by Vít Karafiát and Michal Sojka. To achieve a reduction of the interference between the running kernels and to reduce the resulting execution jitter, we used a time-triggered execution on the GPU. To enable the synchronization, we assessed two synchronization mechanisms available on Tegra X2 platform: one based on zero-copy memory and one based on the globaltimer. We found that the NVIDIA profiler (nvprof) reconfigures the resolution of globaltimer from 1 s to...
Recent advances in Artificial Intelligence (AI) and Graphics Processing Units (GPUs) have made it po...
In modern autonomous systems such as self-driving cars, sustained safe operation requires running co...
With the emergence of highly multithreaded architectures, an effective performance monitoring system...
Multi-Processor Systems-on-Chip (MPSoC) platforms will definitely power various future autonomous mac...
We describe heterogeneous multi-CPU and multi-GPU implementations of Jacobi's iterative method for t...
Execution of GPGPU workloads consists of different stages including data I/O on the CPU, memory copy...
There is growing interest in accelerating irregular data-parallel algorithms on GPUs. These algorith...
As modern GPU workloads become larger and more complex, there is an ever-increasing demand for GPU c...
There is an increasing industrial and academic interest towards a more predictable characterization ...
Recent NVIDIA Graphics Processing Units (GPUs) can ex-ecute multiple kernels concurrently. On these ...
We study static timing analysis of programs running on GPU accelerators. Such programs follow a data...
The current trend in recently released Graphic Processing Units (GPUs) is to exploit transistor scal...
There has been a tremendous growth in the use of Graphics Processing Units (GPU) for the acceleratio...
Heterogeneous platforms play an increasingly important role in modern computer systems. They combin...
<p>Heterogeneous processors with accelerators provide an opportunity to improve performance within a...
Recent advances in Artificial Intelligence (AI) and Graphics Processing Units (GPUs) have made it po...
In modern autonomous systems such as self-driving cars, sustained safe operation requires running co...
With the emergence of highly multithreaded architectures, an effective performance monitoring system...
Multi-Processor Systems-on-Chip (MPSoC) platforms will definitely power various future autonomous mac...
We describe heterogeneous multi-CPU and multi-GPU implementations of Jacobi's iterative method for t...
Execution of GPGPU workloads consists of different stages including data I/O on the CPU, memory copy...
There is growing interest in accelerating irregular data-parallel algorithms on GPUs. These algorith...
As modern GPU workloads become larger and more complex, there is an ever-increasing demand for GPU c...
There is an increasing industrial and academic interest towards a more predictable characterization ...
Recent NVIDIA Graphics Processing Units (GPUs) can ex-ecute multiple kernels concurrently. On these ...
We study static timing analysis of programs running on GPU accelerators. Such programs follow a data...
The current trend in recently released Graphic Processing Units (GPUs) is to exploit transistor scal...
There has been a tremendous growth in the use of Graphics Processing Units (GPU) for the acceleratio...
Heterogeneous platforms play an increasingly important role in modern computer systems. They combin...
<p>Heterogeneous processors with accelerators provide an opportunity to improve performance within a...
Recent advances in Artificial Intelligence (AI) and Graphics Processing Units (GPUs) have made it po...
In modern autonomous systems such as self-driving cars, sustained safe operation requires running co...
With the emergence of highly multithreaded architectures, an effective performance monitoring system...