Recent studies on commercial hardware demonstrated that irregular GPU applications can bottleneck on virtual-to-physical address translations. In this work, we explore ways to reduce address translation overheads for such applications. We discover that the order of servicing a GPU's address translation requests (specifically, page table walks) plays a key role in determining the amount of translation overhead experienced by an application. We find that different SIMD instructions executed by an application require vastly different amounts of work to service their address translation needs, primarily depending upon the number of distinct pages they access. We show that better forward progress is achieved by prioritizing translation requests ...
Abstract—Although GPGPUs are traditionally used to accel-erate workloads with regular control and me...
As virtualization becomes a key technique for supporting cloud computing, much effort has been made ...
With the explosive growth in dataset sizes, application memory footprints are commonly reaching hund...
Recent studies on commercial hardware demonstrated that irregular GPU workloads could bottleneck on ...
Recent studies on commercial hardware demonstrated that irregular GPU workloads could bottleneck on ...
Part 3: AlgorithmInternational audienceThe ever increasing application footprint raises challenges f...
The overhead of memory virtualization remains nontrivial. The traditional shadow paging (TSP) resort...
The proliferation of heterogeneous compute platforms, of which CPU/GPU is a prevalent example, neces...
International audienceDespite the increasing investment in integrated GPUs and next-generation inter...
Operating systems employ virtual memory mechanism to provide large address pace for programs. The ef...
The proliferation of heterogeneous compute platforms, of which CPU/GPU is a prevalent example, neces...
With explosive growth in dataset sizes and increasing machine memory capacities, per-application mem...
Address translation is an essential part of current systems. Getting the virtual-to-physical mapping...
International audienceSingle-Instruction Multiple-Thread (SIMT) micro-architectures implemented in G...
The state-of-the-art GPU virtualization framework, gVirtuS, relies on an API remoting mechanism to s...
Abstract—Although GPGPUs are traditionally used to accel-erate workloads with regular control and me...
As virtualization becomes a key technique for supporting cloud computing, much effort has been made ...
With the explosive growth in dataset sizes, application memory footprints are commonly reaching hund...
Recent studies on commercial hardware demonstrated that irregular GPU workloads could bottleneck on ...
Recent studies on commercial hardware demonstrated that irregular GPU workloads could bottleneck on ...
Part 3: AlgorithmInternational audienceThe ever increasing application footprint raises challenges f...
The overhead of memory virtualization remains nontrivial. The traditional shadow paging (TSP) resort...
The proliferation of heterogeneous compute platforms, of which CPU/GPU is a prevalent example, neces...
International audienceDespite the increasing investment in integrated GPUs and next-generation inter...
Operating systems employ virtual memory mechanism to provide large address pace for programs. The ef...
The proliferation of heterogeneous compute platforms, of which CPU/GPU is a prevalent example, neces...
With explosive growth in dataset sizes and increasing machine memory capacities, per-application mem...
Address translation is an essential part of current systems. Getting the virtual-to-physical mapping...
International audienceSingle-Instruction Multiple-Thread (SIMT) micro-architectures implemented in G...
The state-of-the-art GPU virtualization framework, gVirtuS, relies on an API remoting mechanism to s...
Abstract—Although GPGPUs are traditionally used to accel-erate workloads with regular control and me...
As virtualization becomes a key technique for supporting cloud computing, much effort has been made ...
With the explosive growth in dataset sizes, application memory footprints are commonly reaching hund...