While high-end heterogeneous systems are increasingly supporting heterogeneous uniform memory access (hUMA) as envisioned by the Heterogeneous System Architecture (HSA) foundation, their low-power counterparts targeting the embedded domain still lack basic features like virtual memory support for accelerators. As opposed to simply passing virtual address pointers, explicit data management involving copies is needed to share data between host processor and accelerators which hampers programmability and performance. In this work, we present a mixed hardware/software solution to enable lightweight virtual memory support for many-core accelerators in heterogeneous embedded systems-on-chip (SoCs). Based on an input/output translation lookaside b...