Dynamic transformer for efficient machine translation on embedded devices

Parry, Hishan
Xun, Lei
Sabetsarvestani, Mohammadamin
Bi, Jia
Hare, Jonathon
Merrett, Geoff

Publication date

September 2021

Abstract

The Transformer architecture is widely used for machine translation tasks. However, its resource-intensive nature makes it challenging to implement on constrained embedded devices, particularly where available hardware resources can vary at run-time. We propose a dynamic machine translation model that scales the Transformer architecture based on the available resources at any particular time. The proposed approach, 'Dynamic-HAT', uses a HAT SuperTransformer as the backbone to search for SubTransformers with different accuracy-latency trade-offs at design time. The optimal SubTransformers are sampled from the SuperTransformer at run-time, depending on latency constraints. The Dynamic-HAT is tested on the Jetson Nano and the approach uses inh...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Dynamic transformer for efficient machine translation on embedded devices

Abstract

Extracted data

Dynamic transformer for efficient machine translation on embedded devices

Abstract

Extracted data

Related items

Related items