LSMT for T-DLA+: Efficient computation of quantized LSMT networks

Harisrikanth, Keshav

Abstract

Neural networks represent a complex computation which can be extremely resource intensive. This can limit their usability in contexts where very small amounts of hardware are deployed on low power budgets. One key way in which the computational cost of neural networks can be significantly reduced is quantization, in which the values throughout the network are represented in fewer bits. A ternarized network is specifically a network in which every weight has been quantized to three values, +1,-1 and 0. Past works have shown that, despite their simple weight systems, ternarized neural networks can achieve much closer accuracy to full floating point networks than might be expected. In order to further extract computational efficiency fr...

Extracted data

We use cookies to provide a better user experience.

Data Protection

LSMT for T-DLA+: Efficient computation of quantized LSMT networks

Abstract

Extracted data

LSMT for T-DLA+: Efficient computation of quantized LSMT networks

Abstract

Extracted data

Related items

Related items