MimicNorm: Weight Mean and Last BN Layer Mimic the Dynamic of Batch Normalization

Fei, Wen
Dai, Wenrui
Li, Chenglin
Zou, Junni
Xiong, Hongkai

Publication date

October 2020

Abstract

Substantial experiments have validated the success of Batch Normalization (BN) Layer in benefiting convergence and generalization. However, BN requires extra memory and float-point calculation. Moreover, BN would be inaccurate on micro-batch, as it depends on batch statistics. In this paper, we address these problems by simplifying BN regularization while keeping two fundamental impacts of BN layers, i.e., data decorrelation and adaptive learning rate. We propose a novel normalization method, named MimicNorm, to improve the convergence and efficiency in network training. MimicNorm consists of only two light operations, including modified weight mean operations (subtract mean values from weight parameter tensor) and one BN layer before loss ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

MimicNorm: Weight Mean and Last BN Layer Mimic the Dynamic of Batch Normalization

Abstract

Extracted data

MimicNorm: Weight Mean and Last BN Layer Mimic the Dynamic of Batch Normalization

Abstract

Extracted data

Related items

Related items