A Linear Frequency Principle Model to Understand the Absence of Overfitting in Neural Networks

Yaoyu Zhang; Tao Luo; Zheng Ma; Zhi-Qin John Xu

doi:10.1088/0256-307X/38/3/038701

Abstract

Why heavily parameterized neural networks (NNs) do not overfit the data is an important long standing open question. We propose a phenomenological model of the NN training to explain this non-overfitting puzzle. Our linear frequency principle (LFP) model accounts for a key dynamical feature of NNs: they learn low frequencies first, irrespective of microscopic details. Theory based on our LFP model shows that low frequency dominance of target functions is the key condition for the non-overfitting of NNs and is verified by experiments. Furthermore, through an ideal two-layer NN, we unravel how detailed microscopic NN training dynamics statistically gives rise to an LFP model with quantitative prediction power.

About This Article

Cite this article:

Yaoyu Zhang, Tao Luo, Zheng Ma, Zhi-Qin John Xu. A Linear Frequency Principle Model to Understand the Absence of Overfitting in Neural Networks[J]. Chin. Phys. Lett., 2021, 38(3): 038701. DOI: 10.1088/0256-307X/38/3/038701

A Linear Frequency Principle Model to Understand the Absence of Overfitting in Neural Networks

Abstract

Article Text

About This Article

Cite this article:

Catalog

A Linear Frequency Principle Model to Understand the Absence of Overfitting in Neural Networks

Abstract

Article Text

About This Article

Cite this article:

Catalog

Export File

Citation

Format

Content