A Linear Frequency Principle Model to Understand the Absence of Overfitting in Neural Networks
Yaoyu Zhang1,2, Tao Luo1, Zheng Ma1, and Zhi-Qin John Xu1*
1School of Mathematical Sciences, Institute of Natural Sciences, MOE-LSC, and Qing Yuan Research Institute, Shanghai Jiao Tong University, Shanghai 200240, China 2Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai 200031, China
Abstract:Why heavily parameterized neural networks (NNs) do not overfit the data is an important long standing open question. We propose a phenomenological model of the NN training to explain this non-overfitting puzzle. Our linear frequency principle (LFP) model accounts for a key dynamical feature of NNs: they learn low frequencies first, irrespective of microscopic details. Theory based on our LFP model shows that low frequency dominance of target functions is the key condition for the non-overfitting of NNs and is verified by experiments. Furthermore, through an ideal two-layer NN, we unravel how detailed microscopic NN training dynamics statistically gives rise to an LFP model with quantitative prediction power.
. [J]. 中国物理快报, 2021, 38(3): 38701-.
Yaoyu Zhang, Tao Luo, Zheng Ma, and Zhi-Qin John Xu. A Linear Frequency Principle Model to Understand the Absence of Overfitting in Neural Networks. Chin. Phys. Lett., 2021, 38(3): 38701-.
Rotskoff G and Vanden-Eijnden E 2018 Advances in Neural Information Processing Systems (NeurIPS 2018) (Publisher: Curran Associates, Inc.) vol 31 p 7146
[22]
Chizat L and Bach F 2018Advances in Neural Information Processing Systems (NeurIPS 2018) (Publisher: Curran Associates, Inc.) vol 31 p 3036
Jacot A, Gabriel F and Hongler C 2018 Advances in Neural Information Processing Systems (NeurIPS 2018) (Publisher: Curran Associates, Inc.) vol 31 p 8571
[25]
Lee J, Xiao L, Schoenholz S, Bahri Y, Novak R, Sohl-Dickstein J and Pennington J 2019 Advances in Neural Information Processing Systems (NIPS 2019) (Publisher: Curran Associates, Inc.) vol 32 p 8572
Kalimeris D, Kaplun G, Nakkiran P, Edelman B, Yang T, Barak B and Zhang H 2019 Advances in Neural Information Processing Systems (NIPS 2019) (Publisher: Curran Associates, Inc.) vol 32 p 3496
[28]
Valle-Perez G, Camargo C Q and Louis A A 2019 The International Conference on Learning Representations (New Orleans, United States 6–9 May 2019)
Ronen B, Jacobs D, Kasten Y and Kritchman S 2019 Advances in Neural Information Processing Systems (NIPS 2019) (Publisher: Curran Associates, Inc.) vol 32 p 4763