Exploring Explicit Coarse-Grained Structure in Artificial Neural Networks
Xi-Ci Yang1, Z. Y. Xie2*, and Xiao-Tao Yang1*
1College of Power and Energy Engineering, Harbin Engineering University, Harbin 150001, China 2Department of Physics, Renmin University of China, Beijing 100872, China
Abstract:We propose to employ a hierarchical coarse-grained structure in artificial neural networks explicitly to improve the interpretability without degrading performance. The idea has been applied in two situations. One is a neural network called TaylorNet, which aims to approximate the general mapping from input data to output result in terms of Taylor series directly, without resorting to any magic nonlinear activations. The other is a new setup for data distillation, which can perform multi-level abstraction of the input dataset and generate new data that possesses the relevant features of the original dataset and can be used as references for classification. In both the cases, the coarse-grained structure plays an important role in simplifying the network and improving both the interpretability and efficiency. The validity has been demonstrated on MNIST and CIFAR-10 datasets. Further improvement and some open questions related are also discussed.
Fawzi A, Balog M, Huang A, Hubert T, Paredes B R, Barekatain M, Novikov A, Ruiz F J R, Schrittwieser J, Swirszcz G, Silver D, Hassabis D, and Kohli P 2022 Nature610 47
Dauparas J, Anishchenko I, Bennett N, Ragotte H B R J, Milles L F, Wicky B I M, Courbet A, deHaas R J, Bethel N, Leung P J Y, Huddy T F, Pellock S, Tischer D, Chan F, Koepnick B, Nguyen H, Kang A, Sankaran B, Bera A K, King N P, and Baker D 2022 Science378 49
[14]
Stanev V, Oses C, Kusne A G, Rodriguez E, Paglione J, Curtarolo S, and Takeuchi I 2018 npj Comput. Mater.4 29
[15]
Wang A Y T, Murdock R J, Kauwe S K, Oliynyk A O, Gurlo A, Brgoch J, Persson K A, and Sparks T D 2020 Chem. Mater.32 4954
He K M, Zhang X Y, Ren S Q, and Sun J 2016 Proc. IEEE Conference Computer Vision Pattern Recognition (Las Vegas, USA, 26 June–1 July 2016) pp 770–778
[84]
Huang G, Liu Z, van der Maaten L, and Weinberge K Q 2017 Proc. IEEE Conference Computer Vision Pattern Recognition (Hawaii USA, 21–26 July 2017) pp 4700–4708
[85]
Vaswani A, Shazeer N, Parmer N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, and Polosukhin I 2017 arXiv:1706.03762 [cs.CL]
[86]
Press W H, Teukolsky S A, Vetterling W T, and Flannery B P 2007 Numerical Recipes: The Art of Scientific Computing (Cambridge: Cambridge University Press)
[87]
Tolstikhin I, Houlsby N, Kolesnikov A, Beyer L, Zhai X, Unterthiner T, Yung J, Steiner A, Keysers D, Uszkoreit J, Lucic M, and Dosovitskiy A 2021 arXiv:2105.01601 [cs.CV]
[88]
Luo W, Li Y, Urtasun R, and Zemel R 2016 Advances in Neural Information Processing Systems (Barcelona, Spain, 5–8 December 2016)
[89]
Wang P, Chen P, Yuan Y, Liu D, Huang Z, Hou X, and Cottrell G 2018 IEEE Winter Conference on Applications of Computer Vision (Nevada, USA, 12–15 March 2018) pp 1451–1460
[90]
Deng J, Guo J, Xue N, and Zafeiriou S 2019 In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (Long Beach, USA, 16–20 June 2019) pp 4690–4699