Chinese Physics Letters, 2019, Vol. 36, No. 9, Article code 097501 Machine Learning and Micromagnetic Studies of Magnetization Switching Jing-Yue Miao (缪静月)1,2** Affiliations 1Key Laboratory of Advanced Materials (MOE), School of Materials Science and Engineering, Tsinghua University, Beijing 100084 2Argonne National Laboratory, Chicago, USA Received 24 May 2019, online 23 August 2019 **Corresponding author. Email: miujy15@mails.tsinghua.edu.cn Citation Text: Mou J Y 2019 Chin. Phys. Lett. 36 097501    Abstract Magnetization switching is one of the most fundamental topics in the field of magnetism. Machine learning (ML) models of random forest (RF), support vector machine (SVM), deep neural network (DNN) methods are built and trained to classify the magnetization reversal and non-reversal cases of single-domain particle, and the classification performances are evaluated by comparison with micromagnetic simulations. The results show that the ML models have achieved great accuracy and the DNN model reaches the best area under curve (AUC) of 0.997, even with a small training dataset, and RF and SVM models have lower AUCs of 0.964 and 0.836, respectively. This work validates the potential of ML applications in studies of magnetization switching and provides the benchmark for further ML studies in magnetization switching. DOI:10.1088/0256-307X/36/9/097501 PACS:75.78.Cd, 75.60.Jk © 2019 Chinese Physics Society Article Text Magnetization switching is one of the most fundamental phenomena in magnetism research and it has been extensively studied due to its applications in hard disk drives (HDD), magnetoresistive random access memory (MRAM) and spintronics.[1] Micromagnetic simulations based on the Landau–Lifshitz–Gilbert (LLG) equation has been widely used to study the magnetization dynamics in magnetic systems such as single-domain particles, two-dimensional films and three-dimensional bulks.[2] In the micromagnetic simulation, whether or not the magnetization switching occurs depends on a variety of magnetic parameters such as the anisotropy, damping constant $\alpha$, exchange coupling constant and external applied field. In addition, the structural parameters such as the grain size and the thickness of different layers have significant effects.[3] Therefore, a large number of micromagnetic simulations should be run if we need to find the optimal parameter sets for switching occurrence. Meanwhile, the computing time follows ${\boldsymbol O}(N\log_{2}N)$, where $N$ is the number of cells involved in the systems, which makes the simulation much more time-consuming in complicated systems.[4] Additionally, for magnetic device design we need to consider the effects of electric field and thermal energy so that it is highly difficult to build a comprehensive model and the simulation could be computationally infeasible. However, machine learning (ML) approaches can make decisions only by learning from data, so that it is easier to include all different kinds of parameters. Once a machine learning model is trained, it can predict the switching occurrence in a large number of cases at the same time and can save the computing time using vectorization in the algorithms. Recently, ML approaches have been successfully used in engineering and science, such as the studies of the hysteresis modelling,[5] classification of magnetization states[6–8] and magnetic device designs.[9] In this work, we try to use ML methods to study the magnetization switching and explore the potential of ML for describing the motion of magnetization. We start from the study of magnetization switching in a single-domain particle model, which is the perfect introduction because this model describes adequately the physics of fine magnetic grains and has been thoroughly investigated since the study of the Stoner–Wohlfarth (SW) model.[10] In this Letter, we first use the micromagnetic simulation to study the magnetization switching of the single-domain particle model and validate our simulations with the SW model and the energy surface model. We then build and train the ML models using the datasets generated by micromagnetic simulation. Finally, we use the trained ML models to predict the magnetization switching results and evaluate the performance of different ML models by comparison with simulation results. The insets in Fig. 1 illustrate the single-domain particle switching model. As is shown, the easy axis of uniaxial crystalline anisotropy ${\boldsymbol K}$ is fixed in the $-z$ direction and the initial magnetization ${\boldsymbol m}$ is below the $x$–$y$ plane. The external field ${\boldsymbol H}$ is applied to reverse the magnetization ${\boldsymbol m}$, and we define that the magnetization switching occurs as the magnetization ${\boldsymbol m}$ changes the polarity in a certain time interval of 0.1 ns. Notably, the rise time of the applied field is not considered in this work and all the directions of easy axis, external magnetic field and the initial magnetization are in the same $x$–$z$ plane for simplification. Figure 1 shows the simulated magnetization reversal phase diagrams with damping constant $\alpha$ varying from 1.0 to 0.01 and the initial magnetization directions $\theta_{\rm m0}$ are 180$^{\circ}$ and 225$^{\circ}$ in Figs. 1(a) and 1(b), respectively. The horizontal axis is the direction of applied field $\theta_{\rm H}$ and the perpendicular axis is the normalized magnitude of the magnetic field $h=H/H_{\rm k}$, where $H_{\rm k}=2K_{\rm u}/M_{\rm s}$ is the magnitude of anisotropy field, $K_{\rm u}$ is the uniaxial crystalline anisotropy constant, and $M_{\rm s}$ is the saturation magnetization. The phase diagram means that when the magnetic parameter sets, $\theta_{\rm H}$ and $h$, are above the respective line (with other parameters fixed), the particle will switch under ${\boldsymbol H}$.
cpl-36-9-097501-fig1.png
Fig. 1. Phase diagrams of magnetization switching in single-domain particle. Initial magnetization directions $\theta_{\rm m0}$ are 180$^{\circ}$ and 225$^{\circ}$ in (a) and (b), respectively.
As we can see from Figs. 1(a) and 1(b), the critical switching field depends on the initial magnetization direction $\theta_{\rm m0}$, which shows that the magnetization dynamics is nonlinear and thus provides enough complexity for ML studies. In Fig. 1(a), the simulated critical switching field of the large damping case ($\alpha =1$) is consistent with the SW model derivation because the dynamic switching path follows the minimum energy variation, i.e. the applied field guarantees the total energy decreasing monotonically, which is the requirement of SW model. However, for the small damping case ($\alpha\ll1$), the critical field decreases because the magnetization dynamics would virtually explore all the energy surface and be possible to cross the energy barrier to achieve the reversal, which agrees with the results from both experiments and other micromagnetic simulations.[11] Notably, the result for the large damping case ($\alpha =1$) in Fig. 1(b) has slight difference with the SW model derivation when the field is applied from a small angle. To explain this discrepancy we use the energy surface model to validate three different cases A, B and C in Fig. 1(b), where the normalized magnitude of external field $h$ is 0.75 and the initial magnetization direction angle $\theta_{\rm m0}$ is 225$^{\circ}$ and the applied field angles $\theta_{\rm H}$ are 1$^{\circ}$, 4$^{\circ}$ and 7$^{\circ}$, respectively. Figure 2 shows the simulated total energy versus the angle of the magnetization direction $\theta_{\rm m}$ in cases A, B, C, and the total energy is normalized by $K_{\rm u}$. Figure 2(b) is the zoom in picture for details where the lines are the energy curve and the marker $+$ denotes the initial energy state $E$ and the marker $\times$ denotes the energy barrier $E_{\max}$.
cpl-36-9-097501-fig2.png
Fig. 2. (a) Variation of the energy landscape versus the direction of magnetization $\theta_{\rm m}$ with different external field angles. Here the values of $\theta_{\rm H}$ are 1$^{\circ}$, 4$^{\circ}$ and 7$^{\circ}$ respectively in cases A, B and C. (b) The zoom in picture of (a).
As shown in Fig. 2(b), in case A, the energy would decrease monotonically in one way of reversal ($\theta_{\rm m}$ from 225$^{\circ}$ to 360$^{\circ}$), while there is an energy barrier in the other way of reversal ($\theta_{\rm m}$ from 225$^{\circ}$ to 0), and case C is in the opposite situation. However, in case B, there are energy barriers in both directions so that the magnetization reversal can not be achieved, which is consistent with our micromagnetic simulations in Fig. 1(b). Therefore, we validate our micromagnetic simulations by both the energy surface model and the SW model so that the simulation results could be used for evaluating the ML classification performance. Supervised learning is the ML task of learning a function that maps an input to an output based on example input-output pairs. It infers a function from labeled training dataset consisting of a set of training examples.[12] In this work we use the supervised random forest (RF),[13] support vector machine (SVM)[14] and deep neural networks (DNN)[15] methods to classify the magnetization reversal (positive) and non-reversal (negative) examples in the single-domain particle switching model. These methods are adopted due to their successful applications in non-linear binary classification problems. Meanwhile, we use the open source frameworks of Scikit-learn[16] and TensorFlow[17] to implement the ML models. The first step of implementing ML models is to prepare the training, validation and testing datasets. We choose four variables in the single-domain particle switching model as the input features of ML models and randomly sampled their values in their respective ranges for 300000 examples, where the variables are $\theta_{\rm m0}$ ranging from 90$^{\circ}$ to 180$^{\circ}$, $\theta_{\rm H}$ ranging from $-90^{\circ}$ to 90$^{\circ}$, $\alpha$ ranging from 1 to 0.001, and $h$ ranging from 0.1 to 1. Then, we use the LLG simulation to determine the label of each example and we will label the example to be positive or 1 if the switching occurs, otherwise we will label it to be negative or 0. Thus we have a dataset consisting of 300000 examples and then the dataset is randomly split into 80% training dataset and 20% validation dataset. Similarly, a testing dataset of 10000 examples is generated in the same way except that the values of variables are in the grid distribution, instead of the random distribution in their ranges. Notably, there would be slight chance that the training dataset overlaps the testing dataset so that the testing dataset provides the 'unseen' data to evaluate the generalization ability of the ML model. Additionally, preprocessing methods such as centralization and normalization are used in the datasets to accelerate the speed of training. Then, we build the RF, SVM and DNN models and train the models by feeding in the training dataset. The basic idea of training is to minimize the error between the prediction and the labels and the details of algorithms and models are described elsewhere.[13–17] The hyper parameters of the models are optimized by evaluating the classification performance in the validation dataset. Finally, we use the trained ML models to classify the examples in testing dataset and evaluate the performance by comparing the prediction with labels. In this work, we choose area under curve (AUC) as the evaluation metric, which is commonly used in data science and the larger AUC means better performance.[18] If all of the ML predictions are consistent with the micromagnetic simulation results, the AUC will be 1. Figure 3 shows the classification performance of different ML models versus the number of training examples, i.e., the size of training dataset. As can be seen from the figure, the DNN classifier has the best performance (AUC=0.997) and the AUC reaches 0.95 even with a training dataset of 500 examples. The RF model with 100 randomized trees has the second best AUC=0.964 and the performance improves greatly with increasing the number of the training examples. However, the SVM model has the worst performance (AUC=0.836) even though we have used GridSearch CV to find the optimal hyper-parameter sets, where C=1000, gamma=0.01, kernel='RBF'.[14] This is possibly because the SVM model with kernels does not work well in the cases with small number of features and large number of training examples. Notably, the results listed here are our optimal classification performance derived from 10–100 cycles (epochs) of training through the training dataset instead of the ideal best training performance, which would need much more investigation to achieve.
cpl-36-9-097501-fig3.png
Fig. 3. Comparison of different machine learning models and their classification performance dependence on the number of training examples.
cpl-36-9-097501-fig4.png
Fig. 4. Illustration of the classification performance of the DNN, RF and SVM models and the corresponding AUC values are labeled in the figure. In (a), (b) and (c), $\alpha$ is 0.05 and $\theta_{\rm m0}$ is 180$^{\circ}$, and in (d), $\alpha$ is 1 and $\theta_{\rm m0}$ is 225$^{\circ}$.
The results validate the possibility of ML applications in magnetization switching studies since the ML models have learned the magnetization switching mechanism only from data and achieved excellent accuracy. In addition, the DNN model shows to be most suitable for studies of magnetization switching due to its better performance. Furthermore, we illustrate the classification performance of SVM, RF and DNN models by showing the magnetization reversal phase diagrams of some test samples. In Fig. 4, the yellow (light) dots are the predicted positive examples and the blue (dark) dots are the predicted negative examples. The lines are the critical switching fields calculated from micromagnetic simulations. As shown in the figure, the classification performance of the DNN and RF models is quite excellent and the wrong classified cases all fall on the critical boundaries, which is due to the increased complexity of the critical fields. Figure 4(d) shows that DNN can also resolve the subtlety discussed in Fig. 2. However, SVM classifier seems to have a linear decision boundary, which is much worse than the DNN and RF models. Moreover, we investigate the effects of the architecture of the DNN model on the classification performance. A DNN model has multiple hidden layers between the input and output layers. The number of hidden layers and nodes (neuros) of each layer influences the training time and performance significantly. As the number of layers and nodes, increases the training of models consumes more time.
cpl-36-9-097501-fig5.png
Fig. 5. Evaluation of the classification performance of the DNN models with different numbers of layers and nodes.
Figure 5 shows the best performance of the DNN models with different architectures. As can be seen, the classification performance improves with the increasing number of nodes in each layer and when each hidden layer has more than 16 nodes, the improvement almost saturates. In addition, the model with four hidden layers has the best performance, which is possibly because the model with more layers is harder for training and achieving the optimal performance. Therefore, for the single-domain particle switching, we choose the DNN model with an architecture of 4 hidden layers with 16 nodes in each layer and we believe this investigation could provide some insight for building the DNN models for complicated magnetic systems. In conclusion, an initial machine learning study of magnetization switching has been performed and the results show that the ML models have achieved great classification accuracy and the DNN model reaches AUC=0.997 even with a small training data set of 10000 examples. Notably, though for the case of the single-domain particle, ML methods do not show significant advantages over micromagnetic simulations. This work shows the potential of ML applications in studies of magnetization switching and provides the benchmark test for further studies.
References Experimental Evidence of the Néel-Brown Model of Magnetization ReversalVolume average demagnetizing tensor of rectangular prismsFast adaptive algorithms for micromagneticsMagnetic hysteresis modeling via feed-forward neural networksUnsupervised machine learning account of magnetic transitions in the Hubbard modelMachine learning phases of matterThe Stoner–Wohlfarth model of ferromagnetismEnergy surface model of single particle reversal in sub-Stoner–Wohlfarth switching fieldsLeast Squares Support Vector Machine ClassifiersUsing AUC and accuracy in evaluating learning algorithms
[1] Wernsdorfer W et al 1997 Phys. Rev. Lett. 78 1791
[2] Fukushima H, Nakatani Y and Hayashi N 1998 IEEE Trans. Magn. 34 193
[3]Wei D 2012 Micromagnetics and Recording Materials (Heidelberg: Springer)
[4] Yuan S W and Bertram H N 1992 IEEE Trans. Magn. 28 2031
[5] Serpico C and Visone C 1998 IEEE Trans. Magn. 34 623
[6]Alam M, Ali A, Sultan M S et al 2018 Prog. Electromagn. Res. Symp. (Toyama Jpn. 1–4 August 2018) p 291
[7] Ch'ng K, Vazquez N and Khatami E 2018 Phys. Rev. E 97 013306
[8] Carrasquilla J and Melko R G 2017 Nat. Phys. 13 431
[9]Roy U, Pramanik T, Roy S et al 2017 75th Annual Device Research Conference (South Bend, USA 25–28 June 2017)
[10] Tannous C and Gieraltowski J 2008 Eur. J. Phys. 29 475
[11] Gao K Z, Boerner E D and Bertram H N 2003 J. Appl. Phys. 93 6549
[12]Russell S J and Norvig P 2010 Artificial Intelligence: A Modern Approach 3rd edn (Malaysia: Pearson)
[13]Liaw A and Wiener M 2002 R. News 2 18
[14] Suykens J A K and Vandewalle J 1999 Neural Process. Lett. 9 293
[15]Haykin S 1994 Neural networks (New York: Prentice Hall)
[16]Pedregosa F, Varoquaux G, Gramfort A et al 2011 J. Mach. Learn. Res. 12 2825
[17]Abadi M, Barham P, Chen J et al 12th USENIX Symposium on Operating Systems Design, Implementation (Savannah and USA 2–4 November 2016) 16 265
[18] Huang J and Ling C X 2005 IEEE Trans. Knowl. Data Eng. 17 299