# Artificial Neural Network Model for Atomistic Simulations of \({\rm {Sb/MoS}_{2}}\) van der Waals Heterostructures

- 541 Downloads

## Abstract

van der Waals (vdW) heterostructures have drawn significant amount of attentions because of their potential applications in the future electronic devices as well as quantum computing. Modeling the structural properties has been a challenging task due to the combinatory effects of both chemical complexity and spatial limitations of the first principle calculations. In this work, we trained an artificial neural network (ANN) model for atomistic simulations of \({\rm {Sb/MoS}}_{2}\) vdW heterostructures. The ANN model was trained from thousands of atomistic configurations along with energies computed from density functional theory (DFT) calculations. We demonstrated that the ANN model can successfully predict system energy with high fidelity with respect to DFT calculations with much less consumption of computational resources, manifesting that the ANN model is a powerful tool in atomistic simulations of chemically complex systems.

## Keywords

van der Waals heterostructure Atomistic simulation Neural network Machine learning## Introduction

Since the first successful isolation of graphene in 2004, two-dimensional materials (2D materials) have drawn significant amount of attentions from both academia and industries [1, 2, 3, 4, 5]. Recently, the van der Waals (vdW) heterostructures—a new breed of material derived from 2D materials—have emerged because of the span of combinatory space comprised of all 2D materials to date offer significant amount of freedom, allowing material scientists to manipulate these novel composite materials by switching stackings as well as angles of twist for desired material properties [6, 7, 8]. These vdW heterostructure materials have shown great potentials in electronics, optoelectronics, green energy, and even superconductivity applications [8, 9, 10, 11, 12], making them a promising material for the future.

The unusual physical and electronic properties of vdW heterostructures are originated from the interaction, or, in other words, the lattice mismatch between adjacent nanosheets. Hence, characterization of the structure of vdW heterostructures is critical for comprehensive understanding of material properties, and for the fabrication as well as the stability of heterostructure materials. Various microscopy and optical spectroscopy methods have been utilized for characterization of vdW heterostructures. For example, the high-resolution elemental mapping in electron microscopy can be used to characterize vdW heterostructures consisting of two or more types of nanosheets [13, 14], and the scanning tunneling microscopy (STM) and atomic force microscopy (AFM) can be employed to visualize the periodic Moire patterns, which serves as an indicator as formation of vdW heterostructures [15, 16, 17, 18]. The Moire patterns images from STM or AFM experiments must be complemented with extensive modelings to reveal the atomistic details of vdW heterostructures. Hence, atomistic scale simulations can play a critical role in both verification of the stability and analysis of physical/chemical properties of the heterostructures [19, 20, 21, 22].

The atomistic scale calculations of properties of vdW heterostructures are not a trivial task. Ab initio calculations can in principle evaluate the structural stabilities and physical/chemical properties of vdW heterostructures [19, 21]; however, larger simulation supercells relative to 2D materials comprised of single nanosheet component (eg. graphene) must be constructed for calculations to accomodate the lattice mismatch between nanosheets, which critically limits the application of ab initio calculations in vdW heterostructures, in particular, while the angles of twist must be taken into account. In contrast, classical molecular simulations can handle systems with large spatial spans, which seems to be ideal for exploring the structural properties of vdW heterostructures [22, 23, 24]. However, classical molecular simulations suffer from the limitations in available classical interatomic force fields (or, interatomic potentials). Due to the complex combination of chemical species as well as intermolecular interactions, parametrization of interatomic force fields is challenging, thereby limiting applications of classical molecular dynamics (MD) in exploring the large-scale structural properties of vdW heterostructures.

In the present manuscript, we demonstrate that we can successfully construct an interatomic potential for \({\rm {Sb/MoS}}_{2}\) vdW heterostructures by harnessing the power of machine learning (ML). The universal approximation theorem states that a feed-forward neural network can approximate any continuous function [25, 26], allowing representation of the potential energy surface (PES) of molecular systems. Hence, in the present study, an artificial neural (ANN) model was trained by feeding tens of thousands of configurations from DFT calculations as training sets. For identical system configurations, the ANN model we trained can successfully reproduce respective DFT energies. We carried out MD simulations of \({\rm{Sb/MoS}}_{2}\) heterostructures with system size of the order of 1000 atoms—a system size that is difficult for current ab initio molecular dynamics (AIMD) simulations, manifesting the potential of ANN model as an efficient energy/force evaluator for atomistic simulations of chemically complex systems.

## Simulation Methodology

### Artificial Neural Network for Atomistic Energies

*E*. The architecture of each atomic neural network is displayed schematically in the lower panel of Fig. 1. The atomic neural network is comprised of the input layer, the hidden layers, and the output layer. In our ANN model for atomistic simulations, the input layer is comprised of a series of atomic descriptor functions transformed from coordinates of individual atoms as well as their chemical environments (e.g. bond length or angles with neighboring atoms). The output layer will output atomic energy of each atoms. Each layers in the neural network is comprised of a finite number of nodes (neurons), and each nodes are connected (see the arrows in the lower panel of Fig. 1) via a set of weighting parameters

*w*. The advantage of this energy partitioning scheme is that the energies of systems of arbitrary sizes can be predicted once the atomic neural network function is trained. The expression of the output value of the \(j_{th}\) node in the \(i_{th}\) layer \(o_{i,j}\) can be written as

*i*and layer \(i-1\). For an atomic neural network (the lower panel of Fig. 1) with

*M*hidden layers, the atomic energy of atom

*i*can be written as

*i*. Note that each chemical species in the system (e.g. Sb, Mo, and S in the present study) should have their own atomic neural network function \(\mathcal {N}\) (namely, \(\mathcal {N}_{Sb}\), \(\mathcal {N}_{Mo}\), and \(\mathcal {N}_S\) in the present study). Furthermore, in the present study, we only used the hyperbolic tangent function

*tanh*(

*o*) as the activation function. The essence of the ANN model is the weighting matrix sets \(\{\mathbf {W}\}\) connecting nodes in the neural network, and these parameters needs to be trained to become a valid model that can be utilized for subsequent atomistic simulations. Once the ANN model is trained, we can perform atomistic simulations such as classical MD simulations as well as Monte Carlo simulations to exhaustively sample the configuration space for structural/thermodynamic properties or entropy-related properties such as system free energies. The atomic forces, which are critical for atomistic MD simulations, can be directly obtained by computing the gradient of system energy

*E*, which can be analytically derived by differentiating the atomic neural network function \(\mathcal {N}(\mathbf {I_i},\{\mathbf {W}\})\) with respect to atomic coordinates.

### Descriptor Functions

*i*is comprised of a series of descriptor functions transforming atomic coordinate (usually cartesian) of atom

*i*into translationally/rotationally invariant fingerprints specifying its chemical environment. In this work, the gaussian descriptor functions suggested by Behler [28] were employed as the descriptor functions as the input layer of atomic neural network functions \(\mathcal {N}\). In the present study, the gaussian descriptor functions were divided into two categories, namely, the radial (two-body) descriptor \(G_{i}^{II}\), and the angular (three-body) descriptor \(G_{i}^{III}\). The radial descriptor can be expressed as

### Training Sets and Training Procedures

_{2}vdW heterostructures.

The training sets selected for the ANN model of Sb/MoS_{2} heterostructures are depicted in Fig. 2. The training sets included bulk \(\beta\)-antimonene and MoS_{2} (Fig. 2a, b), and the Sb/MoS_{2} heterostructures (Fig. 2c) subjected to hydrostatic strains of \(\pm 5\%\), \(\pm 3\%\), and \(\pm 1\%\) (see orange arrows in Fig. 2a, b, c). Each training sets contains 150 atoms in the system, which is feasible for ab initio calculations. For large molecular systems such as biomolecules, it is not possible to perform ab initio calculations of the whole molecule for training set generation. One potential solution is to partition the whole molecules into pieces that can be handled by first principle calculations and train the ANN models accordingly. Note that in the training set of the Sb/MoS_{2} heterostructures, the \(\beta\)-antimonene was subjected to a misfit strain of \(-1.773\%\) to accomodate with the lattice of underlying MoS_{2}. To expand range of atomic feature vectors to ensure ANN model transferabilities, the Stone-Wales (SW) defect of single-layer \(\beta -\)antimonene (the lower panel of Fig. 2a) and the sulfur vacancy of single-layer MoS_{2} (the lower panel of Fig. 2b) were also included in the training set. For each configurations displayed in Fig. 2, atomic coordinates and respective electronic energies from a thousand steps of AIMD simulation subjected to canonical ensemble at T = 300 K were collected as the training data for the training processes. It must be noted that the trained ANN models are likely to fail once the atomic descriptor functions are outside the ranges in which the model was trained; hence, the training sets must be properly chosen to include finger print space (namely, the space spanned by finger print functions) that is relevant for systems of interests.

*Vienna ab initio Simulation Package*(VASP) [30, 31] with the project augment wave (PAW) pseudopotential [32], as well as the Perdew, Burke, and Ernzerhof (PBE) exchange-correlation functional [33]. The cutoff energy was 450 eV, and a \(1\times \ 1 \times 1\) Monkhorst Pack k-point mesh was employed. The DFT-D2 method was employed to incorporate van der Waals interactions between atoms [34]. The step size for AIMD simulations were set to 0.5 fs.

## Results and Discussions

### Results of Training Processes

_{2}(Fig. 5), and Antimony/MoS

_{2}heterostructure (Fig. 6) into the ANN model, computed the energies and compared energies from the ANN model with those from respective VASP calculations. Note that each structures in the validation sets were subjected to hydrostatic strains that were not within the training sets, see the applied hydrostatic strains annotated in the upper panels of Figs. 4, 5 and 6a. The structures for validation were obtained from AIMD simulations using VASP subjected to canonical ensemble at T=300K. From Figs. 4, 5 and 6 we can find that the trained ANN model can evaluate the energies with good agreements with those from VASP calculations with a maximal root mean squared error of 0.0014 eV/atom (bulk \(\beta -\)antimonene, \(-2\%\) strain, see Fig. 4b), demonstrating that for given structures of \({\rm {Sb/MoS}}_{2}\) system, this ANN model can evaluate system energies with high fidelity to those from respective ab initio calculations. It must be noted that the simulation system sizes of validation sets were less than 150 atoms, and therefore, allowing direct comparison of energies from trained ANN models and AIMD simulations. Since it has been demonstrated that the trained ANN model can predict system energies with high accuracies to AIMD simulations, in the following subsection, we will demonstrate that the trained ANN model can be employed to carry out MD simulations with system sizes that are too large for AIMD simulations.

### Molecular Dynamics Simulations of \({\rm {Sb/MoS}}_{2}\) Heterostructures

Computational costs of tri-layer antimonene (900 atoms) and monolayer antimonene (1056 atoms) using the ANN model and ab initio calculations (VASP)

Energy model | System | No. atoms | No. cores | Speed |
---|---|---|---|---|

ANN | Tri-layer antimonene | 900 | 1 | 12.5 min/step |

VASP | Tri-layer antimonene | 900 | 112 | 50.0 min/step |

ANN | monolayer antimonene | 1056 | 1 | 13.3 min/step |

VASP | monolayer antimonene | 1056 | 112 | 55.6 min/step |

## Conclusion

In conclusion, the present study successfully constructed an ANN-based model for efficient energy/force evaluation of \({\rm {Sb/MoS}}_{2}\) van der Waals heterostructures with high fidelity to ab initio calculations. The ANN model was trained by feeding a large number of structures of \(\beta\)-antimonene, \({\rm {MoS}}_{2}\), and \({\rm {Sb/MoS}}_{2}\) heterostructures subjected to hydrostatic strains up to \(\pm 5\%\) along with their respective energies from DFT calculations into the training set. The ANN model can successfully evaluate system energies of given structures in the validation set—the set of structures NOT within training sets—with good agreements with respective DFT energies. We performed classical MD simulations of \({\rm {Sb/MoS}}_{2}\) heterostructures using the trained ANN model, and demonstrated that the heterostructures are stable against finite temperature MD simulations with much lower computational expense relative to DFT calculations. Hence, the present study demonstrates that the ANN model can be utilized as an efficient energy/force evaluator for atomistic simulations, allowing researchers to explore structures of vdW heterostructures with system size beyond the reach of conventional ab initio calculations.

## Notes

### Acknowledgements

We thank the Academia Sinica Career Development Award, Grant no. 2317-1050100, and Ministry of Science and Technology, Taiwan, Grant no. MOST 105-2112-M-001-009-MY3 for financial support, and the National Center for High-performance Computing for computational support.

## References

- 1.K.S. Novoselov, D. Jiang, F. Schedin, T.J. Booth, V.V. Khotkevich, S.V. Morozov, A.K. Geim, Two-dimensional atomic crystals. Proc. Natl. Acad. Sci. USA
**102**(30), 10451–3 (2005)CrossRefGoogle Scholar - 2.A.K. Geim, K.S. Novoselov, The rise of graphene. Nat. Mater.
**6**(3), 183–191 (2007)CrossRefGoogle Scholar - 3.K.F. Mak, C. Lee, J. Hone, J. Shan, T.F. Heinz, Atomically thin MoS 2: a new direct-gap semiconductor. Phys. Rev. Lett.
**105**(13), 136805 (2010)CrossRefGoogle Scholar - 4.S.Z. Butler, S.M. Hollen, L. Cao, Y. Cui, J.A. Gupta, H.R. Gutiérrez, T.F. Heinz, S.S. Hong, J. Huang, A.F. Ismach, E. Johnston-Halperin, M. Kuno, V.V. Plashnitsa, R.D. Robinson, R.S. Ruoff, S. Salahuddin, J. Shan, L. Shi, M.G. Spencer, M. Terrones, W. Windl, J.E. Goldberger, Progress, challenges, and opportunities in two-dimensional materials beyond graphene. ACS Nano
**7**(4), 2898–2926 (2013)CrossRefGoogle Scholar - 5.K. Zhang, Y. Feng, F. Wang, Z. Yang, J. Wang, Two dimensional hexagonal boron nitride (2D-hBN): synthesis, properties and applications. J. Mater. Chem. C
**5**(46), 11992–12022 (2017)CrossRefGoogle Scholar - 6.A.K. Geim, I.V. Grigorieva, Van der Waals heterostructures. Nature
**499**(7459), 419–425 (2013)CrossRefGoogle Scholar - 7.Y.-C. Lin, L. Ning, N. Perea-Lopez, J. Li, Z. Lin, X. Peng, C.H. Lee, C. Sun, L. Calderin, P.N. Browning, M.S. Bresnehan, M.J. Kim, T.S. Mayer, M. Terrones, J.A. Robinson, Direct synthesis of van der Waals solids. ACS Nano
**8**(4), 3715–3723 (2014)CrossRefGoogle Scholar - 8.K.S. Novoselov, A. Mishchenko, A. Carvalho, A.H. Castro Neto, 2D materials and van der Waals heterostructures. Science N Y
**353**(6298), aac9439 (2016)CrossRefGoogle Scholar - 9.A. Dankert, S.P. Dash, Electrical gate control of spin current in van der Waals heterostructures at room temperature. Nat. Commun.
**8**, 16093 (2017)CrossRefGoogle Scholar - 10.C.-I. Lu, C.J. Butler, J.-K. Huang, Y.-H. Chu, H.-H. Yang, C.-M. Wei, L.-J. Li, M.-T. Lin, Moiré-related in-gap states in a twisted MoS2/graphite heterojunction. npj 2D Mater. Appl.
**1**(1), 24 (2017)CrossRefGoogle Scholar - 11.Y. Cao, V. Fatemi, S. Fang, K. Watanabe, T. Taniguchi, E. Kaxiras, P. Jarillo-Herrero, Unconventional superconductivity in magic-angle graphene superlattices. Nature
**556**(7699), 43–50 (2018)CrossRefGoogle Scholar - 12.C. Tang, L. Zhong, B. Zhang, H.-F. Wang, Q. Zhang, 3D Mesoporous van der Waals heterostructures for trifunctional energy electrocatalysis. Adv. Mater.
**30**(5), 1705110 (2018)CrossRefGoogle Scholar - 13.R. Decker, Y. Wang, V.W. Brar, W. Regan, H.-Z. Tsai, Q. Wu, W. Gannett, A. Zettl, M.F. Crommie, Local electronic properties of graphene on a BN substrate via scanning tunneling microscopy. Nano Lett.
**11**(6), 2291–2295 (2011)CrossRefGoogle Scholar - 14.J. Xue, J. Sanchez-Yamagishi, D. Bulmash, P. Jacquod, A. Deshpande, K. Watanabe, T. Taniguchi, P. Jarillo-Herrero, B.J. LeRoy, Scanning tunnelling microscopy and spectroscopy of ultra-flat graphene on hexagonal boron nitride. Nat. Mater.
**10**(4), 282–285 (2011)CrossRefGoogle Scholar - 15.M. Kuwabara, D.R. Clarke, D.A. Smith, Anomalous superperiodicity in scanning tunneling microscope images of graphite. Appl. Phys. Lett.
**56**(24), 2396–2398 (1990)CrossRefGoogle Scholar - 16.E.N. Voloshina, Y.S. Dedkov, S. Torbrügge, A. Thissen, M. Fonin, Graphene on Rh(111): scanning tunneling and atomic force microscopies studies. Appl. Phys. Lett.
**100**(24), 241606 (2012)CrossRefGoogle Scholar - 17.S. Tang, Y. Haomin Wang, A.L. Zhang, H. Xie, X. Liu, L. Liu, T. Li, F. Huang, X. Xie, M. Jiang, Precisely aligned graphene grown on hexagonal boron nitride by catalyst free chemical vapor deposition. Sci. Rep.
**3**(1), 2666 (2013)CrossRefGoogle Scholar - 18.M.M. van Wijk, A. Schuring, M.I. Katsnelson, A. Fasolino, Moiré patterns as a probe of interplanar interactions for graphene on h-BN. Phys. Rev. Lett.
**113**(13), 135504 (2014)CrossRefGoogle Scholar - 19.Z.Y. Zhang, M.S. Si, S.L. Peng, F. Zhang, Y.H. Wang, D.S. Xue, Bandgap engineering in van der Waals heterostructures of blue phosphorene and MoS2: a first principles calculation. J. Solid State Chem.
**231**, 64–69 (2015)CrossRefGoogle Scholar - 20.J.H. Kim, K. Kim, Z. Lee, The Hide-and-Seek of Grain boundaries from Moiré pattern Fringe of two-dimensional graphene. Sci. Rep.
**5**(1), 12508 (2015)CrossRefGoogle Scholar - 21.Y. Hongyi, G.-B. Liu, J. Tang, X. Xiaodong, W. Yao, Moiré excitons: from programmable quantum emitter arrays to spin-orbitcoupled artificial lattices. Sci. Adv.
**3**(11), e1701696 (2017)CrossRefGoogle Scholar - 22.J. Wang, R. Namburu, M. Dubey, A.M. Dongare, Origins of Moiré patterns in CVD-grown MoS2 bilayer structures at the atomic scales. Sci. Rep.
**8**(1), 9439 (2018)CrossRefGoogle Scholar - 23.H. Kumar, L. Dong, V.B. Shenoy, Limits of coherency and strain transfer in flexible 2D van der Waals heterostructures: formation of strain solitons and interlayer debonding. Sci. Rep.
**6**(1), 21516 (2016)CrossRefGoogle Scholar - 24.P. Nicolini, R. Capozza, P. Restuccia, T. Polcar, Structural ordering of molybdenum disulfide studied via reactive molecular dynamics simulations. ACS Appl. Mater. Interfaces
**10**(10), 8937–8946 (2018)CrossRefGoogle Scholar - 25.G. Cybenko, Approximation by superpositions of a sigmoidal function. Math. Control Signal. Syst.
**2**(4), 303–314 (1989)MathSciNetCrossRefzbMATHGoogle Scholar - 26.K. Hornik, Approximation capabilities of multilayer feedforward networks. Neural Netw.
**4**(2), 251–257 (1991)MathSciNetCrossRefGoogle Scholar - 27.J. Behler, M. Parrinello, Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett.
**98**, 146401 (2007)CrossRefGoogle Scholar - 28.J. Behler, Atom-centered symmetry functions for constructing high-dimensional neural network potentials. J. Chem. Phys.
**134**(7), 074106 (2011)CrossRefGoogle Scholar - 29.A. Khorshidi, A.A. Peterson, Amp: A modular approach to machine learning in atomistic simulations. Comput. Phys. Commun.
**207**, 310–324 (2016)CrossRefGoogle Scholar - 30.G. Kresse, J. Hafner, \(<\)i\(>\)Ab initio\(<\)/i\(>\) molecular dynamics for liquid metals. Phys. Rev. B
**47**(1), 558–561 (1993)CrossRefGoogle Scholar - 31.G. Kresse, J. Furthmüller, Efficient iterative schemes for \(<\)i\(>\)ab initio\(<\)/i\(>\) total-energy calculations using a plane-wave basis set. Phys. Rev. B
**54**(16), 11169–11186 (1996)CrossRefGoogle Scholar - 32.P.E. Blöchl, Projector augmented-wave method. Phys. Rev. B
**50**(24), 17953–17979 (1994)CrossRefGoogle Scholar - 33.J.P. Perdew, K. Burke, M. Ernzerhof, Generalized gradient approximation made simple. Phys. Rev. Lett.
**77**(18), 3865–3868 (1996)CrossRefGoogle Scholar - 34.S. Grimme, Semiempirical GGA-type density functional constructed with a long-range dispersion correction. J. Comput. Chem.
**27**(15), 1787–1799 (2006)CrossRefGoogle Scholar