Data-Driven Materials Modeling with XGBoost Algorithm and Statistical Inference Analysis for Prediction of Fatigue Strength of Steels

  • Deok-Kee ChoiEmail author
Regular Paper


With the rapid development of the industry, the demand for new materials is increasing. However, new material development is time-consuming and costly. In this study, we proposed a workflow that uses data to create a materials model that accurately reflects the properties of materials. Six different numerical models for predicting the fatigue strength of steels were constructed with an empirical dataset extracted from a certified database (NIMS MatNavi material database). Because it is very difficult to understand the structure and patterns of large amounts of datasets and develop good predictive models at once, we have sought reliable models through statistical inference analysis, which has not been done in previous studies. We also chose the highest performance model with the accuracy (R2 = 0.9850) by applying the latest XGBoost algorithm. Through further study, we believe that this workflow can be used to develop predictive models on various properties of materials.


Fatigue strength Machine learning Data mining Materials informatics Materials database Data-driven model 

List of Symbols


Mean absolute error


Root mean square error


Coefficient of determination


Coefficient of linear regression


Explanatory variable


Response variable


Approximate to response variable


Risk function




Base learner


Weight of tree model


Linear regression


Decision tree


Random forest


Ada boost


Gradient boosting





  1. 1.
    Murakami, Y., Nomoto, T., & Ueda, T. (1999). Factors influencing the mechanism of superlong fatigue failure in steels. Fatigue and Fracture of Engineering Materials and Structures, 22(7), 581–590.Google Scholar
  2. 2.
    Delagnes, D., Lamesle, P., Mathon, M. H., Mebarki, N., & Levaillant, C. (2005). Influence of silicon content on the precipitation of secondary carbides and fatigue properties of a 5% Cr tempered martensitic steel. Materials Science and Engineering A, 394(1), 435–444.Google Scholar
  3. 3.
    Sakai, T., Sato, Y., Nagano, Y., Takeda, M., & Oguma, N. (2006). Effect of stress ratio on long life fatigue behavior of high carbon chromium bearing steel under axial loading. International Journal of Fatigue, 28(11), 1547–1554.zbMATHGoogle Scholar
  4. 4.
    Zhen, G., Kim, Y. S., Haochuang, L., Koo, J. M., Seok, C. S., Lee, K. W., et al. (2014). Bending fatigue life evaluation of cu–mg alloy contact wire. International Journal of Precision Engineering and Manufacturing, 15(7), 1331–1335.Google Scholar
  5. 5.
    Lee, D. K., Lee, J. M., Kim, Y. S., Koo, J. M., Seok, C. S., & Kim, Y. J. (2017). Thermo mechanical fatigue life prediction of ni-based superalloy in738lc. International Journal of Precision Engineering and Manufacturing, 18(4), 561–566.Google Scholar
  6. 6.
    Guo, Z., Kim, Y. S., Hong, S. W., Li, H. C., Chang-Sung Seok, J. M., Lee, K. W., et al. (2014). Fatigue life estimation of cold drawn contact wire. International Journal of Precision Engineering and Manufacturing, 15(11), 2291–2299.Google Scholar
  7. 7.
    Deshpande, P. D., Gautham, B. P., Cecen, A., Kalidindi, S., Agrawal, A., & Choudhary, A. (2013) Application of statistical and machine learning techniques for correlating properties to composition and manufacturing processes of steels. In Proceedings of the 2nd world congress on integrated computational materials engineering (ICME) (pp. 155–160). Springer.Google Scholar
  8. 8.
    Mueller, T., Kusne, A. G., & Ramprasad, R. (2016). Machine learning in materials science: Recent progress and emerging applications. Reviews in Computational Chemistry, 29, 186–273.Google Scholar
  9. 9.
    Park, T. G., Choi, C. H., Won, J. H., & Choi, J. H. (2010). An efficient method for fatigue reliability analysis accounting for scatter of fatigue test data. International Journal of Precision Engineering and Manufacturing, 11(3), 429–437.Google Scholar
  10. 10.
    Voracek, J. (2001). Prediction of mechanical properties of cast irons. Applied Soft Computing, 1(2), 119–125.Google Scholar
  11. 11.
    Sekercioglu, T. (2005). Shear strength estimation of adhesively bonded cylindrical components under static loading using the genetic algorithm approach. International Journal of Adhesion and Adhesives, 25(4), 352–357.Google Scholar
  12. 12.
    Yu, X., Deng, L., Zhang, X., Chen, M., Kuang, F., & Wang, Y. (2018). Accurate numerical computation of hot deformation behaviors by integrating finite element method with artificial neural network. International Journal of Precision Engineering and Manufacturing, 19(3), 395–404.Google Scholar
  13. 13.
    Wimarshana, B., Ryu, J., & Choi, H. J. (2014). Neural network based material models with bayesian framework for integrated materials and product design. International Journal of Precision Engineering and Manufacturing, 15(1), 75–81.Google Scholar
  14. 14.
    Schooling, J. M., Brown, M., & Reed, P. (1999). An example of the use of neural computing techniques in materials science—The modelling of fatigue thresholds in ni-base superalloys. Materials Science and Engineering A, 260, 222–239.Google Scholar
  15. 15.
    Genel, K. (2004). Application of artificial neural network for predicting strainlife fatigue properties of steels on the basis of tensile tests. International Journal of Fatigue, 26(10), 1027–1035.Google Scholar
  16. 16.
    Yilmaz, M., & Ertunc, M. H. (2007). The prediction of mechanical behavior for steel wires and cord materials using neural networks. Materials and Design, 28(2), 599–608.Google Scholar
  17. 17.
    Zhang, L., Juyang, L., Qilin, Z., & Yudong, W. (2015). Using genetic algorithm to optimize parameters of support vector machine and its application in material fatigue life prediction. Advances in Natural Science, 8(1), 21–26.Google Scholar
  18. 18.
    Kantchelian, A., Tygar, J. D., & Joseph, A. (2016). Evasion and hardening of tree ensemble classifiers. In M. F. Balcan & K. Q. Weinberger (Eds.), Proceedings of the 33rd international conference on machine learning, volume 48 of proceedings of machine learning research (pp. 2387–2396). New York, New York, USA.Google Scholar
  19. 19.
    Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.zbMATHGoogle Scholar
  20. 20.
    Bauer, E., & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36(1), 105–139.Google Scholar
  21. 21.
    Maclin, R. (1997). An empirical evaluation of bagging and boosting. In Proceedings of the fourteenth national conference on artificial intelligence (pp. 546–551). AAAI Press.Google Scholar
  22. 22.
    Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.zbMATHGoogle Scholar
  23. 23.
    Li, X., Wang, L., & Sung, E. (2008). Adaboost with SVM-based component classifiers. Engineering Applications of Artificial Intelligence, 21(5), 785–795.Google Scholar
  24. 24.
    Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD’16 (pp. 785–794). New York, NY, USA.Google Scholar
  25. 25.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.MathSciNetzbMATHGoogle Scholar
  26. 26.
    Agrawal, A., & Choudhary, A. (2016). A fatigue strength predictor for steels using ensemble data mining: Steel fatigue strength predictor. In Proceedings of the 25th ACM international on conference on information and knowledge management, CIKM’16 (pp. 2497–2500), New York, NY, USA.Google Scholar
  27. 27.
    Agrawal, A., Deshpande, P. D., Cecen, A., Basavarsu, G. P., Choudhary, A. N., & Kalidindi, S. R. (2014). Exploration of data science techniques to predict fatigue strength of steel from composition and processing parameters. Integrating Materials and Manufacturing Innovation, 3(1), 8.Google Scholar
  28. 28.
    Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th international joint conference on artificial intelligence—Volume 2 (pp. 1137—1143).Google Scholar

Copyright information

© Korean Society for Precision Engineering 2019

Authors and Affiliations

  1. 1.Department of Mechanical EngineeringDankook UniversityYongin-siKorea

Personalised recommendations