Machine Learning for VLSI Chip Testing and Semiconductor Manufacturing Process Monitoring and Improvement

  • Jinjun XiongEmail author
  • Yada Zhu
  • Jingrui He


Machine learning and big data analytics are the latest spotlights with all the glare of fame ranging from media coverage to booming start-up companies to eye-catching merges and acquisitions. On the contrary, the $336 billion industry of semiconductor was seen as an “old-fashioned” business, with fading interests from the best and brightest among young graduates and engineers. This chapter argues that this does not have to be that way because many research problems and solutions as studied in the semiconductor industry are in fact closely related to these machine learning and big data problems. To illustrate this point, we discuss a number of practical but challenging problems arising from semiconductor manufacturing process in this chapter. We first show how machine learning techniques, especially those regression-related problems, often under the “disguise” of optimization problems, have been used frequently (often with nontrivial modeling skills and mathematical sophistications) to solve the semiconductor problems. We discuss such examples as process variation modeling and VLSI chip testing. For some other types of semiconductor problems, such as manufacturing process monitoring and improvement, we show that some existing machine learning algorithms are not necessarily well positioned to solve them, and novel machine learning techniques involving temporal, structural, and hierarchical properties need to be further developed. In either scenario, we convey the message that machine learning and existing semiconductor industry researches are closely related, and researchers often contribute to and benefit from each other.


  1. 1.
    J. Chen, Y. Chen, X. Du, C. Li, J. Lu, S. Zhao, X. Zhou, Big data challenge: a data management perspective. Front. Comp. Sci. 7(2), 157–164 (2013)MathSciNetCrossRefGoogle Scholar
  2. 2.
    D. Analytics, Analytics trends 2015: a below-the-surface look, in White Paper (2015)Google Scholar
  3. 3.
    G. Newell, N. Bekhazi, R. Morgan, Optimizing storage and I/O for distributed processing on enterprise and high performance compute (HPC) systems for mask data preparation software (CATS), Technical Report, Synopsys, Inc., 2007Google Scholar
  4. 4.
    D. Kurz, C.D. Luca, J. Pilz, Monitoring virtual metrology reliability in a sampling decision system, in Conference on Automation Science and Engineering (2013)Google Scholar
  5. 5.
    A. Johnson, S. McLoone, A dynamic sampling methodology for within product virtual metrology, in 29th International Manufacturing Conference (2012)Google Scholar
  6. 6.
    J. Attenberg, K. Weinberger, A. Dasgupta, Collaborative email-spam filtering with the hashing-trick, in Proceedings of the Sixth Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (2009)Google Scholar
  7. 7.
    O. Chappelle, P. Shivaswamy, S. Vadrevu, Multi-task learning for boosting with application to web search ranking, in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (2010), pp. 1189–1198Google Scholar
  8. 8.
    A. Torralba, K.P. Murphy, W.T. Freeman, Sharing features: efficient boosting procedures for multiclass object detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2004), pp. 762–769Google Scholar
  9. 9.
    M. Aitkin, N. Longford, Statistical modelling issues in school effectiveness studies. J. R. Stat. Soc. A 149, 1–43 (1986)CrossRefGoogle Scholar
  10. 10.
    M. Daniels, C. Gatsonis, Hierarchical generalized linear models in the analysis of variations in health care utilization. J. Am. Stat. Assoc. 94, 29–38 (1999)CrossRefGoogle Scholar
  11. 11.
    Y. Chen, B. Hu, E.J. Keogh, G.E.A.P.A. Batista, Dtw-d: time series semi-supervised learning from a single example, in KDD (2013), pp. 383–391Google Scholar
  12. 12.
    B. Hu, Y. Chen, E.J. Keogh, Time series classification under more realistic assumptions, in SDM (2013), pp. 578–586Google Scholar
  13. 13.
    J. Zakaria, A. Mueen, E.J. Keogh, Clustering time series using unsupervised-shapelets, in ICDM (2012), pp. 785–794Google Scholar
  14. 14.
    L. Li, B.A. Prakash, Time series clustering: complex is simpler!, in ICML (2011), pp. 185–192Google Scholar
  15. 15.
    L. Li, B.A. Prakash, C. Faloutsos, Parsimonious linear fingerprinting for time series. J. Proc. VLDB Endow. 3(1), 385–396 (2010)CrossRefGoogle Scholar
  16. 16.
    T. Rakthanmanon, B.J.L. Campana, A. Mueen, G.E.A.P.A. Batista, M.B. Westover, Q. Zhu, J. Zakaria, E.J. Keogh, Searching and mining trillions of time series subsequences under dynamic time warping, in KDD (2012), pp. 262–270Google Scholar
  17. 17.
    L. Wei, E.J. Keogh, X. Xi, M. Yoder, Efficiently finding unusual shapes in large image databases. Data Min. Knowl. Disc. 17(3), 343–376 (2008)MathSciNetCrossRefGoogle Scholar
  18. 18.
    B.-K. Yi, N. Sidiropoulos, T. Johnson, H.V. Jagadish, C. Faloutsos, A. Biliris, Online data mining for co-evolving time sequences, in ICDE (2000), pp. 13–22Google Scholar
  19. 19.
    S. Papadimitriou, J. Sun, C. Faloutsos, Streaming pattern discovery in multiple time-series, in VLDB (2005), pp. 697–708Google Scholar
  20. 20.
    Y.-J. Chang, Y. Kang, C.-L. Hsu, C.-T. Chang, T.Y. Chan, Virtual metrology technique for semiconductor manufacturing, in International Joint Conference on Neural Networks, 2006. IJCNN ’06 (2006), pp. 5289–5293Google Scholar
  21. 21.
    A. Yaglom, Some classes of random fields in n-dimensional space, related to stationary random processes. Theory Probab. Appl. 2, 273–320 (1957)CrossRefGoogle Scholar
  22. 22.
    R.L. Bras, I. Rodriguez-Iturbe, Random Functions and Hydrology (Dover Publishers, Mineola, 1985)Google Scholar
  23. 23.
    T. Coleman, Y. Li, An interior, trust region approach for nonlinear minimization subject to bounds. SIAM J. Optim. 6, 418–445 (1996)MathSciNetCrossRefGoogle Scholar
  24. 24.
    C. Visweswariah, K. Ravindran, K. Kalafala, S.G. Walker, S. Narayan, First-order incremental block-based statistical timing analysis, in DAC, San Diego, CA, June 2004, pp. 331–336Google Scholar
  25. 25.
    H. Chang, S.S. Sapatnekar, Statistical timing analysis considering spatial correlations using a single PERT-like traversal, in ICCAD, San Jose, CA, November 2003, pp. 621–625Google Scholar
  26. 26.
    R. Chen, L. Zhang, V. Zolotov, C. Visweswariah, J. Xiong, Static timing: back to our roots, in Asia and South Pacific Design Automation Conference, Seoul, South Korea, March 2008, pp. 310–315Google Scholar
  27. 27.
    J. Xiong, V. Zolotov, C. Visweswariah, P. Habitz, Optimal margin computation for at-speed test, in Conference on Design, Automation and Test in Europe, Munich, Germany, March 2008, pp. 622–627Google Scholar
  28. 28.
    B. Bakker, T. Heskes, Task clustering and gating for Bayesian multitask learning. J. Mach. Learn. Res. 4, 83–99 (2003)zbMATHGoogle Scholar
  29. 29.
    A. Argyriou, T. Evgeniou, M. Pontil, Convex multi-task feature learning. Mach. Learn. 73(3), 243–272 (2008)CrossRefGoogle Scholar
  30. 30.
    J. Chen, L. Tang, J. Liu, J. Ye, A convex formulation for learning shared structures from multiple tasks, in ICML (2009), p. 18Google Scholar
  31. 31.
    J. Chen, J. Zhou, J. Ye, Integrating low-rank and group-sparse structures for robust multi-task learning, in KDD (2011), pp. 42–50Google Scholar
  32. 32.
    P. Kang, D. Kim, H.-J. Lee, S. Doh, S. Cho, Virtual metrology for run-to-run control in semiconductor manufacturing. Expert Syst. Appl. 38, 2508–2522 (2011)CrossRefGoogle Scholar
  33. 33.
    S. Lynn, J. Ringwood, E. Ragnoli, S. McLoone, N. MacGearailt, Virtual metrology for plasma etch using tool variables, in Advanced Semiconductor Manufacturing Conference (2009)Google Scholar
  34. 34.
    Y. Zhang, J.G. Schneider, Learning multiple tasks with a sparse matrix-normal penalty, in NIPS (2010), pp. 2550–2558Google Scholar
  35. 35.
    H. Liu, M. Palatucci, J. Zhang, Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery, in ICML (2009), p. 82Google Scholar
  36. 36.
    D.G. Luenberger, Linear and Nonlinear Programming, 2nd edn. (Addison-Wesley, Massachusetts, 1973)zbMATHGoogle Scholar
  37. 37.
    R.K. Ando, T. Zhang, A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6, 1817–1853 (2005)MathSciNetzbMATHGoogle Scholar
  38. 38.
    J. Zakaria, A. Mueen, E.J. Keogh, Clustering time series using unsupervised-shapelets, in ICDM (2012), pp. 785–794Google Scholar
  39. 39.
    T. Rakthanmanon, B.J.L. Campana, A. Mueen, G.E.A.P.A. Batista, M.B. Westover, Q. Zhu, J. Zakaria, E.J. Keogh, Searching and mining trillions of time series subsequences under dynamic time warping, in KDD (2012), pp. 262–270Google Scholar
  40. 40.
    D. Chakrabarti, S. Papadimitriou, D.S. Modha, C. Faloutsos, Fully automatic cross-associations, in KDD (2004), pp. 79–88Google Scholar
  41. 41.
    I.S. Dhillon, S. Mallela, D.S. Modha, Information-theoretic co-clustering, in KDD (2003), pp. 89–98Google Scholar
  42. 42.
    R. Tibshirani, Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  43. 43.
    H. Zou, T. Hastie, Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat Methodol.) 67(2), 301–320 (2003)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.IBM Thomas J. Watson Research CenterYorktown HeightsUSA
  2. 2.Arizona State UniversityTempeUSA

Personalised recommendations