Skip to main content

On Approaches for Monitoring Categorical Event Series

  • Chapter
  • First Online:
Control Charts and Machine Learning for Anomaly Detection in Manufacturing

Part of the book series: Springer Series in Reliability Engineering ((RELIABILITY))

  • 1092 Accesses

Abstract

In many manufacturing applications, the monitoring of categorical event series is required, i. e., of processes, where the quality characteristics are measured on a qualitative scale. We survey three groups of approaches for this task. First, the categorical event series might be transformed into a count process (e. g., event counts, discrete waiting times). After having identified an appropriate model for this count process, diverse control charts are available for the monitoring of the generated counts. Second, control charts might be directly applied to the considered categorical event series, using different charts for nominal than for ordinal data. The latter distinction is also crucial for the respective possibilities of analyzing and modeling these data. Finally, also rule-based procedures from machine learning might be used for the monitoring of categorical event series, where the generated rules are used to predict the occurrence of critical events. Our comprehensive survey of methods and models for categorical event series is complemented by two real-data examples from manufacturing industry, about nominal types of defects and ordinal levels of quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Although the presented approaches for episode mining do not explicitly use stochastic assumptions, several connections to models from Sect. 3.1 have been established in the literature, namely to Markov models by Gwadera et al. [16], to Hidden-Markov models by Laxman et al. [27], and to variable-length Markov models by Weiß [55].

References

  1. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th international conference on very large databases, pp 487–499

    Google Scholar 

  2. Agresti A (2010) Analysis of ordinal categorical data, 2nd edn. John Wiley & Sons Inc., Hoboken

    Book  MATH  Google Scholar 

  3. Bai K, Li J (2021) Location-scale monitoring of ordinal categorical processes. Naval research logistics, forthcoming

    Google Scholar 

  4. Bashkansky E, Gadrich T (2011) Statistical quality control for ternary ordinal quality data. Appl Stoch Models Bus Ind 27(6):586–599

    Article  MathSciNet  Google Scholar 

  5. Baum LE, Petrie T (1966) Statistical inference for probabilistic functions of finite state Markov chains. Ann Math Stat 37(6):1554–1563

    Article  MathSciNet  MATH  Google Scholar 

  6. Bersimis S, Sachlas A, Castagliola P (2017) Controlling bivariate categorical processes using scan rules. Methodol Comput Appl Probab 19(4):1135–1149

    Article  MathSciNet  MATH  Google Scholar 

  7. Blatterman DK, Champ CW (1992) A Shewhart control chart under 100% inspection for Markov dependent attribute data. In: Proceedings of the 23rd annual modeling and simulation conference, pp 1769–1774

    Google Scholar 

  8. Bourke PD (1991) Detecting a shift in fraction nonconforming using run-length control charts with 100% inspection. J Qual Technol 23(3):225–238

    Article  Google Scholar 

  9. Brook D, Evans DA (1972) An approach to the probability distribution of CUSUM run length. Biometrika 59(3):539–549

    Article  MathSciNet  MATH  Google Scholar 

  10. Bühlmann P, Wyner AJ (1999) Variable length Markov chains. Ann Stat 27(2):480–513

    Article  MathSciNet  MATH  Google Scholar 

  11. Duncan AJ (1950) A chi-square chart for controlling a set of percentages. Ind Qual Control 7:11–15

    Google Scholar 

  12. Duran RI, Albin SL (2009) Monitoring and accurately interpreting service processes with transactions that are classified in multiple categories. IIE Trans 42(2):136–145

    Article  Google Scholar 

  13. Ferland R, Latour A, Oraichi D (2006) Integer-valued GARCH processes. J Time Ser Anal 27(6):923–942

    Article  MathSciNet  MATH  Google Scholar 

  14. Gan FF (1990) Monitoring observations generated from a binomial distribution using modified exponentially weighted moving average control chart. J Stat Comput Simul 37(1–2):45–60

    Article  MATH  Google Scholar 

  15. Göb R (2006) Data mining and statistical control – a review and some links. In: Lenz HJ, Wilrich PT (eds) Frontiers in statistical quality control 8, pp 285–308. Physica-Verlag, Heidelberg

    Google Scholar 

  16. Gwadera R, Atallah MJ, Szpankowski W (2005) Markov models for identification of significant episodes. In: Proceedings of the 2005 SIAM international conference on data mining, pp 404–414

    Google Scholar 

  17. Höhle M (2010) Online change-point detection in categorical time series. In: Kneib T, Tutz G (eds) Statistical modelling and regression structures. Festschrift in Honour of Ludwig Fahrmeir, pp 377–397. Physica-Verlag, Heidelberg

    Google Scholar 

  18. Höhle M, Paul M (2008) Count data regression charts for the monitoring of surveillance time series. Comput Stat Data Anal 52(9):4357–4368

    Article  MathSciNet  MATH  Google Scholar 

  19. Jacobs PA, Lewis PAW (1983) Stationary discrete autoregressive-moving average time series generated by mixtures. J Time Ser Anal 4(1):19–36

    Article  MathSciNet  MATH  Google Scholar 

  20. Klein I, Doll M (2021) Tests on asymmetry for ordered categorical variables. J Appl Stat 48(7):1180–1198

    Article  MathSciNet  Google Scholar 

  21. Klemettinen M, Mannila H, Toivonen H (1999) Rule discovery in telecommunication alarm data. J Netw Syst Manag 7(4):395–423

    Article  MATH  Google Scholar 

  22. Koutras MV, Bersimis S, Antzoulakos DL (2006) Improving the performance of the chi-square control chart via runs rules. Methodol Comput Appl Prob 8(3):409–426

    Article  MathSciNet  MATH  Google Scholar 

  23. Koutras MV, Maravelakis PE, Bersimis S (2008) Techniques for controlling bivariate grouped observations. J Multivar Anal 99(7):1474–1488

    Article  MathSciNet  MATH  Google Scholar 

  24. Kvålseth TO (1995) Coefficients of variation for nominal and ordinal categorical data. Percept Mot Skills 80(3):843–847

    Article  Google Scholar 

  25. Lambert D, Liu C (2006) Adaptive thresholds: monitoring streams of network counts. J Am Stat Assoc 101(473):78–88

    Article  MathSciNet  MATH  Google Scholar 

  26. Laxman S, Sastry PS (2006) A survey of temporal data mining. Sādhanā 31(2):173–198

    Article  MathSciNet  MATH  Google Scholar 

  27. Laxman S, Sastry PS, Unnikrishnan KP (2005) Discovering frequent episodes and learning hidden Markov models: a formal connection. IEEE Trans Knowl Data Eng 17(11):1505–1517

    Article  Google Scholar 

  28. Li J, Tsung F, Zou C (2014) A simple categorical chart for detecting location shifts with ordinal information. Int J Prod Res 52(2):550–562

    Article  Google Scholar 

  29. Li J, Xu J, Zhou Q (2018) Monitoring serially dependent categorical processes with ordinal information. IISE Trans 50(7):596–605

    Article  Google Scholar 

  30. Mannila H, Toivonen H, Verkamo AI (1997) Discovery of frequent episodes in event sequences. Data Min Knowl Disc 1(3):259–289

    Article  Google Scholar 

  31. Marcucci M (1985) Monitoring multinomial processes. J Qual Technol 17(2):86–91

    Article  Google Scholar 

  32. McKenzie E (1985) Some simple models for discrete variate time series. Water Resour Bull 21(4):645–650

    Article  Google Scholar 

  33. Montgomery DC (2009) Introduction to statistical quality control, 6th edn. John Wiley & Sons Inc., New York

    MATH  Google Scholar 

  34. Morais MC, Knoth S, Weiß CH (2018) An ARL-unbiased thinning-based EWMA chart to monitor counts. Seq Anal 37(4):487–510

    Article  MathSciNet  MATH  Google Scholar 

  35. Mousavi S, Reynolds MR Jr (2009) A CUSUM chart for monitoring a proportion with autocorrelated binary observations. J Qual Technol 41(4):401–414

    Article  Google Scholar 

  36. Mukhopadhyay AR (2008) Multivariate attribute control chart using Mahalanobis \(D^2\) statistic. J Appl Stat 35(4):421–429

    Article  MathSciNet  MATH  Google Scholar 

  37. Nembhard DA, Nembhard HB (2000) A demerits control chart for autocorrelated data. Qual Eng 13(2):179–190

    Article  Google Scholar 

  38. Page E (1954) Continuous inspection schemes. Biometrika 41(1):100–115

    Article  MathSciNet  MATH  Google Scholar 

  39. Perry MB (2020) An EWMA control chart for categorical processes with applications to social network monitoring. J Qual Technol 52(2):182–197

    Article  Google Scholar 

  40. Raftery AE (1985) A model for high-order Markov chains. J Roy Stat Soc B 47(3):528–539

    MathSciNet  MATH  Google Scholar 

  41. Rakitzis AC, Weiß CH, Castagliola P (2017) Control charts for monitoring correlated counts with a finite range. Appl Stoch Models Bus Ind 33(6):733–749

    Article  MathSciNet  MATH  Google Scholar 

  42. Reynolds MR Jr, Stoumbos ZG (1999) A CUSUM chart for monitoring a proportion when inspecting continuously. J Qual Technol 31(1):87–108

    Article  Google Scholar 

  43. Roberts SW (1959) Control chart tests based on geometric moving averages. Technometrics 1(3):239–250

    Article  Google Scholar 

  44. Ryan AG, Wells LJ, Woodall WH (2011) Methods for monitoring multiple proportions when inspecting continuously. J Qual Technol 43(3):237–248

    Article  Google Scholar 

  45. Spanos CJ, Chen RL (1997) Using qualitative observations for process tuning and control. IEEE Trans Semicond Manuf 10(2):307–316

    Article  Google Scholar 

  46. Steiner SH (1998) Grouped data exponentially weighted moving average control charts. J Roy Stat Soc C 47(2):203–216

    Article  MATH  Google Scholar 

  47. Steiner SH, Geyer PL, Wesolowsky GO (1996) Grouped data-sequential probability ratio tests and cumulative sum control charts. Technometrics 38(3):230–237

    Article  MathSciNet  MATH  Google Scholar 

  48. Szarka JL III, Woodall WH (2011) A review and perspective on surveillance of Bernoulli processes. Qual Reliab Eng Int 27(6):735–752

    Article  Google Scholar 

  49. Tucker GR, Woodall WH, Tsui K-L (2002) A control chart method for ordinal data. Am J Math Manag Sci 22(1–2):31–48

    MathSciNet  Google Scholar 

  50. Vasquez Capacho JW, Subias A, Travé-Massuyès L, Jimenez F (2017) Alarm management via temporal pattern learning. Eng Appl Artif Intell 65:506–516

    Article  Google Scholar 

  51. Wang J, Li J, Su Q (2017) Multivariate ordinal categorical process control based on log-linear modeling. J Qual Technol 49(2):108–122

    Article  MathSciNet  Google Scholar 

  52. Wang J, Su Q, Xie M (2018) A univariate procedure for monitoring location and dispersion with ordered categorical data. Commun Stat Simul Comput 47(1):115–128

    Article  MathSciNet  MATH  Google Scholar 

  53. Weiss GM (1999) Timeweaver: a genetic algorithm for identifying predictive patterns in sequences of events. In: Banzhaf et al (eds) Proceedings of the genetic and evolutionary computation conference, pp 718–725. Morgan Kaufmann, San Francisco

    Google Scholar 

  54. Weiß CH (2009) Group inspection of dependent binary processes. Qual Reliab Eng Int 25(2):151–165

    Article  Google Scholar 

  55. Weiß CH (2011) Rule generation for categorical time series with Markov assumptions. Stat Comput 21(1):1–16

    Article  MathSciNet  MATH  Google Scholar 

  56. Weiß CH (2012) Continuously monitoring categorical processes. Qual Technol Quant Manag 9(2):171–188

    Article  Google Scholar 

  57. Weiß CH (2015) SPC methods for time-dependent processes of counts – a literature review. Cogent Math 2(1):1111116

    Article  MathSciNet  MATH  Google Scholar 

  58. Weiß CH (2017) Association rule mining. In: Balakrishnan et al (eds) Wiley StatsRef: statistics reference online. John Wiley & Sons Ltd., Hoboken

    Google Scholar 

  59. Weiß CH (2018a) An introduction to discrete-valued time series. John Wiley & Sons Inc., Chichester

    Book  MATH  Google Scholar 

  60. Weiß CH (2018b) Control charts for time-dependent categorical processes. In: Knoth S, Schmid W (eds) Frontiers in statistical quality control 12. Physica-Verlag, Heidelberg, pp 211–231

    Chapter  Google Scholar 

  61. Weiß CH (2020) Distance-based analysis of ordinal data and ordinal time series. J Am Stat Assoc 115(531):1189–1200

    Article  MathSciNet  MATH  Google Scholar 

  62. Weiß CH (2021) Stationary count time series models. WIREs Comput Stat 13(1):e1502

    Article  MathSciNet  Google Scholar 

  63. Weiß CH, Zhu F, Hoshiyar A (2022) Softplus INGARCH models. Statistica Sinica 32(3), forthcoming

    Google Scholar 

  64. Ye N (2003) Mining computer and network security data. In: Ye N (ed) The handbook of data mining. Lawrence Erlbaum Associations Inc., New Jersey, pp 617–636

    Chapter  Google Scholar 

  65. Ye N, Borror C, Zhang Y (2002a) EWMA techniques for computer intrusion detection through anomalous changes in event intensity. Qual Reliab Eng Int 18(6):443–451

    Article  Google Scholar 

  66. Ye N, Masum S, Chen Q, Vilbert S (2002b) Multivariate statistical analysis of audit trails for host-based intrusion detection. IEEE Trans Comput 51(7):810–820

    Article  Google Scholar 

  67. Yuan Y, Zhou S, Sievenpiper C, Mannar K, Zheng Y (2011) Event log modeling and analysis for system failure prediction. IIE Trans 43(9):647–660

    Article  Google Scholar 

  68. Zimmermann A (2014) Understanding episode mining techniques: benchmarking on diverse, realistic, artificial Data. Intell Data Anal 18(5):761–791

    Article  Google Scholar 

Download references

Acknowledgments

The author thanks the referee for useful comments on an earlier draft of this article. The author is grateful to Professor Jian Li (Xi’an Jiaotong University, China) for providing the flash data discussed in Sect. 5.2.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian H. Weiß .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Weiß, C.H. (2022). On Approaches for Monitoring Categorical Event Series. In: Tran, K.P. (eds) Control Charts and Machine Learning for Anomaly Detection in Manufacturing. Springer Series in Reliability Engineering. Springer, Cham. https://doi.org/10.1007/978-3-030-83819-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-83819-5_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-83818-8

  • Online ISBN: 978-3-030-83819-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics