Skip to main content

On the Relevance of Preprocessing in Predictive Maintenance for Dynamic Systems

  • Chapter
  • First Online:
Predictive Maintenance in Dynamic Systems

Abstract

The complexity involved in the process of real-time data-driven monitoring dynamic systems for predicted maintenance is usually huge. Up to certain extent, any data-driven approach is sensitive to data preprocessing, understood as any data treatment prior to the application of the monitoring model, being sometimes crucial for the final development of the employed monitoring technique. The aim of this work is to quantify the sensitiveness of data-driven predictive maintenance models in dynamic systems in an exhaustive way.

We consider a couple of predictive maintenance scenarios, each of them defined by some public available data. For each scenario, we consider its properties and apply several techniques for each of the successive preprocessing steps, e.g., data cleaning, missing values treatment, outlier detection, feature selection, or imbalance compensation. The pretreatment configurations, i.e., sequential combinations of techniques from different preprocessing steps, are considered together with different monitoring approaches, in order to determine the relevance of data preprocessing for predictive maintenance in dynamical systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    It is not irrelevant when the treatment happens, because there are several steps between data cleansing and feature engineering that could be very sensitive to redundancies or heavily affected by features that in the end are irrelevant.

References

  1. Bartlett, M.: An inverse matrix adjustment arising in discriminant analysis. Ann. Math. Stat. 22(1), 107–111 (1951)

    MathSciNet  MATH  Google Scholar 

  2. Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newsl. 6(1), 20–29 (2004)

    Google Scholar 

  3. Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)

    Google Scholar 

  4. Benkedjouh, T., Medjaher, K., Zerhouni, N., Rechak, S.: Remaining useful life estimation based on nonlinear feature reduction and support vector regression. Eng. Appl. Artif. Intell. 26(7), 1751–1760 (2013)

    Google Scholar 

  5. Box, G.E.P., Cox, D.R.: An analysis of transformations. J. R. Stat. Soc. Ser. B 26(2), 211–252 (1964)

    MATH  Google Scholar 

  6. Bracewell, R.N.: The Fourier Transform and Its Applications, 3rd edn. McGraw-Hill, Boston (2000). ISBN 0-07-116043-4

    MATH  Google Scholar 

  7. Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 30(7), 1145–1159 (1997)

    Google Scholar 

  8. Branden, K.V., Hubert, M.: Robust classification in high dimensions based on the SIMCA method. Chemom. Intell. Lab. Syst. 79, 10–21 (2005)

    Google Scholar 

  9. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)

    MATH  Google Scholar 

  10. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    MATH  Google Scholar 

  11. Brown, G.: A new perspective for information theoretic feature selection. J. Mach. Learn. Res. 13, 27–66 (2012)

    MathSciNet  Google Scholar 

  12. Cabrera, D., Sancho, F., Sánchez, R.V., Zurita, G., Cerrada, M., Li, C., Vásquez, R.E.: Fault diagnosis of spur gearbox based on random forest and wavelet packet decomposition. Front. Mech. Eng. 10(3), 277–286 (2015)

    Google Scholar 

  13. Cernuda, C., Lughofer, E., Märzinger, W., Kasberger, J.: NIR-based quantification of process parameters in polyetheracrylat (PEA) production using flexible non-linear fuzzy systems. Chemom. Intell. Lab. Syst. 109(1), 22–33 (2011)

    Google Scholar 

  14. Cernuda, C., Lughofer, E., Suppan, L., Röder, T., Schmuck, R., Hintenaus, P., Märzinger, W., Kasberger, J.: Evolving chemometric models for predicting dynamic process parameters in viscose production. Anal. Chim. Acta 725, 22–38 (2012)

    Google Scholar 

  15. Cernuda, C., Lughofer, E., Hintenaus, P., Märzinger, Reischer, T., Pawliczek, M., W., Kasberger, J.: Hybrid adaptive calibration methods and ensemble strategy for prediction of cloud point in melamine resin production. Chemom. Intell. Lab. Syst. 126, 60–75 (2013)

    Google Scholar 

  16. Cernuda, C., Lughofer, E., Mayr, G., Röder, T., Hintenaus, P., Märzinger, W., Kasberger, J.: Incremental and decremental active learning for optimized self-adaptive calibration in viscose production. Chemom. Intell. Lab. Syst. 138, 14–29 (2014)

    Google Scholar 

  17. Cernuda, C., Lughofer, E., Klein, H., Forster, C., Pawliczek, M., Brandstetter, M.: Improved quantification of important beer quality parameters based on nonlinear calibration methods applied to FT-MIR spectra. Anal. Bioanal. Chem. 409(3), 841–857 (2017)

    Google Scholar 

  18. Chawla, N.V.: C4.5 and imbalanced data sets: investigating the effect of sampling method, probabilistic estimate, and decision tree structure. In: Proceedings of the ICML’03 Workshop on Learning from Imbalanced Data sets, Washington, DC, USA (2003)

    Google Scholar 

  19. Chawla, N.V.: Data mining for imbalanced datasets: an overview. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, 2nd edn., pp. 875–886. Springer, New York (2010)

    Google Scholar 

  20. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority oversampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    MATH  Google Scholar 

  21. Cho, K., Merriënboer, B., Gulcehre, C., Bougares, F., Schwenk, H., Bahdanau, D., Bengio, Y.: Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. Computer Research Repository (CoRR). arXiv: 1406.1078 (2014)

    Google Scholar 

  22. Chollet, F., et al.: Keras (2015). https://github.com/fchollet/keras

  23. Chuang, A.: Time series analysis: univariate and multivariate methods. Technometrics 33(1), 108–109 (1991)

    Google Scholar 

  24. Cohen, L.: Time-Frequency Analysis. Prentice-Hall, New York (1995). ISBN 978-0135945322

    Google Scholar 

  25. Covell, M.M., Richardson, J.M.: A new, efficient structure for the short-time Fourier transform, with an application in code-division sonar imaging. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 3, pp. 2041–2044 (1991)

    Google Scholar 

  26. Daubechies, I.: Orthonormal bases of compactly supported wavelets. Commun. Pure Appl. Math. 41(7), 909–996 (1988)

    MathSciNet  MATH  Google Scholar 

  27. Drummond, C., Holte, R.: C4.5, class imbalance, and cost sensitivity: why undersampling beats over-sampling. In: Proceedings of the ICML’03 Workshop on Learning from Imbalanced Data Sets, Washington, DC, USA (2003)

    Google Scholar 

  28. Dudani, S.A.: The distance-weighted k-nearest neighbor rule. IEEE Trans. Syst. Man Cybern. SMC-6(4), 325–327 (1976)

    Google Scholar 

  29. Duhamel, P., Vetterli, M.: Fast Fourier transforms: a tutorial review and a state of the art. Signal Process. 19(4), 259–299 (1990)

    MathSciNet  MATH  Google Scholar 

  30. Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: 13th International Joint Conference on Artificial Intelligence, pp. 1022–1027 (1993)

    Google Scholar 

  31. Ferri, C., Flach, P., Orallo, J., Lachice, N. (eds.): ECAI’2004 First Workshop on ROC Analysis in Artificial Intelligence (2004)

    Google Scholar 

  32. Fleuret, F.: Fast binary feature selection with conditional mutual information. J. Mach. Learn. Res. 5, 1531–1555 (2004)

    MathSciNet  MATH  Google Scholar 

  33. Freedman, D., Diaconis, P.: On the histogram as a density estimator: 2 theory. Probab. Theory Relat. Fields 57(4), 453–476 (1981)

    MathSciNet  MATH  Google Scholar 

  34. Friedman, N., Geiger, D., Goldszchmidt, M.: Bayesian network classifiers. Mach. Learn. 29(2–3), 131–163 (1997)

    MATH  Google Scholar 

  35. Frigo, M., Johnson, S.G.: A modified split-radix FFT with fewer arithmetic operations. IEEE Trans. Signal Process. 55(1), 111–119 (2007)

    MathSciNet  MATH  Google Scholar 

  36. García, S., Luengo, J., Sáez, J.A., López, V., Herrera, F.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2013)

    Google Scholar 

  37. Garvey, D., Wigny, R.: PHM Data Challenge 2014. PHM Society. https://www.phmsociety.org/sites/phmsociety.org/files/PHM14DataChallenge.pdf (2014)

  38. Gelper, S., Schettlinger, K., Croux, C., Gather, U.: Robust online scale estimation in time series: a model-free approach. J. Stat. Plann. Inference 139(2), 335–349 (2008)

    MathSciNet  MATH  Google Scholar 

  39. Gerretzen, J., Szymańska, E., Jansen, J., Bart, J., van Manen, H.-J., van den Heuvel, E.R., Buydens, L.: Simple and effective way for data preprocessing selection based on design of experiments. Anal. Chem. 87(24), 12096–12103 (2015)

    Google Scholar 

  40. Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5–6), 602–610 (2005)

    Google Scholar 

  41. Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28(10), 2222–2232 (2017)

    MathSciNet  Google Scholar 

  42. Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. In: SIGMOD’98, Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pp. 73–84 (1998)

    Google Scholar 

  43. Guo, L., Ma, Y., Cukic, B., Singh, H.: Robust prediction of fault-proneness by random forests. In: 15th International Symposium on Software Reliability Engineering, pp. 417–428 (2004)

    Google Scholar 

  44. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(7–8), 1157–1182 (2003)

    MATH  Google Scholar 

  45. Guyon, I., Elisseeff, A.: An introduction to feature extraction. In: Guyon, I., Nikravesh, M., Gunn, S., Zadeh, L.A. (eds.) Feature Extraction. Studies in Fuzziness and Soft Computing, vol. 207, pp. 1–25. Springer, Berlin/Heidelberg (2006)

    Google Scholar 

  46. Hall, M.A.: Correlation-based feature selection for machine learning. PhD Thesis, University of Waikato, Hamilton (1999)

    Google Scholar 

  47. Hart, P.E.: The condensed nearest neighbor rule. IEEE Trans. Inf. Theory 14, 515–516 (1968)

    Google Scholar 

  48. Hastie, T., Tibshirani, R., Friedman, J.: Pathwise coordinate optimization. Ann. Appl. Stat. 1(2), 302–332 (2007)

    MathSciNet  MATH  Google Scholar 

  49. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer Series in Statistics. Springer, New York (2009)

    Google Scholar 

  50. Hastie, T., Tibshirani, R., Friedman, J.: Regularized paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)

    Google Scholar 

  51. He, H., Bai, Y., García, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: IEEE International Joint Conference on Neural Networks, IEEE World Congress on Computational Intelligence, Hong Kong, pp. 1322–1328 (2008)

    Google Scholar 

  52. He, X., Niyogi, P.: Locality preserving projections. In: Proceedings of the 16th International Conference on Neural Information Processing Systems (NIPS’03), pp. 153–160 (2003)

    Google Scholar 

  53. Hinton, G., Roweis, S.: Stochastic neighbor embedding. In: Proceedings of the 15th International Conference on Neural Information Processing Systems (NIPS’02), pp. 857–864 (2002)

    Google Scholar 

  54. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Google Scholar 

  55. Hofmann, T., Schölkopf, B., Smola, A.J.: Kernel methods in machine learning. Ann. Stat. 36(3), 1171–1220 (2009)

    MathSciNet  MATH  Google Scholar 

  56. Huang, N.E., Shen, Z., Long, S.R., Wu, M.C., Shih, H.H., Zheng, Q., Yen, N.-C., Tung, C.C., Liu, H.H.: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. A Math. Phys. Eng. Sci. 454, 903–995 (1998)

    MathSciNet  MATH  Google Scholar 

  57. Hubert, M., Rousseeuw, P., Branden, K.V.: Robpca: a new approach to robust principal component analysis. Technometrics 47, 64–79 (2005)

    MathSciNet  Google Scholar 

  58. Japkowicz, N.: The Class imbalance problem: significance and strategies. In: Proceedings of the 2000 International Conference on Artificial Intelligence (IC-AI’2000): Special Track on Inductive Learning, pp. 111–117, Las Vegas, Nevada (2000)

    Google Scholar 

  59. Jo, T., Japkowicz, N.: Class imbalances versus small disjuncts. ACM SIGKDD Explor. Newsl. 6(1), 40–49 (2004)

    Google Scholar 

  60. Johnson, N., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics, vol. 2. Wiley, New York (1995)

    Google Scholar 

  61. Jolliffe, I.: Principal Components Analysis. Springer, Berlin/Heidelberg/New York (2002)

    MATH  Google Scholar 

  62. Jung, M., Niculita, O., Skaf, Z.: Comparison of different classification algorithms for fault detection and fault isolation in complex systems. Proc. Manuf. 19, 111–118 (2018)

    Google Scholar 

  63. Kadambe, S., Boudreaux-Bartels, G.F.: A comparison of the existence of cross terms in the Wigner distribution and the squared magnitude of the wavelet transform and the short-time Fourier transform. IEEE Trans. Signal Process. 40(10), 2498–2517 (1992)

    MATH  Google Scholar 

  64. Kalchbrenner, N., Danihelka, I., Graves, A.: Grid Long Short-Term Memory. Computer Research Repository (CoRR). arXiv: 1507.01526 (2015)

    Google Scholar 

  65. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)

    MATH  Google Scholar 

  66. Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: one sided selection. In: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 179–186. Morgan Kaufmann, Nashville, TN (1997)

    Google Scholar 

  67. Kwak, N., Choi, C.: Input feature selection for classification problems. IEEE Trans. Neural Netw. 13(1), 143–159 (2002)

    Google Scholar 

  68. Laurikkala, J.: Improving identification of difficult small classes by balancing class distribution. In: AIME’01, Proceedings of the 8th Conference on Artificial Intelligence in Medicine in Europe, pp. 63–66 (2001)

    Google Scholar 

  69. Li, M.: Fractal time series-a tutorial review. Math. Probl. Eng. 2010, 1–26 (2010)

    MathSciNet  Google Scholar 

  70. Li, D., Deogun, J., Spaulding, W., Shuart, B.: Towards missing data imputation — A study of fuzzy k-means clustering method. In: Tsumoto, S., Sowiski, R., Komorowski, J., Grzymaa-Busse, J. (eds.) Rough Sets and Current Trends in Computing (RSCTC 2004). Lecture Notes in Computer Science, vol. 3066, pp. 573–579. Springer, Berlin/Heidelberg (2004)

    Google Scholar 

  71. Lin, D., Tang, X.: Conditional infomax learning: an integrated framework for feature extraction and fusion. In: Leonardis A., Bischof H., Pinz A. (eds) Computer Vision – ECCV 2006. ECCV 2006. Lecture Notes in Computer Science, vol. 3951, pp. 68–82. Springer, Heidelberg (2006)

    Google Scholar 

  72. Ling, C., Li, C.: Data mining for direct marketing problems and solutions. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), pp. 73–79. AAAI Press, New York, NY (1998)

    Google Scholar 

  73. Loutas, T., Roulias, D., Georgoulas, G.: Remaining useful life estimation in rolling bearings utilizing data-driven probabilistic 𝜖-support vectors regression. IEEE Trans. Reliab. 62(4), 821–832 (2013)

    Google Scholar 

  74. Lughofer, E.: FLEXFIS: a robust incremental learning approach for evolving Takagi-Sugeno fuzzy models. IEEE Trans. Fuzzy Syst. 16(6), 1393–1410 (2008)

    Google Scholar 

  75. Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  76. Maesschalck, R.D., Candolfi, A., Massart, D., Heuerding, S.: Decision criteria for soft independent modelling of class analogy applied to near infrared data. Chemom. Intell. Lab. Syst. 47, 65–77 (1999)

    Google Scholar 

  77. Mahalanobis, P.: On the generalised distance in Statistics. Proc. Natl. Inst. Sci. India 2(1), 49–55 (1936)

    MathSciNet  MATH  Google Scholar 

  78. Mallat, S.G.: A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell. 11(7), 674–693 (1989)

    MATH  Google Scholar 

  79. Maloof, M.: Learning when data sets are imbalanced and when costs are unequal and unknown. In: Proceedings of the ICML’03 Workshop on Learning from Imbalanced Data Sets, Washington, DC (2003)

    Google Scholar 

  80. Mann, H.B., Whitney, D.R.: On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18(1), 50–60 (1947)

    MathSciNet  MATH  Google Scholar 

  81. Nikzad-Langerodi, R., Lughofer, E., Cernuda, C., Reischer, T., Kantner, W., Pawliczek, M., Brandstetter, M.: Calibration model maintenance in melamine resin production: integrating drift detection, smart sample selection and model adaptation. Anal. Chim. Acta 1013, 1–12 (2018)

    Google Scholar 

  82. Nunkesser, R., Fried, R., Schettlinger, K., Gather U.: Online analysis of time series by the Q n estimator. Comput. Stat. Data Anal. 53(6), 2354–2362 (2009)

    MathSciNet  MATH  Google Scholar 

  83. Oba, S., Sato, M., Takemasa, I., Monden, M., Matsubara, K., et al.: A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19, 2088–2096 (2003)

    Google Scholar 

  84. Oliveira, M.A., Araujo, N.V.S., Silva, R.N., Silva, T.I., Epaarachchi, J.: Use of Savitzky-Golay filter for performances improvement of SHM systems based on neural networks and distributed PZT sensors. Sensors 18(1), 152 (2018)

    Google Scholar 

  85. Pedrycz, W., Gomide, F.: Fuzzy Systems Engineering: Toward Human-Centric Computing. Wiley, Hoboken, NJ (2007)

    Google Scholar 

  86. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)

    Google Scholar 

  87. Phua, C., Alahakoon, D.: Minority report in fraud detection: classification of skewed data. ACM SIGKDD Explor. Newsl. 6(1), 50–59 (2004)

    Google Scholar 

  88. Propes, N.C., Rosca, J.: PHM Data Challenge 2016. PHM Society. https://www.phmsociety.org/sites/phmsociety.org/files/PHM16DataChallengeCFP.pdf (2016)

  89. Qiu, G.: An improved recursive median filtering scheme for image processing. IEEE Trans. Image Process. 5(4), 646–648 (1996)

    Google Scholar 

  90. Rezgui, W., Mouss, N.K., Mouss, L.H., Mouss, M.D., Benbouzid, M.: A regression algorithm for the smart prognosis of a reversed polarity fault in a photovoltaic generator. In: 2014 International Conference on Green Energy, pp. 134–138 (2014)

    Google Scholar 

  91. Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Google Scholar 

  92. Rubin, D.B.: Multiple Imputation for Nonresponse in Survey, vol. 1. Wiley, New York (2008)

    Google Scholar 

  93. Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)

    Google Scholar 

  94. Said, S.E., Dickey, D.A.: Testing for unit roots in autoregressive-moving average models of unknown order. Biometrika 71(3), 599–607 (1984)

    MathSciNet  MATH  Google Scholar 

  95. Sakia, R.M.: The Box-Cox transformation technique: a review. Statistician 41(2), 169–178 (1992)

    Google Scholar 

  96. Savitzky, A., Golay, M.J.E.: Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 36(8), 1627–1639 (1964)

    Google Scholar 

  97. Schölkopf, B., Smola, A., Müller, K.R.: Kernel principal component analysis. In: Gerstner W., Germond A., Hasler M., Nicoud JD. (eds.) Artificial Neural Networks – ICANN’97. Lecture Notes in Computer Science, vol. 1327. Springer, Berlin/Heidelberg (1997)

    Google Scholar 

  98. Schölkopf, B., Smola, A.J.: Learning with Kernels - Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, London (2002)

    Google Scholar 

  99. Serdio, F., Lughofer, E., Pichler, K., Buchegger, T., Pichler, M., Efendic, H.: Multivariate Fault Detection Using Vector Autoregressive Moving Average and Orthogonal Transformation in Residual Space. In: 2013 Annual Conference of the Prognostics and Health Management (PHM) Society, New Orleans, LA, pp. 1–8 (2013)

    Google Scholar 

  100. Serdio, F., Lughofer, E., Pichler, K., Buchegger, T., Efendic, H.: Residual-based fault detection using soft computing techniques for condition monitoring at rolling mills. Inf. Sci. 259, 304–320 (2014)

    Google Scholar 

  101. Serdio, F., Lughofer, E., Pichler, K., Buchegger, T., Pichler, M., Efendic, H.: Fault detection in multi-sensor networks based on multivariate time-series models and orthogonal transformations. Inf. Fusion 20, 272–291 (2014)

    MATH  Google Scholar 

  102. Serdio, F., Lughofer, E., Zavoianu, A.C., Pichler, K., Buchegger, T., Pichler, M., Efendic, H.: Improved fault detection employing hybrid memetic fuzzy modeling and adaptive filters. Appl. Soft Comput. 51, 60–82 (2017)

    Google Scholar 

  103. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)

    MathSciNet  MATH  Google Scholar 

  104. Sharpley, R.C., Vatchev, V.: Analysis of the intrinsic mode functions. Constr. Approx. 24(1), 17–47 (2006)

    MathSciNet  MATH  Google Scholar 

  105. Silverman, B.W., Jones, M.C.: An important contribution to nonparametric discriminant analysis and density estimation: commentary on Fix and Hodges (1951). Int. Stat. Rev. 57(3), 233–238 (1989)

    MATH  Google Scholar 

  106. Smith, M.R., Martínez, T., Giraud-Carrier, C.: An instance level analysis of data complexity. Mach. Learn. 95(2), 225–256 (2014)

    MathSciNet  Google Scholar 

  107. Smola, A.J., Schölkopf, B.: A tutorial on support vector regression. Stat. Comput. 14, 199–222 (2004)

    MathSciNet  Google Scholar 

  108. Solberg, A. H., Solberg, R.: A large-scale evaluation of features for automatic detection of oil spills in ERS SAR images. In: International Geoscience and Remote Sensing Symposium, pp. 1484–1486 (1996)

    Google Scholar 

  109. Tan, L., Jiang, J.: Digital Signal Processing: Fundamentals and Applications, 2nd edn. Academic/Elsevier, New York (2013)

    Google Scholar 

  110. Tenenbaum, J.B., Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)

    Google Scholar 

  111. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  112. Tomek, I.: Two modifications of CNN. IEEE Trans. Syst. Man Cybern. 6, 769–772 (1976)

    MathSciNet  MATH  Google Scholar 

  113. Troyanskaya, O., Cantor, M., Sherlock, G, Brown, P., Hastie, T., et al.: Missing value estimation methods for DNA microarrays. Bioinformatics. 17, 520–525 (2001)

    Google Scholar 

  114. Tschumitschew, K., Klawonn, F.: Incremental quantile estimation. Evol. Syst. 1(4), 253–264 (2010)

    Google Scholar 

  115. Vapnik, V: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

  116. Varmuza, K., Filzmoser, P.: Introduction to Multivariate Statistical Analysis in Chemometrics. CRC Press, Boca Raton (2009)

    Google Scholar 

  117. Vidal-Naquet, M., Ullman, S.: Object recognition with informative features and linear classification. In: 9th IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 281–288 (2003)

    Google Scholar 

  118. Ville, J.: Théorie et Applications de la Notion de Signal Analytique. Câbles et Transm. 2, 61–74 (1948)

    Google Scholar 

  119. Wang, C., Zhang, Y., Zhong, Z.: Fault diagnosis for diesel valve trains based on time–frequency images. Mech. Syst. Signal Process. 22(8), 1981–1993 (2008)

    Google Scholar 

  120. Weaver, H.J.: Applications of Discrete and Continuous Fourier Analysis. Wiley, New York (1983)

    MATH  Google Scholar 

  121. Weiss, G., Provost, F.: Learning when training data are costly: the effect of class distribution on tree induction. J. Artif. Intell. Res. 19, 315–354 (2003)

    MATH  Google Scholar 

  122. Welch, P.: The use of fast Fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Trans. Audio Electroacoust. 15(2), 70–73 (1967)

    Google Scholar 

  123. Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern. SMC-2(3), 408–421 (1972)

    MathSciNet  MATH  Google Scholar 

  124. Wolpert, D., Macready, W.: No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1(1), 67–82 (1997)

    Google Scholar 

  125. Wu, T.Y., Chen, J., Wang, C.X.: Characterization of gear faults in variable rotating speed using Hilbert-Huang transform and instantaneous dimensionless frequency normalization. Mech. Syst. Signal Process. 30, 103–122 (2012)

    Google Scholar 

  126. Wu, D., Jennings, C., Terpenny, J., Gao, R., Kumara, S.: A comparative study on machine learning algorithms for smart manufacturing: tool wear prediction using random forests. J. Manuf. Sci. Eng. 139(7), 071018 (2017)

    Google Scholar 

  127. Yang, H., Moody, J.: Data visualization and feature selection: new algorithms for nongaussian data. Adv. Neural Inf. Process. Syst. 12, 687–693 (1999)

    Google Scholar 

  128. Yang, B.S., Di, X., Han, T.: Random forests classifier for machine fault diagnosis. J. Mech. Sci. Technol. 22, 1716–1725 (2008)

    Google Scholar 

  129. Yao, K., Cohn, T. Vylomova, K., Duh, K., Dyer, C.: Depth-Gated Long Short-Term Memory. Computer Research Repository (CoRR). arXiv: 1508.03790 (2015)

    Google Scholar 

  130. Zavoianu, A.C., Lughofer, E., Bramerdorfer, G., Amrhein, W., Klement, E.P.: An effective ensemble-based method for creating on-the-fly surrogate fitness functions for multi-objective evolutionary algorithms. In: International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2013), pp. 235–242 (2013)

    Google Scholar 

  131. Zhang, J., Mani, I.: kNN approach to unbalanced data distributions: a case study involving information extraction. In: Proceedings of the ICML’2003 Workshop on Learning from Imbalanced Datasets, Washington, DC, USA (2003)

    Google Scholar 

  132. Zhang, L., Xiong, G., Liu, H., Zou, H., Guo, W.: Bearing fault diagnosis using multi-scale entropy and adaptive neuro-fuzzy inference. Expert Syst. Appl. 37(8), 6077–6085 (2010)

    Google Scholar 

  133. Zou, H. Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. 67(2), 301–320 (2005)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This research is supported by the Basque Government through the BERC 2018–2021 and ELKARTEK programs and through project KK-2018/00071; and by Spanish Ministry of Economy and Competitiveness MINECO through BCAM Severo Ochoa excellence accreditation SEV-2017-0718, and through project TIN2017-82626-R.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carlos Cernuda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Cernuda, C. (2019). On the Relevance of Preprocessing in Predictive Maintenance for Dynamic Systems. In: Lughofer, E., Sayed-Mouchaweh, M. (eds) Predictive Maintenance in Dynamic Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-05645-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05645-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05644-5

  • Online ISBN: 978-3-030-05645-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics