On the Relevance of Preprocessing in Predictive Maintenance for Dynamic Systems

Cernuda, Carlos

doi:10.1007/978-3-030-05645-2_3

Carlos Cernuda^3,4

5399 Accesses
7 Citations

Abstract

The complexity involved in the process of real-time data-driven monitoring dynamic systems for predicted maintenance is usually huge. Up to certain extent, any data-driven approach is sensitive to data preprocessing, understood as any data treatment prior to the application of the monitoring model, being sometimes crucial for the final development of the employed monitoring technique. The aim of this work is to quantify the sensitiveness of data-driven predictive maintenance models in dynamic systems in an exhaustive way.

We consider a couple of predictive maintenance scenarios, each of them defined by some public available data. For each scenario, we consider its properties and apply several techniques for each of the successive preprocessing steps, e.g., data cleaning, missing values treatment, outlier detection, feature selection, or imbalance compensation. The pretreatment configurations, i.e., sequential combinations of techniques from different preprocessing steps, are considered together with different monitoring approaches, in order to determine the relevance of data preprocessing for predictive maintenance in dynamical systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
It is not irrelevant when the treatment happens, because there are several steps between data cleansing and feature engineering that could be very sensitive to redundancies or heavily affected by features that in the end are irrelevant.

References

Bartlett, M.: An inverse matrix adjustment arising in discriminant analysis. Ann. Math. Stat. 22(1), 107–111 (1951)
MathSciNet MATH Google Scholar
Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newsl. 6(1), 20–29 (2004)
Google Scholar
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)
Google Scholar
Benkedjouh, T., Medjaher, K., Zerhouni, N., Rechak, S.: Remaining useful life estimation based on nonlinear feature reduction and support vector regression. Eng. Appl. Artif. Intell. 26(7), 1751–1760 (2013)
Google Scholar
Box, G.E.P., Cox, D.R.: An analysis of transformations. J. R. Stat. Soc. Ser. B 26(2), 211–252 (1964)
MATH Google Scholar
Bracewell, R.N.: The Fourier Transform and Its Applications, 3rd edn. McGraw-Hill, Boston (2000). ISBN 0-07-116043-4
MATH Google Scholar
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 30(7), 1145–1159 (1997)
Google Scholar
Branden, K.V., Hubert, M.: Robust classification in high dimensions based on the SIMCA method. Chemom. Intell. Lab. Syst. 79, 10–21 (2005)
Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MATH Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
MATH Google Scholar
Brown, G.: A new perspective for information theoretic feature selection. J. Mach. Learn. Res. 13, 27–66 (2012)
MathSciNet Google Scholar
Cabrera, D., Sancho, F., Sánchez, R.V., Zurita, G., Cerrada, M., Li, C., Vásquez, R.E.: Fault diagnosis of spur gearbox based on random forest and wavelet packet decomposition. Front. Mech. Eng. 10(3), 277–286 (2015)
Google Scholar
Cernuda, C., Lughofer, E., Märzinger, W., Kasberger, J.: NIR-based quantification of process parameters in polyetheracrylat (PEA) production using flexible non-linear fuzzy systems. Chemom. Intell. Lab. Syst. 109(1), 22–33 (2011)
Google Scholar
Cernuda, C., Lughofer, E., Suppan, L., Röder, T., Schmuck, R., Hintenaus, P., Märzinger, W., Kasberger, J.: Evolving chemometric models for predicting dynamic process parameters in viscose production. Anal. Chim. Acta 725, 22–38 (2012)
Google Scholar
Cernuda, C., Lughofer, E., Hintenaus, P., Märzinger, Reischer, T., Pawliczek, M., W., Kasberger, J.: Hybrid adaptive calibration methods and ensemble strategy for prediction of cloud point in melamine resin production. Chemom. Intell. Lab. Syst. 126, 60–75 (2013)
Google Scholar
Cernuda, C., Lughofer, E., Mayr, G., Röder, T., Hintenaus, P., Märzinger, W., Kasberger, J.: Incremental and decremental active learning for optimized self-adaptive calibration in viscose production. Chemom. Intell. Lab. Syst. 138, 14–29 (2014)
Google Scholar
Cernuda, C., Lughofer, E., Klein, H., Forster, C., Pawliczek, M., Brandstetter, M.: Improved quantification of important beer quality parameters based on nonlinear calibration methods applied to FT-MIR spectra. Anal. Bioanal. Chem. 409(3), 841–857 (2017)
Google Scholar
Chawla, N.V.: C4.5 and imbalanced data sets: investigating the effect of sampling method, probabilistic estimate, and decision tree structure. In: Proceedings of the ICML’03 Workshop on Learning from Imbalanced Data sets, Washington, DC, USA (2003)
Google Scholar
Chawla, N.V.: Data mining for imbalanced datasets: an overview. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, 2nd edn., pp. 875–886. Springer, New York (2010)
Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority oversampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
MATH Google Scholar
Cho, K., Merriënboer, B., Gulcehre, C., Bougares, F., Schwenk, H., Bahdanau, D., Bengio, Y.: Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. Computer Research Repository (CoRR). arXiv: 1406.1078 (2014)
Google Scholar
Chollet, F., et al.: Keras (2015). https://github.com/fchollet/keras
Chuang, A.: Time series analysis: univariate and multivariate methods. Technometrics 33(1), 108–109 (1991)
Google Scholar
Cohen, L.: Time-Frequency Analysis. Prentice-Hall, New York (1995). ISBN 978-0135945322
Google Scholar
Covell, M.M., Richardson, J.M.: A new, efficient structure for the short-time Fourier transform, with an application in code-division sonar imaging. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 3, pp. 2041–2044 (1991)
Google Scholar
Daubechies, I.: Orthonormal bases of compactly supported wavelets. Commun. Pure Appl. Math. 41(7), 909–996 (1988)
MathSciNet MATH Google Scholar
Drummond, C., Holte, R.: C4.5, class imbalance, and cost sensitivity: why undersampling beats over-sampling. In: Proceedings of the ICML’03 Workshop on Learning from Imbalanced Data Sets, Washington, DC, USA (2003)
Google Scholar
Dudani, S.A.: The distance-weighted k-nearest neighbor rule. IEEE Trans. Syst. Man Cybern. SMC-6(4), 325–327 (1976)
Google Scholar
Duhamel, P., Vetterli, M.: Fast Fourier transforms: a tutorial review and a state of the art. Signal Process. 19(4), 259–299 (1990)
MathSciNet MATH Google Scholar
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: 13th International Joint Conference on Artificial Intelligence, pp. 1022–1027 (1993)
Google Scholar
Ferri, C., Flach, P., Orallo, J., Lachice, N. (eds.): ECAI’2004 First Workshop on ROC Analysis in Artificial Intelligence (2004)
Google Scholar
Fleuret, F.: Fast binary feature selection with conditional mutual information. J. Mach. Learn. Res. 5, 1531–1555 (2004)
MathSciNet MATH Google Scholar
Freedman, D., Diaconis, P.: On the histogram as a density estimator: ℓ ₂ theory. Probab. Theory Relat. Fields 57(4), 453–476 (1981)
MathSciNet MATH Google Scholar
Friedman, N., Geiger, D., Goldszchmidt, M.: Bayesian network classifiers. Mach. Learn. 29(2–3), 131–163 (1997)
MATH Google Scholar
Frigo, M., Johnson, S.G.: A modified split-radix FFT with fewer arithmetic operations. IEEE Trans. Signal Process. 55(1), 111–119 (2007)
MathSciNet MATH Google Scholar
García, S., Luengo, J., Sáez, J.A., López, V., Herrera, F.: A survey of discretization techniques: taxonomy and empirical analysis in supervised learning. IEEE Trans. Knowl. Data Eng. 25(4), 734–750 (2013)
Google Scholar
Garvey, D., Wigny, R.: PHM Data Challenge 2014. PHM Society. https://www.phmsociety.org/sites/phmsociety.org/files/PHM14DataChallenge.pdf (2014)
Gelper, S., Schettlinger, K., Croux, C., Gather, U.: Robust online scale estimation in time series: a model-free approach. J. Stat. Plann. Inference 139(2), 335–349 (2008)
MathSciNet MATH Google Scholar
Gerretzen, J., Szymańska, E., Jansen, J., Bart, J., van Manen, H.-J., van den Heuvel, E.R., Buydens, L.: Simple and effective way for data preprocessing selection based on design of experiments. Anal. Chem. 87(24), 12096–12103 (2015)
Google Scholar
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5–6), 602–610 (2005)
Google Scholar
Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28(10), 2222–2232 (2017)
MathSciNet Google Scholar
Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. In: SIGMOD’98, Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pp. 73–84 (1998)
Google Scholar
Guo, L., Ma, Y., Cukic, B., Singh, H.: Robust prediction of fault-proneness by random forests. In: 15th International Symposium on Software Reliability Engineering, pp. 417–428 (2004)
Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(7–8), 1157–1182 (2003)
MATH Google Scholar
Guyon, I., Elisseeff, A.: An introduction to feature extraction. In: Guyon, I., Nikravesh, M., Gunn, S., Zadeh, L.A. (eds.) Feature Extraction. Studies in Fuzziness and Soft Computing, vol. 207, pp. 1–25. Springer, Berlin/Heidelberg (2006)
Google Scholar
Hall, M.A.: Correlation-based feature selection for machine learning. PhD Thesis, University of Waikato, Hamilton (1999)
Google Scholar
Hart, P.E.: The condensed nearest neighbor rule. IEEE Trans. Inf. Theory 14, 515–516 (1968)
Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: Pathwise coordinate optimization. Ann. Appl. Stat. 1(2), 302–332 (2007)
MathSciNet MATH Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer Series in Statistics. Springer, New York (2009)
Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: Regularized paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)
Google Scholar
He, H., Bai, Y., García, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: IEEE International Joint Conference on Neural Networks, IEEE World Congress on Computational Intelligence, Hong Kong, pp. 1322–1328 (2008)
Google Scholar
He, X., Niyogi, P.: Locality preserving projections. In: Proceedings of the 16th International Conference on Neural Information Processing Systems (NIPS’03), pp. 153–160 (2003)
Google Scholar
Hinton, G., Roweis, S.: Stochastic neighbor embedding. In: Proceedings of the 15th International Conference on Neural Information Processing Systems (NIPS’02), pp. 857–864 (2002)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Google Scholar
Hofmann, T., Schölkopf, B., Smola, A.J.: Kernel methods in machine learning. Ann. Stat. 36(3), 1171–1220 (2009)
MathSciNet MATH Google Scholar
Huang, N.E., Shen, Z., Long, S.R., Wu, M.C., Shih, H.H., Zheng, Q., Yen, N.-C., Tung, C.C., Liu, H.H.: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. A Math. Phys. Eng. Sci. 454, 903–995 (1998)
MathSciNet MATH Google Scholar
Hubert, M., Rousseeuw, P., Branden, K.V.: Robpca: a new approach to robust principal component analysis. Technometrics 47, 64–79 (2005)
MathSciNet Google Scholar
Japkowicz, N.: The Class imbalance problem: significance and strategies. In: Proceedings of the 2000 International Conference on Artificial Intelligence (IC-AI’2000): Special Track on Inductive Learning, pp. 111–117, Las Vegas, Nevada (2000)
Google Scholar
Jo, T., Japkowicz, N.: Class imbalances versus small disjuncts. ACM SIGKDD Explor. Newsl. 6(1), 40–49 (2004)
Google Scholar
Johnson, N., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics, vol. 2. Wiley, New York (1995)
Google Scholar
Jolliffe, I.: Principal Components Analysis. Springer, Berlin/Heidelberg/New York (2002)
MATH Google Scholar
Jung, M., Niculita, O., Skaf, Z.: Comparison of different classification algorithms for fault detection and fault isolation in complex systems. Proc. Manuf. 19, 111–118 (2018)
Google Scholar
Kadambe, S., Boudreaux-Bartels, G.F.: A comparison of the existence of cross terms in the Wigner distribution and the squared magnitude of the wavelet transform and the short-time Fourier transform. IEEE Trans. Signal Process. 40(10), 2498–2517 (1992)
MATH Google Scholar
Kalchbrenner, N., Danihelka, I., Graves, A.: Grid Long Short-Term Memory. Computer Research Repository (CoRR). arXiv: 1507.01526 (2015)
Google Scholar
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)
MATH Google Scholar
Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: one sided selection. In: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 179–186. Morgan Kaufmann, Nashville, TN (1997)
Google Scholar
Kwak, N., Choi, C.: Input feature selection for classification problems. IEEE Trans. Neural Netw. 13(1), 143–159 (2002)
Google Scholar
Laurikkala, J.: Improving identification of difficult small classes by balancing class distribution. In: AIME’01, Proceedings of the 8th Conference on Artificial Intelligence in Medicine in Europe, pp. 63–66 (2001)
Google Scholar
Li, M.: Fractal time series-a tutorial review. Math. Probl. Eng. 2010, 1–26 (2010)
MathSciNet Google Scholar
Li, D., Deogun, J., Spaulding, W., Shuart, B.: Towards missing data imputation — A study of fuzzy k-means clustering method. In: Tsumoto, S., Sowiski, R., Komorowski, J., Grzymaa-Busse, J. (eds.) Rough Sets and Current Trends in Computing (RSCTC 2004). Lecture Notes in Computer Science, vol. 3066, pp. 573–579. Springer, Berlin/Heidelberg (2004)
Google Scholar
Lin, D., Tang, X.: Conditional infomax learning: an integrated framework for feature extraction and fusion. In: Leonardis A., Bischof H., Pinz A. (eds) Computer Vision – ECCV 2006. ECCV 2006. Lecture Notes in Computer Science, vol. 3951, pp. 68–82. Springer, Heidelberg (2006)
Google Scholar
Ling, C., Li, C.: Data mining for direct marketing problems and solutions. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), pp. 73–79. AAAI Press, New York, NY (1998)
Google Scholar
Loutas, T., Roulias, D., Georgoulas, G.: Remaining useful life estimation in rolling bearings utilizing data-driven probabilistic 𝜖-support vectors regression. IEEE Trans. Reliab. 62(4), 821–832 (2013)
Google Scholar
Lughofer, E.: FLEXFIS: a robust incremental learning approach for evolving Takagi-Sugeno fuzzy models. IEEE Trans. Fuzzy Syst. 16(6), 1393–1410 (2008)
Google Scholar
Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar
Maesschalck, R.D., Candolfi, A., Massart, D., Heuerding, S.: Decision criteria for soft independent modelling of class analogy applied to near infrared data. Chemom. Intell. Lab. Syst. 47, 65–77 (1999)
Google Scholar
Mahalanobis, P.: On the generalised distance in Statistics. Proc. Natl. Inst. Sci. India 2(1), 49–55 (1936)
MathSciNet MATH Google Scholar
Mallat, S.G.: A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell. 11(7), 674–693 (1989)
MATH Google Scholar
Maloof, M.: Learning when data sets are imbalanced and when costs are unequal and unknown. In: Proceedings of the ICML’03 Workshop on Learning from Imbalanced Data Sets, Washington, DC (2003)
Google Scholar
Mann, H.B., Whitney, D.R.: On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18(1), 50–60 (1947)
MathSciNet MATH Google Scholar
Nikzad-Langerodi, R., Lughofer, E., Cernuda, C., Reischer, T., Kantner, W., Pawliczek, M., Brandstetter, M.: Calibration model maintenance in melamine resin production: integrating drift detection, smart sample selection and model adaptation. Anal. Chim. Acta 1013, 1–12 (2018)
Google Scholar
Nunkesser, R., Fried, R., Schettlinger, K., Gather U.: Online analysis of time series by the Q _n estimator. Comput. Stat. Data Anal. 53(6), 2354–2362 (2009)
MathSciNet MATH Google Scholar
Oba, S., Sato, M., Takemasa, I., Monden, M., Matsubara, K., et al.: A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19, 2088–2096 (2003)
Google Scholar
Oliveira, M.A., Araujo, N.V.S., Silva, R.N., Silva, T.I., Epaarachchi, J.: Use of Savitzky-Golay filter for performances improvement of SHM systems based on neural networks and distributed PZT sensors. Sensors 18(1), 152 (2018)
Google Scholar
Pedrycz, W., Gomide, F.: Fuzzy Systems Engineering: Toward Human-Centric Computing. Wiley, Hoboken, NJ (2007)
Google Scholar
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)
Google Scholar
Phua, C., Alahakoon, D.: Minority report in fraud detection: classification of skewed data. ACM SIGKDD Explor. Newsl. 6(1), 50–59 (2004)
Google Scholar
Propes, N.C., Rosca, J.: PHM Data Challenge 2016. PHM Society. https://www.phmsociety.org/sites/phmsociety.org/files/PHM16DataChallengeCFP.pdf (2016)
Qiu, G.: An improved recursive median filtering scheme for image processing. IEEE Trans. Image Process. 5(4), 646–648 (1996)
Google Scholar
Rezgui, W., Mouss, N.K., Mouss, L.H., Mouss, M.D., Benbouzid, M.: A regression algorithm for the smart prognosis of a reversed polarity fault in a photovoltaic generator. In: 2014 International Conference on Green Energy, pp. 134–138 (2014)
Google Scholar
Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Google Scholar
Rubin, D.B.: Multiple Imputation for Nonresponse in Survey, vol. 1. Wiley, New York (2008)
Google Scholar
Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Google Scholar
Said, S.E., Dickey, D.A.: Testing for unit roots in autoregressive-moving average models of unknown order. Biometrika 71(3), 599–607 (1984)
MathSciNet MATH Google Scholar
Sakia, R.M.: The Box-Cox transformation technique: a review. Statistician 41(2), 169–178 (1992)
Google Scholar
Savitzky, A., Golay, M.J.E.: Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 36(8), 1627–1639 (1964)
Google Scholar
Schölkopf, B., Smola, A., Müller, K.R.: Kernel principal component analysis. In: Gerstner W., Germond A., Hasler M., Nicoud JD. (eds.) Artificial Neural Networks – ICANN’97. Lecture Notes in Computer Science, vol. 1327. Springer, Berlin/Heidelberg (1997)
Google Scholar
Schölkopf, B., Smola, A.J.: Learning with Kernels - Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, London (2002)
Google Scholar
Serdio, F., Lughofer, E., Pichler, K., Buchegger, T., Pichler, M., Efendic, H.: Multivariate Fault Detection Using Vector Autoregressive Moving Average and Orthogonal Transformation in Residual Space. In: 2013 Annual Conference of the Prognostics and Health Management (PHM) Society, New Orleans, LA, pp. 1–8 (2013)
Google Scholar
Serdio, F., Lughofer, E., Pichler, K., Buchegger, T., Efendic, H.: Residual-based fault detection using soft computing techniques for condition monitoring at rolling mills. Inf. Sci. 259, 304–320 (2014)
Google Scholar
Serdio, F., Lughofer, E., Pichler, K., Buchegger, T., Pichler, M., Efendic, H.: Fault detection in multi-sensor networks based on multivariate time-series models and orthogonal transformations. Inf. Fusion 20, 272–291 (2014)
MATH Google Scholar
Serdio, F., Lughofer, E., Zavoianu, A.C., Pichler, K., Buchegger, T., Pichler, M., Efendic, H.: Improved fault detection employing hybrid memetic fuzzy modeling and adaptive filters. Appl. Soft Comput. 51, 60–82 (2017)
Google Scholar
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)
MathSciNet MATH Google Scholar
Sharpley, R.C., Vatchev, V.: Analysis of the intrinsic mode functions. Constr. Approx. 24(1), 17–47 (2006)
MathSciNet MATH Google Scholar
Silverman, B.W., Jones, M.C.: An important contribution to nonparametric discriminant analysis and density estimation: commentary on Fix and Hodges (1951). Int. Stat. Rev. 57(3), 233–238 (1989)
MATH Google Scholar
Smith, M.R., Martínez, T., Giraud-Carrier, C.: An instance level analysis of data complexity. Mach. Learn. 95(2), 225–256 (2014)
MathSciNet Google Scholar
Smola, A.J., Schölkopf, B.: A tutorial on support vector regression. Stat. Comput. 14, 199–222 (2004)
MathSciNet Google Scholar
Solberg, A. H., Solberg, R.: A large-scale evaluation of features for automatic detection of oil spills in ERS SAR images. In: International Geoscience and Remote Sensing Symposium, pp. 1484–1486 (1996)
Google Scholar
Tan, L., Jiang, J.: Digital Signal Processing: Fundamentals and Applications, 2nd edn. Academic/Elsevier, New York (2013)
Google Scholar
Tenenbaum, J.B., Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996)
MathSciNet MATH Google Scholar
Tomek, I.: Two modifications of CNN. IEEE Trans. Syst. Man Cybern. 6, 769–772 (1976)
MathSciNet MATH Google Scholar
Troyanskaya, O., Cantor, M., Sherlock, G, Brown, P., Hastie, T., et al.: Missing value estimation methods for DNA microarrays. Bioinformatics. 17, 520–525 (2001)
Google Scholar
Tschumitschew, K., Klawonn, F.: Incremental quantile estimation. Evol. Syst. 1(4), 253–264 (2010)
Google Scholar
Vapnik, V: Statistical Learning Theory. Wiley, New York (1998)
MATH Google Scholar
Varmuza, K., Filzmoser, P.: Introduction to Multivariate Statistical Analysis in Chemometrics. CRC Press, Boca Raton (2009)
Google Scholar
Vidal-Naquet, M., Ullman, S.: Object recognition with informative features and linear classification. In: 9th IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 281–288 (2003)
Google Scholar
Ville, J.: Théorie et Applications de la Notion de Signal Analytique. Câbles et Transm. 2, 61–74 (1948)
Google Scholar
Wang, C., Zhang, Y., Zhong, Z.: Fault diagnosis for diesel valve trains based on time–frequency images. Mech. Syst. Signal Process. 22(8), 1981–1993 (2008)
Google Scholar
Weaver, H.J.: Applications of Discrete and Continuous Fourier Analysis. Wiley, New York (1983)
MATH Google Scholar
Weiss, G., Provost, F.: Learning when training data are costly: the effect of class distribution on tree induction. J. Artif. Intell. Res. 19, 315–354 (2003)
MATH Google Scholar
Welch, P.: The use of fast Fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Trans. Audio Electroacoust. 15(2), 70–73 (1967)
Google Scholar
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern. SMC-2(3), 408–421 (1972)
MathSciNet MATH Google Scholar
Wolpert, D., Macready, W.: No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1(1), 67–82 (1997)
Google Scholar
Wu, T.Y., Chen, J., Wang, C.X.: Characterization of gear faults in variable rotating speed using Hilbert-Huang transform and instantaneous dimensionless frequency normalization. Mech. Syst. Signal Process. 30, 103–122 (2012)
Google Scholar
Wu, D., Jennings, C., Terpenny, J., Gao, R., Kumara, S.: A comparative study on machine learning algorithms for smart manufacturing: tool wear prediction using random forests. J. Manuf. Sci. Eng. 139(7), 071018 (2017)
Google Scholar
Yang, H., Moody, J.: Data visualization and feature selection: new algorithms for nongaussian data. Adv. Neural Inf. Process. Syst. 12, 687–693 (1999)
Google Scholar
Yang, B.S., Di, X., Han, T.: Random forests classifier for machine fault diagnosis. J. Mech. Sci. Technol. 22, 1716–1725 (2008)
Google Scholar
Yao, K., Cohn, T. Vylomova, K., Duh, K., Dyer, C.: Depth-Gated Long Short-Term Memory. Computer Research Repository (CoRR). arXiv: 1508.03790 (2015)
Google Scholar
Zavoianu, A.C., Lughofer, E., Bramerdorfer, G., Amrhein, W., Klement, E.P.: An effective ensemble-based method for creating on-the-fly surrogate fitness functions for multi-objective evolutionary algorithms. In: International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2013), pp. 235–242 (2013)
Google Scholar
Zhang, J., Mani, I.: kNN approach to unbalanced data distributions: a case study involving information extraction. In: Proceedings of the ICML’2003 Workshop on Learning from Imbalanced Datasets, Washington, DC, USA (2003)
Google Scholar
Zhang, L., Xiong, G., Liu, H., Zou, H., Guo, W.: Bearing fault diagnosis using multi-scale entropy and adaptive neuro-fuzzy inference. Expert Syst. Appl. 37(8), 6077–6085 (2010)
Google Scholar
Zou, H. Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. 67(2), 301–320 (2005)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This research is supported by the Basque Government through the BERC 2018–2021 and ELKARTEK programs and through project KK-2018/00071; and by Spanish Ministry of Economy and Competitiveness MINECO through BCAM Severo Ochoa excellence accreditation SEV-2017-0718, and through project TIN2017-82626-R.

Author information

Authors and Affiliations

BCAM - Basque Center for Applied Mathematics, Bilbao, Spain
Carlos Cernuda
Faculty of Engineering (MU-ENG), Mondragon Unibertsitatea, Arrasate, Spain
Carlos Cernuda

Authors

Carlos Cernuda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Carlos Cernuda .

Editor information

Editors and Affiliations

Fuzzy Logic Laboratorium Linz-Hagenberg, Department of Knowledge-Based Mathematical Systems, Johannes Kepler University Linz, Linz, Austria
Edwin Lughofer
Institute Mines-Telecom Lille Douai, Douai, France
Moamar Sayed-Mouchaweh

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cernuda, C. (2019). On the Relevance of Preprocessing in Predictive Maintenance for Dynamic Systems. In: Lughofer, E., Sayed-Mouchaweh, M. (eds) Predictive Maintenance in Dynamic Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-05645-2_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-05645-2_3
Published: 01 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05644-5
Online ISBN: 978-3-030-05645-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics