Abstract
One of the major targets in industry is minimising the downtime of a machine while maximising its availability, with maintenance considered as a key aspect towards achieving this objective. Condition based maintenance and prognostics and health management, which relies on the concepts of diagnostics and prognostics, is a policy that has been gaining ground over several years. The successful implementation of this methodology is heavily dependent on the quality of data used which can be undermined in scenarios where there is missing data. This issue may compromise the information contained within a data set, thus having a significant effect on the conclusions that can be drawn, hence it is important to find suitable techniques to address this matter. To date a number of methods to recover such data, called imputation techniques, have been proposed. This paper reviews the most widely used methodologies and presents a case study using actual industrial centrifugal compressor data, in order to identify the most suitable technique.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Kothamasu R, Huang SH, VerDuin WH (2006) System health monitoring and prognostics—a review of current paradigms and practices. Int J Adv Manuf Technol 28(9–10):1012–1024. doi:10.1007/s00170-004-2131-6
Lee J, Wu F, Zhao W, Ghaffari M, Liao L, Siegel D (2014) Prognostics and health management design for rotary machinery systems—reviews, methodology and applications. Mech Syst Signal Process 42(1–2):314–334. doi:10.1016/j.ymssp.2013.06.004
Vachtsevanos G, Lewis F, Roemer M, Hess A, Wu B (2006) Intelligent fault diagnosis and prognosis for engineering systems. Wiley, Hoboken, NJ. doi:10.1002/9780470117842
Jardine AKS, Lin D, Banjevic D (2006) A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mech Syst Signal Process 20(7):1483–1510. doi:10.1016/j.ymssp.2005.09.012
Peng Y, Dong M, Zuo MJ (2010) Current status of machine prognostics in condition-based maintenance: a review. Int J Adv Manuf Technol 50(1–4):297–313. doi:10.1007/s00170-009-2482-0
Sikorska JZ, Hodkiewicz M, Ma L (2011) Prognostic modelling options for remaining useful life estimation by industry. Mech Syst Signal Process 25(5):1803–1836. doi:10.1016/j.ymssp.2010.11.018
Brown ML, Kros JF (2003) Data mining and the impact of missing data. Ind Manag Data Syst 103(8):611–621. doi:10.1108/02635570310497657
Pantanowitz A, Marwala T (2009) Evaluating the impact of missing data imputation. In: Advanced data mining and applications. Lecture Notes in Computer Science, vol 5678, pp 577–586. doi:10.1007/978
McKnight PE, McKnight KM, Souraya Sidani AJF (2007) Missing data: a gentle introduction. The Guilford Press, New York
Acock AC (2005) Working with missing values. J Marriage Fam 67(4):1012–1028. doi:10.1111/j.1741-3737.2005.00191.x
Baraldi AN, Enders CK (2010) An introduction to modern missing data analyses. J Sch Psychol 48(1):5–37. doi:10.1016/j.jsp.2009.10.001
Batista GEAPA, Monard MC (2003) An analysis of four missing data treatment methods for supervised learning. Appl Artif Intell. doi:10.1080/713827181
Donders a RT, van der Heijden GJMG, Stijnen T, Moons KGM (2006) Review: a gentle introduction to imputation of missing values. J Clin Epidemiol 59(10):1087–1091. doi:10.1016/j.jclinepi.2006.01.014
Enders CK (2001) A primer on maximum likelihood algorithms available for use with missing data. Struct Equ Model Multidiscip J 8(1):128–141. doi:10.1207/S15328007SEM0801_7
Enders CK (2010) Applied missing data analysis. The Guilford Press, New York
Graham JW (2009) Missing data analysis: making it work in the real world. Annu Rev Psychol 60:549–576. doi:10.1146/annurev.psych.58.110405.085530
Horton NJ, Kleinman KP (2007) Much ado about nothing: a comparison of missing data methods and software to fit incomplete data regression models. Am Stat. doi:10.1198/000313007X172556
Ilin A, Raiko T (2010) Practical approaches to principal component analysis in the presence of missing values. J Mach Learn Res 11:1957–2000
Junninen H, Niska H, Tuppurainen K, Ruuskanen J, Kolehmainen M (2004) Methods for imputation of missing values in air quality data sets. Atmos Environ 38(18):2895–2907. doi:10.1016/j.atmosenv.2004.02.026
Li L, Li Y, Li Z (2014) Missing traffic data: comparison of imputation methods. IET Intell Transp Syst 8(1):51–57. doi:10.1049/iet-its.2013.0052
Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, Hoboken, NJ. doi:10.1002/9781119013563
Myrtveit I, Stensrud E, Olsson UH (2001) Analyzing data sets with missing data: An empirical evaluation of imputation methods and likelihood-based methods. IEEE Trans Softw Eng 27(11):999–1013. doi:10.1109/32.965340
Pigott TD (2001) A review of methods for missing data. Educ Res Eval 7(4):353–383. doi:10.1076/edre.7.4.353.8937
Schafer JL (2000) Analysis of incomplete multivariate data. Chapman & Hall/CRC, London. doi:10.1201/9781439821862
Schafer JL, Graham JW (2002) Missing data: our view of the state of the art. Psychol Methods 7(2):147–177. doi:10.1037/1082-989X.7.2.147
Stacklies W, Redestig H, Scholz M, Walther D, Selbig J (2007) pcaMethods – a bioconductor package providing PCA methods for incomplete data. Bioinformatics 23(9):1164–1167. doi:10.1093/bioinformatics/btm069
Yozgatligil C, Aslan S, Iyigun C, Batmaz I (2013) Comparison of missing value imputation methods in time series: the case of Turkish meteorological data. Theor Appl Climatol 112(1–2):143–167. doi:10.1007/s00704-012-0723-x
Oba S, Sato M, Takemasa I, Monden M, Matsubara K, Ishii S (2003) A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19(16):2088–2096. doi:10.1093/bioinformatics/btg287
Qu L, Li L, Zhang Y, Hu J (2009) PPCA-based missing data imputation for traffic flow volume: a systematical approach. IEEE Trans Intell Transp Syst 10(3):512–522. doi:10.1109/TITS.2009.2026312
Rustum R, Adeloye AJ (2007) Replacing outliers and missing values from activated sludge data using Kohonen self-organizing map. J Environ Eng. http://doi.org/10.1061/(ASCE)0733-9372(2007)133:9(909)
Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, Botstein D, Altman RB (2001) Missing value estimation methods for DNA microarrays. Bioinformatics 17(6):520–525. doi:10.1093/bioinformatics/17.6.520
Ljung L (1999) System identification: theory for the user, 2nd edn. Prentice Hall, Englewood Cliffs
Moahmed TA, Gayar NE, Atiya AF (2014) Forward and backward forecasting ensembles for the estimation of time series missing data. Lect Notes Comput Sci 8774:93–104. doi:10.1007/978-3-642-12159-3
Folguera L, Zupan J, Cicerone D, Magallanes JF (2015) Self-organizing maps for imputation of missing data in incomplete data matrices. Chemom Intel Lab Syst 143:146–151. doi:10.1016/j.chemolab.2015.03.002
Bishop CM (1999) Variational principal components. In: 9th International conference on artificial neural networks ICANN 99, No. 470, pp 509–514. doi:10.1049/cp:19991160
Tipping ME, Bishop CM (1999) Probabilistic principal component analysis. J R Stat Soc Series B Stat Methodology 61(3):611–622. doi:10.1111/1467-9868.00196
Harel O, Zhou X-H (2007) Multiple imputation: review of theory, implementation and software. Stat Med 26(16):3057–3077. doi:10.1002/sim.2787
Ljung L (2015) System identification toolbox TM user β€TM s guide. The MathWorks, Inc
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Loukopoulos, P. et al. (2018). Addressing Missing Data for Diagnostic and Prognostic Purposes. In: Zuo, M., Ma, L., Mathew, J., Huang, HZ. (eds) Engineering Asset Management 2016. Lecture Notes in Mechanical Engineering. Springer, Cham. https://doi.org/10.1007/978-3-319-62274-3_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-62274-3_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-62273-6
Online ISBN: 978-3-319-62274-3
eBook Packages: EngineeringEngineering (R0)