Combining Boosted Trees with Metafeature Engineering for Predictive Maintenance

  • Vítor CerqueiraEmail author
  • Fábio Pinto
  • Claudio Sá
  • Carlos Soares
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9897)


We describe a data mining workflow for predictive maintenance of the Air Pressure System in heavy trucks. Our approach is composed by four steps: (i) a filter that excludes a subset of features and examples based on the number of missing values (ii) a metafeatures engineering procedure used to create a meta-level features set with the goal of increasing the information on the original data; (iii) a biased sampling method to deal with the class imbalance problem; and (iv) boosted trees to learn the target concept. Results show that the metafeatures engineering and the biased sampling method are critical for improving the performance of the classifier.


Predictive maintenance Anomaly detection Boosting Metalearning 



This research has received funding from the ECSEL Joint Undertaking, the framework programme for research and innovation horizon 2020 (2014–2020) under grant agreement n 662189-MANTIS-2014-1. It was also financed by the ERDF European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme within project POCI-01-0145-FEDER-006961, and by National Funds through the FCT Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) as part of project UID/EEA/50014/2013.


  1. 1.
    Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)zbMATHGoogle Scholar
  2. 2.
    Acock, A.C.: Working with missing values. J. Marriage Fam. 67(4), 1012–1028 (2005)CrossRefGoogle Scholar
  3. 3.
    Torgo, L.: Data Mining with R: Learning with Case Studies, 1st edn. Chapman & Hall/CRC, Boca Raton (2010)CrossRefGoogle Scholar
  4. 4.
    Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: ACM SIGMOD International Conference on Management of Data, pp. 93–104 (2000)Google Scholar
  5. 5.
    Torgo, L.: Resource-bounded fraud detection. In: Neves, J., Santos, M.F., Machado, J.M. (eds.) EPIA 2007. LNCS, vol. 4874, pp. 449–460. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-77002-2_38 CrossRefGoogle Scholar
  6. 6.
    Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. arXiv:1603.02754 (2016)

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Vítor Cerqueira
    • 1
    Email author
  • Fábio Pinto
    • 1
  • Claudio Sá
    • 1
  • Carlos Soares
    • 1
  1. 1.INESC TECUniversidade do PortoPortoPortugal

Personalised recommendations