Skip to main content

Advanced Neural Networks Systems for Unbalanced Industrial Datasets

  • Chapter
  • First Online:
Book cover Multidisciplinary Approaches to Neural Computing

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 69))

Abstract

Many industrial tasks are related to the problem of the classification of unbalanced datasets. In these cases rare patterns of interest for the particular applications have to be detected among a much larger amount of patterns. Since data unbalance strongly affects the performance of standard classifiers, several ad–hoc methods have been developed. In this work the main techniques for handling class unbalance are depicted and three methods developed by the authors and based on the use of neural networks are described and tested on industrial case studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Akbani, R., Kwek, S., Japkowicz, N.: 15th European Conference on Machine Learning ECML 2004, Pisa, Italy, Sept. 20–24, pp. 39–50. Springer, Berlin (2004)

    Google Scholar 

  2. Borselli, A., Colla, V., Vannucci, M., Veroli, M.: A fuzzy inference system applied to defect detection in flat steel production. In: 2010 IEEE International Conference on Fuzzy Systems (FUZZ), pp. 1–6 (2010)

    Google Scholar 

  3. Cateni, S., Colla, V., Vannucci, M.: Novel resampling method for the classification of imbalanced datasets for industrial and other real-world problems. Int. Conf. Intell. Syst. Des. Appl. ISDA 2011, 402–407 (2011)

    Google Scholar 

  4. Cateni, S., Colla, V., Vannucci, M.: A method for resampling imbalanced datasets in binary classification tasks for real-world problems. Neurocomputing 135, 32–41 (2014)

    Article  Google Scholar 

  5. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: Synthetic minority over-sampling technique. J. Artif. Int. Res. 16(1), 321–357 (2002)

    MATH  Google Scholar 

  6. Chawla, N.: C4.5 and imbalanced data sets: investigating the effect of sampling method, probabilistic estimate, and decision tree structure. In: Proceedings of ICML03 Works on Class Imbalances (2003)

    Google Scholar 

  7. Estabrooks, A., Jo, T., Japkowicz, N.: A multiple resampling method for learning from imbalanced datasets. Comp. Intell. 20(1), 18–36 (2004)

    Google Scholar 

  8. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  9. Japkowicz, N., Stephen, S.: The class imbalance problem: a systematic study. Intell. Data Anal. 6(5), 429–449 (2002)

    MATH  Google Scholar 

  10. Leskovec, J., Shawne-Taylor, J.: Linear programming boosting for uneven datasets. In: 20th International Conference on Machine Learning (ICML’03), pp. 456–463. AAAI Press, event Dates: 21–24 August (2003)

    Google Scholar 

  11. Li, P., Chan, K., Fang, W.: Hybrid kernel machine ensemble for imbalanced data sets. In: 18th International Conference on Pattern Recognition (ICPR’06), vol. 1, pp. 1108–1111 (2006)

    Google Scholar 

  12. Ling, C., Yang, Q., Wang, J., Zhang, S.: Decision trees with minimal costs. In: Proceedings of the 21-st International Conference on Machine Learning ICML ’04, p. 69. ACM, New York, NY, USA (2004)

    Google Scholar 

  13. Schölkopf, B., Smola, A.J., Williamson, R.C., Bartlett, P.L.: New support vector algorithms. Neural Comput. 12(5), 1207–1245 (2000)

    Article  Google Scholar 

  14. Soler, V., Prim, M.: 17th International Conference on Artificial Neural Networks—ICANN 2007, vol. I, pp. 511–519. Springer, Berlin (2007)

    Google Scholar 

  15. Vannucci, M., Colla, V.: Novel classification method for sensitive problems and uneven datasets based on neural networks and fuzzy logic. Appl. Soft Comput. J. 11(2), 2383–2390 (2011)

    Article  Google Scholar 

  16. Vannucci, M., Colla, V., Cateni, S., Sgarbi, M.: Artificial intelligence techniques for unbalanced datasets in real world classification tasks, chap. In: Computational Modeling and Simulation of Intellect: Current State and Future Perspectives, pp. 551–565. IGI Global (2011)

    Google Scholar 

  17. Vannucci, M., Colla, V., Nastasi, G., Matarese, N.: Detection of rare events within industrial datasets by means of data resampling and specific algorithms. Int. J. Simul. Syst. Sci. Technol. 11(3), 1–11 (2010)

    Google Scholar 

  18. Vannucci, M., Colla, V., Sgarbi, M., Toscanelli, O.: Thresholded neural networks for sensitive industrial classification tasks. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 5517 LNCS(PART 1), pp. 1320–1327 (2009)

    Google Scholar 

  19. Vannucci, M., Colla, V., Vannocci, M., Reyneri, L.: Dynamic resampling method for classification of sensitive problems and uneven datasets. In: Communications in Computer and Information Science 298 CCIS (PART 2), pp. 78–87 (2012)

    Google Scholar 

  20. Weiss, G.M.: Mining with rarity: a unifying framework. SIGKDD Explor. Newsl. 6(1), 7–19 (2004)

    Article  Google Scholar 

  21. Weiss, G.M., Provost, F.: Learning when training data are costly: the effect of class distribution on tree induction. J. Artif. Int. Res. 19(1), 315–354 (2003)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Vannucci .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Vannucci, M., Colla, V. (2018). Advanced Neural Networks Systems for Unbalanced Industrial Datasets. In: Esposito, A., Faudez-Zanuy, M., Morabito, F., Pasero, E. (eds) Multidisciplinary Approaches to Neural Computing. Smart Innovation, Systems and Technologies, vol 69. Springer, Cham. https://doi.org/10.1007/978-3-319-56904-8_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-56904-8_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-56903-1

  • Online ISBN: 978-3-319-56904-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics