Skip to main content

Identifying the Machine Learning Family from Black-Box Models

  • 902 Accesses

Part of the Lecture Notes in Computer Science book series (LNAI,volume 11160)


We address the novel question of determining which kind of machine learning model is behind the predictions when we interact with a black-box model. This may allow us to identify families of techniques whose models exhibit similar vulnerabilities and strengths. In our method, we first consider how an adversary can systematically query a given black-box model (oracle) to label an artificially-generated dataset. This labelled dataset is then used for training different surrogate models (each one trying to imitate the oracle’s behaviour). The method has two different approaches. First, we assume that the family of the surrogate model that achieves the maximum Kappa metric against the oracle labels corresponds to the family of the oracle model. The other approach, based on machine learning, consists in learning a meta-model that is able to predict the model family of a new black-box model. We compare these two approaches experimentally, giving us insight about how explanatory and predictable our concept of family is.


  • Machine learning families
  • Black-box model
  • Dissimilarity measures
  • Adversarial machine learning

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions


  1. 1.

    For reproducibility and replicability purposes, all the experiments, code, data and plots can be found at


  1. Angluin, D.: Queries and concept learning. Mach. Learn. 2(4), 319–342 (1988)

    MathSciNet  Google Scholar 

  2. Benedek, G.M., Itai, A.: Learnability with respect to fixed distributions. Theor. Comput. Sci. 86(2), 377–389 (1991)

    CrossRef  MathSciNet  Google Scholar 

  3. Biggio, B., et al.: Security Evaluation of support vector machines in adversarial environments. In: Ma, Y., Guo, G. (eds.) Support Vector Machines Applications, pp. 105–153. Springer, Cham (2014).

    CrossRef  Google Scholar 

  4. Blanco-Vega, R., Hernández-Orallo, J., Ramírez-Quintana, M.J.: Analysing the trade-off between comprehensibility and accuracy in mimetic models. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 338–346. Springer, Heidelberg (2004).

    CrossRef  Google Scholar 

  5. Dalvi, N., Domingos, P., Sanghai, S., Verma, D., et al.: Adversarial classification. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 99–108. ACM (2004)

    Google Scholar 

  6. Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017).

  7. Domingos, P.: Knowledge discovery via multiple models. Intell. Data Anal. 2(3), 187–202 (1998)

    CrossRef  Google Scholar 

  8. Duin, R.P.W., Loog, M., Pȩkalska, E., Tax, D.M.J.: Feature-based dissimilarity space classification. In: Ünay, D., Çataltepe, Z., Aksoy, S. (eds.) ICPR 2010. LNCS, vol. 6388, pp. 46–55. Springer, Heidelberg (2010).

    CrossRef  Google Scholar 

  9. Fernández-Delgado, M., Cernadas, E., Barro, S., Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems. J. Mach. Learn. Res. 15(1), 3133–3181 (2014)

    MathSciNet  MATH  Google Scholar 

  10. Ferri, C., Hernández-Orallo, J., Modroiu, R.: An experimental comparison of performance measures for classification. Pattern Recognit. Lett. 30(1), 27–38 (2009)

    CrossRef  Google Scholar 

  11. Giacinto, G., Perdisci, R., Del Rio, M., Roli, F.: Intrusion detection in computer networks by a modular ensemble of one-class classifiers. Inf. Fusion 9(1), 69–82 (2008)

    CrossRef  Google Scholar 

  12. Huang, L., Joseph, A.D., Nelson, B., Rubinstein, B.I., Tygar, J.: Adversarial machine learning. In: Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, pp. 43–58 (2011)

    Google Scholar 

  13. Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach. Learn. 51(2), 181–207 (2003)

    CrossRef  Google Scholar 

  14. Landis, J.R., Koch, G.G.: An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics 33, 363–374 (1977)

    CrossRef  Google Scholar 

  15. Lowd, D., Meek, C.: Adversarial learning. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data mining, pp. 641–647. ACM (2005)

    Google Scholar 

  16. Martınez-Plumed, F., Prudêncio, R.B., Martınez-Usó, A., Hernández-Orallo, J.: Making sense of item response theory in machine learning. In: Proceedings of 22nd European Conference on Artificial Intelligence (ECAI). Frontiers in Artificial Intelligence and Applications, vol. 285, pp. 1140–1148 (2016)

    Google Scholar 

  17. Papernot, N., McDaniel, P., Goodfellow, I.: Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277 (2016)

  18. Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: 2016 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 372–387. IEEE (2016)

    Google Scholar 

  19. Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 582–597. IEEE (2016)

    Google Scholar 

  20. Sesmero, M.P., Ledezma, A.I., Sanchis, A.: Generating ensembles of heterogeneous classifiers using stacked generalization. Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. 5(1), 21–34 (2015)

    Google Scholar 

  21. Smith, M.R., Martinez, T., Giraud-Carrier, C.: An instance level analysis of data complexity. Mach. Learn. 95(2), 225–256 (2014)

    CrossRef  MathSciNet  Google Scholar 

  22. Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction APIs. In: USENIX Security Symposium, pp. 601–618 (2016)

    Google Scholar 

  23. Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)

    CrossRef  Google Scholar 

  24. Wallace, C.S., Boulton, D.M.: An information measure for classification. Comput. J. 11(2), 185–194 (1968)

    CrossRef  Google Scholar 

  25. Wolpert, D.H.: Stacked generalization. Neural Netw. 5(2), 241–259 (1992)

    CrossRef  Google Scholar 

Download references


This material is based upon work supported by the Air Force Office of Scientific Research under award number FA9550-17-1-0287, the EU (FEDER), and the Spanish MINECO under grant TIN 2015-69175-C4-1-R, the Generalitat Valenciana PROMETEOII/2015/013. F. Martínez-Plumed was also supported by INCIBE under grant INCIBEI-2015-27345 (Ayudas para la excelencia de los equipos de investigación avanzada en ciberseguridad). J. H-Orallo also received a Salvador de Madariaga grant (PRX17/00467) from the Spanish MECD for a research stay at the CFI, Cambridge, and a BEST grant (BEST/2017/045) from the GVA for another research stay at the CFI.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Raül Fabra-Boluda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fabra-Boluda, R., Ferri, C., Hernández-Orallo, J., Martínez-Plumed, F., Ramírez-Quintana, M.J. (2018). Identifying the Machine Learning Family from Black-Box Models. In: Herrera, F., et al. Advances in Artificial Intelligence. CAEPIA 2018. Lecture Notes in Computer Science(), vol 11160. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00373-9

  • Online ISBN: 978-3-030-00374-6

  • eBook Packages: Computer ScienceComputer Science (R0)