A Statistical Learning Ontology for Managing Analytics Knowledge

  • Ali BehnazEmail author
  • Madhushi Bandara
  • Fethi A. Rabhi
  • Maurice Peat
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 345)


This paper focuses on the use of knowledge management techniques to help organisations tap into the power of statistical learning when conducting analytics. Its main contribution is in the use of an ontology development process to derive the essential concepts required for an ontology to represent variables of interest and their interrelationships with each other and with statistical datasets. This ontology is developed with the help of two case studies in the area of digital marketing and commodity pricing. A number of competency questions have been designed to map to user requirements in both case studies. A prototype system has been developed using a semantic modelling tool and a semantic data repository to demonstrate that the proposed ontology can support the competency questions via semantic queries.


Statistical learning Data science Computational social scientist Ontology Semantic technology 



We are grateful to Capsifi and Ignition Wealth, especially Terry Roach, Mark Fordree and Mike Giles for sponsoring the research which led to this paper. We are also grateful to Adnene Guabtni and Chedia Dhaoui for helping with digital marketing case study. We thank Gino Conte on the visualization development of the prototype application.


  1. 1.
    The Computational Social Science Society of the Americas.
  2. 2.
    Gilbert, N. (ed.): Computational Social Science, vol. 21. Sage, Thousand Oaks (2010)Google Scholar
  3. 3.
    Dhar, V.: Data science and prediction. Commun. ACM 56(12), 64–73 (2013)CrossRefGoogle Scholar
  4. 4.
    Schlegal, K., Linden, A.: Predicts 2017: Analytics Strategy and Technology. Gartner, Stamford (2016)Google Scholar
  5. 5.
    Nural, M.V., Cotterell, M.E., Miller, J.A.: Using semantics in predictive big data analytics. In: 2015 IEEE International Congress on Big Data (BigData Congress), pp. 254–261. IEEE, June 2015Google Scholar
  6. 6.
    Labrinidis, A., Jagadish, H.V.: Challenges and opportunities with big data. Proc. VLDB Endow. 5(12), 2032–2033 (2012)CrossRefGoogle Scholar
  7. 7.
    Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Networks 10(5), 988–999 (1999)CrossRefGoogle Scholar
  8. 8.
    Migon, H.S., Gamerman, D., Louzada, F.: Statistical Inference: An Integrated Approach. CRC Press, Boca Raton (2014)zbMATHGoogle Scholar
  9. 9.
    James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning, vol. 112. Springer, New York (2013)CrossRefGoogle Scholar
  10. 10.
    Harper, K.E., Dagnino, A.: Agile software architecture in advanced data analytics. In: 2014 IEEE/IFIP Conference on Software Architecture (WICSA), pp. 243–246. IEEE, April 2014Google Scholar
  11. 11.
    Yao, L., Rabhi, F.A.: Building architectures for data-intensive science using the ADAGE framework. Concurr. Comput. Pract. Exp. 27(5), 1188–1206 (2015)CrossRefGoogle Scholar
  12. 12.
    Behnaz, A., Rabhi, F., Peat, M.: A software architecture for enabling statistical learning on big data. In: Rojas, I., Pomares, H., Valenzuela, O. (eds.) ITISE 2016. CS, pp. 343–357. Springer, Cham (2017). Scholar
  13. 13.
    Withers, D., Kawas, E., McCarthy, L., Vandervalk, B., Wilkinson, M.: Semantically-guided workflow construction in Taverna: the SADI and BioMoby plug-ins. In: Margaria, T., Steffen, B. (eds.) ISoLA 2010. LNCS, vol. 6415, pp. 301–312. Springer, Heidelberg (2010). Scholar
  14. 14.
    Miller, J.A., Han, J., Hybinette, M.: Using domain specific language for modeling and simulation: scalation as a case study. In: Proceedings of the Winter Simulation Conference, pp. 741–752, December 2010Google Scholar
  15. 15.
    Panov, P., Džeroski, S., Soldatova, L.: OntoDM: an ontology of data mining. In: IEEE International Conference on Data Mining Workshops, ICDMW 2008, pp. 752–760. IEEE, December 2008Google Scholar
  16. 16.
    Lin, M.S., Zhang, H., Yu, Z.G.: An ontology for supporting data mining process. In: IMACS Multiconference on Computational Engineering in Systems Applications, vol. 2, pp. 2074–2077. IEEE, October 2006Google Scholar
  17. 17.
    Espinosa, R., García-Saiz, D., Zorrilla, M.E., Zubcoff, J.J., Mazón, J.N.: Development of a knowledge base for enabling non-expert users to apply data mining algorithms. In: SIMPDA, pp. 46–61, August 2013Google Scholar
  18. 18.
    Behnaz, A., Natarajan, A., Rabhi, Fethi A., Peat, M.: A semantic-based analytics architecture and its application to commodity pricing. In: Feuerriegel, S., Neumann, D. (eds.) FinanceCom 2016. LNBIP, vol. 276, pp. 17–31. Springer, Cham (2017). Scholar
  19. 19.
    Shah, T.M.: Designing and conceptualising ontology patterns for modelling cross-domain health information. Ph.D. thesis, University of New South Wales (2016)Google Scholar
  20. 20.
    Suárez-Figueroa, M.C.: NeOn methodology for building ontology networks: specification, scheduling and reuse. Doctoral thesis, Artificial Intelligence, Universidad Politécnica De Madrid (2010)Google Scholar
  21. 21.
    Yang, S., Lin, S., Carlson, J.R., Ross Jr., W.T.: Brand engagement on social media: will firms’ social media efforts influence search engine advertising effectiveness? J. Mark. Manage. 32(5–6), 526–557 (2016)CrossRefGoogle Scholar
  22. 22.

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Ali Behnaz
    • 1
    Email author
  • Madhushi Bandara
    • 1
  • Fethi A. Rabhi
    • 1
  • Maurice Peat
    • 2
  1. 1.School of Computer Science and EngineeringUniversity of New South WalesSydneyAustralia
  2. 2.The University of Sydney Business SchoolSydneyAustralia

Personalised recommendations