Attribute Interactions in Medical Data Analysis

  • Aleks Jakulin
  • Ivan Bratko
  • Dragica Smrke
  • Janez Demšar
  • Blaž Zupan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2780)


There is much empirical evidence about the success of naive Bayesian classification (NBC) in medical applications of attribute-based machine learning. NBC assumes conditional independence between attributes. In classification, such classifiers sum up the pieces of class-related evidence from individual attributes, independently of other attributes. The performance, however, deteriorates significantly when the “interactions” between attributes become critical. We propose an approach to handling attribute interactions within the framework of “voting” classifiers, such as NBC. We propose an operational test for detecting interactions in learning data and a procedure that takes the detected interactions into account while learning. This approach induces a structuring of the domain of attributes, it may lead to improved classifier’s performance and may provide useful novel information for the domain expert when interpreting the results of learning. We report on its application in data analysis and model construction for the prediction of clinical outcome in hip arthroplasty.


Domain Expert Information Gain Negative Interaction Feature Subset Attribute Interaction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Shapiro, A.D.: Structured induction in expert systems. Turing Institute Press in association with Addison-Wesley Publishing Company (1987)Google Scholar
  2. 2.
    Michie, D.: Problem decomposition and the learning of skills. In: Lavrač, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, pp. 17–31. Springer, Heidelberg (1995)Google Scholar
  3. 3.
    Zupan, B., Bohanec, M., Demšar, J., Bratko, I.: Learning by discovering concept hierarchies. Artificial Intelligence 109, 211–242 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Harris, W.H.: Traumatic arthritis of the hip after dislocation and acetabular fractures: Treatment by mold arthroplasty: end result study using a new method of result evaluation. J. Bone Joint. Surg. 51-A, 737–755 (1969)Google Scholar
  5. 5.
    Zupan, B., Demšar, J., Smrke, D., Božikov, K., Stankovski, V., Bratko, I., Beck, J.R.: Predicting patient’s long term clinical status after hip arthroplasty using hierarchical decision modeling and data mining. Methods of Information in Medicine 40, 25–31 (2001)Google Scholar
  6. 6.
    Jakulin, A.: Attribute interactions in machine learning. Master’s thesis, University of Ljubljana, Faculty of Computer and Information Science (2003)Google Scholar
  7. 7.
    McGill, W.J.: Multivariate information transmission. Psychometrika 19, 97–116 (1954)zbMATHCrossRefGoogle Scholar
  8. 8.
    Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29, 131–163 (1997)zbMATHCrossRefGoogle Scholar
  9. 9.
    Struyf, A., Hubert, M., Rousseeuw, P.J.: Integrating robust clustering techniques in S-PLUS. Computational Statistics and Data Analysis 26, 17–37 (1997)zbMATHCrossRefGoogle Scholar
  10. 10.
    Kononenko, I.: Semi-naive Bayesian classifier. In: Kodratoff, Y. (ed.) EWSL 1991. Lecture Notes in Computer Science (LNAI), vol. 482. Springer, Heidelberg (1991)CrossRefGoogle Scholar
  11. 11.
    Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning 29, 103–130 (1997)zbMATHCrossRefGoogle Scholar
  12. 12.
    Rish, I., Hellerstein, J., Jayram, T.: An analysis of data characteristics that affect naive Bayes performance. Technical Report RC21993, IBM (2001)Google Scholar
  13. 13.
    Demšar, J., Zupan, B.: Orange: a data mining framework. (2002),
  14. 14.
    Brier, G.W.: Verification of forecasts expressed in terms of probability. Weather Rev. 78, 1–3 (1950)CrossRefGoogle Scholar
  15. 15.
    Margolis, D.J., Halpern, A.C., Rebbeck, T., et al.: Validation of a melanoma prognostic model. Arch. Dermatol. 134, 1597–1601 (1998)CrossRefGoogle Scholar
  16. 16.
    Myllymaki, P., Silander, T., Tirri, H., Uronen, P.: B-Course: A web-based tool for Bayesian and causal data analysis. International Journal on Artificial Intelligence Tools 11, 369–387 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Aleks Jakulin
    • 1
  • Ivan Bratko
    • 1
    • 2
  • Dragica Smrke
    • 3
  • Janez Demšar
    • 1
  • Blaž Zupan
    • 1
    • 2
    • 4
  1. 1.Faculty of Computer and Information ScienceUniversity of LjubljanaLjubljanaSlovenia
  2. 2.J. Stefan InstituteLjubljanaSlovenia
  3. 3.Dept. of TraumatologyUniversity Clinical CenterLjubljanaSlovenia
  4. 4.Dept. of Human and Mol. GeneticsBaylor College of MedicineHoustonUSA

Personalised recommendations