Machine Intelligence Mixture of Experts and Bayesian Networks

  • Walker H. LandJr.
  • J. David Schaffer


In Chap.  6, we introduced Bayesian networks and some of the traditionAbstractal methods for designing them from data and pointed out some of the challenges with these approaches. In this chapter, we introduce a much simpler approach that escapes some of these challenges. This approach, we admit, will clearly not be an adequate substitute for all BN applications, but may well be adequate for many classification tasks. This approach uses a simple three-tier network, with the central node (or nodes) for the classification, an input layer for variables that influence the probability of membership in each class, and an output layer for variables whose probability is influenced by the class. We did not invent this approach, but we show how the MI methods in previous chapters, particularly feature subset selection, can greatly simplify the task of specifying the BN topology by reducing the available features to a small and fairly uncorrelated set. This does two things. By reducing the number features, we reduce the data requirements—maybe to manageable levels. By finding uncorrelated features, the assumptions of the simple three-tier design are less likely to be violated.

We illustrate the approach applied to four tasks. Two use the full three-tier design, and two others use reduced forms. The BN for each is compared to other MI methods previously used on these four data sets. We show how the conditional probabilities needed for the BN are estimated from the data. While none of the examples provided show the BN outperforming other MI methods, we offer it in the hope that the approach may demonstrate good performance in some future applications.


Bayesian network Single network template Alzheimer’s speech data Wolberg breast cancer data Generalized acquisition of recurrent links (GNRL) Colon cancer risk data Decision trees Duke University breast cancer data Bank of SVMs Estimating conditional probabilities 



Alzheimer’s disease


Artificial neural network


Area under (ROC) curve


Breast imaging reporting and data system


Bayesian network


Decision tree




Years of education


Fine needle aspirate


False negative


False positive


Future risk (not to be confused with fitness ratio used in Chap.  1)


Genetic Algorithm-Support Vector Machine-Oracle hybrid


GeNeralized Acquisition of Recurrent Links


Machine intelligence


Measure of performance


Negative predicted value


Principle component regression


Positive predicted value


Receiver operator characteristic


Recurrent neural network


Support vector machine


True positive


True positive rate


True negative


True negative rate


  1. Alzheimer’s Association Report (2015) Alzheimer’s disease facts and figures. Alzheimers Dement 11:332–384CrossRefGoogle Scholar
  2. Angeline PJ, Saunders GM, Pollack JR (1996) An evolutionary algorithm that constructs recurrent neural networks. Laboratory for Artificial Intelligence Research, Computer and Information Science Department, Ohio State University, ColumbusGoogle Scholar
  3. Bilska-Wolak AO, Floyd Jr CE (2002) Breast biopsy prediction using a case-based reasoning classifier for masses versus calcifications. In: SPIE medical imaging conference 2002, vol. 4684.Google Scholar
  4. BI-RADS, Breast Imaging - Reporting and Data System (BI-RADS) (1993) American College of Radiology.Google Scholar
  5. Breiman L, Friedman JH, Olshen RA, Stone CJ (1983) Classification and regression trees. Wadsworth, BelmontzbMATHGoogle Scholar
  6. Feldstein AC, Perrin N, Liles EG, Smith D, Rosales AG, Schneider JL, Lafata JE, Myers RE, Mosen DM, Glasgow RE (2012) Primary care colorectal cancer screening recommendation patterns: associated factors and screening outcomes. Med Decis Making 32:198–208CrossRefGoogle Scholar
  7. Fernandez E, Gallus S, Vechia CL, Talamini R, Negri E, Franceschi S (2004) Family history and environmental risk factors for colon cancer. Cancer Epidemiol Biomarkers Prev 13:658–661Google Scholar
  8. Højsgaard S (2012) Graphical independence networks with the gRain Package for R. J Stat Softw 46(10):1–26. Scholar
  9. Jacot G (2014) Improving upon colorectal cancer screening guidelines using artificial intelligence, MS system science. Binghamton University, BinghamtonGoogle Scholar
  10. Land Jr WH, Masters T, Lo JY, McKee D (2000a) Using evolutionary computation to develop neural network breast cancer benign/malignant classification models. In: 4th world conference on systemics, cybernetics and informatics, vol 10, pp 343–347.Google Scholar
  11. Land Jr WH, Masters T, Lo JY (2000b) Application of a new evolutionary programming/adaptive boosting hybrid to breast cancer diagnosis. In: IEEE congress on evolutionary computation proceedings, Beijing.Google Scholar
  12. Land Jr WH, Akanda A, Lo JY, Anderson F, Bryden M (2002a) Application of support vector machines to breast cancer screening using mammogram and history data. In: SPIE medical imaging conference 2002, vol 4684.Google Scholar
  13. Land Jr WH, Lo JY, Velázquez R (2002b) Using evolutionary programming to configure support vector machines for the diagnosis of breast cancer. In Dagli, C.H. et al (Eds) Intelligent engineering systems through artificial neural networks ANNIE’2002, Volume 12, Smart engineering system design, ASME Press, New York, 2002Google Scholar
  14. Land WH, Schaffer JD (2016) A machine intelligence designed Bayesian network applied to Alzheimer’s detection using demographics and speech data. Proc Comput Sci 95:168–174CrossRefGoogle Scholar
  15. Lau DT, Kirby JB (2009) The relationship between living arrangement and preventive care use among community-dwelling elderly persons. Am J Public Health 99(7):1315–1321CrossRefGoogle Scholar
  16. Lo JY, Baker JA, Kornguth PJ, Iglehart JD, Floyd CE (1997) Predicting breast cancer invasion with artificial neural networks on the basis of mammographic features. Radiology 203:159–163CrossRefGoogle Scholar
  17. Lo JY, Baker JA, Kornguth PJ, Floyd CE Jr (1999) Effect of patient history data on the prediction of breast cancer from mammographic findings with artificial neural networks. Acad Radiol 6:10–15CrossRefGoogle Scholar
  18. Pickover C (1984) The use of symmetrized-dot patterns characterizing speech waveforms. IBM Tech Disclosure Bull 27:4055–4056Google Scholar
  19. Seixas FL, Zadrozny B, Laks I, Conci A, Saade DCM (2014) A Bayesian network decision model for supporting the diagnosis of dementia, Alzheimer’s disease and mild cognitive impairment. Comput Biol Med 51:140–158CrossRefGoogle Scholar
  20. Sharp EM, Gatz M (2011) The relationship between education and dementia an updated systematic review. Alzheimer Dis Assoc Disord 25(4):289–304CrossRefGoogle Scholar
  21. Wolberg WH, Mangasarian O (1990) Multisurface method of pattern separation applied to breast cytology diagnosis. Proc National Academy of Sciences 87:9193–9196Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Walker H. LandJr.
    • 1
  • J. David Schaffer
    • 2
  1. 1.Binghamton UniversityBowieUSA
  2. 2.Binghamton UniversityBinghamtonUSA

Personalised recommendations