Advertisement

Machine Intelligence Mixture of Experts and Bayesian Networks

  • Walker H. LandJr.
  • J. David Schaffer
Chapter
  • 326 Downloads

Abstract

In Chap.  6, we introduced Bayesian networks and some of the traditionAbstractal methods for designing them from data and pointed out some of the challenges with these approaches. In this chapter, we introduce a much simpler approach that escapes some of these challenges. This approach, we admit, will clearly not be an adequate substitute for all BN applications, but may well be adequate for many classification tasks. This approach uses a simple three-tier network, with the central node (or nodes) for the classification, an input layer for variables that influence the probability of membership in each class, and an output layer for variables whose probability is influenced by the class. We did not invent this approach, but we show how the MI methods in previous chapters, particularly feature subset selection, can greatly simplify the task of specifying the BN topology by reducing the available features to a small and fairly uncorrelated set. This does two things. By reducing the number features, we reduce the data requirements—maybe to manageable levels. By finding uncorrelated features, the assumptions of the simple three-tier design are less likely to be violated.

We illustrate the approach applied to four tasks. Two use the full three-tier design, and two others use reduced forms. The BN for each is compared to other MI methods previously used on these four data sets. We show how the conditional probabilities needed for the BN are estimated from the data. While none of the examples provided show the BN outperforming other MI methods, we offer it in the hope that the approach may demonstrate good performance in some future applications.

Keywords

Bayesian network Single network template Alzheimer’s speech data Wolberg breast cancer data Generalized acquisition of recurrent links (GNRL) Colon cancer risk data Decision trees Duke University breast cancer data Bank of SVMs Estimating conditional probabilities 

Abbreviations

AD

Alzheimer’s disease

ANN

Artificial neural network

AUC

Area under (ROC) curve

BI-RADS

Breast imaging reporting and data system

BN

Bayesian network

DT

Decision tree

Dx

Diagnosis

edu

Years of education

FNA

Fine needle aspirate

FN

False negative

FP

False positive

FR

Future risk (not to be confused with fitness ratio used in Chap.  1)

GA-SVM-Oracle

Genetic Algorithm-Support Vector Machine-Oracle hybrid

GNARL

GeNeralized Acquisition of Recurrent Links

MI

Machine intelligence

MOP

Measure of performance

NPV

Negative predicted value

PCR

Principle component regression

PPV

Positive predicted value

ROC

Receiver operator characteristic

RNN

Recurrent neural network

SVM

Support vector machine

TP

True positive

TPR

True positive rate

TN

True negative

TNR

True negative rate

References

  1. Alzheimer’s Association Report (2015) Alzheimer’s disease facts and figures. Alzheimers Dement 11:332–384CrossRefGoogle Scholar
  2. Angeline PJ, Saunders GM, Pollack JR (1996) An evolutionary algorithm that constructs recurrent neural networks. Laboratory for Artificial Intelligence Research, Computer and Information Science Department, Ohio State University, ColumbusGoogle Scholar
  3. Bilska-Wolak AO, Floyd Jr CE (2002) Breast biopsy prediction using a case-based reasoning classifier for masses versus calcifications. In: SPIE medical imaging conference 2002, vol. 4684.Google Scholar
  4. BI-RADS, Breast Imaging - Reporting and Data System (BI-RADS) (1993) American College of Radiology.Google Scholar
  5. Breiman L, Friedman JH, Olshen RA, Stone CJ (1983) Classification and regression trees. Wadsworth, BelmontzbMATHGoogle Scholar
  6. Feldstein AC, Perrin N, Liles EG, Smith D, Rosales AG, Schneider JL, Lafata JE, Myers RE, Mosen DM, Glasgow RE (2012) Primary care colorectal cancer screening recommendation patterns: associated factors and screening outcomes. Med Decis Making 32:198–208CrossRefGoogle Scholar
  7. Fernandez E, Gallus S, Vechia CL, Talamini R, Negri E, Franceschi S (2004) Family history and environmental risk factors for colon cancer. Cancer Epidemiol Biomarkers Prev 13:658–661Google Scholar
  8. Højsgaard S (2012) Graphical independence networks with the gRain Package for R. J Stat Softw 46(10):1–26. http://www.jstatsoft.org/v46/i10/Google Scholar
  9. Jacot G (2014) Improving upon colorectal cancer screening guidelines using artificial intelligence, MS system science. Binghamton University, BinghamtonGoogle Scholar
  10. Land Jr WH, Masters T, Lo JY, McKee D (2000a) Using evolutionary computation to develop neural network breast cancer benign/malignant classification models. In: 4th world conference on systemics, cybernetics and informatics, vol 10, pp 343–347.Google Scholar
  11. Land Jr WH, Masters T, Lo JY (2000b) Application of a new evolutionary programming/adaptive boosting hybrid to breast cancer diagnosis. In: IEEE congress on evolutionary computation proceedings, Beijing.Google Scholar
  12. Land Jr WH, Akanda A, Lo JY, Anderson F, Bryden M (2002a) Application of support vector machines to breast cancer screening using mammogram and history data. In: SPIE medical imaging conference 2002, vol 4684.Google Scholar
  13. Land Jr WH, Lo JY, Velázquez R (2002b) Using evolutionary programming to configure support vector machines for the diagnosis of breast cancer. In Dagli, C.H. et al (Eds) Intelligent engineering systems through artificial neural networks ANNIE’2002, Volume 12, Smart engineering system design, ASME Press, New York, 2002Google Scholar
  14. Land WH, Schaffer JD (2016) A machine intelligence designed Bayesian network applied to Alzheimer’s detection using demographics and speech data. Proc Comput Sci 95:168–174CrossRefGoogle Scholar
  15. Lau DT, Kirby JB (2009) The relationship between living arrangement and preventive care use among community-dwelling elderly persons. Am J Public Health 99(7):1315–1321CrossRefGoogle Scholar
  16. Lo JY, Baker JA, Kornguth PJ, Iglehart JD, Floyd CE (1997) Predicting breast cancer invasion with artificial neural networks on the basis of mammographic features. Radiology 203:159–163CrossRefGoogle Scholar
  17. Lo JY, Baker JA, Kornguth PJ, Floyd CE Jr (1999) Effect of patient history data on the prediction of breast cancer from mammographic findings with artificial neural networks. Acad Radiol 6:10–15CrossRefGoogle Scholar
  18. Pickover C (1984) The use of symmetrized-dot patterns characterizing speech waveforms. IBM Tech Disclosure Bull 27:4055–4056Google Scholar
  19. Seixas FL, Zadrozny B, Laks I, Conci A, Saade DCM (2014) A Bayesian network decision model for supporting the diagnosis of dementia, Alzheimer’s disease and mild cognitive impairment. Comput Biol Med 51:140–158CrossRefGoogle Scholar
  20. Sharp EM, Gatz M (2011) The relationship between education and dementia an updated systematic review. Alzheimer Dis Assoc Disord 25(4):289–304CrossRefGoogle Scholar
  21. Wolberg WH, Mangasarian O (1990) Multisurface method of pattern separation applied to breast cytology diagnosis. Proc National Academy of Sciences 87:9193–9196Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Walker H. LandJr.
    • 1
  • J. David Schaffer
    • 2
  1. 1.Binghamton UniversityBowieUSA
  2. 2.Binghamton UniversityBinghamtonUSA

Personalised recommendations