Skip to main content

Apply Ensemble of Lazy Learners to Biomedical Data Mining

  • Conference paper

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 134))

Abstract

Disease stat can be predicted from biomedical data studies by machine learning. However, many available biomedical data feature high dimensional or imbalanced, and these always led to high false positive or false negative rate in prediction. How to construct suitable machine learning model is the key to the performance.

This article describes a novel approach that applies PSO boosted ensembles of lazy learners to mining biomedical data. The learned model is evaluated on three published data sets during 10-fold cross-validation; Experiment result reveals that the proposed model can tackle data interference and performs better than other available rule learning methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gopalakrishnan, V., Ganchev, P., Ranganathan, S., et al.: Rule learning for disease-specific biomarker discovery from clinical proteomic mass spectra. In: Li, J., Yang, Q., Tan, A.-H. (eds.) BioDM 2006. LNCS (LNBI), vol. 3916, pp. 93–105. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. Gopalakrishnan, V., Williams, E., Ranganathan, S., et al.: Proteomic data mining challenges in identification of disease-specific biomarkers from variable resolution mass spectra. In: SIAM Bioinformatics Workshop. Society of Industrial and Applied Mathematics International Conference on Data Mining, Lake Buena Vista, FL (2004)

    Google Scholar 

  3. Gopalakrishnan, V., Lustgarten, J.L., Visweswaran, S., Cooper, G.F.: Bayesian rule learning for biomedical data mining. Bioinformatics 26(5), 668–675 (2010)

    Article  Google Scholar 

  4. Park, C., Cho, S.-B.: Evolutionary Computation for Optimal Ensemble Classifier in Lymphoma Cancer Classification. In: Carbonell, J.G., Siekmann, J. (eds.) ISMIS 2003. LNCS (LNAI), vol. 2871, pp. 521–530. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  5. Datta, S., Pihur, V., Datta, S.: An adaptive optimal ensemble classifier via bagging and rank aggregation with applications to high dimensional data. BMC Bioinformatics 11, 427 (2010)

    Article  Google Scholar 

  6. Viola, P., Jones, M.: Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade. Advances in Neural Information Processing System 14, 1311–1318 (2001)

    Google Scholar 

  7. Aha, D.W.: Lazy learning. Lazy learning, 7–10 (1997)

    Google Scholar 

  8. van den Bosch, A., Weijters, T., van den Herik, H.J.: When small disjuncts abound, try lazy learning: A case study. In: Proceedings Seventh Benelearn Conference, pp. 109–118 (1997)

    Google Scholar 

  9. Atkeson, C.G., Moore, A.W., Schaal, S.: Locally weighted learning. Artificial Intelligence Review 11(1-5), 11–73 (1997)

    Article  Google Scholar 

  10. Zheng, Z., Webb, G.I., Ting, K.M.: Lazy Bayesian Rules: A Lazy Semi-Naive Bayesian Learning Technique Competitive to Boosting Decision Trees. In: Proc. 16th International Conf. on Machine Learning, pp. 493–502 (1999)

    Google Scholar 

  11. Schapirere, R.E.: The strength of weak learnability. Machine Learning 5(2), 197–227 (1990)

    Google Scholar 

  12. Zenobi, G., Cunningham, P.: An Approach to Aggregating Ensembles of Lazy Learners that Supports Explanation. In: Craw, S., Preece, A.D. (eds.) ECCBR 2002. LNCS (LNAI), vol. 2416, pp. 121–160. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  13. Chen, Y., Zhao, Y.: A novel ensemble of classifiers for microarray data classification. Applied Soft Computing 8(4), 1664–1669 (2008)

    Article  Google Scholar 

  14. Wang, X., Wang, H.: Classification by evolutionary ensembles. Pattern Recognition 39, 595–607 (2006)

    Article  MATH  Google Scholar 

  15. Kennedy, J., Eberhart, R.: Particle Swarm Optimization. In: Proceedings of IEEE International Conference on Neural Networks, vol. IV, pp. 1942–1948 (1995)

    Google Scholar 

  16. Parsopoulos, K.E., Vrahatis, M.N.: Particle swarm optimization method in multiobjective problems. In: Proceedings of ACM Symp. on Applied Computing, Madrid The Association for Computing Machinery, pp. 603–607. ACM Press, New York (2002)

    Google Scholar 

  17. Kim, K.-J., Cho, S.-B.: An evolutionary algorithm approach to optimal ensemble classifiers for DNA microarray data analysis. IEEE Transactions On Evolutionary Computation 12(3), 377–388 (2008)

    Article  Google Scholar 

  18. Wold, S., SjÄolstrÄom, M., Erikson, L.: PLS-regression: A Basic Tool of Chemometrics. Chemometrics and Intelligent Laboratory Systems 130, 58–109 (2001)

    Google Scholar 

  19. Rocke, D.M., Dai, J.: Sampling and Subsampling for Cluster Analysis in Data Mining: With Applications to Sky Survey Data. Data Mining and Knowledge Discovery 7(2), 215–232 (2003)

    Article  MathSciNet  Google Scholar 

  20. Rosenwald, A., Wright, G., Chan, W.C., et al.: The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N. Engl. J. Med. 346, 1937–1947 (2002)

    Article  Google Scholar 

  21. Alon, U., Barkai, N., Notterman, D.A., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 96, 6745–6750 (1999)

    Article  Google Scholar 

  22. Iizuka, N., Oka, M., Yamada-Okabe, H., et al.: Oligonucleotide microarray for prediction of early intrahepatic recurrence of hepatocellular carcinoma after curative resection. The Lancet 361(9361), 923–929 (2003)

    Article  Google Scholar 

  23. Sokolova, M., Japkowicz, N., Szpakowicz, S.: Beyond Accuracy, F-score and ROC: a Family of Discriminant Measures for Performance Evaluation. In: Sattar, A., Kang, B.-h. (eds.) AI 2006. LNCS (LNAI), vol. 4304, pp. 1015–1021. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  24. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  25. Quinlan, J.R.: Bagging, Boosting, and C4.5. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, pp. 725–730 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pengfei, L., Wulei, T. (2011). Apply Ensemble of Lazy Learners to Biomedical Data Mining. In: Chen, R. (eds) Intelligent Computing and Information Science. ICICIS 2011. Communications in Computer and Information Science, vol 134. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18129-0_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-18129-0_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-18128-3

  • Online ISBN: 978-3-642-18129-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics