Skip to main content
Log in

Observer-invariant histopathology using genetics-based machine learning

  • Published:
Natural Computing Aims and scope Submit manuscript

Abstract

Prostate cancer accounts for one-third of noncutaneous cancers diagnosed in US men and is a leading cause of cancer-related death. Advances in Fourier transform infrared spectroscopic imaging now provide very large data sets describing both the structural and local chemical properties of cells within prostate tissue. Uniting spectroscopic imaging data and computer-aided diagnoses (CADx), our long term goal is to provide a new approach to pathology by automating the recognition of cancer in complex tissue. The first step toward the creation of such CADx tools requires mechanisms for automatically learning to classify tissue types—a key step on the diagnosis process. Here we demonstrate that genetics-based machine learning (GBML) can be used to approach such a problem. However, to efficiently analyze this problem there is a need to develop efficient and scalable GBML implementations that are able to process very large data sets. In this paper, we propose and validate an efficient GBML technique—\({\tt NAX}\)—based on an incremental genetics-based rule learner. \({\tt NAX}\) exploits massive parallelisms via the message passing interface (MPI) and efficient rule-matching using hardware-implemented operations. Results demonstrate that \({\tt NAX}\) is capable of performing prostate tissue classification efficiently, making a compelling case for using GBML implementations as efficient and powerful tools for biomedical image processing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Amdahl G (1967) Validity of the single processor approach to achieving large-scale computing capabilities. In Proceedings of the American federation of information processing societies conference (AFIPS). 30:483–485 AFIPS

  • Bacardit J, Butz M (2006) Advances at the frontier of Learning Classifier Systems. Chapter data mining in Learning Classifier Systems: Comparing XCS with GAssist, vol I. Springer

  • Bacardit J, Krasnogor N (2006) Biohel: Bioinformatics-oriented hierarchical evolutionary learning (Nottingham ePrints). University of Nottingham

  • Barry A, Drugowitsch J (1997) LCSWeb: the LCS wiki. http://www.lcsweb.cs.bath.ac.uk/

  • Bernadó E, Llorà X, Garrell J (2001) Advances in Learning Classifier Systems: 4th international workshop (IWLCS 2001). Chapter XCS and GALE: a comparative study of two Learning Classifier Systems with six other learning algorithms on classification tasks. Springer Berlin, Heidelberg, pp 115–132

  • Bhargava R, Fernandez D, Hewitt S, Levin I (2006) High throughput assessment of cells and tissues: Bayesian classification of spectral metrics from infrared vibrational spectroscopic imaging data. Biochemica et Biophisica Acta 1758(7):830–845

    Article  Google Scholar 

  • Cantú-Paz E (2000) Efficient and accurate parallel genetic algorithms. Kluwer Academic Publishers

  • Cordón O, Herrera F, Hoffmann F, Magdalena L (2001) Genetic fuzzy systems. Evolutionary tuning and learning of fuzzy knowledge bases. World Scientific

  • Fernandez D, Bhargava R, Hewitt S, Levin I (2005) Infrared spectroscopic imaging for histopathologic recognition. Nat Biotechnol 23(4):469–474

    Article  Google Scholar 

  • Flockhart I (1995) GA-MINER: parallel data mining with hierarchical genetic algorithms (final report). (Technical Report Technical Report EPCCAIKMS-GA-MINER-REPORT 1.0). University of Edinburgh

  • Gabriel E, Fagg G, Bosilca G, Angskun T, Dongarra J, Squyres J, Sahay V, Kambadur P, Barrett B, Lumsdaine A, Castain R, Daniel D, Graham R, Woodall T (2004) Open MPI: goals, concept, and design of a next generation MPI implementation. In Proceedings of the 11th European PVMMPI Users’ group meeting Springer

  • Goldberg D (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley Professional

  • Goldberg D (2002) The design of innovation: lessons from and for competent genetic algorithms. Springer

  • Grama A, Gupta A, Karypis G, Kumar V (2003) Introduction to parallel computing. Addison-Wesley

  • Holte R (1993) Very simple classification rules perform well on most commonly used datasets. Mach Learn 11:63–91

    Article  MATH  Google Scholar 

  • Lattouf J-B, Saad F (2002) Gleason score on biopsy: is it reliable for predcting the final grade on pathology? BJU Int 90:694–699

    Article  Google Scholar 

  • Levin I, Bhargava R (2005) Fourier transform infrared vibrational spectroscopic imaging: integrating microscopy and molecular recognition. Annu Rev Phys Chem 56: 429–474

    Article  Google Scholar 

  • Llorà X (2002) Genetics-based machine learning using fine-grained parallelism for data mining. Doctoral dissertation, Enginyeria i Arquitectura La Salle. Ramon Llull University, Barcelona, Catalonia, European Union

  • Llorà X (2006) Learning Classifier Systems and other genetics-based machine learning Blog. http://www-illigal.ge.uiuc.edulcs-n-gbml/

  • Llorà X, Garrell J (2001) Knowledge-independent data mining with fine-grained parallel evolutionary algorithms. In Proceedings of the genetic and evolutionary computation conference (GECCO’2001). Morgan Kaufmann Publishers, pp 461–468

  • Llorà X, Goldberg D (2003) Bounding the effect of noise in multiobjective Learning Classifier Systems. Evol Comput J 11(3):279–298

    Article  Google Scholar 

  • Llorà X, Sastry K (2006) Fast rule matching for Learning Classifier Systems via vector instructions. In Proceedings of the 2006 genetic and evolutionary computation conference. ACM Press, pp 1513–1520

  • Llorà X, Sastry K, Goldberg D (2005) The compact classifier system: motivation, analysis and first results. In Proceedings of the congress on evolutionary computation, vol 1. IEEE press, (Also as IlliGAL TR No 2005019, pp 596–603)

  • Llorà X, Sastry K, Goldberg D, de la Ossa L (2007) The χ-ary extended compact classifier system: linkage learning in Pittsburgh LCS. In Advances at the frontier of Learning Classifier Systems, vol II. IlliGAL report no 2006015. Springer, pp (in preparation)

  • Merz CJ, Murphy PM (1998) UCI repository for machine learning data-bases. http://www.ics.uci.edu/~mlearn/MLRepository.html

  • Mitchell T (1997) Machine learning. McGraw Hill

  • Orriols-Puig A, Bernadó-Mansilla E (2006) A further look at UCS classifier system. In Proceedings of the 8th annual conference on genetic and evolutionary computation workshop program. ACM Press

  • Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann

  • Stone C, Bull L (2003) For real! XCS with continuous-valued inputs. Evol Comput J 11(3):279–298

    Article  Google Scholar 

  • Wilson S (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175

    Article  Google Scholar 

  • Wilson S (2000a) Get real! XCS with continuous-valued inputs. Lect Notes Comput Sci 1813:209–219

    Article  Google Scholar 

  • Wilson S (2000b) Mining oblique data with xcs. In Revised papers of the 3th international workshop on Learning Classifier Systems (IWLCS 2000). Springer, pp 158–176

Download references

Acknowledgments

We would like to thank David E. Goldberg for his continual support and encouragement, allowing us to have access to the IlliGAL resources. Thanks also to Kumara Sastry for hallway discussions and to the Automated Learning Group and the Data-Intensive Technologies and Applications at the National Center for Supercomputing Applications for hosting this joint collaboration.

This work was sponsored by the Air Force Office of Scientific Research, Air Force Materiel Command, USAF, under grant FA9550-06-1-0370, the National Science Foundation under grant IIS-02-09199, and the National Institute of Health. The US Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon.

The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Air Force Office of Scientific Research, the National Science Foundation, or the US Government.

Rohit Bhargava would like to acknowledge collaborators over the years, especially Dr. Stephen M. Hewitt and Dr. Ira W. Levin of the National Institutes of Health, for numerous useful discussions and guidance. Funding for this work was provided in part by University of Illinois Research Board and by the Department of Defense Prostate Cancer Research Program. This work was also funded in part by the National Center for Supercomputing Applications and the University of Illinois, under the auspices of the NCSA/UIUC faculty fellows program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xavier Llorà.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Llorà, X., Priya, A. & Bhargava, R. Observer-invariant histopathology using genetics-based machine learning. Nat Comput 8, 101–120 (2009). https://doi.org/10.1007/s11047-007-9056-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11047-007-9056-6

Keywords

Navigation