Skip to main content

Crop Disease Protection Using Parallel Machine Learning Approaches

  • Chapter
  • First Online:
Classification in BioApps

Part of the book series: Lecture Notes in Computational Vision and Biomechanics ((LNCVB,volume 26))

  • 1979 Accesses

Abstract

Crop diseases are the most important biological hazards to challenge sustainable development in agricultural production for many years. Every year, 42% of the global agricultural yield is destroyed by disease. Bioinformatics techniques provide efficient methods with which to analyze and interpret the raw biological data, which helps to study the effect of a pathogen on a crop. Microarray gene expression data represent the expression levels of the genes of a cell (organism) maintained in a particular environment. Hence, significant gene prediction and pathogen–host interactions can be studied using gene expression data. Different machine learning techniques can be applied to extract useful information represented by the candidate genes. The approach proposed in this chapter consists of the preprocessing of gene expression data, gene selection or feature extraction using a parallel approach and classification. The feature selection methods have been analyzed for the extraction of candidate genes with biological significance for rice-related diseases; these are a support vector machine with recursive feature elimination (SVM-RFE), minimum redundancy maximum relevance (mRMR), principal component analysis (PCA), successive feature selection (SFS) and independent component analysis (ICA). In order to deal with computational complexity and the large volume of data, the combination of general-purpose graphics processing unit (GPGPU) computing and MapReduce programming on an Apache Hadoop framework is proposed. The experimental results show improved time efficiency in feature extraction and classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D (2000) Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10):906–914

    Article  Google Scholar 

  2. Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537

    Google Scholar 

  3. Sørlie T, Tibshirani R, Parker J et al (2003) Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci 100(14):8418–8423

    Article  Google Scholar 

  4. Van’t Veer LJ, Dai H, Van de Vijver MJ et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871):530–536

    Article  Google Scholar 

  5. Boersma BJ, Reimer M, Yi M et al (2008) A stromal gene signature associated with inflammatory breast cancer. Int J Cancer 122(6):1324–1332

    Article  Google Scholar 

  6. Mishra D, Dash R, Rath AK, Acharya M (2011) Feature selection in gene expression data using principal component analysis and rough set theory. Adv Exp Med Biol 696(1):91–100

    Article  Google Scholar 

  7. Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene section for cancer classification using support vector machines. Mach Learn 46(1):389–422

    Article  MATH  Google Scholar 

  8. Tang Y, Zhang YQ, Huang Z (2007) Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis. IEEE/ACM Trans Comput Biol BioInf :365–381

    Google Scholar 

  9. Au W-H, Chan KCC et al (2005) Attribute clustering for grouping, selection and classification of gene expression data. IEEE/ACM Trans Comput Biol BioInf 2(2):83–101

    Article  Google Scholar 

  10. Zheng CH, Ng TY, Zhang L, Shiu CK, Wang HQ (2011) Tumor classification based on non-negative matrix factorization using gene expression data. IEEE Trans Nanobiosci 10(2):86–93

    Article  Google Scholar 

  11. Chuang LY; Yang CH, Tu CJ, Yang CH (2006) A novel feature selection for gene expression data. In: Proceedings of the joint conference on information sciences. Atlantis Press, pp 57–60

    Google Scholar 

  12. Wu MY, Dai DQ, Shi Y, Yan H, Zhang XF (2012) Biomarker identification and cancer classification based on microarray data using laplace naive bayes model with mean shrinkage. IEEE/ACM Trans Comput Biol Bioinf 9(6):1649–1661

    Article  Google Scholar 

  13. Aggarwal CC (2014) Data classification: algorithms and applications, 1st edn. CRC Press, Boca Raton, pp 2–4

    Google Scholar 

  14. Rojas R (1996) Neural Networks—a systematic introduction, 1st edn. Springer, New York, pp 55–58

    MATH  Google Scholar 

  15. Lu Y, Han J (2003) Cancer classification using gene expression data. Inf Syst 28(4):243–268

    Article  MATH  Google Scholar 

  16. Pirooznia M, Yang JY, Yang MQ, Deng Y (2008) A comparative study of different machine learning methods on microarray gene expression data, BMC Genomics 9(1):230–230

    Google Scholar 

  17. Dudoit S, Fridlyand J, Speed TP (2002) J Am Stat Assoc 97(457):77–87

    Article  Google Scholar 

  18. Mallika R, Saravanan V (2010) An SVM based classification method for cancer data using minimum microarray gene expressions. Int Sci Index 4(2):472–476

    Google Scholar 

  19. Shen X, Lin Y (2004) Gene expression data classification using SVM-KNN classifier. In: International symposium on intelligent multimedia, video and speech processing, pp 149–152

    Google Scholar 

  20. Samb ML, Camara F, Ndiaye S, Slimani Y, Esseghir MA (2012) Int J Adv Sci Technol 43(1):27–36

    Google Scholar 

  21. Zhou X, Tuck DP (2007) MSVM-RFE: extensions of SVM-RFE for multiclass gene selection on DNA microarray data. Bioinformatics :1106–1114

    Google Scholar 

  22. Ren Y, Wang D, Wang Y, Zhou J, Zhang H et al (2010) Prediction of disease-resistant gene in rice based on SVM-RFE. In: 3rd international conference on biomedical engineering and informatics (BMEI), vol 6, no 1, pp 2343–2346

    Google Scholar 

  23. Shaik Rafi, Ramakrishna W (2014) Machine learning approaches distinguish multiple stress conditions using stress-responsive genes and identify candidate genes for broad resistance in rice. Plant Physiol 164(1):481–495

    Article  Google Scholar 

  24. Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th conference on symposium on opearting systems design & implementation, vol 6, issue no 1. Usenix, CA, USA, pp 137–149

    Google Scholar 

  25. Shvachko K, Kuang H, Radia S, Chansler R (2010) The Hadoop distributed file system. In: IEEE 26th symposium on mass storage systems and technologies (MSST), pp 121–134

    Google Scholar 

  26. Wu GQ, Li HG, Hu XG, Bi Y, Zhang J et al (2009) MReC4.5: C4.5 Ensemble classification with MapReduce. In: China grid annual conference, pp 249–255

    Google Scholar 

  27. Athanasopoulos A, Dimou A, Mezaris V, Kompatsiaris I (2011) GPU acceleration for support vector machines. In: Proceedings of the 12th international workshop on image analysis for multimedia interactive services

    Google Scholar 

  28. Zhang X, Zhang Y (2014) GPU implementation of parallel support vector machine algorithm with applications to intruder detection. J Comput 9(5)

    Google Scholar 

  29. Azmandian F et al (2014) Harnessing the power of GPUs to speed up feature selection for outlier detection. J Comput Sci Technol 29(3):408–422

    Article  Google Scholar 

  30. Sharma A, Imoto S, Miyano S (2012) A top-r feature selection algorithm for micro array gene expression data. IEEE/ACM Trans Comput Biol Bioinf 9(3):754–764

    Article  Google Scholar 

  31. Zhou L, Wang H, Wang W (2012) Parallel implementation of classification algorithms based on cloud computing environment. TELKOMNIKA Indonesian J Electr Eng 10(5):1087–1092

    Google Scholar 

  32. Mcnabb AW, Monson, CK, Seppi KD (2007) Parallel PSO using mapreduce. IEEE Congress on Evolutionary Computation, pp 7–14

    Google Scholar 

  33. Catanzaro BC, Sundaram N, Keutzer K (2008) Fast support vector machine training and classification on graphics processors. In: Proceedings of the 25th international conference on machine learning, pp 104–111

    Google Scholar 

  34. Mejia-Roa E, Garcia C, Gomez et al (2011) Biclustering and classification analysis in gene expression using nonnegative matrix factorization on multi-GPU systems. In: 11th international conference on intelligent systems design and applications (ISDA), pp 882–887

    Google Scholar 

  35. Dey N, Ashour A (2016) Classification and clustering in biomedical signal processing. IGI Publishing, Hershey, PA

    Book  Google Scholar 

  36. AlShahrani AM, Al-Abadi MA et al (2017) Automated system for crops recognition and classification. In Applied video processing in surveillance and monitoring systems, doi:10.4018/978-1-5225-1022-2.ch00

  37. Kriti, Virmani J, Dey N, Kumar V (2015) Applications of intelligent optimization in biology and medicine. In: PCA-PNN and PCA-SVM based CAD systems for breast density classification, vol 96, pp 159–180

    Google Scholar 

  38. Saba L, Dey N, Ashour AS, Samanta S (2016) Automated stratification of liver disease in ultrasound: an online accurate feature classification paradigm

    Google Scholar 

  39. Ahmed Saddam, Dey Nilanjan, Ashour Amira S et al (2017) Effect of fuzzy partitioning in Crohn’s disease classification: a neuro-fuzzy-based approach. Med Biol Eng Comput 55(1):101–115

    Article  Google Scholar 

  40. Chatterjee S, Hore S, Dey N (2015) Dengue fever classification using gene expression data: a PSO based artificial neural network approach. In: Proceedings of the 5th international conference on frontiers in intelligent computing: theory and applications, pp 331–341

    Google Scholar 

  41. Zemmal N, Azizi N, Sellami M, Dey N (2015) Automated classification of mammographic abnormalities using transductive semi supervised learning algorithm. In: Proceedings of the Mediterranean conference on information and communication technologies, pp 657–662 (2015)

    Google Scholar 

  42. Jain A, Bhatnagar V, Dey N (2016) Dynamic priceaAssessment Model for Flight Booking Engines using Classification and Regression Adapted to MapReduce Framework. J Global Inf Manage

    Google Scholar 

Download references

Acknowledgements

This research is an outcome of University Grants Commission project. The work was carried out in PSG-Nokia Centre for Big Data Analytics. The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. Sudha Sadasivam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Sadasivam, G.S., Madhesu, S., Mumthas, O.Y., Dharani, K. (2018). Crop Disease Protection Using Parallel Machine Learning Approaches. In: Dey, N., Ashour, A., Borra, S. (eds) Classification in BioApps. Lecture Notes in Computational Vision and Biomechanics, vol 26. Springer, Cham. https://doi.org/10.1007/978-3-319-65981-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-65981-7_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-65980-0

  • Online ISBN: 978-3-319-65981-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics