Genomic signal processing of microarrays for cancer gene expression and identification using cluster-fuzzy adaptive networking

Abstract

Genomic signal processing (GSP) is a functioning exploration area of recent times and a settled technique of digital signal processing for gathering information from genomic sequences. The recognition and identification of biological signals and analysis of sequences are the fundamental objectives of using GSP. Microarray data are typically used in GSP; microarray study decides genes that cause a specific disease and helps in anticipating and diagnosing a disease, and characterization of diseases. Microarray information is incredible innovation where information handled to an enormous number with plenty of genes. Recent research works show that microarray handling will be helpful for the classification of cancer genes. Different machine learning and artificial intelligence techniques are likewise used to distinguish the tumours and cancer cells. In this examination, the genomic signal processing is carried out utilizing cluster-fuzzy adaptive networking techniques. The major purpose of this research work is to evaluate the microarray data sets for recognizing the cancer genes. The microarray data set is generated using leukaemia, colon, prostate, breast cancer and lymphoma. Initially, the noise in the microarray is filtered and smoothened by utilizing a Kalman filter followed by an optimal clustering technique such as grid density-based clustering that is applied for clustering the microarray data sets. The clustered data of microarray are classified by adaptive neuro fuzzy interference system (ANFIS) for gene sequencing process of cancer identification. The adaptive network systems are developed based on autonomous networking concepts to change the static system into a dynamic. The efficiency of clustering is evaluated in terms of cluster indexes namely partition entropy, partition coefficient, Xie and Beni. The presented ANFIS is assessed in terms of precision, accuracy, recall, sensitivity, F-score and specificity. The proposed initiated methodology is mathematically designed and executed in the MATLAB platform and run for various test runs. During the implementation, the performance of cluster and classification efficiency of proposed techniques are compared with the existing strategies like fuzzy c-means with ANN and density-based clustering with ANN, respectively. Ultimately, the performance outcomes demonstrated that the proposed method can provide effective and optimal classification and identification of microarray cancer genes through genomic signal processing than the conventional methods, respectively.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

References

  1. Abdullah M, Eldin H, Al-Moshadak T, Alshaik R, Al-Anesi I (2015) Density grid-based clustering for wireless sensors networks. Procedia Comput Sci 65:35–47

    Google Scholar 

  2. Ahmadlou M et al (2018) Flood susceptibility assessment using integration of adaptive network-based fuzzy inference system (ANFIS) and biogeography-based optimization (BBO) and BAT algorithms (BA). Geocarto Int 34:1–21

    Google Scholar 

  3. Akhlaghi S, Zhou N and Huang Z (2017) Adaptive adjustment of noise covariance in Kalman filter for dynamic state estimation. In: 2017 IEEE power and energy society general meeting

  4. Amini A, Wah T (2012) On density-based clustering algorithms over evolving data streams: a summarization paradigm. Appl Mech Mater 263–266:2234–2237

    Google Scholar 

  5. Analytical BS, Barretos CH, Cancer Genome Atlas Research Network (2017) Integrated genomic and molecular characterization of cervical cancer. Nature 543(7645):378

    Google Scholar 

  6. Aravanis A, Lee M, Klausner R (2017) Next-generation sequencing of circulating tumor DNA for early cancer detection. Cell 168(4):571–574

    Google Scholar 

  7. Borrayo E, Mendizabal-Ruiz E, Vélez-Pérez H, Romo-Vázquez R, Mendizabal A, Morales J (2014) Genomic signal processing methods for computation of alignment-free distances from DNA sequences. PLoS ONE 9(11):e110954

    Google Scholar 

  8. Boyacioglu AM, Avci D (2010) An adaptive network-based fuzzy inference system (ANFIS) for the prediction of stock market return: the case of the Istanbul stock exchange. Expert Syst Appl 37(12):7908–7912

    Google Scholar 

  9. Chandrakar N (2016) Artificial neural networks as classification and diagnostic tools for lymph node-negative breast cancers. Korean J Chem Eng 33(4):1318–1324

    Google Scholar 

  10. Chauhan N, Cho B-J (2019) Performance analysis of classification techniques of human brain MRI images. Int J Fuzzy Log Intell Syst 19(4):315–322

    Google Scholar 

  11. Chen D, Lin Y, Zhou Y, Chen M, Wen D (2017) Dislocation substructures evolution and an adaptive-network-based fuzzy inference system model for constitutive behaviour of a Ni-based superalloy during hot deformation. J Alloys Compd 708:938–946

    Google Scholar 

  12. Chinnaswamy A, Srinivasan R (2015) Hybrid feature selection using correlation coefficient and particle swarm optimization on microarray gene expression data. Adv Intell Syst Comput 1(1):229–239

    Google Scholar 

  13. Choudhry M, Kapoor R (2016) Performance analysis of fuzzy C-means clustering methods for MRI image segmentation. Procedia Comput Sci 89:749–758

    Google Scholar 

  14. Garro BA, Rodríguez K, Vázquez RA (2016) Classification of DNA microarrays using artificial neural networks and ABC algorithm. Appl Soft Comput 38:548–560

    Google Scholar 

  15. Harvey Simeon B, Ji S-Y (2017) Cloud-scale genomic signals processing for robust large-scale cancer genomic microarray data analysis. IEEE J Biomed Health Inf 21(1):238–245

    Google Scholar 

  16. Hira Z, Gillies D (2015) A review of feature selection and feature extraction methods applied on microarray data. Adv Bioinform 2015:1–13

    Google Scholar 

  17. Li Y et al (2018) The p53–Mdm2 regulation relationship under different radiation doses based on the continuous–discrete extended Kalman filter algorithm. Neurocomputing 273:230–236

    Google Scholar 

  18. Libbrecht MW, Noble WS (2015) Machine learning applications in genetics and genomics. Nat Rev Genet 16(6):321–332

    Google Scholar 

  19. Mel’nikov SM, ten Hoorn JWMS, Eijkelenboom APAM (2004) Effect of phytosterols and phytostanols on the solubilization of cholesterol by dietary mixed micelles: an in vitro study. Chem Phys Lipids 127(2):121–141

    Google Scholar 

  20. Min SLB, Yoon S (2017) Deep learning in bioinformatics. Brief Bioinform 18(5):851–869

    Google Scholar 

  21. Mishra P, Bhoi N (2019) Microarray filtering-based fuzzy C-means clustering and classification in genomic signal processing. Arab J Sci Eng. https://doi.org/10.1007/s13369-019-03945-0

    Google Scholar 

  22. Nanda JS, Panda G (2015) Design of computationally efficient density-based clustering algorithms. Data Knowl Eng 95:23–38

    Google Scholar 

  23. Naseem TM et al (2017) Preprocessing and signal processing techniques on genomic data sequences. Biomed Res 28:1

    Google Scholar 

  24. Nino-Ruiz ED, Sandu A (2017) Efficient parallel implementation of DDDAS inference using an ensemble Kalman filter with shrinkage covariance matrix estimation. Clust Comput 22:1–11

    Google Scholar 

  25. Podolsky M, Barchuk A, Kuznetcov V, Gusarova N, Gaidukov V, Tarakanov S (2016) Evaluation of machine learning algorithm utilization for lung cancer classification based on gene expression levels. Asian Pac J Cancer Prev 17(2):835–838

    Google Scholar 

  26. Raza K, Alam M (2016) Recurrent neural network based hybrid model for reconstructing gene regulatory network. Comput Biol Chem 64:322–334

    Google Scholar 

  27. Rebollo J et al (2017) Gene expression profiling of tumors from heavily pretreated patients with metastatic cancer for the selection of therapy: a pilot study. Am J Clin Oncol 40(2):140–145

    Google Scholar 

  28. Saito T, Rehmsmeier M (2017) Precrec: fast and accurate precision–recall and ROC curve calculations in R. Bioinformatics 33(1):145–147

    Google Scholar 

  29. Sasikala S, Balamurugan S, Geetha S (2015) A novel feature selection technique for improved survivability diagnosis of breast cancer. Procedia Comput Sci 50:16–23

    Google Scholar 

  30. Sharma M (2012) Brain tumor segmentation using hybrid genetic algorithm and artificial neural network fuzzy inference system (ANFIS). Int J Fuzzy Log Syst 2(4):31–42

    Google Scholar 

  31. Tirumala S, Narayanan A (2016) Attribute selection and classification of prostate cancer gene expression data using artificial neural networks. In: Cao H, Li J, Wang R (eds) Lecture notes in computer science. Springer, Cham, pp 26–34

    Google Scholar 

  32. Wang L, Wang Y, Chang Q (2016) Feature selection methods for big data bioinformatics: a survey from the search perspective. Methods 111:21–31

    Google Scholar 

  33. Wiharto ES, Susilo M (2019) The hybrid method of SOM artificial neural network and median thresholding for segmentation of blood vessels in the retina image fundus. Int J Fuzzy Log Intell Syst 19(4):323–331

    Google Scholar 

  34. Xu X, Ding S, Du M, Xue Y (2016) DPCG: an efficient density peaks clustering algorithm based on grid. Int J Mach Learn Cybern 9(5):743–754

    Google Scholar 

  35. Xue B, Zhang M, Browne W, Yao X (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput 20(4):606–626

    Google Scholar 

  36. Yue S et al (2016) A new validity index for evaluating the clustering results by partitional clustering algorithms. Soft Comput 20(3):1127–1138

    Google Scholar 

  37. Zhang L et al (2017) Cancer progression prediction using gene interaction regularized elastic net. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 14(1):145–154

    Google Scholar 

Download references

Funding

No funding is provided for the preparation of manuscript.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Purnendu Mishra.

Ethics declarations

Conflict of interest

Authors Purnendu Mishra and Dr. Nilamani Bhoi declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by V. Loia.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mishra, P., Bhoi, N. Genomic signal processing of microarrays for cancer gene expression and identification using cluster-fuzzy adaptive networking. Soft Comput (2020). https://doi.org/10.1007/s00500-020-05068-3

Download citation

Keywords

  • Genomic signal processing
  • Microarray data
  • Kalman filter
  • Grid density-based clustering
  • Fuzzy interference system and partition coefficient