Tools and Methods in Analysis of Complex Sequences

  • Noor Ahmad ShaikEmail author
  • Babajan Banaganapalli
  • Ramu Elango
  • Jumana Y. Al-Aama


Genome sequencing is an important molecular technique used in diverse types of biological investigations. Since the time of Sanger’s sequencing of bacteriophage genome is completed, continuous search has been ongoing to innovate new sequencing methods which can offer faster, reliable, and extensive coverage of genomic sequences. One such latest innovation is NGS (next-generation sequencing) platforms, which utilizes different molecular techniques to sequence the entire genome of any organism. The NGS technology is customized for different types of applications, including WGS (whole genome sequencing), WES (whole exome sequencing), and targeted resequencing, de novo assembly sequencing and transcriptome sequencing at the DNA or RNA level. NGS technologies are widely used to analyze quantitative differences in gene expression, discovering transcribed sequences of miRNAs, and also in mutation screening.

Illumina SBS (sequencing by synthesis) is one of the most advanced methodologies of gene sequencing. In this method, hybridization of denatured DNA with the oligonucleotides of cell lanes takes place, and base pairs are generated. There are various software available to assess the quality of genetic sequences produced. In WES, the whole exome is sequenced, and the data is taken for primary analysis. This process involves three steps including sequence alignment, postprocessing, and variant analysis. After that, secondary analysis is performed in which generated variants are annotated and prioritized.

Finally, these variants are filtered to determine any casual genetic mutation which is associated with any disease or lethality in the human genome.


Sequencing WGS WES NGS Variants 


  1. Al-Aama JY, Shaik NA, Banaganapalli B, Salama MA, Rashidi O, Sahly AN, Mohsen MO, Shawoosh HA, Shalabi HA, Edreesi MA, Alharthi SE, Wang J, Elango R, Saadah OI (2017) Whole exome sequencing of a consanguineous family identifies the possible modifying effect of a globally rare AK5 allelic variant in celiac disease development among Saudi patients. PLoS One 12:e0176664PubMedPubMedCentralGoogle Scholar
  2. Bailey-Wilson JE, Wilson AF (2011) Linkage analysis in the next-generation sequencing era. Hum Hered 72:228–236PubMedPubMedCentralGoogle Scholar
  3. Bamshad MJ, Ng SB, Bigham AW, Tabor HK, Emond MJ, Nickerson DA, Shendure J (2011) Exome sequencing as a tool for Mendelian disease gene discovery. Nat Rev Genet 12(11):745PubMedGoogle Scholar
  4. Butkiewicz M, Bush WS (2016) In Silico functional annotation of genomic variation. Curr Protoc Hum Genet 88, Unit 6 15Google Scholar
  5. Chen R, Im H, Snyder M (2015) Whole-exome enrichment with the Agilent SureSelect human all exon platform. Cold Spring Harb Protoc 2015:626–633PubMedGoogle Scholar
  6. Del Fabbro C, Scalabrin S, Morgante M, Giorgi FM (2013) An extensive evaluation of read trimming effects on Illumina NGS data analysis. PLoS One 8:e85024PubMedPubMedCentralGoogle Scholar
  7. Fedurco M, Romieu A, Williams S, Lawrence I, Turcatti G (2006) BTA, a novel reagent for DNA attachment on glass and efficient generation of solid-phase amplified DNA colonies. Nucleic Acids Res 34(3):e22–e22PubMedPubMedCentralGoogle Scholar
  8. Gilissen C, Hoischen A, Brunner HG, Veltman JA (2011) Unlocking Mendelian disease using exome sequencing. Genome Biol 12:228PubMedPubMedCentralGoogle Scholar
  9. Guo W, Zhu X, Yan L, Qiao J (2018) The present and future of whole-exome sequencing in studying and treating human reproductive disorders. J Genet Genomics 45:517–525PubMedGoogle Scholar
  10. Huang H, Fang M, Jostins L, Umicevic Mirkov M, Boucher G, Anderson CA, Andersen V, Cleynen I, Cortes A, Crins F, D’amato M, Deffontaine V, Dmitrieva J, Docampo E, Elansary M, Farh KK, Franke A, Gori AS, Goyette P, Halfvarson J, Haritunians T, Knight J, Lawrance IC, Lees CW, Louis E, Mariman R, Meuwissen T, Mni M, Momozawa Y, Parkes M, Spain SL, Theatre E, Trynka G, Satsangi J, Van Sommeren S, Vermeire S, Xavier RJ, International Inflammatory Bowel Disease Genetics, C, Weersma RK, Duerr RH, Mathew CG, Rioux JD, Mcgovern DPB, Cho JH, Georges M, Daly MJ, Barrett JC (2017) Fine-mapping inflammatory bowel disease loci to single-variant resolution. Nature 547:173–178PubMedPubMedCentralGoogle Scholar
  11. Korte A, Farlow A (2013) The advantages and limitations of trait analysis with GWAS: a review. Plant Methods 9:29PubMedPubMedCentralGoogle Scholar
  12. Kruglyak KM, Lin E, Ong FS (2016) Next-generation sequencing and applications to the diagnosis and treatment of lung cancer. Adv Exp Med Biol 890:123–136PubMedGoogle Scholar
  13. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, Fitzhugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, Levine R, Mcewan P, Mckernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann Y, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, Mcmurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, Mcpherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921PubMedPubMedCentralGoogle Scholar
  14. Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359PubMedPubMedCentralGoogle Scholar
  15. Le Gallo M, Lozy F, Bell DW (2017) Next-generation sequencing. Adv Exp Med Biol 943:119–148PubMedGoogle Scholar
  16. Lee S, Abecasis GR, Boehnke M, Lin X (2014a) Rare-variant association analysis: study designs and statistical tests. Am J Hum Genet 95(1):5–23PubMedPubMedCentralGoogle Scholar
  17. Lee H, Deignan JL, Dorrani N, Strom SP, Kantarci S, Quintero-Rivera F, Das K, Toy T, Harry B, Yourshaw M, Fox M, Fogel BL, Martinez-Agosto JA, Wong DA, Chang VY, Shieh PB, Palmer CG, Dipple KM, Grody WW, Vilain E, Nelson SF (2014b) Clinical exome sequencing for genetic identification of rare Mendelian disorders. JAMA 312:1880–1887PubMedPubMedCentralGoogle Scholar
  18. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Genome Project Data Processing, S (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079PubMedPubMedCentralGoogle Scholar
  19. Macarthur DG (2012) Challenges in clinical genomics. Genome Med 4:43PubMedPubMedCentralGoogle Scholar
  20. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA et al (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437(7057):376PubMedPubMedCentralGoogle Scholar
  21. Maxam AM, Gilbert W (1977) A new method for sequencing DNA. Proc Natl Acad Sci U S A 74:560–564PubMedPubMedCentralGoogle Scholar
  22. Metzker ML (2005) Emerging technologies in DNA sequencing. Genome Res 15:1767–1776PubMedGoogle Scholar
  23. Nakagawa H, Wardell CP, Furuta M, Taniguchi H, Fujimoto A (2015) Cancer whole-genome sequencing: present and future. Oncogene 34:5943–5950PubMedGoogle Scholar
  24. Priya RR, Rajasimha HK, Brooks MJ, Swaroop A (2012) Exome sequencing: capture and sequencing of all human coding regions for disease gene discovery. Methods Mol Biol 884:335–351PubMedGoogle Scholar
  25. Pruitt KD, Harrow J, Harte RA, Wallin C, Diekhans M, Maglott DR, Searle S, Farrell CM, Loveland JE, Ruef BJ, Hart E, Suner MM, Landrum MJ, Aken B, Ayling S, Baertsch R, Fernandez-Banet J, Cherry JL, Curwen V, Dicuccio M, Kellis M, Lee J, Lin MF, Schuster M, Shkeda A, Amid C, Brown G, Dukhanina O, Frankish A, Hart J, Maidak BL, Mudge J, Murphy MR, Murphy T, Rajan J, Rajput B, Riddick LD, Snow C, Steward C, Webb D, Weber JA, Wilming L, Wu W, Birney E, Haussler D, Hubbard T, Ostell J, Durbin R, Lipman D (2009) The consensus coding sequence (CCDS) project: identifying a common protein-coding gene set for the human and mouse genomes. Genome Res 19:1316–1323PubMedPubMedCentralGoogle Scholar
  26. Pruitt KD, Tatusova T, Brown GR, Maglott DR (2012) NCBI reference sequences (RefSeq): current status, new features and genome annotation policy. Nucleic Acids Res 40:D130–D135PubMedPubMedCentralGoogle Scholar
  27. Puritz JB, Lotterhos KE (2018) Expressed exome capture sequencing: a method for cost-effective exome sequencing for all organisms. Mol Ecol Resour 18:1209–1222PubMedGoogle Scholar
  28. Rabbani B, Tekin M, Mahdieh N (2014) The promise of whole-exome sequencing in medical genetics. J Hum Genet 59:5–15PubMedPubMedCentralGoogle Scholar
  29. Regier AA, Farjoun Y, Larson DE, Krasheninina O, Kang HM, Howrigan DP, Chen BJ, Kher M, Banks E, Ames DC, English AC, Li H, Xing J, Zhang Y, Matise T, Abecasis GR, Salerno W, Zody MC, Neale BM, Hall IM (2018) Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects. Nat Commun 9:4038PubMedPubMedCentralGoogle Scholar
  30. Rhodes J, Beale MA, Fisher MC (2014) Illuminating choices for library prep: a comparison of library preparation methods for whole genome sequencing of Cryptococcus neoformans using Illumina HiSeq. PLoS One 9:e113501PubMedPubMedCentralGoogle Scholar
  31. Sanger F, Air GM, Barrell BG, Brown NL, Coulson AR, Fiddes CA, Hutchison CA, Slocombe PM, Smith M (1977a) Nucleotide sequence of bacteriophage phi X174 DNA. Nature 265:687–695PubMedGoogle Scholar
  32. Sanger F, Nicklen S, Coulson AR (1977b) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A 74:5463–5467PubMedPubMedCentralGoogle Scholar
  33. Schmieder R, Edwards R (2011) Quality control and preprocessing of metagenomic datasets. Bioinformatics 27(6):863–864PubMedPubMedCentralGoogle Scholar
  34. Stitziel NO, Kiezun A, Sunyaev S (2011) Computational and statistical approaches to analyzing variants identified by exome sequencing. Genome Biol 12:227PubMedPubMedCentralGoogle Scholar
  35. Teare MD, Santibanez Koref MF (2014) Linkage analysis and the study of Mendelian disease in the era of whole exome and genome sequencing. Brief Funct Genomics 13:378–383PubMedGoogle Scholar
  36. Turcatti G, Romieu A, Fedurco M, Tairi AP (2008) A new class of cleavable fluorescent nucleotides: synthesis and optimization as reversible terminators for DNA sequencing by synthesis. Nucleic Acids Res 36(4):e25–e25PubMedPubMedCentralGoogle Scholar
  37. Visscher PM, Brown MA, Mccarthy MI, Yang J (2012) Five years of GWAS discovery. Am J Hum Genet 90:7–24PubMedPubMedCentralGoogle Scholar
  38. Worthey EA, Mayer AN, Syverson GD, Helbling D, Bonacci BB, Decker B, Serpe JM, Dasu T, Tschannen MR, Veith RL, Basehore MJ, Broeckel U, Tomita-Mitchell A, Arca MJ, Casper JT, Margolis DA, Bick DP, Hessner MJ, Routes JM, Verbsky JW, Jacob HJ, Dimmock DP (2011) Making a definitive diagnosis: successful clinical application of whole exome sequencing in a child with intractable inflammatory bowel disease. Genet Med 13:255–262PubMedGoogle Scholar
  39. Ye BD, Mcgovern DP (2016) Genetic variation in IBD: progress, clues to pathogenesis and possible clinical utility. Expert Rev Clin Immunol 12:1091–1107PubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Noor Ahmad Shaik
    • 1
    Email author
  • Babajan Banaganapalli
    • 2
  • Ramu Elango
    • 2
  • Jumana Y. Al-Aama
    • 1
  1. 1.Department of Genetic MedicineKing Abdulaziz UniversityJeddahSaudi Arabia
  2. 2.Princess Al-Jawhara Center of Excellence in Research of Hereditary Disorders, Department of Genetic Medicine, Faculty of MedicineKing Abdulaziz UniversityJeddahSaudi Arabia

Personalised recommendations