Skip to main content

Data Mining to Detect Common, Unique, and Polymorphic Simple Sequence Repeats

  • Chapter
  • First Online:
  • 2151 Accesses

Abstract

Nowadays computational data mining of biological data is of paramount importance to discover patterns in large data generated through sequencing and other efforts. The extracted information can be used in various ways to get new insights about subject organism. Simple sequence repeats (SSRs) consist of 1–6 nucleotides and can be characterized in wet laboratory as well as mined through computational approaches. These repeats help in the genetic mapping, breeding experiments, phylogeny and can also be used to develop molecular markers. In view of their usefulness, various specialized biological databases of SSRs were developed. In this chapter, a case study is presented which used in silico mined nucleotide sequence data to further detect putative polymorphic, common, and unique SSRs in chloroplast genomes of genus Triticum. Earlier, SSRs were detected in several organisms; however, in silico detection of unique, common, and putative polymorphic SSRs is a recent development which can be used in various ways including the identification of species.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402

    Article  CAS  Google Scholar 

  • Bachmann L, Bare PTJ (2004) Allelic variation, fragment length analysis and population genetic model: a case study on Drosophilla microsatellites. Zool Syst Evol Res 42:215–222

    Article  Google Scholar 

  • Barkworth ME (1992) Taxonomy of the Triticeae: a historical perspective. Hereditas 116:1–14

    Article  Google Scholar 

  • Batwal S, Sitaraman S, Ranade S, Khandekar P, Bajaj S (2011) Analysis of distribution and significance of simple sequence repeats in enteric bacteria Shigella dysenteriae SD197. Bioinformation 6:348–351

    Article  Google Scholar 

  • Botstein D, White RL, Skolnick M, Davis RW (1980) Construction of a genetic linkage map inman using restriction fragment length polymorphisms. Am J Hum Genet 32:314–331

    CAS  PubMed  PubMed Central  Google Scholar 

  • Coenye T, Vandamme P (2005) Characterization of mononucleotide repeats in sequenced prokaryotic genomes. DNA Res 12:221–233

    Article  CAS  Google Scholar 

  • Dvorak J, Zhang H-B (1992) Application of molecular tools for study of the phylogeny of diploid and polyploid taxa in Triticeae. Hereditas 166:37–42

    Google Scholar 

  • Field D, Wills C (1996) Long, polymorphic microsatellites in simple organisms. Proc Biol Sci 263:209–215

    Article  CAS  Google Scholar 

  • Gerber HP, Seipel K, Georgiev O, Hofferer M, Hug M, Rusconi S, Schaffner W (1994) Transcriptional activation modulated by homopolymeric glutamine and proline stretches. Science 263:808–811

    Article  CAS  Google Scholar 

  • Gupta PK, Varshney RK (2000) The development and use of microsatellite markers for genetic analysis and plant breeding with emphasis on bread wheat. Euphytica 113:163–185

    Article  CAS  Google Scholar 

  • Gupta PK, Balyan HS, Edwards KJ, Isaac P, Korzun V, Roder M, Jourdrier P, Schlatter AR, Dubcovsky J, de la Pena RC, Khairallah M, Hayden M, Keller B, Wang R, Hardouin JP, Jack P, Leroy P (2002) Genetic mapping of 66 new SSR loci in bread wheat. Theor Appl Genet 105:413–422

    Article  CAS  Google Scholar 

  • Gutiérrez-Ozuna R, Hamilton MB (2017) Identification and characterization of microsatellite loci in the tuliptree, Liriodendron tulipifera (Magnoliaceae). Appl Plant Sci 5(8):pii: apps.1700032. https://doi.org/10.3732/apps.1700032

    Article  Google Scholar 

  • Hancock JM (1995) The contribution of slippage-like processes to genome evolution. J Mol Evol 41:1038–1047

    Article  CAS  Google Scholar 

  • Heslop-Harrison JS (1992) Molecular cytogenetics, cytology and genomic comparisons in the Triticeae. Hereditas 116:93–99

    Article  Google Scholar 

  • Jones N, Ougham H, Thomas H, Pasakinskiense I (2009) Markers and mapping revisited: finding your gene. New Phytol 183:935–966

    Article  CAS  Google Scholar 

  • Kabra R, Kapil A, Attarwala K, Rai PK, Shanker A (2016) Identification of common, unique and polymorphic microsatellites among 73 cyanobacterial genomes. World J Microbiol Biotechnol 32:71

    Article  Google Scholar 

  • Kaila T, Chaduvla PK, Rawal HC, Saxena S, Tyagi A, Mithra SVA, Solanke AU, Kalia P, Sharma TR, Singh NK, Gaikwad K (2017) Chloroplast genome sequence of Clusterbean (Cyamopsis tetragonoloba L.): genome structure and comparative analysis. Genes (Basel) 8(9):E212. https://doi.org/10.3390/genes8090212

    Article  CAS  Google Scholar 

  • Kapil A, Rai PK, Shanker A (2014) ChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants. Database 2014:1–5

    Article  Google Scholar 

  • Kashi Y, King D, Soller M (1997) Simple sequence repeats as a source of quantitative genetic variation. Trends Genet 13:74–78

    Article  CAS  Google Scholar 

  • Katti MV, Rajenkar PK, Gupta VS (2001) Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol Biol Evol 18:1161–1167

    Article  CAS  Google Scholar 

  • Kumar M, Kapil A, Shanker A (2014) MitoSatPlant: mitochondrial microsatellites database of Viridiplantae. Mitochondrion 19:334–337

    Article  CAS  Google Scholar 

  • Kumpatla SV, Mukhopadhyaya S (2005) Mining and survey of simple sequence repeats in expressed sequence tags in dicotyledonous species. Genome 48:985–998

    Article  CAS  Google Scholar 

  • Levinson G, Gutman GA (1987) Slipped-strand mispairing: a major mechanism for DNA sequence evolution. Mol Biol Evol 4:203–221

    CAS  PubMed  Google Scholar 

  • Morgante M, Hanafey M, Powell W (2002) Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nat Genet 30:194–200

    Article  CAS  Google Scholar 

  • Moxon ER, Wills C (1999) DNA microsatellites: agents of evolution. Sci Am 280:94–99

    Article  CAS  Google Scholar 

  • Mudunuri SB, Nagarajaram HA (2007) IMEx: imperfect microsatellite extractor. Bioinformatics 23:1181–1187

    Article  CAS  Google Scholar 

  • Ogihara Y, Tsunewaki K (1988) Diversity and evolution of chloroplast DNA in Triticum and Aegilops as revealed by restriction fragment analysis. Theor Appl Genet 76:321–332

    Article  CAS  Google Scholar 

  • Primmer CR, Raudsepp T, Chowdary BP, Moller AP, Ellegren H (1997) Low frequency of microsatellites in the avian genome. Genome Res 7:471–482

    Article  CAS  Google Scholar 

  • Rajendrakumar P, Biswal AK, Balachandran SM, Srinivasarao K, Sundaram RM (2007) Simple sequence repeats in organellar genomes of rice: frequency and distribution in genic and intergenic regions. Bioinformatics 23:1–4

    Article  CAS  Google Scholar 

  • Roder MS, Korzun V, Gill BS, Ganal MW (1998) The physical mapping of microsatellite markers in wheat. Genome 41:278–283

    Article  CAS  Google Scholar 

  • Roy JK, Prasad M, Varshney RK, Balyan HS, Blake TK, Dhaliwal HS, Singh H, Edwards KJ, Gupta PK (1999) Identification of a microsatellite on chromosomes 6B and a STS on 7D of bread wheat showing an association with preharvest sprouting tolerance. Theor Appl Genet 99:336–340

    Article  Google Scholar 

  • Sehgal SK, Li W, Rabinowicz PD, Chan A, Simkova H, Dolezel J, Gill BS (2012) Chromosome arm-specific BAC end sequences permit comparative analysis of homoeologous chromosomes and genomes of polyploid wheat. BMC Plant Biol 12:64

    Article  CAS  Google Scholar 

  • Shanker A, Bhargava A, Bajpai R, Singh S, Srivastava S, Sharma V (2007a) Bioinformatically mined simple sequence repeats in UniGene of Citrus sinensis. Sci Hort 113:353–361

    Article  CAS  Google Scholar 

  • Shanker A, Singh A, Sharma V (2007b) In silico mining in expressed sequences of Neurospora crassa for identification and abundance of microsatellites. Microbiol Res 162:250–256

    Article  CAS  Google Scholar 

  • Squirrell J, Hollingsworth PM, Woodhead M, Russell J, Low AJ, Gibby M, Powell W (2003) How much effort is required to isolate nuclear microsatellites from plants? Mol Ecol 12:1339–1348

    Article  CAS  Google Scholar 

  • Sung W, Tucker A, Bergeron RD, Lynch M, Thomas WK (2010) Simple sequence repeat variation in the Daphnia pulex genome. BMC Genomics 11:691

    Article  CAS  Google Scholar 

  • Tautz D, Renz M (1984) Simple sequences are ubiquitous repetitive components of eukaryotic genomes. Nucleic Acids Res 12:4127–4138

    Article  CAS  Google Scholar 

  • Tautz D, Schlotterer C (1994) Simple Sequences. Curr Opin Genet Dev 4:832–837

    Article  CAS  Google Scholar 

  • Tomar RSS, Deshmukh RK, Naik K, Tomar SMS (2014) Development of chloroplast-specific microsatellite markers for molecular characterization of alloplasmic lines and phylogenetic analysis in wheat. Plant Breed 133:12–18

    Article  CAS  Google Scholar 

  • Untergrasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, Rozen SG (2012) Primer3: new capabilities and interfaces. Nucleic Acids Res 40:e115

    Article  Google Scholar 

  • Vogt P (1990) Potentially genetic functions of tandemly repeated DNA sequence blocks in the human genome are based on a highly conserved “chromatin folding code”. Hum Genet 84:301–336

    CAS  PubMed  Google Scholar 

  • Voorrips RE (2002) MapChart: software for the graphical presentation of linkage maps and QTLs. J Hered 93:77–78

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kapil, A., Jha, C.K., Shanker, A. (2018). Data Mining to Detect Common, Unique, and Polymorphic Simple Sequence Repeats. In: Shanker, A. (eds) Bioinformatics: Sequences, Structures, Phylogeny . Springer, Singapore. https://doi.org/10.1007/978-981-13-1562-6_7

Download citation

Publish with us

Policies and ethics