Data Mining to Detect Common, Unique, and Polymorphic Simple Sequence Repeats

Kapil, Aditi; Jha, C. K.; Shanker, Asheesh

doi:10.1007/978-981-13-1562-6_7

Data Mining to Detect Common, Unique, and Polymorphic Simple Sequence Repeats

Aditi Kapil²,
C. K. Jha³ &
Asheesh Shanker^2,4

Chapter
First Online: 14 October 2018

2151 Accesses

Abstract

Nowadays computational data mining of biological data is of paramount importance to discover patterns in large data generated through sequencing and other efforts. The extracted information can be used in various ways to get new insights about subject organism. Simple sequence repeats (SSRs) consist of 1–6 nucleotides and can be characterized in wet laboratory as well as mined through computational approaches. These repeats help in the genetic mapping, breeding experiments, phylogeny and can also be used to develop molecular markers. In view of their usefulness, various specialized biological databases of SSRs were developed. In this chapter, a case study is presented which used in silico mined nucleotide sequence data to further detect putative polymorphic, common, and unique SSRs in chloroplast genomes of genus Triticum. Earlier, SSRs were detected in several organisms; however, in silico detection of unique, common, and putative polymorphic SSRs is a recent development which can be used in various ways including the identification of species.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402
Article CAS Google Scholar
Bachmann L, Bare PTJ (2004) Allelic variation, fragment length analysis and population genetic model: a case study on Drosophilla microsatellites. Zool Syst Evol Res 42:215–222
Article Google Scholar
Barkworth ME (1992) Taxonomy of the Triticeae: a historical perspective. Hereditas 116:1–14
Article Google Scholar
Batwal S, Sitaraman S, Ranade S, Khandekar P, Bajaj S (2011) Analysis of distribution and significance of simple sequence repeats in enteric bacteria Shigella dysenteriae SD197. Bioinformation 6:348–351
Article Google Scholar
Botstein D, White RL, Skolnick M, Davis RW (1980) Construction of a genetic linkage map inman using restriction fragment length polymorphisms. Am J Hum Genet 32:314–331
CAS PubMed PubMed Central Google Scholar
Coenye T, Vandamme P (2005) Characterization of mononucleotide repeats in sequenced prokaryotic genomes. DNA Res 12:221–233
Article CAS Google Scholar
Dvorak J, Zhang H-B (1992) Application of molecular tools for study of the phylogeny of diploid and polyploid taxa in Triticeae. Hereditas 166:37–42
Google Scholar
Field D, Wills C (1996) Long, polymorphic microsatellites in simple organisms. Proc Biol Sci 263:209–215
Article CAS Google Scholar
Gerber HP, Seipel K, Georgiev O, Hofferer M, Hug M, Rusconi S, Schaffner W (1994) Transcriptional activation modulated by homopolymeric glutamine and proline stretches. Science 263:808–811
Article CAS Google Scholar
Gupta PK, Varshney RK (2000) The development and use of microsatellite markers for genetic analysis and plant breeding with emphasis on bread wheat. Euphytica 113:163–185
Article CAS Google Scholar
Gupta PK, Balyan HS, Edwards KJ, Isaac P, Korzun V, Roder M, Jourdrier P, Schlatter AR, Dubcovsky J, de la Pena RC, Khairallah M, Hayden M, Keller B, Wang R, Hardouin JP, Jack P, Leroy P (2002) Genetic mapping of 66 new SSR loci in bread wheat. Theor Appl Genet 105:413–422
Article CAS Google Scholar
Gutiérrez-Ozuna R, Hamilton MB (2017) Identification and characterization of microsatellite loci in the tuliptree, Liriodendron tulipifera (Magnoliaceae). Appl Plant Sci 5(8):pii: apps.1700032. https://doi.org/10.3732/apps.1700032
Article Google Scholar
Hancock JM (1995) The contribution of slippage-like processes to genome evolution. J Mol Evol 41:1038–1047
Article CAS Google Scholar
Heslop-Harrison JS (1992) Molecular cytogenetics, cytology and genomic comparisons in the Triticeae. Hereditas 116:93–99
Article Google Scholar
Jones N, Ougham H, Thomas H, Pasakinskiense I (2009) Markers and mapping revisited: finding your gene. New Phytol 183:935–966
Article CAS Google Scholar
Kabra R, Kapil A, Attarwala K, Rai PK, Shanker A (2016) Identification of common, unique and polymorphic microsatellites among 73 cyanobacterial genomes. World J Microbiol Biotechnol 32:71
Article Google Scholar
Kaila T, Chaduvla PK, Rawal HC, Saxena S, Tyagi A, Mithra SVA, Solanke AU, Kalia P, Sharma TR, Singh NK, Gaikwad K (2017) Chloroplast genome sequence of Clusterbean (Cyamopsis tetragonoloba L.): genome structure and comparative analysis. Genes (Basel) 8(9):E212. https://doi.org/10.3390/genes8090212
Article CAS Google Scholar
Kapil A, Rai PK, Shanker A (2014) ChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants. Database 2014:1–5
Article Google Scholar
Kashi Y, King D, Soller M (1997) Simple sequence repeats as a source of quantitative genetic variation. Trends Genet 13:74–78
Article CAS Google Scholar
Katti MV, Rajenkar PK, Gupta VS (2001) Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol Biol Evol 18:1161–1167
Article CAS Google Scholar
Kumar M, Kapil A, Shanker A (2014) MitoSatPlant: mitochondrial microsatellites database of Viridiplantae. Mitochondrion 19:334–337
Article CAS Google Scholar
Kumpatla SV, Mukhopadhyaya S (2005) Mining and survey of simple sequence repeats in expressed sequence tags in dicotyledonous species. Genome 48:985–998
Article CAS Google Scholar
Levinson G, Gutman GA (1987) Slipped-strand mispairing: a major mechanism for DNA sequence evolution. Mol Biol Evol 4:203–221
CAS PubMed Google Scholar
Morgante M, Hanafey M, Powell W (2002) Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nat Genet 30:194–200
Article CAS Google Scholar
Moxon ER, Wills C (1999) DNA microsatellites: agents of evolution. Sci Am 280:94–99
Article CAS Google Scholar
Mudunuri SB, Nagarajaram HA (2007) IMEx: imperfect microsatellite extractor. Bioinformatics 23:1181–1187
Article CAS Google Scholar
Ogihara Y, Tsunewaki K (1988) Diversity and evolution of chloroplast DNA in Triticum and Aegilops as revealed by restriction fragment analysis. Theor Appl Genet 76:321–332
Article CAS Google Scholar
Primmer CR, Raudsepp T, Chowdary BP, Moller AP, Ellegren H (1997) Low frequency of microsatellites in the avian genome. Genome Res 7:471–482
Article CAS Google Scholar
Rajendrakumar P, Biswal AK, Balachandran SM, Srinivasarao K, Sundaram RM (2007) Simple sequence repeats in organellar genomes of rice: frequency and distribution in genic and intergenic regions. Bioinformatics 23:1–4
Article CAS Google Scholar
Roder MS, Korzun V, Gill BS, Ganal MW (1998) The physical mapping of microsatellite markers in wheat. Genome 41:278–283
Article CAS Google Scholar
Roy JK, Prasad M, Varshney RK, Balyan HS, Blake TK, Dhaliwal HS, Singh H, Edwards KJ, Gupta PK (1999) Identification of a microsatellite on chromosomes 6B and a STS on 7D of bread wheat showing an association with preharvest sprouting tolerance. Theor Appl Genet 99:336–340
Article Google Scholar
Sehgal SK, Li W, Rabinowicz PD, Chan A, Simkova H, Dolezel J, Gill BS (2012) Chromosome arm-specific BAC end sequences permit comparative analysis of homoeologous chromosomes and genomes of polyploid wheat. BMC Plant Biol 12:64
Article CAS Google Scholar
Shanker A, Bhargava A, Bajpai R, Singh S, Srivastava S, Sharma V (2007a) Bioinformatically mined simple sequence repeats in UniGene of Citrus sinensis. Sci Hort 113:353–361
Article CAS Google Scholar
Shanker A, Singh A, Sharma V (2007b) In silico mining in expressed sequences of Neurospora crassa for identification and abundance of microsatellites. Microbiol Res 162:250–256
Article CAS Google Scholar
Squirrell J, Hollingsworth PM, Woodhead M, Russell J, Low AJ, Gibby M, Powell W (2003) How much effort is required to isolate nuclear microsatellites from plants? Mol Ecol 12:1339–1348
Article CAS Google Scholar
Sung W, Tucker A, Bergeron RD, Lynch M, Thomas WK (2010) Simple sequence repeat variation in the Daphnia pulex genome. BMC Genomics 11:691
Article CAS Google Scholar
Tautz D, Renz M (1984) Simple sequences are ubiquitous repetitive components of eukaryotic genomes. Nucleic Acids Res 12:4127–4138
Article CAS Google Scholar
Tautz D, Schlotterer C (1994) Simple Sequences. Curr Opin Genet Dev 4:832–837
Article CAS Google Scholar
Tomar RSS, Deshmukh RK, Naik K, Tomar SMS (2014) Development of chloroplast-specific microsatellite markers for molecular characterization of alloplasmic lines and phylogenetic analysis in wheat. Plant Breed 133:12–18
Article CAS Google Scholar
Untergrasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, Rozen SG (2012) Primer3: new capabilities and interfaces. Nucleic Acids Res 40:e115
Article Google Scholar
Vogt P (1990) Potentially genetic functions of tandemly repeated DNA sequence blocks in the human genome are based on a highly conserved “chromatin folding code”. Hum Genet 84:301–336
CAS PubMed Google Scholar
Voorrips RE (2002) MapChart: software for the graphical presentation of linkage maps and QTLs. J Hered 93:77–78
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

Department of Bioscience and Biotechnology, Banasthali Vidyapith, Rajasthan, India
Aditi Kapil & Asheesh Shanker
Department of Computer Science, Banasthali Vidyapith, Rajasthan, India
C. K. Jha
Department of Bioinformatics, Central University of South Bihar, Gaya, Bihar, India
Asheesh Shanker

Authors

Aditi Kapil
View author publications
You can also search for this author in PubMed Google Scholar
C. K. Jha
View author publications
You can also search for this author in PubMed Google Scholar
Asheesh Shanker
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Bioinformatics, Central University of South Bihar, Gaya, Bihar, India
Asheesh Shanker

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kapil, A., Jha, C.K., Shanker, A. (2018). Data Mining to Detect Common, Unique, and Polymorphic Simple Sequence Repeats. In: Shanker, A. (eds) Bioinformatics: Sequences, Structures, Phylogeny . Springer, Singapore. https://doi.org/10.1007/978-981-13-1562-6_7

Download citation

DOI: https://doi.org/10.1007/978-981-13-1562-6_7
Published: 14 October 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1561-9
Online ISBN: 978-981-13-1562-6
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics