Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 222))

  • 762 Accesses

Abstract

We investigate Chargaff’s second parity rule and its extensions in the human genome, and evaluate its statistical significance. This phenomenon has been previously investigated in the reference human genome, but this sequence does not represent a proper sampling of the human population. With the 1000 genomes project, we have data from next-generation sequencing of different human individuals, constituting a sample of 1092 individuals. We explore and analyze this new type of data to evaluate the phenomenon of symmetry globally and for pairs of symmetric words.

Our methodology is based on measurements, traditional statistical tests and equivalence statistical tests using different parameters (e.g. mean, correlation coefficient).

We find that the global symmetries phenomenon is significant for word lengths smaller than 8. However, even when the global symmetry is significant, some symmetric word pairs do not present a significant positive correlation but a small or non positive correlation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. The 1000 genomes project data release: Integrated variant call set for phase 1, version 3

    Google Scholar 

  2. Grch37 Reference human genome assembly

    Google Scholar 

  3. Albrecht-Buehler, G.: Inversions and inverted transpositions as the basis for an almost universal “format” of genome sequences. Genomics 90, 297–305 (2007)

    Article  Google Scholar 

  4. Baisnée, P.-F., Hampson, S., Baldi, P.: Why are complementary DNA strands symmetric? Bioinformatics 18(8), 1021–1033 (2002)

    Article  Google Scholar 

  5. Karkas, J.D., Rudner, R., Chargaff, E.: Separation of B. subtilis DNA into complementary strands. II. template functions and composition as determined by transcription with RNA polymerase. Proceedings of the National Academy of Sciences of the United States of America 60(3), 915–920 (1968)

    Article  Google Scholar 

  6. Kline, R.B.: Beyond Significance testing: Reforming Data Analysis Methods in Behavioral Research. American Psychological Association (2004)

    Google Scholar 

  7. Kong, S.-G., Fan, W.-L., Chen, H.-D., Hsu, Z.-T., Zhou, N., Zheng, B., Lee, H.-C.: Inverse symmetry in complete genomes and whole-genome inverse duplication. PLoS One 4(11), 7553 (2009)

    Article  Google Scholar 

  8. Migliorati, S., Ongaro, A.: Adjusting p-values when n is large in the presence of nuisance parameters. In: Statistics for Industry and Technology, Vienna, pp. 305–318 (September 2010)

    Google Scholar 

  9. Moore, D.S.: Statistics: Concepts and Controversies, 4th edn. Freeman (1997)

    Google Scholar 

  10. Qi, D., Jamie Cuticchia, A.: Compositional symmetries in complete genomes. Bioinformatics 17(6), 557–559 (2001)

    Article  Google Scholar 

  11. Rudner, R., Karkas, J.D., Chargaff, E.: Separation of B. subtilis DNA into complementary strands, I. biological properties. Proceedings of the National Academy of Sciences of the United States of America 60(2), 630–635 (1968)

    Article  Google Scholar 

  12. Rudner, R., Karkas, J.D., Chargaff, E.: Separation of B. subtilis DNA into complementary strands. III. direct analysis. Proceedings of the National Academy of Sciences of the United States of America 60(3), 921–922 (1968)

    Article  Google Scholar 

  13. Thanassoulis, G., Vasan, R.S.: Genetic cardiovascular risk prediction — Will we get there? Circulation 122(22), 2323–2334 (2010)

    Article  Google Scholar 

  14. Zhang, S.-H., Huang, Y.-Z.: Limited contribution of stem-loop potential to symmetry of single-stranded genomic DNA. Bioinformatics 26(4), 478–485 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vera Afreixo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Afreixo, V., Rodrigues, J.M.O.S., Garcia, S.P. (2013). Analysis of Word Symmetries in Human Genomes Using Next-Generation Sequencing Data. In: Mohamad, M., Nanni, L., Rocha, M., Fdez-Riverola, F. (eds) 7th International Conference on Practical Applications of Computational Biology & Bioinformatics. Advances in Intelligent Systems and Computing, vol 222. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-00578-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-00578-2_2

  • Publisher Name: Springer, Heidelberg

  • Print ISBN: 978-3-319-00577-5

  • Online ISBN: 978-3-319-00578-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics