Skip to main content

Genetic Variance Study in Human on the Basis of Skin/Eye/Hair Pigmentation Using Apache Spark

  • Conference paper
  • First Online:
Book cover International Conference on Innovative Computing and Communications

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1087))

Abstract

Heredity and variation are the basis of genetics. Human beings show variation on the basis of skin/hair/eye color. The diversity in the phenotypes is originated due to variations at the genetic level. It has been observed that specific populations across the globe share similar shades of color. It has been reported that pigment melanin is responsible for skin/eye/hair color. Six major genes have identified which are responsible to produce variation in coloration: HERC2, OCA2, TYR, MC1R, SLC45A2, and SLC24A2. In this paper, Apache Spark and Python on a virtual machine running Ubuntu have been used to analyze the variation considering the genomic regions associated with these genes. The study included different populations which have been categorized into three groups. First group is the ‘sample population’ that includes five subpopulations, Mexican, Han Chinese, Yoruba, British, and Japanese. People from these populations can be easily distinguished on the basis of skin/eye/hair color. The second group includes five super populations of the world from different continents, viz. African, American, European, East Asian, and South Asian. This will provide the intercontinent analysis. The third group is ‘South-Asian population’ that includes five subpopulations from South-Asian subcontinent, viz. Punjabi, Gujarati, Tamil, Telugu, and Bengali, for the study in geographically closer populations. These populations are expected to show some degree of variation in the genomic regions in these six genes. Our results indicated that three different populations showed variations in different genes. First group of population depicted the maximum diversity in ‘TYR’ gene followed by SLC45A2. This SLC45A2 gene was most diverse in continental population, whereas the third group showed similar diversity across all the six genes. This implicates that the specific population shows diversity in specific genes and also proves that Apache Spark has great potential in assessing nucleotide diversity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. IGSR: The International Genome Sample Resource, Using Data from IGSR [Online] (2017). Downloaded[30.05.’17] from World Wide Web, http://www.internationalgenome.org/data

  2. The Variant Call Format(VCF) Version 4.1 Specification. 10 June 2016 [Online]. Retrieved [06.06.’17] from World Wide Web, http://samtools.github.io/hts-specs/VCFv4.1.pdf

  3. Wikipedia.org, Nucleotide Diversity [Online] (2017). Retrieved [06.06.’17] from World Wide Web, https://en.wikipedia.org/wiki/Nucleotide_diversity

  4. Wikipedia.org, Haplotype [Online] (2017). Retrieved [06.06.’17] from World Wide Web, https://en.wikipedia.org/wiki/Haplotype

  5. A. Auton, A. Marcketta, P. Danecek, VCFtools. (Version 0.1.14) [Software] (2015). Available from World Wide Web, https://vcftools.github.io/index.html

  6. J. Marcial Portilla, Installing Scala and Spark on Ubuntu.[Online] medium.com (2016). Retrieved [11.06.’17] from World Wide Web, https://medium.com/@josemarcialportilla/installing-scala-and-spark-on-ubuntu-5665ee4b62b1

  7. S. Jain, A. Saxena, Analysis of Hadoop and MapReduce tectonics through hive big data. Int. J. Control Theor. Appl. 9/14, 3811–3911 (2016)

    Google Scholar 

  8. A. Saxena, N. Kaushik, N. Kaushik, Implementing and analyzing big data techniques with spring framework in Java & J2EE, in Second International Conference on Information and Communication Technology for Competitive Strategies (ICTCS) (ACM Digital Library, 2016)

    Google Scholar 

  9. A. Saxena, N. Kaushik, N. Kaushik, A. Dwivedi, Implementation of cloud computing and big data with Java based web application, in Proceedings of the 10th INDIACom; INDIACom-2016; IEEE Conference ID: 37465 2016 3rd International Conference on “Computing for Sustainable Global Development”, 16th—18th March, 2016 (BharatiVidyapeeth’s Institute of Computer Applications and Management (BVICAM), New Delhi, India, 2016), pp. 3043–3047

    Google Scholar 

  10. J.D. Hunter, Matplotlib. (Version 2.0.2) [code], matplotlib.org (2003). Available from World Wide Web, https://matplotlib.org/downloads.html

  11. R. Delgado, Will Apache Spark Finally Advance Genomic Data Analysis? [HTML Document] (2017). Retrieved [15.06.’17] from World Wide Web, http://www.kdnuggets.com/2017/06/apache-spark-advance-genomic-data-analysis.html

  12. A. Auton, A. Marcketta, P. Danecek, VCFtools Manual [HTML Document] (2015). Retrieved [16.05.’17] from World Wide Web, https://vcftools.github.io/man_latest.html

  13. IGSR, What do Your Population Code Mean? [Online] (2017). Retrieved [17.06.’17] from World Wide Web, http://www.internationalgenome.org/category/population/

  14. NCBI, Information for Each Gene. [Online] (2017). Retrieved [07.06.’17] from World Wide Web, https://www.ncbi.nlm.nih.gov/gene

  15. pyspark.sql module, (Version 2.1.0) [Documentation] (2017). Retrieved [20.06.’17] from World Wide Web, http://spark.apache.org/docs/2.1.0/api/python/pyspark.sql.html

  16. R.A. Sturm, Molecular Genetics of human Pigmentation Diversity [HTML Document] (2009). Retrieved [31.05.’17] from World Wide Web, https://academic.oup.com/hmg/article/18/R1/R9/2901093/Molecular-genetics-of-human-pigmentation-diversity#55341554

  17. T. Danford, Next Generation Genomics Analysis Using Spark and ADAM. [Online] In Spark Summit (2015). Retrieved [31.05.’17] from World Wide Web, https://spark-summit.org/east-2015/next-generation-genomics-analysis-using-spark-and-adam/

  18. A. Chhawchharia, A. Saxena, Execution of big data using map reduce tecnhique and HQL, in Proceedings of the 11th INDIACom; INDIACom-2016; IEEE Conference ID: 40353 2017 4th International Conference on “Computing for Sustainable Global Development”, 1st—3rd March, 2017 ( BharatiVidyapeeth’s Institute of Computer Applications and Management (BVICAM), New Delhi, India, 2017)

    Google Scholar 

  19. M. Chand, C. Shakya, G.S. Saggu, D. Saha, I.K. Shreshtha, A. Saxena, Analysis of big data using apache spark, in Proceedings of the 11th INDIACom; INDIACom-2016; IEEE Conference ID: 40353 2017 4th International Conference on “Computing for Sustainable Global Development”, 1st—3rd March, 2017 (BharatiVidyapeeth’s Institute of Computer Applications and Management (BVICAM), New Delhi, India, 2017)

    Google Scholar 

  20. S. Sendre, S. Singh, L. Anand, V. Sharma, A. Saxena, Decimation of duplicated images using Mapreduce in bigdata, in Proceedings of the 11th INDIACom; INDIACom-2016; IEEE Conference ID: 40353 2017 4th International Conference on “Computing for Sustainable Global Development”, 1st—3rd March, 2017 (BharatiVidyapeeth’s Institute of Computer Applications and Management (BVICAM), New Delhi, India, 2017)

    Google Scholar 

  21. S. Jain, A. Saxena, Integration of spring in hadoop for data processing, in Proceedings of the 11th INDIACom; INDIACom-2016; IEEE Conference ID: 40353 2017 4th International Conference on “Computing for Sustainable Global Development”, 1st—3rd March, 2017 (BharatiVidyapeeth’s Institute of Computer Applications and Management (BVICAM), New Delhi, India, 2017)

    Google Scholar 

  22. K. Yesugade, V. Bangre, S. Sinha, S. Kak, A. Saxena, Analyzing human behaviour using data analytics in booking a type hotel, in Proceedings of the 11th INDIACom; INDIACom-2016; IEEE Conference ID: 40353 2017 4th International Conference on “Computing for Sustainable Global Development”, 1st—3rd March, 2017 (BharatiVidyapeeth’s Institute of Computer Applications and Management (BVICAM), New Delhi, India, 2017)

    Google Scholar 

  23. A. Saxena, A. Chaurasia, N. Kaushik, A. Dwivedi, N. Kaushik, Handling big data using map-reduce over hybrid cloud, in International Conference on Innovative Computing and Communications Springer, pp. 135–144

    Google Scholar 

  24. N. Creana, M.W. Feldman, Worldwide genetic and cultural change in human evolution. Current Opin. Genet. Dev. 41, 85–92 (2016)

    Google Scholar 

  25. Race, Ethinicity and genetics working group. NHGRI. The use of racial, ethinic, snd ancestral categories in human genetics research. Am. J. Hum. Genet. 77, 519–532 2005

    Google Scholar 

  26. R.A. Sturm, D.L. Duffy, Human pigmentation genes under environmental selection. Genome Biol. 13(9), 248 (2012). https://doi.org/10.1186/gb-2012-13-9-248

    Article  Google Scholar 

  27. R.A. Sturm, Molecular genetics of human pigmentation diversity. Hum. Mol. Genet. 15;18(R1), R9–R17 (2009). https://doi.org/10.1093/hmg/ddp003

    Article  Google Scholar 

  28. P. Sulem, D.F Gudbjartsson, S.N. Stacey, A. Helgason, T. Rafnar, K.P. Magnusson, A. Manolescu, A. Karason, A. Palsson, G. Thorleifsson, M. Jakobsdottir, S. Steinberg, S. Pálsson, F. Jonasson, B. Sigurgeirsson, K. Thorisdottir, R. Ragnarsson, K.R. Benediktsdottir, K.K Aben, L.A. Kiemeney, J.H. Olafsson, J. Gulcher, A. Kong, U. Thorsteinsdottir, K. Stefansson, Genetic determinants of hair, eye and skin pigmentation in Europeans. Nat. Genet. 39, 1443–1452 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shivani Chandra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Saxena, A., Chandra, S., Grover, A., Anand, L., Jauhari, S. (2020). Genetic Variance Study in Human on the Basis of Skin/Eye/Hair Pigmentation Using Apache Spark. In: Khanna, A., Gupta, D., Bhattacharyya, S., Snasel, V., Platos, J., Hassanien, A. (eds) International Conference on Innovative Computing and Communications. Advances in Intelligent Systems and Computing, vol 1087. Springer, Singapore. https://doi.org/10.1007/978-981-15-1286-5_31

Download citation

Publish with us

Policies and ethics