Skip to main content

Data Scientist: A Systematic Review of the Literature

  • Conference paper
  • First Online:
  • 1144 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 895))

Abstract

The commercial activities of services and production have accumulated plenty of data throughout the years, hence today’s necessity of a professional agent to interpret data, generates information in order to produce valuable results and conclusions. The scope of the current article is to present a systematic review of the literature which main goal was to spot the work and career profile of the so called Data Scientist; realizing that, as a new work field, there are not concretely defined profiles, although knowledge areas are indeed defined, as well as characteristics that are needed to be counted, apart from some technologies that can serve as supporting means for the labor these new technicians do in the IT (Information Technology) area.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Chen, H., Chiang, R.H., Storey, V.C.: Business intelligence and analytics: from big data to big impact. MIS Q. 36(4), 1165–1188 (2012)

    Google Scholar 

  2. Jaramillo, O.: Pertinencia del perfil de los profesionales de la información con las demandas del mercado laboral. Revista Interamericana de Bibliotecología. 38 (2015). https://doi.org/10.17533/udea.rib.v38n2a03

  3. Kim, M., Zimmermann, T., DeLine, R., Begel, A.: The emerging role of data scientists on software development teams, pp. 96–107. ACM Press (2016). https://doi.org/10.1145/2884781.2884783. http://dl.acm.org/citation.cfm?doid=2884781.2884783

  4. Ecleo, J.J., Galido, A.: Surveying LinkedIn profiles of data scientists: the case of the Philippines. Procedia Comput. Sci. 124, 53–60 (2017). https://doi.org/10.1016/j.procs.2017.12.129

    Google Scholar 

  5. Kitchenham, B.: Procedures for performing systematic reviews. 33 (2004)

    Google Scholar 

  6. Huang, X., Lin, J.: Evaluation of PICO as a knowledge representation for clinical questions: In: Proceeding of the Annual Symposium oh the American Medical Informatics Association. AMIA Press (2006). http://users.umiacs.umd.edu/~jimmylin/publications/Huang_etal_AMIA2006.pdf

  7. Zhai, J., Jocz, J.A., Tan, A.-L.: ‘Am I Like a Scientist?’: primary children’s images of doing science in school. Int. J. Sci. Educ. 36, 553–576 (2014). https://doi.org/10.1080/09500693.2013.791958

    Google Scholar 

  8. Treadwell, G., Ross, T., Lee, A., Lowenstein, J.K.: A numbers game: two case studies in teaching data journalism. Journal. Mass Commun. Educ. 71, 297–308 (2016). https://doi.org/10.1177/1077695816665215

    Google Scholar 

  9. Younge, A.J.: Architectural principles and experimentation of distributed high performance virtual clusters. 24 (2017)

    Google Scholar 

  10. Gold, A.U., et al.: Arctic climate connections curriculum: a model for bringing authentic data into the classroom. J. Geosci. Educ. 63, 185–197 (2015). https://doi.org/10.5408/14-030.1

    Google Scholar 

  11. Fuller, M.: BIG DATA: new science, new challenges, new dialogical opportunities: Zygon. Zygon® 50, 569–582 (2015). https://doi.org/10.1111/zygo.12187

    Google Scholar 

  12. Manieri, A., et al.: Data science professional uncovered: how the EDISON project will contribute to a widely accepted profile for Data Scientists (2015)

    Google Scholar 

  13. Seo, D., Lee, M.-H., Yu, S.: Development of network analysis and visualization system for KEGG pathways. Symmetry 7, 1275–1288 (2015). https://doi.org/10.3390/sym7031275

    Google Scholar 

  14. Shaikh, M.A.H., Omar, M.T., Azharul Hasan, K.M.: Efficient index computation for array based structured data. In: Efficient Index Computation for Array Based Structured Data, pp. 101–105. IEEE (2015). http://ieeexplore.ieee.org/document/7391930/. Accessed 18 May 2018

  15. Rupp, A.A., van Rijn, P.W.: GDINA and CDM packages in R. Meas.: Interdiscipl. Res. Perspect. 16, 71–77 (2018). https://doi.org/10.1080/15366367.2018.1437243

    Google Scholar 

  16. Webb, S.J., et al.: Guidelines and best practices for electrophysiological data collection, analysis and reporting in autism. J. Autism Dev. Disord. 45, 425–443 (2015). https://doi.org/10.1007/s10803-013-1916-6

    Google Scholar 

  17. Brennan, P.F., Bakken, S.: Nursing needs big data and big data needs nursing: nursing needs big data. J. Nurs. Scholarsh. 47, 477–484 (2015). https://doi.org/10.1111/jnu.12159

    Google Scholar 

  18. Tudoran, R., Costan, A., Antoniu, G.: OverFlow: multi-site aware big data management for scientific workflows on clouds. IEEE Trans. Cloud Comput. 4, 76–89 (2016). https://doi.org/10.1109/TCC.2015.2440254

    Google Scholar 

  19. Asamoah, D.A., Sharda, R., Hassan Zadeh, A., Kalgotra, P.: Preparing a data scientist: a pedagogic experience in designing a big data analytics course: preparing a data scientist. Decis. Sci. J. Innov. Educ. 15, 161–190 (2017). https://doi.org/10.1111/dsji.12125

    Google Scholar 

  20. Bowers, A.J.: Quantitative research methods training in education leadership and administration preparation programs as disciplined inquiry for building school improvement capacity. J. Res. Leadersh. Educ. 12, 72–96 (2017). https://doi.org/10.1177/1942775116659462

    Google Scholar 

  21. Malviya, A., Udhani, A., Soni, S.: R-tool: data analytic framework for big data. In: R-Tool: Data Analytic Framework for Big Data, pp. 1–5. IEEE (2016). http://ieeexplore.ieee.org/document/7570960/. Accessed 18 May 2018

  22. Ebadi, H., Antignac, T., Sands, D.: Sampling and partitioning for differential privacy. In: Sampling and Partitioning for Differential Privacy, pp. 664–673. IEEE (2016). http://ieeexplore.ieee.org/document/7906954/. Accessed 18 May 2018

  23. Rojas, J.A.R., Beth Kery, M., Rosenthal, S., Dey, A.: Sampling techniques to improve big data exploration. Sampling Techniques to Improve Big Data Exploration, pp. 26–35. IEEE (2017). http://ieeexplore.ieee.org/document/8231848/. Accessed 18 May 2018

  24. Gehl, R.W.: Sharing, knowledge management and big data: a partial genealogy of the data scientist (2015)

    Google Scholar 

  25. Kim, S., Choi, M.-S.: Study on data center and data librarian role for reuse of research data. In: Study on Data Center and Data Librarian Role for Reuse of Research Data, pp. 303–308. IEEE (2016). http://ieeexplore.ieee.org/document/7440517/. Accessed 18 May 2018

  26. Eybers, S., Hattingh, M.: Teaching data science to post graduate students: a preliminary study using a « F-L-I-P » class room approach (2016)

    Google Scholar 

  27. Baškarada, S., Koronios, A.: Unicorn data scientist: the rarest of breeds. Program 51, 65–74 (2017). https://doi.org/10.1108/PROG-07-2016-0053

    Google Scholar 

  28. Schreck, B., Veeramachaneni, K.: What would a data scientist ask? Automatically formulating and solving predictive problems. In: What Would a Data Scientist Ask? Automatically Formulating and Solving Predictive Problems, pp. 440–451. IEEE (2016). http://ieeexplore.ieee.org/document/7796930/. Accessed 19 May 2018

  29. Data robot: Beneficios para los científicos de datos. https://www.datarobot.com/data-scientists/. Accessed 19 May 2018

  30. SubjectivesSystems: Convertimos DATA en VENTAJA. https://www.subjectivesystems.com/. Accessed 19 May 2018

  31. Turi create intelligence: GraphLab-Create. https://pypi.org/project/GraphLab-Create/. Accessed 19 May 2018

  32. Ipython: Ipython interactive computing. http://ipython.org/index.html. Accessed 19 May 2018

  33. KNIME: KNIME Analytics Platform. https://www.knime.com/knime-analytics-platform. Accessed 19 May 2018

  34. Saltz, J.S., Grady, N.W.: The ambiguity of data science team roles and the need for a data science workforce framework, pp. 2355–2361. IEEE (2017). http://ieeexplore.ieee.org/document/8258190/. Accessed 19 May 2018

  35. Forbes: Report: Why « Data Scientist » is the Best Job to Pursue in 2016. https://www.forbes.com/sites/gregoryferenstein/2016/01/20/report-why-data-scientist-is-the-best-job-to-pursue-in-2016/#13caba13a526

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcos Antonio Espinoza Mina .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Espinoza Mina, M.A., Gallegos Barzola, D. (2019). Data Scientist: A Systematic Review of the Literature. In: Botto-Tobar, M., Pizarro, G., Zúñiga-Prieto, M., D’Armas, M., Zúñiga Sánchez, M. (eds) Technology Trends. CITT 2018. Communications in Computer and Information Science, vol 895. Springer, Cham. https://doi.org/10.1007/978-3-030-05532-5_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05532-5_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05531-8

  • Online ISBN: 978-3-030-05532-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics