Abstract
The commercial activities of services and production have accumulated plenty of data throughout the years, hence today’s necessity of a professional agent to interpret data, generates information in order to produce valuable results and conclusions. The scope of the current article is to present a systematic review of the literature which main goal was to spot the work and career profile of the so called Data Scientist; realizing that, as a new work field, there are not concretely defined profiles, although knowledge areas are indeed defined, as well as characteristics that are needed to be counted, apart from some technologies that can serve as supporting means for the labor these new technicians do in the IT (Information Technology) area.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Chen, H., Chiang, R.H., Storey, V.C.: Business intelligence and analytics: from big data to big impact. MIS Q. 36(4), 1165–1188 (2012)
Jaramillo, O.: Pertinencia del perfil de los profesionales de la información con las demandas del mercado laboral. Revista Interamericana de Bibliotecología. 38 (2015). https://doi.org/10.17533/udea.rib.v38n2a03
Kim, M., Zimmermann, T., DeLine, R., Begel, A.: The emerging role of data scientists on software development teams, pp. 96–107. ACM Press (2016). https://doi.org/10.1145/2884781.2884783. http://dl.acm.org/citation.cfm?doid=2884781.2884783
Ecleo, J.J., Galido, A.: Surveying LinkedIn profiles of data scientists: the case of the Philippines. Procedia Comput. Sci. 124, 53–60 (2017). https://doi.org/10.1016/j.procs.2017.12.129
Kitchenham, B.: Procedures for performing systematic reviews. 33 (2004)
Huang, X., Lin, J.: Evaluation of PICO as a knowledge representation for clinical questions: In: Proceeding of the Annual Symposium oh the American Medical Informatics Association. AMIA Press (2006). http://users.umiacs.umd.edu/~jimmylin/publications/Huang_etal_AMIA2006.pdf
Zhai, J., Jocz, J.A., Tan, A.-L.: ‘Am I Like a Scientist?’: primary children’s images of doing science in school. Int. J. Sci. Educ. 36, 553–576 (2014). https://doi.org/10.1080/09500693.2013.791958
Treadwell, G., Ross, T., Lee, A., Lowenstein, J.K.: A numbers game: two case studies in teaching data journalism. Journal. Mass Commun. Educ. 71, 297–308 (2016). https://doi.org/10.1177/1077695816665215
Younge, A.J.: Architectural principles and experimentation of distributed high performance virtual clusters. 24 (2017)
Gold, A.U., et al.: Arctic climate connections curriculum: a model for bringing authentic data into the classroom. J. Geosci. Educ. 63, 185–197 (2015). https://doi.org/10.5408/14-030.1
Fuller, M.: BIG DATA: new science, new challenges, new dialogical opportunities: Zygon. Zygon® 50, 569–582 (2015). https://doi.org/10.1111/zygo.12187
Manieri, A., et al.: Data science professional uncovered: how the EDISON project will contribute to a widely accepted profile for Data Scientists (2015)
Seo, D., Lee, M.-H., Yu, S.: Development of network analysis and visualization system for KEGG pathways. Symmetry 7, 1275–1288 (2015). https://doi.org/10.3390/sym7031275
Shaikh, M.A.H., Omar, M.T., Azharul Hasan, K.M.: Efficient index computation for array based structured data. In: Efficient Index Computation for Array Based Structured Data, pp. 101–105. IEEE (2015). http://ieeexplore.ieee.org/document/7391930/. Accessed 18 May 2018
Rupp, A.A., van Rijn, P.W.: GDINA and CDM packages in R. Meas.: Interdiscipl. Res. Perspect. 16, 71–77 (2018). https://doi.org/10.1080/15366367.2018.1437243
Webb, S.J., et al.: Guidelines and best practices for electrophysiological data collection, analysis and reporting in autism. J. Autism Dev. Disord. 45, 425–443 (2015). https://doi.org/10.1007/s10803-013-1916-6
Brennan, P.F., Bakken, S.: Nursing needs big data and big data needs nursing: nursing needs big data. J. Nurs. Scholarsh. 47, 477–484 (2015). https://doi.org/10.1111/jnu.12159
Tudoran, R., Costan, A., Antoniu, G.: OverFlow: multi-site aware big data management for scientific workflows on clouds. IEEE Trans. Cloud Comput. 4, 76–89 (2016). https://doi.org/10.1109/TCC.2015.2440254
Asamoah, D.A., Sharda, R., Hassan Zadeh, A., Kalgotra, P.: Preparing a data scientist: a pedagogic experience in designing a big data analytics course: preparing a data scientist. Decis. Sci. J. Innov. Educ. 15, 161–190 (2017). https://doi.org/10.1111/dsji.12125
Bowers, A.J.: Quantitative research methods training in education leadership and administration preparation programs as disciplined inquiry for building school improvement capacity. J. Res. Leadersh. Educ. 12, 72–96 (2017). https://doi.org/10.1177/1942775116659462
Malviya, A., Udhani, A., Soni, S.: R-tool: data analytic framework for big data. In: R-Tool: Data Analytic Framework for Big Data, pp. 1–5. IEEE (2016). http://ieeexplore.ieee.org/document/7570960/. Accessed 18 May 2018
Ebadi, H., Antignac, T., Sands, D.: Sampling and partitioning for differential privacy. In: Sampling and Partitioning for Differential Privacy, pp. 664–673. IEEE (2016). http://ieeexplore.ieee.org/document/7906954/. Accessed 18 May 2018
Rojas, J.A.R., Beth Kery, M., Rosenthal, S., Dey, A.: Sampling techniques to improve big data exploration. Sampling Techniques to Improve Big Data Exploration, pp. 26–35. IEEE (2017). http://ieeexplore.ieee.org/document/8231848/. Accessed 18 May 2018
Gehl, R.W.: Sharing, knowledge management and big data: a partial genealogy of the data scientist (2015)
Kim, S., Choi, M.-S.: Study on data center and data librarian role for reuse of research data. In: Study on Data Center and Data Librarian Role for Reuse of Research Data, pp. 303–308. IEEE (2016). http://ieeexplore.ieee.org/document/7440517/. Accessed 18 May 2018
Eybers, S., Hattingh, M.: Teaching data science to post graduate students: a preliminary study using a « F-L-I-P » class room approach (2016)
Baškarada, S., Koronios, A.: Unicorn data scientist: the rarest of breeds. Program 51, 65–74 (2017). https://doi.org/10.1108/PROG-07-2016-0053
Schreck, B., Veeramachaneni, K.: What would a data scientist ask? Automatically formulating and solving predictive problems. In: What Would a Data Scientist Ask? Automatically Formulating and Solving Predictive Problems, pp. 440–451. IEEE (2016). http://ieeexplore.ieee.org/document/7796930/. Accessed 19 May 2018
Data robot: Beneficios para los científicos de datos. https://www.datarobot.com/data-scientists/. Accessed 19 May 2018
SubjectivesSystems: Convertimos DATA en VENTAJA. https://www.subjectivesystems.com/. Accessed 19 May 2018
Turi create intelligence: GraphLab-Create. https://pypi.org/project/GraphLab-Create/. Accessed 19 May 2018
Ipython: Ipython interactive computing. http://ipython.org/index.html. Accessed 19 May 2018
KNIME: KNIME Analytics Platform. https://www.knime.com/knime-analytics-platform. Accessed 19 May 2018
Saltz, J.S., Grady, N.W.: The ambiguity of data science team roles and the need for a data science workforce framework, pp. 2355–2361. IEEE (2017). http://ieeexplore.ieee.org/document/8258190/. Accessed 19 May 2018
Forbes: Report: Why « Data Scientist » is the Best Job to Pursue in 2016. https://www.forbes.com/sites/gregoryferenstein/2016/01/20/report-why-data-scientist-is-the-best-job-to-pursue-in-2016/#13caba13a526
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Espinoza Mina, M.A., Gallegos Barzola, D. (2019). Data Scientist: A Systematic Review of the Literature. In: Botto-Tobar, M., Pizarro, G., Zúñiga-Prieto, M., D’Armas, M., Zúñiga Sánchez, M. (eds) Technology Trends. CITT 2018. Communications in Computer and Information Science, vol 895. Springer, Cham. https://doi.org/10.1007/978-3-030-05532-5_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-05532-5_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05531-8
Online ISBN: 978-3-030-05532-5
eBook Packages: Computer ScienceComputer Science (R0)