Advertisement

Introduction to Applied Data Science

  • Thilo StadelmannEmail author
  • Martin Braschler
  • Kurt Stockinger
Chapter

Abstract

What is data science? Attempts to define it can be made in one (prolonged) sentence, while it may take a whole book to demonstrate the meaning of this definition. This book introduces data science in an applied setting, by first giving a coherent overview of the background in Part I, and then presenting the nuts and bolts of the discipline by means of diverse use cases in Part II; finally, specific and insightful lessons learned are distilled in Part III. This chapter introduces the book and provides an answer to the following questions: What is data science? Where does it come from? What are its connections to big data and other mega trends? We claim that multidisciplinary roots and a focus on creating value lead to a discipline in the making that is inherently an interdisciplinary, applied science.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Breiman, L. (2001). Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical Science, 16(3), 199–231.MathSciNetCrossRefGoogle Scholar
  2. Brodie, M. L. (2015a). The emerging discipline of data science. Keynote at the 2nd Swiss Workshop on Data Science SDS|2015. Available May 3, 2018, from https://www.youtube.com/watch?v=z93X2k9RVqg
  3. Brodie, M. L. (2015b). Doubt and verify: Data science power tools. Available March 23, 2018, from http://www.kdnuggets.com/2015/07/doubt-verify-data-science-power-tools.html
  4. Brooks, R. (2017). The seven deadly sins of AI predictions. MIT Technology Review. Available March 28, 2018, from https://www.technologyreview.com/s/609048/the-seven-deadly-sins-of-ai-predictions/
  5. Brown, E. D. (2014). Drowning in data, starved for information. Available March 27, 2018, from http://ericbrown.com/drowning-in-data-starved-for-information.htm
  6. Brundage, M., Avin, S., Clark, J., Toner, H., Eckersley, P., Garfinkel, B., et al. (2018). The malicious use of artificial intelligence: Forecasting, prevention, and mitigation. arXiv preprint arXiv:1802.07228.Google Scholar
  7. Chui, M., Farrell, D., & Jackson, K. (2014). How government can promote open data. Available March 23, 2018, from https://www.mckinsey.com/industries/public-sector/our-insights/how-government-can-promote-open-data
  8. Clerck, J. (2017). Digitization, digitalization and digital transformation: The differences. i-SCOOP. Available March 23, 2018, from https://www.i-scoop.eu/digitization-digitalization-digital-transformation-disruption/
  9. Cleveland, W. S. (2001). Data science: An action plan for expanding the technical areas of the field of statistics. International Statistical Review, 69(1), 21–26.CrossRefGoogle Scholar
  10. Dapp, T., Slomka, L., AG, D. B., & Hoffmann, R. (2014). Fintech–the digital (r) evolution in the financial sector. Frankfurt am Main: Deutsche Bank Research.Google Scholar
  11. Davenport, T. H., & Harris, J. G. (2007). Competing on analytics: The new science of winning. Boston: Harvard Business Press.Google Scholar
  12. Davenport, T. H., & Patil, D. (2012). Data scientist: The sexiest job of the 21st century. Available March 23, 2018, from http://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/ar/1
  13. Domingos, P. (2015). The master algorithm: How the quest for the ultimate learning machine will remake our world. New York: Basic Books.Google Scholar
  14. Domo. (2017). Data never sleeps 5.0. Available March 23, 2018, from https://www.domo.com/learn/data-never-sleeps-5
  15. Domo. (2018). Data never sleeps 6.0. Available October 9, 2018, from https://www.domo.com/learn/data-never-sleeps-6
  16. Düllmann, D. (1999). Petabyte databases. ACM SIGMOD Record, 28(2), 506.CrossRefGoogle Scholar
  17. Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI Magazine, 17(3), 37.Google Scholar
  18. Gruber, A. (2017, January 17). Wenn Maschinen lernen lernen. Spiegel Online. Available May 10, 2018, from http://www.spiegel.de/netzwelt/web/kuenstliche-intelligenz-wenn-maschinen-lernen-lernen-a-1130255.html
  19. Harding, C. (2017). Digital participation – The advantages and disadvantages. Available March 23, 2018, from https://www.polyas.de/blog/en/digital-democracy/digital-participation-advantages-disadvantages
  20. Harman, D. K., & Voorhees, E. M. (2006). TREC: An overview. Annual Review of Information Science and Technology, 40(1), 113–155.CrossRefGoogle Scholar
  21. Hayashi, C., Yajima, K., Bock, H. H., Ohsumi, N., Tanaka, Y., & Baba, Y. (Eds.). (1996). Data science, classification, and related methods: Proceedings of the fifth conference of the international federation of classification societies (IFCS-96), Kobe, Japan, March 27–30, 1996. Springer Science & Business Media.Google Scholar
  22. Hayes-Roth, F., Waterman, D. A., & Lenat, D. B. (1983). Building expert system. Boston, MA: Addison-Wesley Longman.Google Scholar
  23. Henke, N., Bughin, J., Chui, M., Manyika, J., Saleh, T., Wiseman, B., & Sethupathy, G. (2016). The age of analytics: Competing in a data-driven world. McKinsey Global Institute report.Google Scholar
  24. Hey, T., Tansley, S., & Tolle, K. M. (2009). The fourth paradigm: Data-intensive scientific discovery (Vol. 1). Redmond, WA: Microsoft Research.Google Scholar
  25. Humby, C. (2006, November). Data is the new Oil!. ANA Senior marketer’s summit, Kellogg School. http://ana.blogs.com/maestros/2006/11/data_is_the_new.html
  26. Kagermann, H., Lukas, W. D., & Wahlster, W. (2011). Industrie 4.0: Mit dem Internet der Dinge auf dem Weg zur 4. industriellen Revolution. VDI nachrichten, 13, 11.Google Scholar
  27. Kovach, S. (2017). We talked to Sophia – The AI robot that once said it would ‘destroy humans’. Tech Insider youtube video. Available May 10, 2018, from https://www.youtube.com/watch?v=78-1MlkxyqI
  28. Kremp, M. (2018, May 9). Google Duplex ist gruselig gut. Spiegel Online. Available May 10, 2018, from http://www.spiegel.de/netzwelt/web/google-duplex-auf-der-i-o-gruselig-gute-kuenstliche-intelligenz-a-1206938.html
  29. Krogerus, M., & Grassegger, H. (2016). Ich habe nur gezeigt, dass es die Bombe gibt. Das Magazin, (48–3). Available May 11, 2018, from https://www.tagesanzeiger.ch/ausland/europa/Ich-habe-nur-gezeigt-dass-es-die-Bombe-gibt/story/17474918
  30. Kuhn, J. (2018). “Techlash”: Der Aufstand gegen die Tech-Giganten hat begonnen. Süddeutsche Zeitung. Available April 3, 2018, from http://www.sueddeutsche.de/digital/digitalisierung-techlash-der-aufstand-gegen-die-tech-giganten-hat-begonnen-1.3869965
  31. Laney, D. (2001). 3D data management: Controlling data volume, velocity and variety. META Group Research Note, 6(70).Google Scholar
  32. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.CrossRefGoogle Scholar
  33. Li, R., Lu, B., & McDonald-Maier, K. D. (2015). Cognitive assisted living ambient system: A survey. Digital Communications and Networks, 1(4), 229–252.CrossRefGoogle Scholar
  34. Loukides, M. (2010). What is data science? Available March 23, 2018, from https://www.oreilly.com/ideas/what-is-data-science
  35. Manyika, J. (2009). Hal Varian on how the Web challenges managers. McKinsey Quarterly. Available March 23, 2018, from https://www.mckinsey.com/industries/high-tech/our-insights/hal-varian-on-how-the-web-challenges-managers
  36. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. H. (2011). Big data: The next frontier for innovation, competition, and productivity. Available March 23, 2018, from https://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/big-data-the-next-frontier-for-innovation
  37. Mayer, M. (2016). Fintech? Edtech? Adtech? Duriantech? – The 10 buzziest startup sectors. Available March 23, 2018, from https://techsauce.co/en/startup-2/fintech-edtech-adtech-duriantech-the-10-buzziest-startup-sectors/
  38. McAfee, A., Brynjolfsson, E., Davenport, T. H., Patil, D. J., & Barton, D. (2012). Big data: The management revolution. Harvard Business Review, 90(10), 60–68. Available March 23, 2018, from https://hbr.org/2012/10/big-data-the-management-revolution
  39. MedTech Europe. (2018). The European medical technology industry in figures 2018. MedTech Europe Brochure. Available May 10, 2018, from http://www.medtecheurope.org/EU-medtech-industry-facts-and-figures-2017
  40. Meltzer, T. (2014). Robot doctors, online lawyers and automated architects: The future of the professions? The Guardian. Available March 28, 2018, from https://www.theguardian.com/technology/2014/jun/15/robot-doctors-online-lawyers-automated-architects-future-professions-jobs-technology
  41. Naisbitt, J., & Cracknell, J. (1984). Megatrends: Ten new directions transforming our lives (No. 04; HN59. 2, N3.). New York: Warner Books.Google Scholar
  42. Parekh, D. (2015). How big data will transform our economy and our lives. Available March 23, 2018, from http://techcrunch.com/2015/01/02/the-year-of-big-data-is-upon-us/
  43. Patil, D. (2011). Building data science teams. Available March 23, 2018, from http://radar.oreilly.com/2011/09/building-data-science-teams.html
  44. Press, G. (2013). A very short history of data science. Available March 23, 2018, from https://www.forbes.com/sites/gilpress/2013/05/28/a-very-short-history-of-data-science
  45. Provost, F., & Fawcett, T. (2013, March). Data science and its relationship to big data and data-driven decision making. Big Data, 1(1), 51–59.CrossRefGoogle Scholar
  46. Rahimi, A., & Recht, B. (2017). Reflections on Random Kitchen Sinks. Acceptance speech for Test of Time Award at NIPS 2017. Available March 28, 2018, from http://www.argmin.net/2017/12/05/kitchen-sinks/
  47. Roberts, D. (2018, May 9). Here’s how self-driving cars could catch on. Vox article. Available May 10, 2018, from https://www.vox.com/energy-and-environment/2018/5/8/17330112/self-driving-cars-autonomous-vehicles-texas-drive-ai
  48. Russell, S. J., & Norvig, P. (2010). Artificial intelligence: A modern approach (3rd ed.). Upper Saddle River, NJ: Pearson Education.zbMATHGoogle Scholar
  49. Shiers, J. (1998). Building a multi-petabyte database: The RD45 project at CERN. In Object databases in practice (pp. 164–176). Upper Saddle River, NJ: Prentice Hall.Google Scholar
  50. Siegel, E. (2013). Predictive analytics: The power to predict who will click, buy, lie, or die. Hoboken, NJ: Wiley.Google Scholar
  51. Simard, P. Y., Steinkraus, D., & Platt, J. C. (2003). Best practices for convolutional neural networks applied to visual document analysis. ICDAR, 3, 958–962.Google Scholar
  52. Smith, D. (2011). “Data Science”: What’s in a name? Available March 27, 2018, from http://blog.revolutionanalytics.com/2011/05/data-science-whats-in-a-name.html.
  53. Soubra, D. (2012). The 3Vs that define Big Data. Available March 23, 2018, from www.datasciencecentral.com/forum/topics/the-3vs-that-define-big-data
  54. Spout Social. (2018). The complete guide to chatbots in 2018. Sprout blog. Available May 10, 2018, from https://sproutsocial.com/insights/topics/chatbots/
  55. Stadelmann, T., Stockinger, K., Braschler, M., Cieliebak, M., Baudinot, G. R., Dürr, O., & Ruckstuhl, A. (2013, August). Applied data science in Europe – Challenges for academia in keeping up with a highly demanded topic. In European computer science summit ECSS 2013. Amsterdam: Informatics Europe.Google Scholar
  56. Stockinger, K., & Stadelmann, T. (2014). Data Science für Lehre, Forschung und Praxis. HMD Praxis der Wirtschaftsinformatik, 51(4), 469–479.CrossRefGoogle Scholar
  57. Stockinger, K., Stadelmann, T., & Ruckstuhl, A. (2016). Data Scientist als Beruf. In D. Fasel, & A. Meier (Eds.), Big data. Edition HMD.  https://doi.org/10.1007/978-3-658-11589-0_4.CrossRefGoogle Scholar
  58. Sveinsdottir, E., & Frøkjær, E. (1988). Datalogy—The Copenhagen tradition of computer science. BIT Numerical Mathematics, 28(3), 450–472.MathSciNetCrossRefGoogle Scholar
  59. Swiss Alliance for Data-Intensive Services. (2017). Digitization & innovation through cooperation. Glimpses from the Digitization & Innovation Workshop at “Konferenz Digitale Schweiz”. Available March 28, 2018, from https://www.data-service-alliance.ch/blog/blog/digitization-innovation-through-cooperation-glimpses-from-the-digitization-innovation-workshop
  60. Tez, R.-M. (2016). Rocket AI: 2016’s most notorious AI launch and the problem with AI hype. Blog post. Available May 10, 2018, from https://medium.com/the-mission/rocket-ai-2016s-most-notorious-ai-launch-and-the-problem-with-ai-hype-d7908013f8c9
  61. Tukey, J. W. (1962). The future of data analysis. The Annals of Mathematical Statistics, 33(1), 1–67.MathSciNetCrossRefGoogle Scholar
  62. Valarezo, U. A., Pérez-Amaral, T., & Gijón, C. (2016). Big data: Witnessing the birth of a new discipline. Journal of Informatics and Data Mining, 1(2).Google Scholar
  63. Vorhies, W. (2014). How many “V’s” in big data? The characteristics that define big data. Available March 23, 2018, from https://www.datasciencecentral.com/profiles/blogs/how-many-v-s-in-big-data-the-characteristics-that-define-big-data
  64. White, A. (2015). The end of big data – It’s all over now. Available March 23, 2018, from https://blogs.gartner.com/andrew_white/2015/08/20/the-end-of-big-data-its-all-over-now/
  65. Witten, I. H., Moffat, A., & Bell, T. C. (1999). Managing gigabytes: Compressing and indexing documents and images. San Francisco, CA: Morgan Kaufmann.zbMATHGoogle Scholar
  66. Zierer, K. (2017, December 27). Warum der Fokus auf das digitale Klassenzimmer Unfug ist. Spiegel Online. Available May 10, 2018, from http://www.spiegel.de/lebenundlernen/schule/digitales-klassenzimmer-die-schueler-muessen-wieder-in-den-mittelpunkt-a-1181900.html#ref=meinunghpmobi

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Thilo Stadelmann
    • 1
    Email author
  • Martin Braschler
    • 1
  • Kurt Stockinger
    • 1
  1. 1.ZHAW Zurich University of Applied SciencesWinterthurSwitzerland

Personalised recommendations