Advertisement

Data Driven pp 73-86 | Cite as

Principles of Data Science: Primer

  • Jeremy David Curuksu
Chapter
Part of the Management for Professionals book series (MANAGPROF)

Abstract

Let us face it. Statistics and mathematics deter almost everyone except the ones who choose to specialize in it. If you kept reading and reached this far in the book you are probably now considering skipping the chapters on Data Science and moving on to the next on Strategy because, well, it sounds more exciting. Thus, let us start this chapter on statistics by a simple example that illustrates why it is worth reading and why consultants may increasingly use mathematics.

References

  1. 59.
    Sarkar et al (2011) Translational bioinformatics: linking knowledge across biological and clinical realms. J Am Med Inform Assoc 18:354–357CrossRefGoogle Scholar
  2. 65.
    Marx V (2013) The big challenges of big data. Nature 498:255–260CrossRefGoogle Scholar
  3. 89.
    Siegel E (2013) Predictive analytics: the power to predict who will click, buy, lie, or die. Wiley, HobokenGoogle Scholar
  4. 91.
    Wheelan C (2013) Naked statistics. Norton, New YorkGoogle Scholar
  5. 154.
    Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. IJCAI 14(2):1137–1145Google Scholar
  6. 155.
    Lee Rodgers J, Nicewander WA (1988) Thirteen ways to look at the correlation coefficient. Am Stat 42(1):59–66CrossRefGoogle Scholar
  7. 156.
    Cover TM, Thomas JA (2012) Elements of information theory. Wiley, New YorkGoogle Scholar
  8. 157.
    Kullback S (1959) Information theory and statistics. Wiley, New YorkGoogle Scholar
  9. 158.
    Gower JC (1985) Properties of Euclidean and non-Euclidean distance matrices. Linear Algebra Appl 67:81–97CrossRefGoogle Scholar
  10. 159.
    Legendre A (1805) Nouvelles méthodes pour la détermination des orbites des comètes. Didot, ParisGoogle Scholar
  11. 160.
    Ozer DJ (1985) Correlation and the coefficient of determination. Psychol Bull 97(2):307CrossRefGoogle Scholar
  12. 161.
    Nagelkerke NJ (1991) A note on a general definition of the coefficient of determination. Biometrika 78(3):691–692CrossRefGoogle Scholar
  13. 162.
    Aiken LS, West SG, Reno RR (1991) Multiple regression: testing and interpreting interactions. Sage, LondonGoogle Scholar
  14. 163.
    Gibbons MR (1982) Multivariate tests of financial models: a new approach. J Financ Econ 10(1):3–27CrossRefGoogle Scholar
  15. 164.
    Berger JO (2013) Statistical decision theory and Bayesian analysis. Springer, New YorkGoogle Scholar
  16. 165.
    Ng A (2008) Artificial intelligence and machine learning, online video lecture series. Stanford University, Stanford. www.see.stanford.edu Google Scholar
  17. 166.
    Ott RL, Longnecker M (2001) An introduction to statistical methods and data analysis. Cengage Learning, BelmontGoogle Scholar
  18. 167.
    Tsitsiklis (2010) Probabilistic systems analysis and applied probability, online video lecture series. MIT, Cambridge. www.ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-041-probabilistic-systems-analysis-and-applied-probability-fall-2010/video-lectures/ Google Scholar
  19. 168.
    Nuzzo R (2014) Statistical errors. Nature 506(7487):150–152CrossRefGoogle Scholar
  20. 169.
    Goodman SN (1999) Toward evidence-based medical statistics: the p-value fallacy. Ann Intern Med 130(12):995–1004CrossRefGoogle Scholar
  21. 170.
    Lyapunov A (1901) Nouvelle forme du théorème sur la limite de probabilité. Mémoires de l'Académie de St-Petersbourg 12Google Scholar
  22. 171.
    Baesens B (2014) Analytics in a big data world: the essential guide to data science and its applications. Wiley, New YorkGoogle Scholar
  23. 172.
    Curuksu J (2012) Adaptive conformational sampling based on replicas. J Math Biol 64:917–931CrossRefGoogle Scholar
  24. 173.
    Pidd M (1998) Computer simulation in management science. Wiley, ChichesterGoogle Scholar
  25. 174.
    Löytynoja A (2014) Machine learning with Matlab, Nordic Matlab expo 2014. MathWorks, Stockholm. www.mathworks.com/videos/machine-learning-with-matlab-92623.html Google Scholar
  26. 175.
    Becla J, Lim KT, Wang DL (2010) Report from the 3rd workshop on extremely large databases. Data Sci J 8:MR1–MR16CrossRefGoogle Scholar
  27. 176.
    Treinen W (2014) Big data value strategic research and innovation agenda. European Commission Press, New YorkGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Jeremy David Curuksu
    • 1
  1. 1.Amazon Web Services, IncNew YorkUSA

Personalised recommendations