Abstract
What are the greatest challenges of big data and data science? This question itself is problematic as data science is at a very early stage and has been built on existing disciplines. This chapter explores this important issue.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
ASA: Ethical guidelines for statistical practice, American statistical association (2016). URL https://www.certifiedanalytics.org/ethics.php
Batini, C., Scannapieco, M.: Data and Information Quality: Dimensions, Principles and Techniques. Springer (2016)
BI: Behavioral insights (2014). URL http://www.behaviouralinsights.co.uk/
Bynum, T.: Computer and information ethics. In: The Stanford encyclopedia of philosophy (ed. Zalta EN) (2015). URL See http://plato.stanford.edu/archives/win2015/entries/ethics-computer/
Cao, L.: Domain driven data mining: Challenges and prospects. IEEE Trans. on Knowledge and Data Engineering 22(6), 755–769 (2010)
Cao, L.: In-depth behavior understanding and use: The behavior informatics approach. Information Science 180(17), 3067–3085 (2010)
Cao, L.: Combined mining: Analyzing object and pattern relations for discovering and constructing complex but actionable patterns. WIREs Data Mining and Knowledge Discovery 3(2), 140–155 (2013)
Cao, L.: Non-iidness learning in behavioral and social data. The Computer Journal 57(9), 1358–1370 (2014)
Cao, L.: Coupling learning of complex interactions. J. Information Processing and Management 51(2), 167–186 (2015)
Cao, L.: Metasynthetic Computing and Engineering of Complex Systems. Springer (2015)
Cao, L.: Data science: Challenges and directions (2016). Technical Report, UTS Advanced Analytics Institute
Cao, L., (Eds), P.S.Y.: Behavior Computing: Modeling, Analysis, Mining and Decision. Springer (2012)
Cao, L., Ou, Y., Yu, P.S.: Coupled behavior analysis with applications. IEEE Trans. on Knowledge and Data Engineering 24(8), 1378–1392 (2012)
Cao, L., Yu, P.S., Kumar, V.: Nonoccurring behavior analytics: A new area. IEEE Intelligent Systems 30(6), 4–11 (2015)
Cao, L., Yu, P.S., Zhang, C., Zhao, Y.: Domain Driven Data Mining. Springer (2010)
Ceglar, A., Roddick, J.: Association mining. ACM Computing Surveys 38(2), 5 (2006)
Chemuturi, M.: Mastering Software Quality Assurance: Best Practices, Tools and Techniques for Software Developers. J. Ross Publishing (2010)
Deeplearning: Deeplearning (2016). URL www.deeplearning.net/
Drew, C.: Data science ethics in government. Phil. Trans. R. Soc. A 374 (2016)
DSA: Data science code of professional conduct, data science association (2016). URL http://www.datascienceassn.org/code-of-conduct.html
(Ed.), M.P.: Similarity-based pattern analysis and recognition. Springer (2013)
Ehling, M., Korner, T.: Handbook on Data Quality Assessment Methods and Tools (eds.). EUROSTAT, Wiesbaden (2007)
Faghmous, J.H., Kumar, V.: A big data guide to understanding climate change: The case for theory-guided data science. Big Data 2(3), 155–163 (2014)
Floridi, L.: The ethics of information. Oxford University Press (2013)
Floridi, L., Taddeo, M.: What is data ethics. Phil. Trans. R. Soc. A 374(2083) (2016)
G. Szkely, e.a.: Measuring and testing independence by correlation of distances. Annals of Statistics 35(6), 2769–2794 (2007)
Galin, D.: Software Quality Assurance: From Theory to Implementation. Pearson (2003)
Ganiz, M., George, C., Pottenger, W.: Higher order naive bayes: A novel non-iid approach to text classification. IEEE Transactions on Knowledge and Data Engineering 23(7), 1022–1034 (2011)
Google: Deepmind (2016). URL https://deepmind.com/
H. Lu, e.a.: Beyond intratransaction association analysis. ACM Transactions on Information Systems 18(4), 423–454 (2000)
Hazena, B.T., Booneb, C.A., Ezellc, J.D., Jones-Farmer, L.A.: Data quality for data science, predictive analytics, and big data in supply chain management: An introduction to the problem and suggestions for research and applications. International Journal of Production Economics 154, 72–80 (2014)
INFORMS: Informs code of ethics for certified analytics professionals. URL https://www.certifiedanalytics.org/ethics.php
J. Hair, e.a.: Multivariate data analysis (7th Edition). Prentice Hall (2009)
Kan, S.H.: Metrics and Models in Software Quality Engineering, 2nd Edition. Addison-Wesley Professional (2002)
Kenett, R.S., Shmueli, G.: Information Quality: The Potential of Data and Analytics to Generate Knowledge. Wiley (2016)
Kramer, A., Guillory, J., Hancock, J.: Experimental evidence of massive-scale emotional contagion through social networks. Proc. Natl. Acad. Sci. 111(24), 8788–8790 (2014)
Kurzweil, R.: How to Create a Mind: The Secret of Human Thought Revealed. Penguin Books (2013)
Leonelli, S.: Locating ethics in data science: responsibility and accountability in global and distributed knowledge production systems. Phil. Trans. R. Soc. A 374 (2016)
Loshin, D.: Enterprise Knowledge Management. Morgan Kaufmann (2001)
Miller, K., Taddeo, M.: The ethics of information technologies. In: Library of Essays on the Ethics of Emerging Technologies (ed.). NY: Routledge (2017)
MIT: Checklist for software quality (2011). URL http://web.mit.edu/~6.170/www/quality.html
Mitchell, M.: Complexity: A Guided Tour. Oxford University Press (2011)
Mittelstadt, B., Floridi, L.: The ethics of big data: current and foreseeable issues in biomedical contexts. Sci. Eng. Ethics 22, 303–341 (2015)
von Neumann, J., Kurzweil, R.: The Computer and the Brain, 3rd Edition. Yale University Press (2012)
Neville, J., Jensen, D.: Relational dependency networks. The Journal of Machine Learning Research 8, 653–692 (2007)
O’Leary, D.E.: Ethics for big data and analytics. IEEE Intelligent Systems 31(4), 81–84 (2016)
Pearson, K.: Report on certain enteric fever inoculation statistics. Br Med J. 2(2288), 1243–1246 (1904)
Philip, J.C.: Computer Generated Artificial Life: A Biblical And Logical Analysis (Integrated Apologetics), 10th edition. Philip Communications (2015)
Qian, X., Yu, J., Dai, R.: A new discipline of science-the study of open complex giant system and its methodology. Chin. J. Syst. Eng. Electron. 4(2), 2–12 (1993)
Redman, T.: Data Quality: The Field Guide. Digital Press (2001)
Rowley, J.: The wisdom hierarchy: representations of the DIKW hierarchy. Journal of Information and Communication Science 33(2), 163–180 (2007)
Schulmeyer, G.G., Mcmanus, J.I.: Handbook of Software Quality Assurance, 3rd Edition. Prentice Hall PTR (1998)
Sebastian-Coleman, L.: Measuring Data Quality for Ongoing Improvement: A Data Quality Assessment Framework. Morgan Kaufmann (2013)
Suchma, L.: Human-Machine Reconfigurations: Plans and Situated Actions. Cambridge University Press (2006)
Taddeo, M., (eds.), L.F.: The ethical impact of data science. Phil. Trans. R. Soc. A 374 (2016). URL http://rsta.royalsocietypublishing.org/content/374/2083
Taleb, N.N.: The Black Swan: The Impact of the Highly Improbable. Random House, New York (2007)
USAID: Usaid recommended data quality assessment (dqa) checklist (2016). URL https://usaidlearninglab.org/sites/default/files/resource/files/201sae.pdf
Wang, C., Cao, L., Chi, C.: Formalization and verification of group behavior interactions. IEEE Trans. Systems, Man, and Cybernetics: Systems 45(8), 1109–1124 (2015)
Wei Wei Junfu Yin, J.L., Cao, L.: Modeling asymmetry and tail dependence among multiple variables by using partial regular vine. In: SDM2014 (2014)
Wikipedia: General data protection regulation (2016). URL https://en.wikipedia.org/wiki/General_Data_Protection_Regulation
Wikipedia: National data protection authority (2016). URL https://en.wikipedia.org/wiki/National_data_protection_authority
Wikipedia: Accuracy, precision, recall and specificity (2017). URL https://en.wikipedia.org/wiki/Precision_and_recall
Wikipedia: Data quality (2017). URL https://en.wikipedia.org/wiki/Data_quality
Woodall P., B.A., Parlikad, A.: Data quality assessment: The hybrid approach. Information & Management 50(7), 369–382 (2013)
Woodall P., O.M., A., B.: A classification of data quality assessment and improvement methods. International Journal of Information Quality 3(4), 298–321 (2014)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Cao, L. (2018). Data Science Challenges. In: Data Science Thinking. Data Analytics. Springer, Cham. https://doi.org/10.1007/978-3-319-95092-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-95092-1_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95091-4
Online ISBN: 978-3-319-95092-1
eBook Packages: Computer ScienceComputer Science (R0)