Skip to main content

Big Data

  • Chapter
  • First Online:

Part of the book series: Advanced Information and Knowledge Processing ((BRIEFSAIKP))

Abstract

A major business trend for most organizations is big data and business analytics, along with mobile, cloud, and social media technologies. Big data may be characterized by its volume, velocity, and variety. Most data are heterogenous and unstructured as it contains mixed and often indeterminate amounts of different kinds of information such as text, images, dates, numbers, and other information in various formats. Data analysts and scientists spend most of their time in preparing, cleaning, and wrangling their data. Data analytics may be divided into descriptive analytics, predictive analytics, and prescriptive analytics. The continuing growth of data means that large-scale analytics becomes critical for business competitiveness, and also facilitating internal decision-making processes based on data internal to the organization. Big data requires complex and advanced visualization techniques in order to fully understand the information contained in the data. Machine learning and deep learning methods are being integrated into data analytics processes. Machine learning uses statistical techniques to give computer systems the ability to “learn” (i.e., progressively improve performance on a specific task) with data. Current issues and challenges with big data and its analysis are reviewed.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Gokhale, V.: The 2011 IBM Tech Trends Report: The Clouds Are Rolling In … Is Your Business Ready? IBM, New York, NY (2011). http://ibm.co/1Plc0VR

  2. Jagadish, H.V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J.M., Ramakrishnan, R., Shahabi, C.: Big data and its technical challenges. Commun. ACM 57(7), 86–94 (2014). https://pdfs.semanticscholar.org/e527/d3c3d02f3493097be0d0f190bdc322c7519b.pdf

    Article  Google Scholar 

  3. Davis, C.K.: Communications of the ACM Viewpoint: Beyond Data and Analysis 57(6), 39–41 (2014). https://cacm.acm.org/magazines/2014/6/175178-beyond-data-and-analysis/abstract

  4. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute (2011). https://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/big-data-the-next-frontier-for-innovation

  5. Commun. ACM. Big Data 60(6), 24–25 (2017). https://cacm.acm.org/magazines/2017/6/217731-big-data/abstract

  6. https://en.wikipedia.org/wiki/Big_data

  7. Press, G.: 12 Big Data Definitions: What’s Yours? Forbes, 3 Sept 2014. https://www.forbes.com/sites/gilpress/2014/09/03/12-big-data-definitions-whats-yours/#2f31991613ae

  8. Cox, M., Ellsworth, D.: Application-controlled demand paging for out-of-core visualization. In: Proceedings of the 8th Conference on Visualization (1997). http://dl.acm.org/citation.cfm?id=266989.267068&coll=DL&dl=GUIDE

  9. Lohr, S.: The Origins of ‘Big Data’: An Etymological Detective Story, New York Times, 1 Feb 2013. https://bits.blogs.nytimes.com/2013/02/01/the-origins-of-big-data-an-etymological-detective-story/

  10. Laney, D.: 3D Data Management: Controlling Data Volume, Velocity, and Variety. Application Delivery Strategies, META Group (2001). https://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf

  11. Mayer-Schonberger, V., Cukier, K.: Big Data: A Revolution That Will Transform How We Live, Work and Think. John Murray, London, UK (2013)

    Google Scholar 

  12. Datafloq: A Short History of Big Data. https://datafloq.com/read/big-data-history/239

  13. Moore, R.J.: https://blog.rjmetrics.com/2011/02/07/eric-schmidts-5-exabytes-quote-is-a-load-of-crap/ (2011)

  14. IBM Marketing Cloud: 10 Key Marketing Trends for 2107. White Paper. https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?htmlfid=WRL12345USEN, https://public.dhe.ibm.com/common/ssi/ecm/wr/en/wrl12345usen/watson-customer-engagement-watson-marketing-wr-other-papers-and-reports-wrl12345usen-20170719.pdf

  15. Statista: https://www.statista.com/statistics/254266/global-big-data-market-forecast/ (2018)

  16. http://wikibon.org/

  17. Columbus, L.: 10 Charts That Will Change Your Perspective Of Big Data’s Growth. https://www.forbes.com/sites/louiscolumbus/2018/05/23/10-charts-that-will-change-your-perspective-of-big-datas-growth/#749ec39b2926 (2018)

  18. Columbus, L.: IBM Predicts Demand For Data Scientists Will Soar 28% By 2020. https://www.forbes.com/sites/louiscolumbus/2017/05/13/ibm-predicts-demand-for-data-scientists-will-soar-28-by-2020/#353567997e3b (2018)

  19. https://blogs-images.forbes.com/louiscolumbus/files/2017/05/Data-Science-and-Analytics-Demand-by-industry.jpg

  20. Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970). https://doi.org/10.1145/362384.362685. https://cs.uwaterloo.ca/~david/cs848s14/codd-relational.pdf

    Article  Google Scholar 

  21. https://en.wikipedia.org/wiki/Semi-structured_data

  22. Patel, J.M.: Operational NoSQL systems: what’s new and what’s next? IEEE Comput. 49(4), 23–30 (2016). https://www.computer.org/csdl/mags/co/2016/04/mco2016040023.html

    Article  Google Scholar 

  23. Gudivada, V.N., Rao, D., Raghaven, V.V.: Renaissance in database management: navigating the landscape of candidate systems. IEEE Comput. 49(4), 31–42 (2016). https://ieeexplore.ieee.org/document/7452311

    Article  Google Scholar 

  24. DB-Engines Ranking: https://db-engines.com/en/ranking. Accessed 18 Sept 2018

  25. Stonebraker, M.: Stonebraker on NoSQL and enterprises. Commun. ACM 54(8), 10–11 (2011). https://cacm.acm.org/magazines/2011/8/114950-stonebraker-on-nosql-and-enterprises/abstract

    Article  Google Scholar 

  26. https://en.wikipedia.org/wiki/Entity%E2%80%93relationship_model

  27. Microsoft: Big Data Architectures (2017). https://docs.microsoft.com/en-us/azure/architecture/data-guide/big-data/

  28. Taylor, C.: Big Data Architecture, Datamation, 8 June 2017 https://www.datamation.com/big-data/big-data-architecture.html

  29. https://cra.org/ccc/wp-content/uploads/sites/2/2015/05/bigdatawhitepaper.pdf

  30. Gartner Summits: Advanced Analytics (2018). https://www.gartner.com/it-glossary/advanced-analytics/

  31. Tayi, G.M., Krishna, P.R.: IEEE Computing Now special issue on Advanced Data Analytics, Guest Editors’ Introduction, Oct 2017. https://www.computer.org/publications/tech-news/computing-now/advanced-data-analytics

  32. Machine Learning, Wikipedia: https://en.wikipedia.org/wiki/Machine_learning

  33. R2D3: A Visual Introduction to Machine Learning. http://www.r2d3.us/visual-intro-to-machine-learning-part-1/

  34. Knight, W.: The Dark Secret at the Heart of AI, MIT Technology Review, May/June 2017. https://www.technologyreview.com/s/604087/the-dark-secret-at-the-heart-of-ai/

  35. Microsoft: https://docs.microsoft.com/en-us/azure/architecture/data-guide/scenarios/advanced-analytics (2017)

  36. Evelson, B.: Topic Overview: Business Intelligence, 21 Nov 2008. https://www.forrester.com/report/Topic+Overview+Business+Intelligence/-/E-RES39218

  37. https://en.wikipedia.org/wiki/Magic_Quadrant

  38. Vijayan, J.: Presidential election a victory for quants, Computerworld (2012). https://www.computerworld.com/article/2492918/business-intelligence/presidential-election-a-victory-for-quants.html

  39. Lampitt, A.: The real story of how big data analytics helped Obama win, Infoworld (2013). https://www.infoworld.com/article/2613587/big-data/the-real-story-of-how-big-data-analytics-helped-obama-win.html

  40. Yan, Z.: How data analytics helped Obama win the 2012 US presidential election (2018). https://yp.scmp.com/tertiary-education/city-university-of-hong-kong/article/109120/how-data-analytics-helped-obama-win

  41. Ceron, A., Curini, L., Iacus, S.M.: Politics and Big Data: Nowcasting and Forecasting Elections with Social Media. Routledge, Abingdon, UK (2017)

    Google Scholar 

  42. Johnson, D.W.: Campaigning in the Twenty-First Century: Activism, Big Data, and Dark Money. Routledge, Abingdon, UK (2016)

    Book  Google Scholar 

  43. Olavsrud, T.: 6 data analytics trends that will dominate 2018. CIO (2018). https://www.cio.com/article/3251720/analytics/4-data-analytics-trends-that-will-dominate-2018.html

  44. Heller, M.: 10 hot data analytics trends—and 5 going cold. CIO (2017). https://www.cio.com/article/3213189/analytics/10-hot-data-analytics-trends-and-5-going-cold.html

  45. Lebied, M.: Top 10 Analytics And Business Intelligence Trends for 2018. Datapine (2017). https://www.datapine.com/blog/business-intelligence-trends/

  46. Carillo, D.: 10 Big Data Trends you should know. Pure B2B, 2018. KDNuggets. https://www.kdnuggets.com/2018/09/10-big-data-trends.html

  47. Fisher, D., Deline, R., Czerwinski, M., Drucker, S.: Interactions with big data analytics. ACM Interact. 19(3), 50–59 (2012). https://dl.acm.org/citation.cfm?id=2168943

    Article  Google Scholar 

  48. Fan, J., Fang, H., Liu, H.: Challenges of big data analysis. Natl. Sci. Rev. 1(2), 293–314 (2014). https://doi.org/10.1093/nsr/nwt032, https://academic.oup.com/nsr/article/1/2/293/1397586

    Article  Google Scholar 

  49. Naimi, A.I., Westreich, D.J.: Book Review of Big Data: A Revolution That Will Transform How We Live, Work, and Think. Am. J. Epidemiol. 179(9), 1143–1144 (2014). https://doi.org/10.1093/aje/kwu085

    Article  Google Scholar 

  50. https://blog.hootsuite.com/twitter-demographics/

  51. Gartner: Gartner Marketing Analytics Survey (2018). https://www.gartner.com/smarterwithgartner/key-findings-from-gartner-marketing-analytics-survey-2018/

  52. Stonebraker, M.: My 10 fears about the Future of the DBMS field (2018). https://www.youtube.com/watch?v=DJFKl_5JTnA

  53. Kasik, D., Dill, J.: Toward technology transfer evaluation criteria. In: Proceedings of Hawaii International Conference on System Sciences (HICSS) (2019)

    Google Scholar 

  54. Green, A.: Seven Free Data Wrangling Tools (2015). https://blog.varonis.com/free-data-wrangling-tools/

  55. Kandel, S., Paepcke, A., Hellerstein, J.M., Heer, J.: Enterprise data analysis and visualization: an interview study. IEEE Trans. Vis. Comput. Graph. 18(12), 2917–2926 (2012). http://vis.stanford.edu/files/2012-EnterpriseAnalysisInterviews-VAST.pdf, https://ieeexplore.ieee.org/document/6327298

    Article  Google Scholar 

  56. Chu, X., Ilyas, I.F., Krishnan, S., Wang. J.: Data cleaning: overview and emerging challenges. In: SIGMOD’16, 26 June–01 July 2016. http://dx.doi.org/10.1145/2882903.2912574. https://dl.acm.org/citation.cfm?doid=2882903.2912574

  57. Shneiderman, B., Plaisant, C.: Sharpening analytic focus to cope with big data volume and variety. Visualization viewpoints. IEEE Comput. Graph. Appl. 35(3), 10–14 (2015). https://ieeexplore.ieee.org/document/7111924, http://www.cs.umd.edu/hcil/trs/2014-27/2014-27.pdf

    Article  Google Scholar 

  58. Gupta, D., Rani, R.: A study of big data evolution and research challenges. J. Inf. Sci. 1–19 (2018). https://doi.org/10.1177/0165551518789880

    Article  Google Scholar 

  59. Glavic, B.: Big Data provenance: challenges and implications for benchmarking. In: Workshop on Specifying Big Data Benchmarks, vol. 8163, pp. 72–80, Springer, Cham, Switzerland (2012)

    Chapter  Google Scholar 

  60. Wang, J., Crawl, D., Purawat, S., Nguyen, M., Altintas, I.: Big data provenance: challenges, state of the art and opportunities. In: IEEE International Conference on Big Data 2015, pp. 2509–2516 (2015). https://ieeexplore.ieee.org/document/7364047, https://www.researchgate.net/publication/301451405_Big_Data_Provenance_Challenges_State_of_the_Art_and_Opportunities

  61. Ragan, E.D., Endert, A., Sanyal, J., Chen, J.: Characterizing provenance in visualization and data analysis: an organizational framework of provenance types and purposes. IEEE Trans. Vis. Comput. Graph. 22(1), 31–40 (2016). https://ieeexplore.ieee.org/document/7192714

    Article  Google Scholar 

  62. Marr, B.: Where Big Data Projects Fail, Forbes (2015). https://www.forbes.com/sites/bernardmarr/2015/03/17/where-big-data-projects-fail/#12b6463c239f

  63. Kugler, L.: What happens when big data blunders? Commun. ACM 59(6), 15–16 (2016). https://dl.acm.org/citation.cfm?id=2911975, https://cacm.acm.org/magazines/2016/6/202655-what-happens-when-big-data-blunders/abstract

    Article  Google Scholar 

  64. Manoj, K.S., Dileep, K.G. (eds.): Effective Big Data Management and Opportunities for Implementation. IGI Publishing, Hershey, PA (2016). https://dl.acm.org/citation.cfm?id=3044790, http://eprints.bournemouth.ac.uk/23576/9/karanasiou%20chap_kumar%202016%20book.pdf

  65. Ebert, D.: Keynote talk at the 5th annual 2017 Big Data Congress, Halifax, NS: Human-Computer Collaborative Decision Making, Through Visual Analytics, Nov 2017. https://www.conf.purdue.edu/landing_pages/psps/profile8.aspx

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John Dill .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Dill, J. (2019). Big Data. In: Data Science and Visual Computing. Advanced Information and Knowledge Processing(). Springer, Cham. https://doi.org/10.1007/978-3-030-24367-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-24367-8_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-24366-1

  • Online ISBN: 978-3-030-24367-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics