Skip to main content

Review of Data Analysis Framework for Variety of Big Data

  • Conference paper
  • First Online:
Book cover Emerging Trends in Expert Applications and Security

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 841))

Abstract

Big Data is too large to be handled by traditional methods for analysis. It is a new ubiquitous term, which describes huge amount of data. Dealing with “Variety”, one of the five characteristics of Big Data is a great challenge. Variety means a range of formats such as structured tables, semi-structured log files, and unstructured text, audio, and video data. Every format of data has its unique framework for analyzing it. In this paper, we present a detailed study about various frameworks for analyzing structured, semi-structured, and unstructured data individually. In addition, some frameworks, which deal with all the three formats together, are also explained.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sagiroglu S, Sinang D (2013) Big data : a review, IEEE

    Google Scholar 

  2. Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Hung Byers A (2011) Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute

    Google Scholar 

  3. Laney D (2001) 3D data management: controlling data volume, velocity and variety, In: Application delivery stratergies, Meta Group

    Google Scholar 

  4. Gruska N, Martin P (2010) Integrating mapreduce and RDBMS. In: Proceedings of the 2010 conference of the center for advanced studies on collaborative research, pp 212–223

    Google Scholar 

  5. Marx V (2013) Biology: the big challenges of big data. Nature 498(7453):255–260

    Article  Google Scholar 

  6. Dumbill E (2012) What is big data? an introduction to the big data landscape. Strata

    Google Scholar 

  7. Fadnavis RA, Tabhane S (2015) Big data processing using Hadoop. In: IJCSIT, vol 1

    Google Scholar 

  8. Park K, Nguyen MC, Won H (2015) Web based collaborative big data analytics on big data as a service platform. In: ICACT

    Google Scholar 

  9. Arora Y, Goyal D (2016) Big data: a review of analytics methods and techniques. In: 2nd international conference on contemporary computing and informatics. IEEE

    Google Scholar 

  10. Pavlo A, Paulson E, Rasin A (2009) A comparison of approaches to Large Scale Data Analysis, ACM

    Google Scholar 

  11. Cubranic D, Murphy GC, Singer J, Booth KS (2005) Hipikat: a project memory for software development. TSE 31(6):446–465

    Google Scholar 

  12. Meyer André, Fritz Thomas, Murphy Gail C, Zimmermann & Thomas, Software Developers’ Perceptions of Productivity, In: FSE, Nov 2914

    Google Scholar 

  13. Deligiannidis L, Kochut KJ, Sheth AP (2007) RDF data exploration and visualization. In: CIMS, pp 39–46. ACM

    Google Scholar 

  14. Hernandiz ME, Falconer SM (2008) Synchronized tag clouds for exploring semi structurd clinical trial data. In: Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds. ACM

    Google Scholar 

  15. Gandomi A, Haider M (2015) Beyond the hype: big data concepts, methods and analytics. Int J Inf Manage 35(2)

    Article  Google Scholar 

  16. Bansal SK (2014) Towards a semantic extract transform load (ETL) framework for big data integration. IEEE

    Google Scholar 

  17. Wu X, Zhu X, Wu G, Ding W (2014) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97–107

    Article  Google Scholar 

  18. Robinson EB, Lichtenstein P, Anckarsäter H, Happé F, Ronald A (2013) Examining and interpreting the female protective effect against autistic behavior, Proc Natl Acad Sci USA

    Google Scholar 

  19. Liu H, Liu Z, Yuan T, Yao Y (2014) Adaptively incremental dictionary compression method for column-oriented database, pp. 628–632

    Google Scholar 

  20. Gu J, Lin Z (2014) Implementation and evaluation of deep neural networks (DNN) on mainstream heterogeneous systems. In: Proceedings of the 5th Asia-Pacific workshop on systems

    Google Scholar 

  21. Kehrer J, Hauser H (2013) Visualization and visual analysis of multifaceted scientific data: a survey. IEEE Trans Vis Comput Graph 19(3):495–513

    Article  Google Scholar 

  22. Risi M, Sessa MI, Tucci M, Tortora G (2014) CoDe modeling of graph composition for data warehouse report visualization. IEEE Trans Knowl Data Eng 26(3):563–576

    Article  Google Scholar 

  23. Das TK, Mohan Kumar P (2013) Big data analytics: a framework for unstructured data analysis. IJET 5(1)

    Google Scholar 

  24. Lomotey RK, Deters R (2014) Analytics-as-a-Service (AaaS) tool for unstructured data mining. IEEE

    Google Scholar 

  25. Vashisht P, Gupta V (2015) Big data analytics techniques: a survey. IEEE

    Google Scholar 

  26. Greene GJ (2015) A generic framework for concept-based exploration of semistructured software engineering data. In: 30th IEEE/ACM international conference on automated software engineering (ASE). IEEE

    Google Scholar 

  27. Sindhu CS, Hedge NP (2013) A framework to handle data heterogeneity contextual to medical big data. IEEE

    Google Scholar 

  28. Chen Z, Zhong F, Yuan X, Hu Y (2016) Framework of integrated big data: a review. IEEE

    Google Scholar 

  29. Reddy V, Arnina Salim MS (2016) A comparative study of various clustering techniques on big data sets using Apache Mahout in 3D. In: MEC international conference on big data smart city. IEEE

    Google Scholar 

  30. Gharehchopogh FS, Khalifelu ZA (2011) Analysis and evaluation of unstructured data: text mining versus natural language processing. IEEE

    Google Scholar 

  31. Arora S, Chana I (2014) A survey of clustering techniques for big data analytics. In: 5th international conference—confluence the next generation information technology summit. IEEE

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yojna Arora .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Arora, Y., Goyal, D. (2019). Review of Data Analysis Framework for Variety of Big Data. In: Rathore, V., Worring, M., Mishra, D., Joshi, A., Maheshwari, S. (eds) Emerging Trends in Expert Applications and Security. Advances in Intelligent Systems and Computing, vol 841. Springer, Singapore. https://doi.org/10.1007/978-981-13-2285-3_7

Download citation

Publish with us

Policies and ethics