Skip to main content

Big Data Mining or Turning Data Mining into Predictive Analytics from Large-Scale 3Vs Data: The Future Challenge for Knowledge Discovery

  • Conference paper
Model and Data Engineering (MEDI 2014)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 8748))

Included in the following conference series:

Abstract

Mining Big Data is among one of the most attracting research contexts of recent years. Essentially, mining Big Data puts emphasis on how classical Data Mining algorithms can be extended in order to deal with novel features of Big Data, such as volume, variety and velocity. This novel challenge opens the door to a widespread number of challenging research problems that will generate both academic and industrial spin-offs in future years. Following this main trend, in this paper we provide a brief discussion on most relevant open problems and future directions on the fundamental issue of mining Big Data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D.J., Rasin, A., Silberschatz, A.: Hadoopdb: An architectural hybrid of mapreduce and dbms technologies for analytical workloads. PVLDB 2(1), 922–933 (2009)

    Google Scholar 

  2. Agrawal, D., Das, S., El Abbadi, A.: Big data and cloud computing: current state and future opportunities. In: EDBT, pp. 530–533 (2011)

    Google Scholar 

  3. Amatriain, X.: Mining large streams of user data for personalized recommendations. SIGKDD Explorations 14(2), 37–48 (2012)

    Article  Google Scholar 

  4. Cheah, Y.-W., Canon, S.R., Plale, B., Ramakrishnan, L.: Milieu: Lightweight and configurable big data provenance for science. In: BigData Congress, pp. 46–53 (2013)

    Google Scholar 

  5. Chen, X., Chen, H., Zhang, N., Chen, J., Wu, Z.: Owl reasoning over big biomedical data. In: BigData Conference, pp. 29–36 (2013)

    Google Scholar 

  6. Chen, Y., Alspaugh, S., Katz, R.H.: Interactive analytical processing in big data systems: A cross-industry study of mapreduce workloads. PVLDB 5(12), 1802–1813 (2012)

    Google Scholar 

  7. Cheng, D., Schretlen, P., Kronenfeld, N., Bozowsky, N., Wright, W.: Tile based visual analytics for twitter big data exploratory analysis. In: BigData Conference, pp. 2–4 (2013)

    Google Scholar 

  8. Cohen, J., Dolan, B., Dunlap, M., Hellerstein, J.M., Welton, C.: Mad skills: New analysis practices for big data. PVLDB 2(2), 1481–1492 (2009)

    Google Scholar 

  9. Cuzzocrea, A.: Retrieving accurate estimates to OLAP queries over uncertain and imprecise multidimensional data streams. In: Bayard Cushing, J., French, J., Bowers, S. (eds.) SSDBM 2011. LNCS, vol. 6809, pp. 575–576. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  10. Cuzzocrea, A.: Analytics over big data: Exploring the convergence of datawarehousing, olap and data-intensive cloud infrastructures. In: COMPSAC, pp. 481–483 (2013)

    Google Scholar 

  11. Cuzzocrea, A., Saccá, D., Serafino, P.: A hierarchy-driven compression technique for advanced OLAP visualization of multidimensional data cubes. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2006. LNCS, vol. 4081, pp. 106–119. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Cuzzocrea, A., Song, I.-Y., Davis, K.C.: Analytics over large-scale multidimensional data: the big data revolution! In: DOLAP 2011, pp. 101–104 (2011)

    Google Scholar 

  13. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  14. Erdman, A.G., Keefe, D.F., Schiestl, R.: Grand challenge: Applying regulatory science and big data to improve medical device innovation. IEEE Trans. Biomed. Engineering 60(3), 700–706 (2013)

    Article  Google Scholar 

  15. Fan, W., Bifet, A.: Mining big data: current status, and forecast to the future. SIGKDD Explorations 14(2), 1–5 (2012)

    Article  Google Scholar 

  16. Ferreira, N., Poco, J., Vo, H.T., Freire, J., Silva, C.T.: Visual exploration of big spatio-temporal urban data: A study of new york city taxi trips. IEEE Trans. Vis. Comput. Graph. 19(12), 2149–2158 (2013)

    Article  Google Scholar 

  17. Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub totals. Data Min. Knowl. Discov. 1(1), 29–53 (1997)

    Article  Google Scholar 

  18. Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F.B., Babu, S.: Starfish: A self-tuning system for big data analytics. In: CIDR, pp. 261–272 (2011)

    Google Scholar 

  19. Kambatla, K., Kollias, G., Kumar, V., Grama, A.: Trends in big data analytics. J. Parallel Distrib. Comput. 74(7), 2561–2573 (2014)

    Article  Google Scholar 

  20. Kang, U., Akoglu, L., Chau, D.H.: Big graph mining for the web and social media: algorithms, anomaly detection, and applications. In: WSDM, pp. 677–678 (2014)

    Google Scholar 

  21. Kang, U., Faloutsos, C.: Big graph mining: algorithms and discoveries. SIGKDD Explorations 14(2), 29–36 (2012)

    Article  Google Scholar 

  22. Kum, H.-C., Krishnamurthy, A., Machanavajjhala, A., Ahalt, S.C.: Social genome: Putting big data to work for population informatics. IEEE Computer 47(1), 56–63 (2014)

    Article  Google Scholar 

  23. Laney, D.: 3D data management: Controlling data volume, velocity, and variety. Technical report, META Group (February 2001)

    Google Scholar 

  24. Lin, J., Ryaboy, D.V.: Scaling big data mining infrastructure: the twitter experience. SIGKDD Explorations 14(2), 6–19 (2012)

    Article  Google Scholar 

  25. Lin, Z., Chau, D.H.P., Kang, U.: Leveraging memory mapping for fast and scalable graph computation on a pc. In: BigData Conference, pp. 95–98 (2013)

    Google Scholar 

  26. O’Driscoll, A., Daugelaite, J., Sleator, R.D.: ‘big data’, hadoop and cloud computing in genomics. Journal of Biomedical Informatics 46(5), 774–781 (2013)

    Article  Google Scholar 

  27. Paoletti, M., Camiciottoli, G., Meoni, E., Bigazzi, F., Cestelli, L., Pistolesi, M., Marchesi, C.: Explorative data analysis techniques and unsupervised clustering methods to support clinical assessment of chronic obstructive pulmonary disease (copd) phenotypes. Journal of Biomedical Informatics 42(6), 1013–1021 (2009)

    Article  Google Scholar 

  28. Pei, J.: Some new progress in analyzing and mining uncertain and probabilistic data for big data analytics. In: Ciucci, D., Inuiguchi, M., Yao, Y., Ślęzak, D., Wang, G. (eds.) RSFDGrC 2013. LNCS, vol. 8170, pp. 38–45. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  29. Power, D.J.: Using ’big data’ for analytics and decision support. Journal of Decision Systems 23(2), 222–228 (2014)

    Article  MathSciNet  Google Scholar 

  30. Sun, Y., Han, J.: Mining heterogeneous information networks: a structural analysis approach. SIGKDD Explorations 14(2), 20–28 (2012)

    Article  Google Scholar 

  31. Zhang, X., Liu, C., Nepal, S., Yang, C., Dou, W., Chen, J.: Sac-frapp: a scalable and cost-effective framework for privacy preservation over big data on cloud. Concurrency and Computation: Practice and Experience 25(18), 2561–2576 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Cuzzocrea, A. (2014). Big Data Mining or Turning Data Mining into Predictive Analytics from Large-Scale 3Vs Data: The Future Challenge for Knowledge Discovery. In: Ait Ameur, Y., Bellatreche, L., Papadopoulos, G.A. (eds) Model and Data Engineering. MEDI 2014. Lecture Notes in Computer Science, vol 8748. Springer, Cham. https://doi.org/10.1007/978-3-319-11587-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11587-0_2

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11586-3

  • Online ISBN: 978-3-319-11587-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics