Abstract
Mining Big Data is among one of the most attracting research contexts of recent years. Essentially, mining Big Data puts emphasis on how classical Data Mining algorithms can be extended in order to deal with novel features of Big Data, such as volume, variety and velocity. This novel challenge opens the door to a widespread number of challenging research problems that will generate both academic and industrial spin-offs in future years. Following this main trend, in this paper we provide a brief discussion on most relevant open problems and future directions on the fundamental issue of mining Big Data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D.J., Rasin, A., Silberschatz, A.: Hadoopdb: An architectural hybrid of mapreduce and dbms technologies for analytical workloads. PVLDB 2(1), 922–933 (2009)
Agrawal, D., Das, S., El Abbadi, A.: Big data and cloud computing: current state and future opportunities. In: EDBT, pp. 530–533 (2011)
Amatriain, X.: Mining large streams of user data for personalized recommendations. SIGKDD Explorations 14(2), 37–48 (2012)
Cheah, Y.-W., Canon, S.R., Plale, B., Ramakrishnan, L.: Milieu: Lightweight and configurable big data provenance for science. In: BigData Congress, pp. 46–53 (2013)
Chen, X., Chen, H., Zhang, N., Chen, J., Wu, Z.: Owl reasoning over big biomedical data. In: BigData Conference, pp. 29–36 (2013)
Chen, Y., Alspaugh, S., Katz, R.H.: Interactive analytical processing in big data systems: A cross-industry study of mapreduce workloads. PVLDB 5(12), 1802–1813 (2012)
Cheng, D., Schretlen, P., Kronenfeld, N., Bozowsky, N., Wright, W.: Tile based visual analytics for twitter big data exploratory analysis. In: BigData Conference, pp. 2–4 (2013)
Cohen, J., Dolan, B., Dunlap, M., Hellerstein, J.M., Welton, C.: Mad skills: New analysis practices for big data. PVLDB 2(2), 1481–1492 (2009)
Cuzzocrea, A.: Retrieving accurate estimates to OLAP queries over uncertain and imprecise multidimensional data streams. In: Bayard Cushing, J., French, J., Bowers, S. (eds.) SSDBM 2011. LNCS, vol. 6809, pp. 575–576. Springer, Heidelberg (2011)
Cuzzocrea, A.: Analytics over big data: Exploring the convergence of datawarehousing, olap and data-intensive cloud infrastructures. In: COMPSAC, pp. 481–483 (2013)
Cuzzocrea, A., Saccá, D., Serafino, P.: A hierarchy-driven compression technique for advanced OLAP visualization of multidimensional data cubes. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2006. LNCS, vol. 4081, pp. 106–119. Springer, Heidelberg (2006)
Cuzzocrea, A., Song, I.-Y., Davis, K.C.: Analytics over large-scale multidimensional data: the big data revolution! In: DOLAP 2011, pp. 101–104 (2011)
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Erdman, A.G., Keefe, D.F., Schiestl, R.: Grand challenge: Applying regulatory science and big data to improve medical device innovation. IEEE Trans. Biomed. Engineering 60(3), 700–706 (2013)
Fan, W., Bifet, A.: Mining big data: current status, and forecast to the future. SIGKDD Explorations 14(2), 1–5 (2012)
Ferreira, N., Poco, J., Vo, H.T., Freire, J., Silva, C.T.: Visual exploration of big spatio-temporal urban data: A study of new york city taxi trips. IEEE Trans. Vis. Comput. Graph. 19(12), 2149–2158 (2013)
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub totals. Data Min. Knowl. Discov. 1(1), 29–53 (1997)
Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F.B., Babu, S.: Starfish: A self-tuning system for big data analytics. In: CIDR, pp. 261–272 (2011)
Kambatla, K., Kollias, G., Kumar, V., Grama, A.: Trends in big data analytics. J. Parallel Distrib. Comput. 74(7), 2561–2573 (2014)
Kang, U., Akoglu, L., Chau, D.H.: Big graph mining for the web and social media: algorithms, anomaly detection, and applications. In: WSDM, pp. 677–678 (2014)
Kang, U., Faloutsos, C.: Big graph mining: algorithms and discoveries. SIGKDD Explorations 14(2), 29–36 (2012)
Kum, H.-C., Krishnamurthy, A., Machanavajjhala, A., Ahalt, S.C.: Social genome: Putting big data to work for population informatics. IEEE Computer 47(1), 56–63 (2014)
Laney, D.: 3D data management: Controlling data volume, velocity, and variety. Technical report, META Group (February 2001)
Lin, J., Ryaboy, D.V.: Scaling big data mining infrastructure: the twitter experience. SIGKDD Explorations 14(2), 6–19 (2012)
Lin, Z., Chau, D.H.P., Kang, U.: Leveraging memory mapping for fast and scalable graph computation on a pc. In: BigData Conference, pp. 95–98 (2013)
O’Driscoll, A., Daugelaite, J., Sleator, R.D.: ‘big data’, hadoop and cloud computing in genomics. Journal of Biomedical Informatics 46(5), 774–781 (2013)
Paoletti, M., Camiciottoli, G., Meoni, E., Bigazzi, F., Cestelli, L., Pistolesi, M., Marchesi, C.: Explorative data analysis techniques and unsupervised clustering methods to support clinical assessment of chronic obstructive pulmonary disease (copd) phenotypes. Journal of Biomedical Informatics 42(6), 1013–1021 (2009)
Pei, J.: Some new progress in analyzing and mining uncertain and probabilistic data for big data analytics. In: Ciucci, D., Inuiguchi, M., Yao, Y., Ślęzak, D., Wang, G. (eds.) RSFDGrC 2013. LNCS, vol. 8170, pp. 38–45. Springer, Heidelberg (2013)
Power, D.J.: Using ’big data’ for analytics and decision support. Journal of Decision Systems 23(2), 222–228 (2014)
Sun, Y., Han, J.: Mining heterogeneous information networks: a structural analysis approach. SIGKDD Explorations 14(2), 20–28 (2012)
Zhang, X., Liu, C., Nepal, S., Yang, C., Dou, W., Chen, J.: Sac-frapp: a scalable and cost-effective framework for privacy preservation over big data on cloud. Concurrency and Computation: Practice and Experience 25(18), 2561–2576 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Cuzzocrea, A. (2014). Big Data Mining or Turning Data Mining into Predictive Analytics from Large-Scale 3Vs Data: The Future Challenge for Knowledge Discovery. In: Ait Ameur, Y., Bellatreche, L., Papadopoulos, G.A. (eds) Model and Data Engineering. MEDI 2014. Lecture Notes in Computer Science, vol 8748. Springer, Cham. https://doi.org/10.1007/978-3-319-11587-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-11587-0_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11586-3
Online ISBN: 978-3-319-11587-0
eBook Packages: Computer ScienceComputer Science (R0)