Abstract
In an increasingly competitive market, making well-informed decisions requires the analysis of a wide range of heterogeneous, large and complex data. This paper focuses on the emerging field of graph warehousing. Graphs are widespread structures that yield a great expressive power. They are used for modeling highly complex and interconnected domains, and efficiently solving emerging big data application. This paper presents the current status and open challenges of graph BI and analytics, and motivates the need for new warehousing frameworks aware of the topological nature of graphs. We survey the topics of graph modeling, management, processing and analysis in graph warehouses. Then we conclude by discussing future research directions and positioning them within a unified architecture of a graph BI and analytics framework.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bean, R.: Variety, not volume, is driving big data initiatives (2016). https://sloanreview.mit.edu/article/variety-not-volume-is-driving-big-data-initiatives/. Accessed 25 Jan 2018
García-Solaco, M., Saltor, F., Castellanos, M.: In: Bukhres, O.A., Elmagarmid, A.K. (eds.) Object-Oriented Multidatabase Systems, pp. 129–202. Prentice Hall International (UK) Ltd, Hertfordshire, UK (1995)
Feinberg, D., Heudecker, N.: IT market clock for database management systems (2014). https://www.gartner.com/doc/2852717/it-market-clock-database-management. Accessed 02 Jan 2018
Akoglu, L., Tong, H., Koutra, D.: Graph based anomaly detection and description: a survey. Data Mining Knowl. Discov. 29(3), 626–688 (2015)
Van Vlasselaer, V., Bravo, C., Caelen, O., Eliassi-Rad, T., Akoglu, L., Snoeck, M., Baesens, B.: Apate: a novel approach for automated credit card transaction fraud detection using network-based extensions. Decis. Support Syst. 75, 38–48 (2015)
Dasgupta, K., Singh, R., Viswanathan, B., Chakraborty, D., Mukherjea, S., Nanavati, A.A., Joshi, A.: Social ties and their relevance to churn in mobile telecom networks. In: Proceedings of the 11th International Conference on Extending Database Technology, EDBT 2008. Advances in database technology, New York, USA, pp. 668–677. ACM (2008)
Duan, L., Da Xu, L.: Business intelligence for enterprise systems: a survey. IEEE Trans. Industr. Inform. 8(3), 679–687 (2012)
Lim, E.P., Chen, H., Chen, G.: Business intelligence and analytics: Research directions. ACM Trans. Manag. Inf. Syst. 3(4), 17 (2013)
Cuzzocrea, A., Bellatreche, L., Song, I.Y.: Data warehousing and OLAP over big data: Current challenges and future research directions. In: Proceedings of the Sixteenth International Workshop on Data Warehousing and OLAP, pp. 67–70. ACM (2013)
Skhiri, S., Jouili, S.: Large graph mining: recent developments, challenges and potential solutions. In: Aufaure, M.-A., Zimányi, E. (eds.) eBISS 2012. LNBIP, vol. 138, pp. 103–124. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36318-4_5
Shi, C., Li, Y., Zhang, J., Sun, Y., Philip, S.Y.: A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. 29(1), 17–37 (2017)
Chen, C., Yan, X., Zhu, F., Han, J., Yu, P.S.: Graph OLAP: a multi-dimensional framework for graph data analysis. Knowl. Inf. Syst. 21(1), 41–63 (2009)
Hannachi, L., Benblidia, N., Boussaid, O., Bentayeb, F.: Community cube: a semantic framework for analysing social network data. Int. J. Metadata Semant. Ontol. 10(3), 155–169 (2015)
Angles, R., Arenas, M., Barceló, P., Hogan, A., Reutter, J., Vrgoč, D.: Foundations of modern query languages for graph databases. ACM Comput. Surv. 50(5), 68 (2017)
Hölsch, J., Schmidt, T., Grossniklaus, M.: On the performance of analytical and pattern matching graph queries in neo4j and a relational database. In: Ioannidis, Y.E., Stoyanovich, J., Orsi, G. (eds.) Proceedings of the Workshops of the EDBT/ICDT 2017 Joint Conference (EDBT/ICDT 2017), Venice, Italy, March 21–24, 2017. Volume 1810 of CEUR Workshop Proceedings, CEUR-WS.org (2017)
Qu, Q., Zhu, F., Yan, X., Han, J., Yu, P.S., Li, H.: Efficient topological OLAP on information networks. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011, Part I. LNCS, vol. 6587, pp. 389–403. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20149-3_29
Berlingerio, M., Coscia, M., Giannotti, F., Monreale, A., Pedreschi, D.: Multidimensional networks: foundations of structural analysis. World Wide Web 16(5–6), 567–593 (2013)
Zhao, P., Li, X., Xin, D., Han, J.: Graph cube: On warehousing and OLAP multidimensional networks. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pp. 853–864. ACM (2011)
Wang, Z., Fan, Q., Wang, H., Tan, K.l., Agrawal, D., El Abbadi, A.: Pagrol: Prallel Graph OLAP over large-scale attributed graphs. In: 2014 IEEE 30th International Conference on Data Engineering (ICDE), pp. 496–507. IEEE (2014)
Ghrab, A., Romero, O., Skhiri, S., Vaisman, A., Zimányi, E.: A framework for building OLAP cubes on graphs. In: Morzy, T., Valduriez, P., Bellatreche, L. (eds.) ADBIS 2015. LNCS, vol. 9282, pp. 92–105. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23135-8_7
Nebot, V., Berlanga, R.: Building data warehouses with semantic web data. Decis. Support Syst. 52(4), 853–868 (2012)
Kämpgen, B., Harth, A.: Transforming statistical linked data for use in OLAP systems. In: Proceedings of the 7th International Conference on Semantic Systems, pp. 33–40. ACM (2011)
Beheshti, S.M.R., Benatallah, B., Motahari-Nezhad, H.R.: Scalable graph-based olap analytics over process execution data. Distrib. Parallel Databases 34(3), 379–423 (2016)
Varga, J., Vaisman, A.A., Romero, O., Etcheverry, L., Pedersen, T.B., Thomsen, C.: Dimensional enrichment of statistical linked open data. Web Semant. Sci. Serv. Agents World Wide Web 40, 22–51 (2016)
Nath, R.P.D., Hose, K., Pedersen, T.B., Romero, O.: SETL: a programmable semantic extract-transform-load framework for semantic data warehouses. Inf. Syst. 68, 17–43 (2017)
Lee, K., Lee, K.: Escaping your comfort zone: a graph-based recommender system for finding novel recommendations among relevant items. Expert Syst. with Appl. 42(10), 4851–4858 (2015)
Demesmaeker, F., Ghrab, A., Nijssen, S., Skhiri, S.: Discovering interesting patterns in large graph cubes. In: 2017 IEEE International Conference on Big Data (Big Data), pp. 3322–3331 (2017)
Bleco, D., Kotidis, Y.: Entropy-based selection of graph cuboids. In: Proceedings of the Fifth International Workshop on Graph Data-management Experiences & Systems, vol. 2. ACM (2017)
Lumsdaine, A., Gregor, D., Hendrickson, B., Berry, J.: Challenges in parallel graph processing. Parallel Process. Lett. 17(01), 5–20 (2007)
Batarfi, O., El Shawi, R., Fayoumi, A.G., Nouri, R., Barnawi, A., Sakr, S., et al.: Large scale graph processing systems: survey and an experimental evaluation. Cluster Comput. 18(3), 1189–1213 (2015)
Denis, B., Ghrab, A., Skhiri, S.: A distributed approach for graph-oriented multidimensional analysis. In: 2013 IEEE International Conference on Big Data, pp. 9–16, October 2013
Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: PREGEL: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 135–146. ACM (2010)
Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)
Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: Distributed graph-parallel computation on natural graphs. In: OSDI, vol. 12, p. 2 (2012)
Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: Graphx: Graph processing in a distributed dataflow framework. OSDI. 14, 599–613 (2014)
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud 2010, Berkeley, CA, USA, p. 10 (2010)
Junghanns, M., Petermann, A., Gómez, K., Rahm, E.: Gradoop: Scalable graph data management and analytics with hadoop. arXiv preprint arXiv:1506.00548 (2015)
Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K.: Apache FLINK: Stream and batch processing in a single engine. In: Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, vol. 36(4) (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Ghrab, A., Romero, O., Jouili, S., Skhiri, S. (2018). Graph BI & Analytics: Current State and Future Challenges. In: Ordonez, C., Bellatreche, L. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2018. Lecture Notes in Computer Science(), vol 11031. Springer, Cham. https://doi.org/10.1007/978-3-319-98539-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-98539-8_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98538-1
Online ISBN: 978-3-319-98539-8
eBook Packages: Computer ScienceComputer Science (R0)