Abstract
A Subgraph Census (determining the frequency of smaller subgraphs in a network) is an important computational task at the heart of several graph mining algorithms. Here we focus on the g-tries, an efficient state-of-the art data structure. Its algorithm makes extensive use of the graph primitive that checks if a certain edge exists. The original implementation used adjacency matrices in order to make this operation as fast as possible, as is the case with most past approaches. This representation is very expensive in memory usage, limiting the applicability. In this paper we study a number of possible approaches that scale linearly with the number of edges. We make an extensive empirical study of these alternatives in order to find an efficient hybrid approach that combines the best representations. We achieve a performance that is less than \(50\,\%\) slower than the adjacency matrix on average (almost 3 times more efficient than a naive binary search implementation), while being memory efficient and tunable for different memory restrictions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Albert, I., Albert, R.: Conserved network motifs allow protein-protein interaction prediction. Bioinformatics 20(18), 3346–3352 (2004)
Batagelj, V., Mrvar, A.: Pajek datasets (2006). http://vlado.fmf.uni-lj.si/pub/networks/data/
Cook, S.A.: The complexity of theorem-proving procedures. In: ACM Symposium on Theory of computing STOC, pp. 151–158. ACM, New York, USA (1971)
Fellbaum, C.: WordNet. Wiley Online Library (1998)
Gleiser, P.M., Danon, L.: Community structure in jazz. Adv. Complex Syst. 06(04), 565–573 (2003)
Grochow, J.A., Kellis, M.: Network motif discovery using subgraph enumeration and symmetry-breaking. In: Speed, T., Huang, H. (eds.) RECOMB 2007. LNCS (LNBI), vol. 4453, pp. 92–106. Springer, Heidelberg (2007)
Khakabimamaghani, S., Sharafuddin, I., Dichter, N., Koch, I., Masoudi-Nejad, A.: Quatexelero: an accelerated exact network motif detection algorithm. PLoS ONE 8(7), e68073 (2013)
Klimt, B., Yang, Y.: Introducing the enron corpus. In: CEAS (2004)
Leskovec, J., Mcauley, J.J.: Learning to discover social circles in ego networks. In: Advances in Neural Information Processing Systems, pp. 539–547 (2012)
Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network motifs: simple building blocks of complex networks. Science 298(5594), 824–827 (2002)
Oliveira Aparicio, D., Pinto Ribeiro, P.M., Da Silva, F.M.A.: Parallel subgraph counting for multicore architectures. In: 2014 IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA), pp. 34–41. IEEE (2014)
Opsahl, T., Agneessens, F., Skvoretz, J.: Node centrality in weighted networks: Generalizing degree and shortest paths. Soc. Netw. 32(3), 245–251 (2010)
Paredes, P., Ribeiro, P.: Towards a faster network-centric subgraph census. In: 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 264–271. IEEE (2013)
Ribeiro, P., Silva, F.: Efficient subgraph frequency estimation with G-tries. In: Moulton, V., Singh, M. (eds.) WABI 2010. LNCS, vol. 6293, pp. 238–249. Springer, Heidelberg (2010)
Ribeiro, P., Silva, F.: G-tries: a data structure for storing and finding subgraphs. Data Min. Knowl. Disc. 28(2), 337–377 (2014)
Richardson, M., Agrawal, R., Domingos, P.: Trust management for the semantic web. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 351–368. Springer, Heidelberg (2003)
Sporns, O., Kötter, R.: Motifs in brain networks. PLoS Biol. 2(11), e369 (2004)
Acknowledgements
This work is partially funded by FCT, within project UID/EEA/50014/2013.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Paredes, P., Ribeiro, P. (2016). Large Scale Graph Representations for Subgraph Census. In: Wierzbicki, A., Brandes, U., Schweitzer, F., Pedreschi, D. (eds) Advances in Network Science. NetSci-X 2016. Lecture Notes in Computer Science(), vol 9564. Springer, Cham. https://doi.org/10.1007/978-3-319-28361-6_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-28361-6_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28360-9
Online ISBN: 978-3-319-28361-6
eBook Packages: Computer ScienceComputer Science (R0)