Abstract
We study the problem of identifying and ranking the members of a community in a very large network with link analysis only, given a set of representatives of the community. We define the concept of a community justified by a formal analysis of a simple model of the evolution of a directed graph. We show that the problem of deciding whether a non trivial community exists is NP complete. Nevertheless, experiments show that a very simple greedy approach can identify members of a community in the Danish part of the web graph with time complexity only dependent on the size of the found community and its immediate surroundings. The members are ranked with a “local” variant of the PageRank algorithm. Results are reported from successful experiments on identifying and ranking Danish Computer Science sites and Danish Chess pages using only a few representatives.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Andersen, R., Chung, F.R.K., Lang, K.: Local graph partitioning using pagerank vectors. In: FOCS, pp. 475–486. IEEE Computer Society, Los Alamitos (2006)
Andersen, R., Lang, K.J.: Communities from seed sets. In: WWW 2006: Proceedings of the 15th international conference on World Wide Web, pp. 223–232. ACM Press, New York (2006)
Bagrow, J., Bollt, E.: A local method for detecting communities. Physical Review E 72, 046108 (2005)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(1–7), 107–117 (1998)
Flake, G., Lawrence, S., Giles, C.L.: Efficient identification of web communities. In: Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, August 20–23, pp. 150–160 (2000)
Flake, G., Tarjan, R., Tsioutsiouliklis, K.: Graph clustering and minimum cut trees. Internet Mathematics 1(4), 385–408 (2004)
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, New York (1979)
Haveliwala, T.H.: Topic-sensitive pagerank. In: WWW 2002: Proceedings of the 11th international conference on World Wide Web, pp. 517–526. ACM Press, New York (2002)
Jeh, G., Widom, J.: Scaling personalized web search. In: WWW 2003: Proceedings of the 12th international conference on World Wide Web, pp. 271–279. ACM Press, New York (2003)
Langville, A.N., Meyer, C.D.: Deeper inside pagerank. Internet Mathematics 1(3), 335–380 (2005)
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Physical Review E 69, 026113 (2004)
Page, L., Brin, S., Motwani, R., WinogradThe, T.: Pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)
Richardson, M., Domingos, P.: The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank. In: Advances in Neural Information Processing Systems 14, MIT Press, Cambridge (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Olsen, M. (2008). Communities in Large Networks: Identification and Ranking. In: Aiello, W., Broder, A., Janssen, J., Milios, E. (eds) Algorithms and Models for the Web-Graph. WAW 2006. Lecture Notes in Computer Science, vol 4936. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78808-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-78808-9_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78807-2
Online ISBN: 978-3-540-78808-9
eBook Packages: Computer ScienceComputer Science (R0)