Abstract
The Star algorithm is an effective and efficient algorithm for graph clustering. We propose a series of novel, yet simple, metrics for the selection of Star centers in the Star algorithm and its variants. We empirically study the performance of off-line, standard and extended, and on-line versions of the Star algorithm adapted to the various metrics and show that one of the proposed metrics outperforms all others in both effectiveness and efficiency of clustering. We empirically study the sensitivity of the metrics to the threshold value of the algorithm and show improvement with respect to this aspect too.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Salton, G.: Automatic Text Processing: the transformation, analysis, and retrieval of information by computer. Addison-Wesley, London (1989)
Aslam, J., Pelekhov, K., Rus, D.: Static and Dynamic Information Organization with Star Clusters. In: Proceedings of the 1998 Conference on Information Knowledge Management, Baltimore (1998)
Aslam, J., Pelekhov, K., Rus, D.: The Star Clustering Algorithm. Journal of Graph Algorithms and Applications 8(1), 95–129 (2004)
MacQueen, J.B.: Some Methods for classification and Analysis of Multivariate Observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, University of California Press, vol. 1, pp. 281–297 (1967)
Johnson, S.C.: Hierarchical Clustering Schemes. Psychometrika 2, 241–254 (1967)
van Dongen, S.M.: Graph clustering by flow simulation - Tekst. - Proefschrift Universiteit Utrecht (2000)
Croft, W.B.: Clustering large files of documents using the single-link method. Journal of the American Society for Information Science, 189–195 (November 1977)
Voorhees, E.: The cluster hypothesis revisited. In: Proceedings of the 8th SIGIR, pp. 95–104
Salton, G.: The Smart document retrieval project. In: Proceedings of the Fourteenth Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, pp. 356–358
Karp, R.: Reducibility among combinatorial problems. In: Computer Computations, pp. 85–104. Plenum Press, NY (1972)
Lund, C., Yannakakis, M.: On the hardness of approximating minimization problems. Journal of the ACM 41, 1960–1981 (1994)
Press, W., Flannery, B., Teukolsky, S., Vetterling, W.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, Cambridge (1988)
García, R.J.G., Contelles, J.M.B., Porrata, A.P.: Extended Star Clustering Algorithm. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds.) CIARP 2003. LNCS, vol. 2905, pp. 480–487. Springer, Heidelberg (2003)
Sergey, B., Lawrence, P.: The anatomy of a large-scale hypertextual Web search engine. In: Proceedings of the seventh international conference on World Wide Web, vol. 7, pp. 107–117 (1998)
http://www.daviddlewis.com/resources/testcollections/reuters21578/ , (visited on December 2006)
http://trec.nist.gov/data.html , (visited on December 2006)
Google News, http://news.google.com.sg
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wijaya, D.T., Bressan, S. (2007). Journey to the Centre of the Star: Various Ways of Finding Star Centers in Star Clustering. In: Wagner, R., Revell, N., Pernul, G. (eds) Database and Expert Systems Applications. DEXA 2007. Lecture Notes in Computer Science, vol 4653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74469-6_64
Download citation
DOI: https://doi.org/10.1007/978-3-540-74469-6_64
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74467-2
Online ISBN: 978-3-540-74469-6
eBook Packages: Computer ScienceComputer Science (R0)