Evaluation of Web Session Cluster Quality Based on Access-Time Dissimilarity and Evolutionary Algorithms

Dixit, Veer Sain; Bhatia, Shveta Kundra; Singh, V. B.

doi:10.1007/978-3-319-09156-3_22

Veer Sain Dixit²³,
Shveta Kundra Bhatia²⁴ &
V. B. Singh²⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8583))

Included in the following conference series:

International Conference on Computational Science and Its Applications

3537 Accesses

Abstract

Web session cluster refinement is one of the major research issues for the improvement of cluster quality in recent days. The motive of refinement using Evolutionary Algorithms is quite obvious because in any clustering algorithm the obtained clusters shall have some data items that are inappropriately clustered, hence, never giving us well separated and cohesive clusters. Hence the quality of clusters is improved using refinement techniques. Initial clusters are formed using K-Means clustering algorithm which suffers from local minima problem. The refinement on clusters is performed on the basis of access and time features (Modified Knockout Refinement Algorithm) which is a distance based dissimilarity, Genetic Algorithm (GA), Particle Swarm Optimization (PSO) and a combination of MKRA with GA and MKRA with PSO. Results are evaluated on five synthetic datasets and three real datasets. Further, it is shown experimentally that effectiveness of combining MKRA with evolutionary techniques produces better quality clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Mobasher, Discovery of aggregate usage profiles for web personalization. WebKDD, Boston (2009)
Google Scholar
Deborah, L., Baskaran, R., Kannan, A.: A Survey on Internal Validity Measure for Cluster Validation. International Journal of Computer Science & Engineering Survey (IJCSES) 1(2) (2010)
Google Scholar
Sanghoun, O., Chang, W.A., Moongu, J.: An Evolutionary Cluster Validation Index (2008)
Google Scholar
Nock, R., Nielsen, F.: On Weighting Clustering. IEEE Transactions and Pattern Analysis and Machine Intelligence 28(8), 1223–1235 (2006)
Article Google Scholar
Baldi, P., Frasconi, P., Smyth, P.: Modeling the Internet and the Web. Wiley (2003)
Google Scholar
Chakrabarti, S.: Mining the Web. Morgan Kaufmann Publishers (2003)
Google Scholar
Banerjee, A., Ghosh, J.: Click stream clustering using weighted longest common subsequences. In: Proceedings of the Web Mining Workshop at the 1st SIAM Conference on Data Mining (2001)
Google Scholar
Cadez, I.V., Heckerman, D., Meek, C., Smyth, P., White, S.: Model-based clustering and visualization of navigation patterns on a Web site. Data Mining and Knowledge Discovery 7(4), 399–424 (2003)
Article MathSciNet Google Scholar
Eiron, N., McCurley, K.: Untangling compound documents on the Web. In: Proceedings of the Fourteenth ACM Conference on Hypertext and Hypermedia (2003)
Google Scholar
Flake, G., Lawrence, S., Giles, C.L., Coetzee, F.: Self-organization and identification of Web Communities. IEEE Computer 35(3) (2002)
Google Scholar
Berkhin, P.: Survey of clustering data mining techniques. Springer, Heidelberg (2006)
Google Scholar
Xie, Y., Phoha, V.V.: Web user clustering from access log using belief function. In: Proceedings of the First International Conference on Knowledge Capture (K-CAP 2001), pp. 202–208. ACM Press (2001)
Google Scholar
Shahabi, C., Banaei-Kashani, F.: A framework for efficient and anonymous web usage mining based on client-side tracking. In: Kohavi, R., Masand, B., Spiliopoulou, M., Srivastava, J. (eds.) WebKDD 2001. LNCS (LNAI), vol. 2356, pp. 113–144. Springer, Heidelberg (2002)
Chapter Google Scholar
Fu, Y., Sandhu, K., Shih, M.: Clustering of Web users based on access patterns. Proceedings of WEBKDD (1999)
Google Scholar
Gonzales, E., Mabu, S., Taboada, K., Hirasawa, K.: Web Mining using Genetic Relation Algorithm. In: SICE Annual Conference, pp. 1622–1627 (2010)
Google Scholar
Oyanagi, S., Kubota, K., Nakase, A.: Application of matrix clustering to web log analysis and access prediction. In: Third International Workshop on Mining Web Log Data Across All Customers Touch Points, EBKDD 2001 (2001)
Google Scholar
Castellano, G., Fanelli, A.M., Mencar, C., Torsello, M.: Similarity based Fuzzy clustering for user profiling. In: Proceedings of International Conference on Web Intelligence and Intelligent Agent Technology. IEEE/WIC/ACM (2007)
Google Scholar
Bentley, J.: Multidimensional Binary Search Trees Used for Associative Searching. ACM 18(9), 509–517 (1975)
Article MATH Google Scholar
Bradley, P.S., Fayyad, U., Reina, C.: Scaling Clustering Algorithms to Large Databases. In: 4th International Conference on Knowledge Discovery and Data Mining, KDD 1998. AAAI Press (August 1998)
Google Scholar
Scholkopf, B., Smola, J., Muller, R.: Technical Report: Nonlinear component analysis as a kernel eigen value problem. Neural Comput. 10(5), 1299–1319 (1998)
Article Google Scholar
Dhillon, I.S., Fan, J., Guan, Y.: Efficient clustering of very large document collections. In: Data Mining for Scientific and Engineering Applications, pp. 357–381. Kluwer Academic Publishers (2001)
Google Scholar
Elkan, C.: Using the Triangle Inequality to Accelerate k-Means. In: Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), pp. 609–616 (2003)
Google Scholar
Kanungo, T., Mount, D.M., Netanyahu, N., Piatko, C., Silverman, R., Wu, A.: An efficient kmeans clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Analysis and Machine Intelligence 24(7), 881–892 (2002)
Article Google Scholar
Pelleg, D., Moore, A.: Accelerating exact kmeans algorithm with geometric reasoning. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, pp. 727–734 (1999)
Google Scholar
Karypis, G., Han, E., Kumar, V.: Multilevel Refinement for Hierarchical Clustering. Department of Computer Science & Engineering Army HPC Research Center (1999)
Google Scholar
Sujatha, N., Iyakutty, K.: Refinement of Web usage Data Clustering from K-means with Genetic Algorithm. European Journal of Scientific Research 42(3), 478–490 (2010) ISSN 1450-216X
Google Scholar
Merwe, V.D., Engelbrecht, A.: Data clustering using particle swarm optimization. In: The 2003 Congress on Evolutionary Computation, CEC 2003, vol. 1, pp. 215–220. IEEExplore (2003)
Google Scholar
Xiao, X., Dow, E.R., Eberhart, R., Miled, Z., Oppelt, R.: Gene Clustering using Self-Organizing Maps and Particle Swarm Optimization. In: Guo, M. (ed.) ISPA 2003. LNCS, vol. 2745, pp. 154–160. Springer, Heidelberg (2003)
Google Scholar
Omran, M., Salman, A., Engelbrecht, A.: Dynamic clustering using particle swarm optimization with application in image segmentation. Pattern Analysis and Applications, 332–344 (2006)
Google Scholar
Mitchell, M.: An Introduction to Genetic Algorithms, ch. 1-6, pp. 1–203. MIT Press (1998)
Google Scholar
Arben, A., Alireza, L.: Using genetic algorithm for dynamic and multiple criteria web-site optimizations. European Journal of Operational Research, 1767–1777 (2007)
Google Scholar
Ahmadyfard, A., Modares, H.: Combining PSO and K-Means to Enhance Data Clustering. In: International Symposium on Telecommunications. Published by IEEE (2008)
Google Scholar
Krishna, K., Murty, M.N.: Genetic K-Means Algorithm. IEEE Transactions Published in: Systems, Man, and Cybernetics, Part B: Cybernetics 29(3) (1999)
Google Scholar
Dixit, V.S.: Refinement of Clusters Based on Dissimilarity Measures. International Journal of Multidisciplinary Research and Advances in Engineering (IJMRAE) 6(1) (January 2014) (accepted to be published)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Atma Ram Sanatan Dharma College, University of Delhi, New Delhi, India
Veer Sain Dixit
Computer Science Department, Research Scholar, University of Delhi, New Delhi, India
Shveta Kundra Bhatia
Computer Science Department, Delhi College of Arts and Commerce, University of Delhi, New Delhi, India
V. B. Singh

Authors

Veer Sain Dixit
View author publications
You can also search for this author in PubMed Google Scholar
Shveta Kundra Bhatia
View author publications
You can also search for this author in PubMed Google Scholar
V. B. Singh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Engineering, University of Basilicata, 85100, Potenza, Italy
Beniamino Murgante
Department of Computer and Information Sciences, Covenant University, Ota, Nigeria
Sanjay Misra
Department of Production and Systems, University of Minho, 4710-057, Braga, Portugal
Ana Maria A. C. Rocha
DICAR, Polytecnico di Bari, 70125, Bari, Italy
Carmelo Torre
University of Minho, Braga, Portugal
Jorge Gustavo Rocha & Maria Irene Falcão &
Monash University, 3800,, Clayton, VIC, Australia
David Taniar
Department of Intelligent Informatics, Kyushu Sangyo University, 2-3-1 Matsukadai, 813-8503, Higashi-ku, Fukuoka, Japan
Bernady O. Apduhan
Department of Mathematics and Computer Science, University of Perugia, Via Vanvitelli, 1, 06123, Perugia, Italy
Osvaldo Gervasi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dixit, V.S., Bhatia, S.K., Singh, V.B. (2014). Evaluation of Web Session Cluster Quality Based on Access-Time Dissimilarity and Evolutionary Algorithms. In: Murgante, B., et al. Computational Science and Its Applications – ICCSA 2014. ICCSA 2014. Lecture Notes in Computer Science, vol 8583. Springer, Cham. https://doi.org/10.1007/978-3-319-09156-3_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-09156-3_22
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09155-6
Online ISBN: 978-3-319-09156-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics