Skip to main content

Understanding Users’ Subject Interests in the Web Site Based on Their Usage of Its Content: A Novel Two-Phase Clustering Framework

  • Conference paper
  • 1459 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5559))

Abstract

In order to understand the behavior of website users, a deep analysis of content and usage data can reveal valuable knowledge about the main subjects these visitors are truly interested in. Preprocessing and clustering the highly unstructured content of web pages should be addressed very carefully in order to provide effective results. In this paper, a novel proposed two-phase self organizing feature map clustering framework to segment web users based on their subject interests in the diverse content of a University website is described. Also, the overall noise and dimensionality reduction of the sample web site content is properly addressed through the formulation of a comprehensive ten-step preprocessing procedure, which provided very promising experimental results when applied to the input web pages in the first phase of the proposed framework.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abonyi, J., Feil, B.: Cluster Analysis for Data Mining and System Identification. Birkhauser Verlag, Berlin (2007)

    MATH  Google Scholar 

  2. Akerkar, R., Lingras, P.: Building an Intelligent Web, Theory and Practice. Jones and Barlett, London (2008)

    Google Scholar 

  3. Cios, K.J., Pedrycz, W., Swiniarski, R.W., Kurgan, L.A.: Data Mining, A Knowledge Discovery Approach. Springer Science+Business Media, New York (2007)

    Google Scholar 

  4. Cooley, R., Mobasher, B., Srivastava, J.: Data Preparation for Mining World Wide Web Browsing Patterns. Knowledge and Information Systems 1(1) (1999)

    Google Scholar 

  5. Hornick, M.F., Marcadé, E., Venkayala, S.: Java Data Mining: Strategy, Standard, and Practice. The Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, San Francisco (2007)

    Google Scholar 

  6. Jung, J.: Semantic Preprocessing of Web Request Streams for Web Usage Mining. Journal of Universal Computer Science 11(8), 1383–1396 (2005)

    Google Scholar 

  7. Kaski, S., Honkela, T., Lagus, K., Kohonen, T.: WEBSOM – Self-organizing maps of document collections. Neurocomputing, 101–117 (1998)

    Google Scholar 

  8. Kohonen, T.: Self-organizing Maps, 3rd edn. Springer, Berlin (2001)

    Book  MATH  Google Scholar 

  9. Kohonen, T., Kaski, S., Lagus, K., Honkela, T.: Very Large Two-Level SOM for the Browsing of Newsgroups. In: Vorbrüggen, J.C., von Seelen, W., Sendhoff, B. (eds.) ICANN 1996. LNCS, vol. 1112, pp. 269–274. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  10. Li, Y., Zhang, C., Zhang, S.: Cooperative strategy for Web data mining and cleaning. Applied Artificial Intelligence 17, 443–460 (2003)

    Article  Google Scholar 

  11. Mobasher, B., Cooley, R., Srivastava, J.: Creating Adaptive Web Sites Through Usage-Based Clustering of URLs. In: Proceedings of IEEE Knowledge and Data Engineering Exchange (1999)

    Google Scholar 

  12. Nurnberger, A., Borgelt, C.: Fast Fuzzy Clustering of Web Page Collections. In: Proceedings of PKDD Workshop on Statistical Approaches for Web Mining, Pisa, Italy (2004)

    Google Scholar 

  13. Porter, M.: An algorithm for suffix stripping. Program (14), 130–137 (1980)

    Google Scholar 

  14. Velásquez, J.D., Yasuda, H., Aoki, T.: Using Self Organizing Feature Maps to acquire knowledge about visitor behavior. In: Proceedings of the knowledge-based intelligent information and engineering systems, Oxford, UK, pp. 951–958 (2003)

    Google Scholar 

  15. Velásquez, J.D., Yasuda, H., Aoki, T., Weber, R.: A new similarity measure to understand visitor behavior in a web site. IEICE Transactions on Information and Systems E200-D(1) (2004)

    Google Scholar 

  16. Vesanto, J.: Using SOM in Data Mining. Licentiate’s thesis, Helsinki University of Technology (2000)

    Google Scholar 

  17. Zhang, S., Zhang, C., Yang, Q.: Data Preparation for Data Mining. Applied Artificial Intelligence 17, 375–381 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ammari, A., Zharkova, V. (2009). Understanding Users’ Subject Interests in the Web Site Based on Their Usage of Its Content: A Novel Two-Phase Clustering Framework. In: Håkansson, A., Nguyen, N.T., Hartung, R.L., Howlett, R.J., Jain, L.C. (eds) Agent and Multi-Agent Systems: Technologies and Applications. KES-AMSTA 2009. Lecture Notes in Computer Science(), vol 5559. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01665-3_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01665-3_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01664-6

  • Online ISBN: 978-3-642-01665-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics