Skip to main content

Mining Evolving Web Sessions and Clustering Dynamic Web Documents for Similarity-Aware Web Content Management

  • Conference paper
Advanced Data Mining and Applications (ADMA 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5139))

Included in the following conference series:

  • 2495 Accesses

Abstract

Similarity discovery has become one of the most important research streams in web usage mining community in the recent years. The knowledge obtained from the exercise can be used for many applications such as predicting user’s preference, optimizing web cache organization and improving the quality of web document pre-fetching. This paper presents an approach of mining evolving web sessions to cluster web users and establish similarities among web documents, which are then applied to a Similarity-aware Web content Management system, facilitating offline building of the similarity-ware web caches and online updating of sub-caches and cache content similarity profiles. An agent-based web document pre-fetching mechanism is also developed to support the similarity-aware caching to further reduce the bandwidth consumption and network traffic latency, therefore to improve the web access performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chen, L., Bhowmick, S.S., Li, J.: COWES: Clustering Web Users Based on Historical Web Sessions. In: Li Lee, M., Tan, K.-L., Wuwongse, V. (eds.) DASFAA 2006. LNCS, vol. 3882, pp. 541–556. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. Xiao, J., Zhang, Y.: Clustering of web users using session-based similarity measures. In: Proc. of ICCNMC 2001 (2001)

    Google Scholar 

  3. Nasraoui, O., Soliman, M., Saka, E., Badia, A., Germain, R.: A Web usage mining Framework for mining Evolving user profiles in Dynamic Web sites. IEEE Transaction on Knowledge and Data Engineering 20(2) (2008)

    Google Scholar 

  4. Xiao, J., Wang, J.: A Similarity-Aware Multiagent-Based Web Content Management Scheme. In: Yeung, D.S., Liu, Z.-Q., Wang, X.-Z., Yan, H. (eds.) ICMLC 2005. LNCS (LNAI), vol. 3930, pp. 305–314. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Fan, L., Cao, P., Lin, W., Jacobson, Q.: Web Prefetching between Low-Bandwidth Client and Proxies: Potential and Performance. In: SIGMETRICS 1999 (1999)

    Google Scholar 

  6. Palpanas, T.: Web Prefetching using Partial Matching Prediction, Technical report CSRG-376, University of Toronto (1998)

    Google Scholar 

  7. Xiao, J.: Agent-based Similarity-aware Web Document Pre-fetching. In: Proc. of the CIMCA/IAWTIC 2005, pp. 928–933 (2005)

    Google Scholar 

  8. Wang, W., Zaiane, O.R.: Clustering web sessions by sequence alignment. In: Proc. of DEXA (2002)

    Google Scholar 

  9. Fu, Y., Sandhu, K., Shih, M.: A generalization-based approach to clustering of web usage sessions. In: Masand, B., Spiliopoulou, M. (eds.) WebKDD 1999. LNCS (LNAI), vol. 1836, pp. 21–38. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  10. Wen, J.R., Nie, J.Y., Zhang, H.J.: Querying Clustering Using User Logs. ACM Transactions on Information Systems 20(1), 59–81 (2002)

    Article  Google Scholar 

  11. Popescul, A., Flake, G., Lawrence, S., Ungar, L.H., Gile, C.L.: Clustering and Identifying Temporal Trends in Document Database. In: Proceedings of the IEEE advances in Digital Libraries, Washington (2000)

    Google Scholar 

  12. Flesca, S., Masciari, E.: Efficient and Effective Web Change Detection. In: Data & Knowledge Engineering. Elsevier, Amsterdam (2003)

    Google Scholar 

  13. Salton, G., Yang, C.: On the specification of term values in automatic indexing. Journal of Documentation 29, 351–372 (1973)

    Article  Google Scholar 

  14. Barfourosh, A.A., Nezhad, H.R.M., Anderson, M.L., Perlis, D.: Information Retrieval on the World Wide Web and Active Logic: A Survey and Problem Definition, Technical report UMIACS-TR-2001-69, DRUM: Digital Repository at the University of Maryland (2002)

    Google Scholar 

  15. Broder, A.Z.: On the Resemblance and Containment of Documents. In: Proceedings of Compression and Complexity of SEQUENCES 1997, Salerno, Italy, pp. 21–29 (1997)

    Google Scholar 

  16. Fox, E.: Extending the Boolean and Vector Space Models on Information Retrieval with P-Norm Queries and Multiple Concepts Types. Cornell University Dissertation (1983)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xiao, J. (2008). Mining Evolving Web Sessions and Clustering Dynamic Web Documents for Similarity-Aware Web Content Management. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N.J., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2008. Lecture Notes in Computer Science(), vol 5139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88192-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88192-6_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88191-9

  • Online ISBN: 978-3-540-88192-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics