Automatic Web User Profiling and Personalization Using Robust Fuzzy Relational Clustering

  • Olfa Nasraoui
  • Raghu Krishnapuram
  • Anupam Joshi
  • Tapan Kamdar
Part of the Studies in Fuzziness and Soft Computing book series (STUDFUZZ, volume 105)


The proliferation of information on the world wide Web has made the personalization of this information space a necessity. Personalization of content returned from a Web site is a desired feature that can enhance server performance improve system design, and lead to wise marketing decisions in electronic commerce. Mining typical user profiles from the vast amount of historical data stored in access logs is an important component of Web personalization. In the absence of a priori knowledge, unsupervised or clustering methods seem to be ideally suited to categorize the usage behavior of Web surfers. In this chapter, we present a framework for mining typical user profiles from server acces logs based on robust fuzzy relational clustering. As a by-product of the clustering process that generates robust profiles, associations between different URL addresses on a given site can easily be inferred. In general, the URLs that are present in the same profile tend to be visited together in the same session or form a large itemset. Finally, we present a personalization system that uses previously mined profiles to automatically generate a Web page containing URLs the user might be interested in. Our personalization approach is based on profiles computed from the prior traversal patterns of the users on the website and do not involve providing any declarative private information or the user to log in.


Fuzzy Cluster User Session Relational Cluster Intercluster Distance Robust Weight 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of the 20th VLDB Conference, pages 487–499, Santiago, Chile, 1994.Google Scholar
  2. 2.
    R. Armstrong, T. Joachims D. Freitag, and T. Mitchell. Webwatcher: A learning apprentice for the World Wide Web. In Proceedings of the AAAI Spring Symposium on Information Gathering from Heterogeneous, Distributed Environments, pages 6–13, Stanford, CA, March 1995.Google Scholar
  3. 3.
    G. Arocena and A. Mendelz. Weboql: Restructuring documents, databases, and web. In Proc. IEEE Intl. Conf. Data Engineering ‘88. IEEE Press, 1998.Google Scholar
  4. 4.
    P. Bajcsy and N. Ahuja. Location-and density-based hierarchical clustering using similarity analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20: 1011–1015, 1998.CrossRefGoogle Scholar
  5. 5.
    J. C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York, 1981.CrossRefGoogle Scholar
  6. 6.
    R. Cooley, B. Mobasher, and J. Srivasta. Web Mining: Information and pattern discovery on the World Wide Web. In Proc. IEEE Intl. Conf. Tools with AI, pages 558–567, Newport Beach, CA, 1997.CrossRefGoogle Scholar
  7. 7.
    R. N. Davé and R. Krishnapuram. Robust clustering methods: A unified view. IEEE Transactions on Fuzzy Systems, 5 (2): 270–293, 1997.CrossRefGoogle Scholar
  8. 8.
    E. Diday. La methode des nuees dynamiques. Rev. Stat. Appliquee, XIX (2): 1934, 1975.Google Scholar
  9. 9.
    U. Fayad, G. Piatetsky-Shapiro, and P. Smyth. From data mining to knowledge discovery: An overview. In U. Fayad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, 1996.Google Scholar
  10. 10.
    H. Frigui and R. Krishnapuram. Clustering by competitive agglomeration. Pattern Recognition, 30 (7): 1223–1232, 1997.CrossRefGoogle Scholar
  11. 11.
    K. S. Fu. Syntactic Pattern Recognition and Applications. Academic Press, San Diego, CA, 1982.Google Scholar
  12. 12.
    K. C. Gowda and E. Diday. Symbolic clustering using a new similarity measure. IEEE Transactions on Systems, Man, and Cybernetics, 20: 368–377, 1992.CrossRefGoogle Scholar
  13. 13.
    S. Guha, R. Rastogi, and K. Shim. CURE: An efficient algorithm for large databases. In Proceedings of SIGMOD ‘88, pages 73–84, Seattle, June 1998.CrossRefGoogle Scholar
  14. 14.
    D. E. Gustafson and W. C. Kessel. Fuzzy clustering with the fuzzy covariance matrix. In Proceedings of IEEE CDC, pages 761–766, San Diego, California, 1979.Google Scholar
  15. 15.
    R. J. Hathaway and J. C. Bezdek. Switching regression models and fuzzy clustering. IEEE Transactions on Fuzzy Systems, 1 (3): 195–204, 1993.CrossRefGoogle Scholar
  16. 16.
    R. J. Hathaway and J. C. Bezdek. NERF c-means: Non-Euclidean relational fuzzy clustering. Pattern Recognition, 27: 429–437, 1994.CrossRefGoogle Scholar
  17. 17.
    R.J. Hathaway, J.W. Devenport, and J.C. Bezdek. Relational dual of the c-means clustering algorithms. Pattern Recognition, 22 (2): 205–212, 1989.CrossRefGoogle Scholar
  18. 18.
    A. Joshi, C. Punyapu, and P. Karnam. Personalization and asynchronicity to support mobile web access. In Proc. Workshop on Web Information and Data Management, 7 th Intl. Conf. on Information and Knowledge Management, November 1998.Google Scholar
  19. 19.
    A. Joshi, S. Weerawarana, and E. Houstis. On disconnected browsing of distributed information. In Proceedings of IEEE Intl. Workshop on Research Issues in Data Engineering (RIDE), pages 101–108, Birmingham, UK, 1997.Google Scholar
  20. 20.
    L. Kaufman and P. J. Rousseeuw. Clustering by means of medoids. In Y. Dodge, editor, Statistical Data Analysis Based on the L1 Norm, pages 405–416. North Holland/Elsevier, Amsterdam, 1987.Google Scholar
  21. 21.
    L. Kaufman and P. J. Rousseeuw. Finding Groups in Data, An Itroduction to Cluster Analysis. John Wiley and Sons, Brussels, Belgium, 1990.Google Scholar
  22. 22.
    J. Kim, R. Krishnapuram, and R. N. Davé. Application of the least trimmed squares technique to prototype-based clustering. Pattern Recognition Letters, 17: 633–641, 1996.CrossRefGoogle Scholar
  23. 23.
    R. Krishnapuram and J. M. Keller. A possibilistic approach to clustering. IEEE Transactions on Fuzzy Systems, 1 (2): 98–110, 1993.CrossRefGoogle Scholar
  24. 24.
    O. Nasraoui, H. Frigui, R. Krishnapuram, and A. Joshi. Mining web access logs using relational competitive fuzzy clustering. In Eighth International Fuzzy Systems Association Congress, Hsinchu, Taiwan, Aug. 1999.Google Scholar
  25. 25.
    O. Nasraoui and R. Krishnapuram. Crisp interpretation of fuzzy and possibilistic clustering algorithms. In 3rd European Congress on Intelligent Techniques and Soft Computing, volume 3, pages 1312–1318, Aachen, Germany, Aug. 1995.Google Scholar
  26. 26.
    O. Nasraoui and R. Krishnapuram. A robust estimator based on density and scale optimization, and its application to clustering. In IEEE International Conference on Fuzzy Systems, pages 1031–1035, New Orleans, LA, Sep. 1996.Google Scholar
  27. 27.
    O. Nasraoui and R. Krishnapuram. A genetic algorithm for robust clustering based on a fuzzy least median of squares criterion. In Proceedings of NAFIPS’97, pages 217–221, Syracuse, NY, Sept. 1997.Google Scholar
  28. 28.
    O. Nasraoui and R. Krishnapuram. Mining web access logs using a relational clustering algorithm based on a robust estimator. In Proc. of the Eighth International World Wide Web Conference, pages 40–41, Toronto, 1999.Google Scholar
  29. 29.
    O. Nasraoui, R. Krishnapuram, H. Frigui, and Joshi A. Extracting web user profiles using relational competitive fuzzy clustering. International Journal on Artificial Intelligence Tools, 9 (4): 509–526, 2000.CrossRefGoogle Scholar
  30. 30.
    O. Nasraoui, R. Krishnapuram, and A. Joshi. Relational clustering based on a new robust estimator with application to web mining. In Proceedings of the North American Fuzzy Information Society, pages 705–709, New York City, 1999.Google Scholar
  31. 31.
    R. T. Ng and J. Han. Efficient and effective clustering methods for spatial data mining. In Proceedings of the 20th VLDB Conference, pages 144–155, Santiago, Chile, Sept. 1994.Google Scholar
  32. 32.
    O.Zaiane and J. Han. Webml: Querying the world-wide web for resources and knowledge. In Proc. Workshop on Web Information and Data Management, 7th Intl. Conf. on Information and Knowledge Management, 1998.Google Scholar
  33. 33.
    M. Perkowitz and O. Etzioni. Adaptive web sites: an ai challenge. In Proc. Intl. Joint Conf. on AI-IJCAI97, 1997.Google Scholar
  34. 34.
    M. Perkowitz and O. Etzioni. Adaptive web sites: Automatically synthesizing web pages. In Proc. AAAI 98, 1998.Google Scholar
  35. 35.
    G. D. Ramkumar and A. Swami. Clustering data without distance functions. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 21: 9–14, 1998.Google Scholar
  36. 36.
    M. Roubens. Pattern classification problems and fuzzy sets. Fuzzy Sets and Systems, 1: 239–253, 1978.CrossRefGoogle Scholar
  37. 37.
    P. J. Rousseeuw and A. M. Leroy. Robust Regression and Outlier Detection. John Wiley and Sons, New York, 1987.CrossRefGoogle Scholar
  38. 38.
    T. A. Runkler and J. C. Bezdek. ACE: A tool for clustering and rule extraction. IEEE Transactions on Fuzzy Systems, 1999.Google Scholar
  39. 39.
    E. H. Ruspini. Numerical methods for fuzzy clustering. Information Science, 2: 319–350, 1970.CrossRefGoogle Scholar
  40. 40.
    C. Shahabi, A. M. Zarkesh, J. Abidi, and V. Shah. Knowledge discovery from user’s web-page navigation. In Proceedings of the IEEE Intl. Workshop on Research Issues in Data Engineering (RIDE), pages 20–29, Birmingham, UK, 1997.Google Scholar
  41. 41.
    U. Shardanand and P. Maes. Social information filetering: Algorithms for automating `word of mouth’. In Proc. CHI’95 Conference on Human Factors in Computing Systems, New York, 1995. ACM Press.Google Scholar
  42. 42.
    P. H. A. Sneath and R. R. Sokal. Numerical Taxonomy-The Principles and Practice of Numerical Classification. W. H. Freeman, San Francisco, 1973.Google Scholar
  43. 43.
    Y. El Sonbaty and M. A. Ismail. Fuzzy clustering for symbolic data. IEEE Transactions on Fuzzy Systems, 6: 195–204, 1998.CrossRefGoogle Scholar
  44. 44.
    L. Terveen, W. Hill, and B. Amento. PHOAKS-a system for sharing recommendations. Communications of the ACM, 40 (3): 59–62, 1997.CrossRefGoogle Scholar
  45. 45.
    M. P. Windham. Numerical classification of proximity data with assignment measures. Journal of Classification, 2: 157–172, 1985.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Olfa Nasraoui
    • 1
  • Raghu Krishnapuram
    • 2
  • Anupam Joshi
    • 3
  • Tapan Kamdar
    • 3
  1. 1.Department of Electrical and Computer EngineeringThe University of MemphisMemphisUSA
  2. 2.IBM India Research Lab, Block 1Indian Institute of TechnologyHauz Khas, New DelhiIndia
  3. 3.Department of Computer Science and Electrical EngineeringUniversity of Maryland — Baltimore CountyBaltimoreUSA

Personalised recommendations