Skip to main content

In this chapter we focus on some problems concerning application of an immune-based algorithm to extraction and visualization of cluster structure. Particularly a hierarchical, topic-sensitive approach is proposed; it appears to be a robust solution to the problem of scalability of document map generation process (both in terms of time and space complexity). This approach relies upon extraction of a hierarchy of concepts, i.e. almost homogenous groups of documents described by unique sets of terms. To represent the content of each context a modified version the aiNet [9] algorithm is employed; it was chosen because of its natural ability to represent internal patterns existing in a training set. Careful evaluation of the effectiveness of the novel text clustering procedure is presented in section reporting experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Baraldi and P. Blonda. A survey of fuzzy clustering algorithms for pattern recognition. IEEE Trans. on Systems, Man and Cybernetics, 29B:786–801, 1999.

    Google Scholar 

  2. A. Becks. Visual knowledge management with adaptable document maps. GMD research series, 15, 2001.

    Google Scholar 

  3. M.W. Berry, Z. Drmač, and E.R. Jessup. Matrices, vector spaces and information retrieval. SIAM Review, 41(2):335–362, 1999.

    Article  MATH  MathSciNet  Google Scholar 

  4. J.C. Bezdek and S.K. Pal. Fuzzy Models for Pattern Recognition: Methods that Search for Structures in Data. IEEE, New York, 1992.

    Google Scholar 

  5. G.B.P. Bezerra, T.V. Barra, M.F. Hamilton, and F.J. von Zuben. A hierarchical immune-inspired approach for text clustering. In Proc. Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU’2006), volume 1, pages 2530–2537, 2006.

    Google Scholar 

  6. C. Boulis and M. Ostendorf. Combining multiple clustering systems. In Proc. of 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-2004), pages 63–74. Springer-Verlag, LNAI 3202, 2004.

    Google Scholar 

  7. K. Ciesielski, M. Dramiński, M. Kłopotek, D. Czerski, and S.T. Wierzchoń. Adaptive document maps. In Proceedings of the Intelligent Advances in Soft Computing 5, pages 109–120. Springer-Verlag, 2006.

    Google Scholar 

  8. K. Ciesielski and M. Kłopotek. Text data clustering by contextual graphs. In L. Todorovski, N. Lavrac, and K.P. Jantke, editors, Discovery Science, pages 65–76. Springer-Verlag, LNAI 4265, 2006.

    Google Scholar 

  9. L.N. de Castro and J. Timmis. Artificial Immune Systems: A New Computational Intelligence Approach. Springer, 2002.

    Google Scholar 

  10. L.N. de Castro and F.J. von Zuben. An evolutionary immune network for data clustering. In SBRN’2000, pages 84–89. IEEE Computer Society Press, 2000.

    Google Scholar 

  11. B. Fritzke. Some competitive learning methods, 1997. http://www.neuroinformatik.ruhr-uni-bochum.de/ini/VDM/research/gsn/JavaPaper.

  12. M. Gilchrist. Taxonomies for business: Description of a research project. In 11 Nordic Conference on Information and Documentation, Reykjavik, Iceland, May 30 – June 1 2001. http://www.bokis.is/iod2001/papers/Gilchrist_paper.doc.

  13. C. Hung and S. Wermter. A constructive and hierarchical self-organising model in a non-stationary environment. In Int. Joint Conference in Neural Networks, pages 2948–2953, 2005.

    Google Scholar 

  14. S.Y. Jung and K. Taek-Soo. An incremental similarity computation method in agglomerative hierarchical clustering. Journal of Fuzzy Logic and Intelligent System, December 2001.

    Google Scholar 

  15. M. Kłopotek. A new bayesian tree learning method with reduced time and space complexity. Fundamenta Informaticae, 49(4):349–367, 2002.

    MATH  MathSciNet  Google Scholar 

  16. M. Kłopotek, M. Dramiński, K. Ciesielski, M. Kujawiak, and S.T. Wierzchoń. Mining document maps. In M. Gori, M. Ceci, and M. Nanni, editors, Proc. of Statistical Approaches to Web Mining Workshop (SAWM) at PKDD’04, pages 87–98, Pisa, Italy, 2004.

    Google Scholar 

  17. M. Kłopotek, S. Wierzchoń, K. Ciesielski, M. Dramiński, and D. Czerski. E-Service Intelligence – Methodologies, Technologies and Applications. Part II: Methodologies, Technologies and Systems, volume 37 of Studies in Computational Intelligence, chapter Techniques and technologies behind maps of Internet and Intranet document collections, pages 169–190. Springer, 2007.

    Google Scholar 

  18. M. Kłopotek, S. Wierzchoń, K. Ciesielski, M. Dramiński, and D. Czerski. Conceptual Maps of Document Collections in Internet and Intranet. Coping with the Technological Challenge. IPI PAN Publishing House, Warszawa 2007.

    Google Scholar 

  19. T. Kohonen. Self-Organizing Maps, volume 30 of Springer Series in Information Sciences. Springer, Berlin, Heidelberg, New York, 2001.

    Google Scholar 

  20. K. Lagus, S. Kaski, and T. Kohonen. Mining massive document collections by the WEBSOM method. Information Sciences, 163/1-3:135–156, 2004.

    Article  Google Scholar 

  21. G. Salton. The SMART Retrieval System – Experiments in Automatic Document Processing. Prentice-Hall, Upper Saddle River, NJ, USA, 1971.

    Google Scholar 

  22. N. Tang and V.R. Vemuri. An artificial immune system approach to document clustering. In Proceedings of the 2005 ACM symposium on Applied Computing Santa Fe, New Mexico, pages 918–922, 2005.

    Google Scholar 

  23. J. Timmis. aiVIS: Artificial immune network visualization. In Proceedings of EuroGraphics UK 2001 Conference, pages 61–69. Univeristy College, London, 2001.

    Google Scholar 

  24. C.J. van Rijsbergen. Information Retrieval. Butterworths, London, 1979. http://www.dcs.gla.ac.uk/Keith/Preface.html.

  25. S.T. Wierzchoń. Artificial immune systems. Theory and applications (in Polish). Akademicka Oficyna Wydawnicza EXIT Publishing, Warszawa, 2001.

    Google Scholar 

  26. D.R. Wilson and T.R. Martinez. Reduction techniques for instance-based learning algorithms. Machine Learning, 38:257–286, 2000.

    Article  MATH  Google Scholar 

  27. T. Zhang, R. Ramakrishan, and M. Livny. Birch: Efficient data clustering method for large databases. In Proc. ACM SIGMOD Int. Conf. on Data Management, pages 103–114, 1997.

    Google Scholar 

  28. Y. Zhao and G. Karypis. Criterion functions for document clustering: Experiments and analysis. http://www-users.cs.umn.edu/~karypis/publications/ir.html.

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Wierzchoń, S.T., Ciesielski, K., Kłopotek, M.A. (2008). Scalability and Evaluation of Contextual Immune Model for Web Mining. In: Hassanien, AE., Abraham, A., Kacprzyk, J. (eds) Computational Intelligence in Multimedia Processing: Recent Advances. Studies in Computational Intelligence, vol 96. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76827-2_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-76827-2_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-76826-5

  • Online ISBN: 978-3-540-76827-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics