In this chapter we focus on some problems concerning application of an immune-based algorithm to extraction and visualization of cluster structure. Particularly a hierarchical, topic-sensitive approach is proposed; it appears to be a robust solution to the problem of scalability of document map generation process (both in terms of time and space complexity). This approach relies upon extraction of a hierarchy of concepts, i.e. almost homogenous groups of documents described by unique sets of terms. To represent the content of each context a modified version the aiNet [9] algorithm is employed; it was chosen because of its natural ability to represent internal patterns existing in a training set. Careful evaluation of the effectiveness of the novel text clustering procedure is presented in section reporting experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A. Baraldi and P. Blonda. A survey of fuzzy clustering algorithms for pattern recognition. IEEE Trans. on Systems, Man and Cybernetics, 29B:786–801, 1999.
A. Becks. Visual knowledge management with adaptable document maps. GMD research series, 15, 2001.
M.W. Berry, Z. Drmač, and E.R. Jessup. Matrices, vector spaces and information retrieval. SIAM Review, 41(2):335–362, 1999.
J.C. Bezdek and S.K. Pal. Fuzzy Models for Pattern Recognition: Methods that Search for Structures in Data. IEEE, New York, 1992.
G.B.P. Bezerra, T.V. Barra, M.F. Hamilton, and F.J. von Zuben. A hierarchical immune-inspired approach for text clustering. In Proc. Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU’2006), volume 1, pages 2530–2537, 2006.
C. Boulis and M. Ostendorf. Combining multiple clustering systems. In Proc. of 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-2004), pages 63–74. Springer-Verlag, LNAI 3202, 2004.
K. Ciesielski, M. Dramiński, M. Kłopotek, D. Czerski, and S.T. Wierzchoń. Adaptive document maps. In Proceedings of the Intelligent Advances in Soft Computing 5, pages 109–120. Springer-Verlag, 2006.
K. Ciesielski and M. Kłopotek. Text data clustering by contextual graphs. In L. Todorovski, N. Lavrac, and K.P. Jantke, editors, Discovery Science, pages 65–76. Springer-Verlag, LNAI 4265, 2006.
L.N. de Castro and J. Timmis. Artificial Immune Systems: A New Computational Intelligence Approach. Springer, 2002.
L.N. de Castro and F.J. von Zuben. An evolutionary immune network for data clustering. In SBRN’2000, pages 84–89. IEEE Computer Society Press, 2000.
B. Fritzke. Some competitive learning methods, 1997. http://www.neuroinformatik.ruhr-uni-bochum.de/ini/VDM/research/gsn/JavaPaper.
M. Gilchrist. Taxonomies for business: Description of a research project. In 11 Nordic Conference on Information and Documentation, Reykjavik, Iceland, May 30 – June 1 2001. http://www.bokis.is/iod2001/papers/Gilchrist_paper.doc.
C. Hung and S. Wermter. A constructive and hierarchical self-organising model in a non-stationary environment. In Int. Joint Conference in Neural Networks, pages 2948–2953, 2005.
S.Y. Jung and K. Taek-Soo. An incremental similarity computation method in agglomerative hierarchical clustering. Journal of Fuzzy Logic and Intelligent System, December 2001.
M. Kłopotek. A new bayesian tree learning method with reduced time and space complexity. Fundamenta Informaticae, 49(4):349–367, 2002.
M. Kłopotek, M. Dramiński, K. Ciesielski, M. Kujawiak, and S.T. Wierzchoń. Mining document maps. In M. Gori, M. Ceci, and M. Nanni, editors, Proc. of Statistical Approaches to Web Mining Workshop (SAWM) at PKDD’04, pages 87–98, Pisa, Italy, 2004.
M. Kłopotek, S. Wierzchoń, K. Ciesielski, M. Dramiński, and D. Czerski. E-Service Intelligence – Methodologies, Technologies and Applications. Part II: Methodologies, Technologies and Systems, volume 37 of Studies in Computational Intelligence, chapter Techniques and technologies behind maps of Internet and Intranet document collections, pages 169–190. Springer, 2007.
M. Kłopotek, S. Wierzchoń, K. Ciesielski, M. Dramiński, and D. Czerski. Conceptual Maps of Document Collections in Internet and Intranet. Coping with the Technological Challenge. IPI PAN Publishing House, Warszawa 2007.
T. Kohonen. Self-Organizing Maps, volume 30 of Springer Series in Information Sciences. Springer, Berlin, Heidelberg, New York, 2001.
K. Lagus, S. Kaski, and T. Kohonen. Mining massive document collections by the WEBSOM method. Information Sciences, 163/1-3:135–156, 2004.
G. Salton. The SMART Retrieval System – Experiments in Automatic Document Processing. Prentice-Hall, Upper Saddle River, NJ, USA, 1971.
N. Tang and V.R. Vemuri. An artificial immune system approach to document clustering. In Proceedings of the 2005 ACM symposium on Applied Computing Santa Fe, New Mexico, pages 918–922, 2005.
J. Timmis. aiVIS: Artificial immune network visualization. In Proceedings of EuroGraphics UK 2001 Conference, pages 61–69. Univeristy College, London, 2001.
C.J. van Rijsbergen. Information Retrieval. Butterworths, London, 1979. http://www.dcs.gla.ac.uk/Keith/Preface.html.
S.T. Wierzchoń. Artificial immune systems. Theory and applications (in Polish). Akademicka Oficyna Wydawnicza EXIT Publishing, Warszawa, 2001.
D.R. Wilson and T.R. Martinez. Reduction techniques for instance-based learning algorithms. Machine Learning, 38:257–286, 2000.
T. Zhang, R. Ramakrishan, and M. Livny. Birch: Efficient data clustering method for large databases. In Proc. ACM SIGMOD Int. Conf. on Data Management, pages 103–114, 1997.
Y. Zhao and G. Karypis. Criterion functions for document clustering: Experiments and analysis. http://www-users.cs.umn.edu/~karypis/publications/ir.html.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Wierzchoń, S.T., Ciesielski, K., Kłopotek, M.A. (2008). Scalability and Evaluation of Contextual Immune Model for Web Mining. In: Hassanien, AE., Abraham, A., Kacprzyk, J. (eds) Computational Intelligence in Multimedia Processing: Recent Advances. Studies in Computational Intelligence, vol 96. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76827-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-76827-2_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76826-5
Online ISBN: 978-3-540-76827-2
eBook Packages: EngineeringEngineering (R0)