Fuzzy Clustering for Documents Based on Optimization of Classifier Using the Genetic Algorithm

Youn, Ju-In; Eun, He-Jue; Kim, Yong-Sung

doi:10.1007/11424826_2

Fuzzy Clustering for Documents Based on Optimization of Classifier Using the Genetic Algorithm

Ju-In Youn²⁴,
He-Jue Eun²⁴ &
Yong-Sung Kim²⁴

Conference paper

1648 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3481))

Abstract

It is a problem that established document categorization method reflects the semantic relation inaccurately at feature expression of document. For the purpose of solving this problem, we propose a genetic algorithm and C-Means clustering algorithm for choosing an appropriate set of fuzzy clustering for classification problems of documents. The aim of the proposed method is to find a minimum set of fuzzy cluster that can correctly classify all training documents. The number of fuzzy pseudo-partition and the shapes of the fuzzy membership functions that we use the classification criteria are determined by the genetic algorithms. Then, the classifier decides using fuzzy c-means clustering algorithms for documents classification. A solution obtained by the genetic algorithm is a set of fuzzy clustering, and its fitness function is determined by fuzzy membership function.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ko, S.-J.: Bayesian Automatic Document Categorization Using Apriori-Genetic Algorithm 8(3), 6 (2003)
Google Scholar
Soon, H.K.: A Cluster Validity Index for Fuzzy Clustering. Electronics Letters 34(22) (2002)
Google Scholar
Baeza-ates, R., Ribeiro-Neto, B.: Modern Information Retrieval, 230–255 (1998)
Google Scholar
Klir, G.J., Yuan, B.: Fuzzy Sets and Fuzzy Logic Theory and Applications (1998)
Google Scholar
Ko, S.-J.: Optimization of Associative Word Knowledge Base Using Apriori-genetic algorithm. KISS 28(8) (2003)
Google Scholar
Lee, K.-M.: Classification Rule Mining from Fuzzy Data based on fuzzy decision Tree. KISS 28(1) (2003)
Google Scholar
Hyun-Jin, K.: Clustering Korean Nouns Based On Syntactic Relation and Corpus Data. In: Proceedings of the LASTED International Conference Artificial Intelligence and Soft Computing (2003)
Google Scholar
Gondon, M.: Probabilistic and genetic algorithms for document retrieval. Communication of the ACM 31 (2000)
Google Scholar
Koczy, L.T.: Information retrieval by fuzzy relations and hierarchical co-occurrence (1997)
Google Scholar
Baranyi, P., Gedeon, T.D., Koczy, L.T.: Improved fuzzy and neural network algorithms for frequency prediction in document filtering. TR 97-02 (1997)
Google Scholar
Koczy, L.T., Gedeon, T.D., Koczy, J.A.: The construction of fuzzy relational maps in information retrieval. IETR 98-01 (1998)
Google Scholar
Koczy, L.T., Gedeon, T.: Information retrieval by fuzzy relations and hierarchical cooccurrence. Part I. TR99-01, Dept. of Info. Eng., School of Comp. Sci. & Eng., UNSW (1999)
Google Scholar
Han, S.-W., Eun, H.-J., Kim, Y.-S., Koczy, L.T.: A Document Classification Algorithm Using the Fuzzy set Theory and Hierarchical Structure of Document. In: Laganá, A., Gavrilova, M.L., Kumar, V., Mun, Y., Tan, C.J.K., Gervasi, O. (eds.) ICCSA 2004. LNCS, vol. 3043, pp. 122–133. Springer, Heidelberg (2004)
Chapter Google Scholar
Eun, H.-j.: An Algorithm of Documents classification and Query Extension using fuzzy function. Journal of KISS: Software and applications 28(2) (2001)
Google Scholar
Chen, T.C.: A Fuzzy Network for the Document Clustering Based on the Measurement of Information Pattern. Neural Networks 4 (1991)
Google Scholar

Download references

Author information

Authors and Affiliations

Division of Electronics and Information Engineering, Chonbuk National University, 664-14 1 ga, Duckjin-Dong Duckjin-Gu, Jeonju, Republic of Korea
Ju-In Youn, He-Jue Eun & Yong-Sung Kim

Authors

Ju-In Youn
View author publications
You can also search for this author in PubMed Google Scholar
He-Jue Eun
View author publications
You can also search for this author in PubMed Google Scholar
Yong-Sung Kim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics and Computer Science, University of Perugia, via Vanvitelli, 1, I-06123, Perugia, Italy
Osvaldo Gervasi
Department of Computer Science, University of Calgary, 2500 University Drive N.W., T2N 1N4, Calgary, AB, Canada
Marina L. Gavrilova
William Norris Professor, Head of the Computer Science and Engineering Department, University of Minnesota, USA
Vipin Kumar
Department of Chemistry, University of Perugia, Via Elce di Sotto, 8, P.O. Box, I-06123, Perugia, Italy
Antonio Laganà
Institute of High Performance Computing, IHCP, 1 Science Park Road, 01-01 The Capricorn, Singapore Science Park II, 117528, Singapore
Heow Pueh Lee
School of Computing, Soongsil University, Seoul, Korea
Youngsong Mun
Clayton School of IT, Monash University, 3800, Clayton, Australia
David Taniar
OptimaNumerics Ltd, P.O. Box, Belfast, United Kingdom
Chih Jeng Kenneth Tan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Youn, JI., Eun, HJ., Kim, YS. (2005). Fuzzy Clustering for Documents Based on Optimization of Classifier Using the Genetic Algorithm. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2005. ICCSA 2005. Lecture Notes in Computer Science, vol 3481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11424826_2

Download citation

DOI: https://doi.org/10.1007/11424826_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25861-2
Online ISBN: 978-3-540-32044-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics