Optimization of Association Word Knowledge Base through Genetic Algorithm

Ko, Su-Jeong; Lee, Jung-Hyun

doi:10.1007/3-540-46145-0_21

Su-Jeong Ko⁷ &
Jung-Hyun Lee⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2454))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

1246 Accesses
1 Citations

Abstract

Query expansion in knowledge based on information retrieval system requires knowledge base being considered semantic relations between words. Since Apriori algorithm extracts association word without taking user preference into account, recall is improved but accuracy is reduced. This paper shows how to establish optimized association word knowledge base with improved accuracy only including association word that users prefer among association words being considered semantic relations between words. Toward this end, web documents related to computer are classified into eight classes, and nouns are extracted from web document of each class. Association word is extracted from nouns through Apriori algorithm, and association word that users do not favor is excluded from knowledge base through genetic algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Reference

R. Agrawal and R. Srikant, "Fast Algorithms for Mining Association Rules," Proceedings of the 20th VLDB Conference, Santiago, Chile, 1994.
Google Scholar
R. Agrawal and T. Imielinski and A. Swami, "Mining association rules between sets of items in large databases," Proceedings of the 1993 ACM SIGMOD Conference, Washington DC, USA, May 1993.
Google Scholar
P. Brown and P. Della and R. Mercer, "Class-based n-gram models of natural language," Computational Linguistics, 18(4), pp. 467–479, 1992.
Google Scholar
C. Clifton and R. Steinheiser, "Data Mining on Text," Proceedings of the Twenty-Second Annual International Computer Software & Applications Conference, 1998.
Google Scholar
M. Gondon, "Probabilistic and genetic algorithms for document retrieval," Communication of the ACM,31, pp. 1208–1218, 1988.
Article Google Scholar
V. Hatzivassiloglou and K. McKeown, "Towards the automatic identification of adjectival scales: Clustering adjectives according to meaning," Proceedings of the 3 1 st Annual Meeting of the ACL, pp. 172–182, 1993.
Google Scholar
K. Hyun-Jin and P. Jay-Duke and J. Myung-Gil and P. Dong-In. "Clustering Korean Nouns Based On Syntactic Relations and Corpus Data," Proceedings of the LASTED International Conference Artificial Intelligence and Soft Computing, 1998.
Google Scholar
T. Joachims, "A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization," Proceedings of 14th International Conference on Machine Learning, 1997.
Google Scholar
S. J. KO and J. H. Lee, "Feature Selection using Association Word Ming for Classification," Proceedings of the DEXA, LNCS2113, 2001.
Google Scholar
H. IU and R. Setiono and H. Liu, "Effective Data Mining Using Neural Networks," Proceeding of the IEEE Trans. Knowledge and data engineering, V.8 N.6, pp. 962–969, 1996.
Google Scholar
G. Miller, "Wordnet:An on-line lexical database," International Journal of Lexicography. 3(4), pp. 235–244, 1990.
Article Google Scholar
K. Miyashita and K. Sycara, "Improving System Performance in Case Based Iterative Optimization through Knowledge Filtering," Proceedings of the International Joint Conference on Artificial Intelligence, 1995.
Google Scholar
T. Mitchell, Maching Learning, McGraw-Hill, pp. 249–273, 1997.
Google Scholar
D. W. Oard and G. Marchionini, "A Conceptual Framework for Text Filtering," Technical Report CAR-TR-830, Human Computer Interaction Laboratory, University of Maryland at College Park, 1996.
Google Scholar
C. Plaunt and B. A. Norgard, "An association based method for automatic indexing with a controlled vocabulary," Journal of the American Society for Information Science, 49, 888–902. 1998.
Google Scholar
P. C. Wong and P. C. Whitney and J. Thomas, "Visualizing Association Rules for Text Mining," Proceedings of the 1999 IEEE Symposium on Information Visualization, pp. 120–123, 1999.
Google Scholar
J. Xu and W. Bruce, "Query Expansion Local and Global Document Analysis," Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 4–11, 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, Inha University, Yong_hyen dong, Namgu, Inchon, Korea
Su-Jeong Ko & Jung-Hyun Lee

Authors

Su-Jeong Ko
View author publications
You can also search for this author in PubMed Google Scholar
Jung-Hyun Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, 606-8501, Kyoto, Japan
Yahiko Kambayashi
Institute for Computer Science and Business Informatics, University of Vienna, Liebiggasse 4, 1010, Vienna, Austria
Werner Winiwarter
Center for Spatial Information Science (CSIS), University of Tokyo, 4-6-1, Komaba, Meguro-ku, 153-8904, Tokyo, Japan
Masatoshi Arikawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ko, SJ., Lee, JH. (2002). Optimization of Association Word Knowledge Base through Genetic Algorithm. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2002. Lecture Notes in Computer Science, vol 2454. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46145-0_21

Download citation

DOI: https://doi.org/10.1007/3-540-46145-0_21
Published: 02 September 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44123-6
Online ISBN: 978-3-540-46145-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics