A Genetic-CBR Approach for Cross-Document Relationship Identification

Kumar, Yogan Jaya; Salim, Naomie; Abuobieda, Albaraa

doi:10.1007/978-3-642-35326-0_19

Yogan Jaya Kumar^4,5,
Naomie Salim⁵ &
Albaraa Abuobieda⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 322))

Included in the following conference series:

International Conference on Advanced Machine Learning Technologies and Applications

Abstract

Various applications concerning multi document has emerged recently. Information across topically related documents can often be linked. Cross-document Structure Theory (CST) analyzes the relationships that exist between sentences across related documents. However, most of the existing works rely on human experts to identify the CST relationships.In this work, we aim to automatically identify some of the CST relations using supervised learning method. We propose Genetic-CBR approach which incorporates genetic algorithm (GA) to improve the case base reasoning (CBR) classification. GA is used to scale the weights of the data features used by the CBR classifier. We perform the experiments using the datasets obtained from CSTBank corpus. Comparison with other learning methods shows that the proposed method yields better results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Radev, D.R.: A Common Theory of Information Fusion from Multiple Text Sources Step One: Cross-Document Structure. In: Proceeding of SIGDIAL, vol. 10, pp. 74–83 (2000)
Google Scholar
Zhang, Z., Blair-Goldensohn, S., Radev, D.R.: Towards CST-Enhanced Summarization. In: Proceedings of AAAI/IAAI, pp. 439–446 (2002)
Google Scholar
Zhang, Z., Otterbacher, J., Radev, D.R.: Learning cross-document structural relationships using boosting. In: Proceedings of CIKM, pp. 124–130 (2003)
Google Scholar
Miyabe, Y., Takamura, H., Okumura, M.: Identifying cross-document relations between sentences. In: Proceedings of IJCNLP, pp. 141–148 (2008)
Google Scholar
Zahri, N.A.H.B., Fukumoto, F.: Multi-document Summarization Using Link Analysis Based on Rhetorical Relations between Sentences. In: Proceedings of CICLing, vol. 2, pp. 328–338 (2011)
Google Scholar
Erkan, G., Radev, D.R.: LexPageRank: Prestige in multi-document text summarization. In: Proceedings of EMNLP, pp. 365–371 (2004)
Google Scholar
Jorge, M.L.C., Pardo, T.S.: Experiments with CST-based Multidocument Summarization. In: Workshop on Graph-based Methods for Natural Language Processing, pp. 74–82. ACL, Uppsala (2010)
Google Scholar
Aamodt, A., Plaza, E.: Case-based reasoning: foundational issues, methodological variations and system approaches. AI Communications 7, 39–59 (1994)
Google Scholar
Paszkowicz, W.: Genetic Algorithms, A Nature-inspired Tool: Survey of Applications in Materials Science and Related Fields. In: Mat. Man. Proc., vol. 24, pp. 174–197 (2009)
Google Scholar
Scott, M.T.: An introduction to genetic algorithms. Journal of Computing Sciences in Colleges 20, 115–123 (2004)
Google Scholar
Anita, T., Rucha, D.: Article: Genetic Algorithm - Survey Paper. In: IJCA Proceedings on NCRTC, vol. 5, pp. 25–29. Foundation of Computer Science, New York (2012)
Google Scholar
Kotsiantis, S.B.: Supervised Machine Learning: A Review of Classification Techniques. Informatica Slovenia 31, 249–268 (2007)
MathSciNet MATH Google Scholar
CSTBank PhaseI, http://tangra.si.umich.edu/clair/CSTBank/

Download references

Author information

Authors and Affiliations

Faculty of Information and Communication Technology, Universiti Teknikal Malaysia Melaka, 76100, Melaka, Malaysia
Yogan Jaya Kumar
Faculty of Computer Science and Information Systems, Universiti Teknologi Malaysia, 81310, Skudai, Johor, Malaysia
Yogan Jaya Kumar, Naomie Salim & Albaraa Abuobieda

Authors

Yogan Jaya Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Naomie Salim
View author publications
You can also search for this author in PubMed Google Scholar
Albaraa Abuobieda
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Cairo University, Egypt
Aboul Ella Hassanien & Rabie Ramadan &
Ain Shams University, Cairo, Egypt
Abdel-Badeeh M. Salem
University of Tasmania, TAS, Australia
Tai-hoon Kim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kumar, Y.J., Salim, N., Abuobieda, A. (2012). A Genetic-CBR Approach for Cross-Document Relationship Identification. In: Hassanien, A.E., Salem, AB.M., Ramadan, R., Kim, Th. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2012. Communications in Computer and Information Science, vol 322. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35326-0_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-35326-0_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35325-3
Online ISBN: 978-3-642-35326-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics