Skip to main content

A Genetic-CBR Approach for Cross-Document Relationship Identification

  • Conference paper
Advanced Machine Learning Technologies and Applications (AMLTA 2012)

Abstract

Various applications concerning multi document has emerged recently. Information across topically related documents can often be linked. Cross-document Structure Theory (CST) analyzes the relationships that exist between sentences across related documents. However, most of the existing works rely on human experts to identify the CST relationships.In this work, we aim to automatically identify some of the CST relations using supervised learning method. We propose Genetic-CBR approach which incorporates genetic algorithm (GA) to improve the case base reasoning (CBR) classification. GA is used to scale the weights of the data features used by the CBR classifier. We perform the experiments using the datasets obtained from CSTBank corpus. Comparison with other learning methods shows that the proposed method yields better results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Radev, D.R.: A Common Theory of Information Fusion from Multiple Text Sources Step One: Cross-Document Structure. In: Proceeding of SIGDIAL, vol. 10, pp. 74–83 (2000)

    Google Scholar 

  2. Zhang, Z., Blair-Goldensohn, S., Radev, D.R.: Towards CST-Enhanced Summarization. In: Proceedings of AAAI/IAAI, pp. 439–446 (2002)

    Google Scholar 

  3. Zhang, Z., Otterbacher, J., Radev, D.R.: Learning cross-document structural relationships using boosting. In: Proceedings of CIKM, pp. 124–130 (2003)

    Google Scholar 

  4. Miyabe, Y., Takamura, H., Okumura, M.: Identifying cross-document relations between sentences. In: Proceedings of IJCNLP, pp. 141–148 (2008)

    Google Scholar 

  5. Zahri, N.A.H.B., Fukumoto, F.: Multi-document Summarization Using Link Analysis Based on Rhetorical Relations between Sentences. In: Proceedings of CICLing, vol. 2, pp. 328–338 (2011)

    Google Scholar 

  6. Erkan, G., Radev, D.R.: LexPageRank: Prestige in multi-document text summarization. In: Proceedings of EMNLP, pp. 365–371 (2004)

    Google Scholar 

  7. Jorge, M.L.C., Pardo, T.S.: Experiments with CST-based Multidocument Summarization. In: Workshop on Graph-based Methods for Natural Language Processing, pp. 74–82. ACL, Uppsala (2010)

    Google Scholar 

  8. Aamodt, A., Plaza, E.: Case-based reasoning: foundational issues, methodological variations and system approaches. AI Communications 7, 39–59 (1994)

    Google Scholar 

  9. Paszkowicz, W.: Genetic Algorithms, A Nature-inspired Tool: Survey of Applications in Materials Science and Related Fields. In: Mat. Man. Proc., vol. 24, pp. 174–197 (2009)

    Google Scholar 

  10. Scott, M.T.: An introduction to genetic algorithms. Journal of Computing Sciences in Colleges 20, 115–123 (2004)

    Google Scholar 

  11. Anita, T., Rucha, D.: Article: Genetic Algorithm - Survey Paper. In: IJCA Proceedings on NCRTC, vol. 5, pp. 25–29. Foundation of Computer Science, New York (2012)

    Google Scholar 

  12. Kotsiantis, S.B.: Supervised Machine Learning: A Review of Classification Techniques. Informatica Slovenia 31, 249–268 (2007)

    MathSciNet  MATH  Google Scholar 

  13. CSTBank PhaseI, http://tangra.si.umich.edu/clair/CSTBank/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kumar, Y.J., Salim, N., Abuobieda, A. (2012). A Genetic-CBR Approach for Cross-Document Relationship Identification. In: Hassanien, A.E., Salem, AB.M., Ramadan, R., Kim, Th. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2012. Communications in Computer and Information Science, vol 322. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35326-0_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35326-0_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35325-3

  • Online ISBN: 978-3-642-35326-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics