Skip to main content

Lexical Contextual Relations for the Unsupervised Discovery of Texts Features

  • Chapter

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 453))

Abstract

This chapter studies the role of lexical contextual relations for the problem of systematically extracting the most relevant features characterizing texts in the narrative discourse form. Narrative texts have inherent structure dictated by language usage in generating them. We suggest that the relative distance between terms within a text gives sufficient information about its topic structure and its relevant content. We describe a model we developed for identifying major features in texts that were collected about some topics. Text features can be used, for example, to discover important events in psychiatric medical reports to direct the physician to certain recurrent and interesting aspects of some cases. The model utilizes the inherent structure of narrative discourse in the form of extracting context-dependent lexical relations. We propose and discuss possible measurements through experiments from applying the model to a database of psychiatric evaluation reports. We qualitatively demonstrate that a useful text structure and content can be systematically extracted by collocational lexical analysis without the need to encode any supplemental sources of knowledge.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Akman, V. and Surav, M. (1996). Steps toward formalizing context. AI Magazine, 17(3):55–72.

    Google Scholar 

  • Allen, J. (1995). Natural Language Understanding. Benjamin Cummings, Redwood City, CA.

    Google Scholar 

  • Alterman, R. and Bookman, L. (1990). Some computational experiments in summarization. Discourse processes, 13:143–174.

    Article  Google Scholar 

  • Applebaum, D. (1996). Probability and information: an integrated approach. Cambridge University Press, Cambridge, UK.

    MATH  Google Scholar 

  • Brill, E. and Mooney, R. J. (1997). An overview of empirical natural language processing. AI Magazine, 18(4):13–24.

    Google Scholar 

  • Brown, P. and et al., J. C. (1990). A statistical approach to machine translation. Computational Linguistics, 16(2):79–85.

    Google Scholar 

  • Cardie, C. (1997). Empirical methods in information extraction. AI Magazine, 18(4):65–79.

    Google Scholar 

  • Chen, H. and Lynch, K. J. (1992). Automatic construction of networks of concepts characterizing document databases. IEEE Transactions on Systems, Man and Cybernetics, 22(5):885–902.

    Article  Google Scholar 

  • Church, K., Gale, W., Hanks, P., and Hindle, D. (1991). Using statistics in lexical analysis. In Zernick, U., editor, Lexical acquisition: exploiting on-line resources to build a lexicon, chapter 6, pages 115–164. LEA, Hillsdale, NJ.

    Google Scholar 

  • Feldman, R. and Dagan, I. (1995). Knowledge discovery in textual databases (kdt). In Fayyad, U. and Uthurusany, R., editors, Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD95), pages 112–117.

    Google Scholar 

  • Harter, S. P. (1992). Psychological relevance and information science. Journal of the American Society for Information Science,43:602–615.

    Article  Google Scholar 

  • Jacobs, P. S., editor (1992). Text-Based Intelligent Systems: current research and practice in information extraction. Lawrence Erlbaum Associates, Hillsdale, NJ.

    Google Scholar 

  • Jacobs, P. S. (1993). Using statistical methods to improve knowledge-based news categorization. IEEE Expert,8(2):13–23.

    Article  MathSciNet  Google Scholar 

  • Lapalut, S. (1995). Text clustering to support knowledge acquisition from documents. Research Report 2639, INRIA.

    Google Scholar 

  • Lehnert, W., Soderland, S., Aronow, D., Feng, F., and Shmueli, A. (1995). Inductive text classification for medical applications. Journal for Experimental and Theoretical Artificial Intelligence, 7(1):271–302.

    Google Scholar 

  • Manning, C. (1993). Automatic acquisition of large subcategorization dictionary from corpora. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics,pages 235–242.

    Google Scholar 

  • Moulin, B. and Rousseau, D. (1991). Extracting logical knowledge from prescriptive texts in order to build deontic knowledge bases. In Proceedings of the 6th Workshop on Knowledge Acquisition for Knowledge-Based Systems, pages 1–20.

    Google Scholar 

  • Moulin, B. and Rousseau, D. (1992). Automated knowledge acquisition from regulatory texts. IEEE Expert, 7(5):27–35.

    Article  Google Scholar 

  • Perrin, P. (1997). Contextual Representation and Learning for Unsupervised Knowledge Discovery in Texts. PhD thesis, Computer Science Department, Tulane University, New Orleans, LA.

    Google Scholar 

  • Perrin, P. and Petry, F. (1994). Intelligent agents for scientific databases. In Proceedings of the 5th Symposium on Computer Science, Merida, Mexico.

    Google Scholar 

  • Perrin, P. and Petry, F. (1998). Contextual text representation for unsupervised knowledge discovery in texts. In Proceedings of the Second Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-98),Melbourne, Australia, 15–17 April 1998.

    Google Scholar 

  • Riloff, E. (1996). Automatically generating extraction patterns from untagged text. In Press, A. P., editor, Proceedings of the 13th National Conference on Artificial Intelligence,pages 1044–1049, Menlo Park, CA.

    Google Scholar 

  • Salton, G. (1989). Automatic text processing. Addison-Wesley, Reading, MA.

    Google Scholar 

  • Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technology Journal, 27:379–423,623–656.

    MathSciNet  Google Scholar 

  • Soderland, S., Fisher, D., Aseltine, J., and Lehnert, W. (1995). Crystal: inducing a conceptual dictionary. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI ‘85), Montreal, Canada. AAAI, Morgan Kaufmann.

    Google Scholar 

  • Sperber, D. and Wilson, D. (1986). Relevance: communication and cognition. Basil Blackwell, Oxford, UK.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer Science+Business Media New York

About this chapter

Cite this chapter

Perrin, P., Petry, F. (1998). Lexical Contextual Relations for the Unsupervised Discovery of Texts Features. In: Liu, H., Motoda, H. (eds) Feature Extraction, Construction and Selection. The Springer International Series in Engineering and Computer Science, vol 453. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5725-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-5725-8_10

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-7622-4

  • Online ISBN: 978-1-4615-5725-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics