Lexical Contextual Relations for the Unsupervised Discovery of Texts Features

Perrin, Patrick; Petry, Fred

doi:10.1007/978-1-4615-5725-8_10

Lexical Contextual Relations for the Unsupervised Discovery of Texts Features

Patrick Perrin³ &
Fred Petry³

Chapter

1327 Accesses
2 Citations

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 453))

Abstract

This chapter studies the role of lexical contextual relations for the problem of systematically extracting the most relevant features characterizing texts in the narrative discourse form. Narrative texts have inherent structure dictated by language usage in generating them. We suggest that the relative distance between terms within a text gives sufficient information about its topic structure and its relevant content. We describe a model we developed for identifying major features in texts that were collected about some topics. Text features can be used, for example, to discover important events in psychiatric medical reports to direct the physician to certain recurrent and interesting aspects of some cases. The model utilizes the inherent structure of narrative discourse in the form of extracting context-dependent lexical relations. We propose and discuss possible measurements through experiments from applying the model to a database of psychiatric evaluation reports. We qualitatively demonstrate that a useful text structure and content can be systematically extracted by collocational lexical analysis without the need to encode any supplemental sources of knowledge.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Akman, V. and Surav, M. (1996). Steps toward formalizing context. AI Magazine, 17(3):55–72.
Google Scholar
Allen, J. (1995). Natural Language Understanding. Benjamin Cummings, Redwood City, CA.
Google Scholar
Alterman, R. and Bookman, L. (1990). Some computational experiments in summarization. Discourse processes, 13:143–174.
Article Google Scholar
Applebaum, D. (1996). Probability and information: an integrated approach. Cambridge University Press, Cambridge, UK.
MATH Google Scholar
Brill, E. and Mooney, R. J. (1997). An overview of empirical natural language processing. AI Magazine, 18(4):13–24.
Google Scholar
Brown, P. and et al., J. C. (1990). A statistical approach to machine translation. Computational Linguistics, 16(2):79–85.
Google Scholar
Cardie, C. (1997). Empirical methods in information extraction. AI Magazine, 18(4):65–79.
Google Scholar
Chen, H. and Lynch, K. J. (1992). Automatic construction of networks of concepts characterizing document databases. IEEE Transactions on Systems, Man and Cybernetics, 22(5):885–902.
Article Google Scholar
Church, K., Gale, W., Hanks, P., and Hindle, D. (1991). Using statistics in lexical analysis. In Zernick, U., editor, Lexical acquisition: exploiting on-line resources to build a lexicon, chapter 6, pages 115–164. LEA, Hillsdale, NJ.
Google Scholar
Feldman, R. and Dagan, I. (1995). Knowledge discovery in textual databases (kdt). In Fayyad, U. and Uthurusany, R., editors, Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD95), pages 112–117.
Google Scholar
Harter, S. P. (1992). Psychological relevance and information science. Journal of the American Society for Information Science,43:602–615.
Article Google Scholar
Jacobs, P. S., editor (1992). Text-Based Intelligent Systems: current research and practice in information extraction. Lawrence Erlbaum Associates, Hillsdale, NJ.
Google Scholar
Jacobs, P. S. (1993). Using statistical methods to improve knowledge-based news categorization. IEEE Expert,8(2):13–23.
Article MathSciNet Google Scholar
Lapalut, S. (1995). Text clustering to support knowledge acquisition from documents. Research Report 2639, INRIA.
Google Scholar
Lehnert, W., Soderland, S., Aronow, D., Feng, F., and Shmueli, A. (1995). Inductive text classification for medical applications. Journal for Experimental and Theoretical Artificial Intelligence, 7(1):271–302.
Google Scholar
Manning, C. (1993). Automatic acquisition of large subcategorization dictionary from corpora. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics,pages 235–242.
Google Scholar
Moulin, B. and Rousseau, D. (1991). Extracting logical knowledge from prescriptive texts in order to build deontic knowledge bases. In Proceedings of the 6th Workshop on Knowledge Acquisition for Knowledge-Based Systems, pages 1–20.
Google Scholar
Moulin, B. and Rousseau, D. (1992). Automated knowledge acquisition from regulatory texts. IEEE Expert, 7(5):27–35.
Article Google Scholar
Perrin, P. (1997). Contextual Representation and Learning for Unsupervised Knowledge Discovery in Texts. PhD thesis, Computer Science Department, Tulane University, New Orleans, LA.
Google Scholar
Perrin, P. and Petry, F. (1994). Intelligent agents for scientific databases. In Proceedings of the 5th Symposium on Computer Science, Merida, Mexico.
Google Scholar
Perrin, P. and Petry, F. (1998). Contextual text representation for unsupervised knowledge discovery in texts. In Proceedings of the Second Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-98),Melbourne, Australia, 15–17 April 1998.
Google Scholar
Riloff, E. (1996). Automatically generating extraction patterns from untagged text. In Press, A. P., editor, Proceedings of the 13th National Conference on Artificial Intelligence,pages 1044–1049, Menlo Park, CA.
Google Scholar
Salton, G. (1989). Automatic text processing. Addison-Wesley, Reading, MA.
Google Scholar
Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technology Journal, 27:379–423,623–656.
MathSciNet Google Scholar
Soderland, S., Fisher, D., Aseltine, J., and Lehnert, W. (1995). Crystal: inducing a conceptual dictionary. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI ‘85), Montreal, Canada. AAAI, Morgan Kaufmann.
Google Scholar
Sperber, D. and Wilson, D. (1986). Relevance: communication and cognition. Basil Blackwell, Oxford, UK.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Tulane University, New Orleans, LA, 70118, USA
Patrick Perrin & Fred Petry

Authors

Patrick Perrin
View author publications
You can also search for this author in PubMed Google Scholar
Fred Petry
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National University of Singapore, Singapore
Huan Liu
Osaka University, Osaka, Japan
Hiroshi Motoda

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Perrin, P., Petry, F. (1998). Lexical Contextual Relations for the Unsupervised Discovery of Texts Features. In: Liu, H., Motoda, H. (eds) Feature Extraction, Construction and Selection. The Springer International Series in Engineering and Computer Science, vol 453. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5725-8_10

Download citation

DOI: https://doi.org/10.1007/978-1-4615-5725-8_10
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7622-4
Online ISBN: 978-1-4615-5725-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics