Abstract
This chapter studies the role of lexical contextual relations for the problem of systematically extracting the most relevant features characterizing texts in the narrative discourse form. Narrative texts have inherent structure dictated by language usage in generating them. We suggest that the relative distance between terms within a text gives sufficient information about its topic structure and its relevant content. We describe a model we developed for identifying major features in texts that were collected about some topics. Text features can be used, for example, to discover important events in psychiatric medical reports to direct the physician to certain recurrent and interesting aspects of some cases. The model utilizes the inherent structure of narrative discourse in the form of extracting context-dependent lexical relations. We propose and discuss possible measurements through experiments from applying the model to a database of psychiatric evaluation reports. We qualitatively demonstrate that a useful text structure and content can be systematically extracted by collocational lexical analysis without the need to encode any supplemental sources of knowledge.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Akman, V. and Surav, M. (1996). Steps toward formalizing context. AI Magazine, 17(3):55–72.
Allen, J. (1995). Natural Language Understanding. Benjamin Cummings, Redwood City, CA.
Alterman, R. and Bookman, L. (1990). Some computational experiments in summarization. Discourse processes, 13:143–174.
Applebaum, D. (1996). Probability and information: an integrated approach. Cambridge University Press, Cambridge, UK.
Brill, E. and Mooney, R. J. (1997). An overview of empirical natural language processing. AI Magazine, 18(4):13–24.
Brown, P. and et al., J. C. (1990). A statistical approach to machine translation. Computational Linguistics, 16(2):79–85.
Cardie, C. (1997). Empirical methods in information extraction. AI Magazine, 18(4):65–79.
Chen, H. and Lynch, K. J. (1992). Automatic construction of networks of concepts characterizing document databases. IEEE Transactions on Systems, Man and Cybernetics, 22(5):885–902.
Church, K., Gale, W., Hanks, P., and Hindle, D. (1991). Using statistics in lexical analysis. In Zernick, U., editor, Lexical acquisition: exploiting on-line resources to build a lexicon, chapter 6, pages 115–164. LEA, Hillsdale, NJ.
Feldman, R. and Dagan, I. (1995). Knowledge discovery in textual databases (kdt). In Fayyad, U. and Uthurusany, R., editors, Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD95), pages 112–117.
Harter, S. P. (1992). Psychological relevance and information science. Journal of the American Society for Information Science,43:602–615.
Jacobs, P. S., editor (1992). Text-Based Intelligent Systems: current research and practice in information extraction. Lawrence Erlbaum Associates, Hillsdale, NJ.
Jacobs, P. S. (1993). Using statistical methods to improve knowledge-based news categorization. IEEE Expert,8(2):13–23.
Lapalut, S. (1995). Text clustering to support knowledge acquisition from documents. Research Report 2639, INRIA.
Lehnert, W., Soderland, S., Aronow, D., Feng, F., and Shmueli, A. (1995). Inductive text classification for medical applications. Journal for Experimental and Theoretical Artificial Intelligence, 7(1):271–302.
Manning, C. (1993). Automatic acquisition of large subcategorization dictionary from corpora. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics,pages 235–242.
Moulin, B. and Rousseau, D. (1991). Extracting logical knowledge from prescriptive texts in order to build deontic knowledge bases. In Proceedings of the 6th Workshop on Knowledge Acquisition for Knowledge-Based Systems, pages 1–20.
Moulin, B. and Rousseau, D. (1992). Automated knowledge acquisition from regulatory texts. IEEE Expert, 7(5):27–35.
Perrin, P. (1997). Contextual Representation and Learning for Unsupervised Knowledge Discovery in Texts. PhD thesis, Computer Science Department, Tulane University, New Orleans, LA.
Perrin, P. and Petry, F. (1994). Intelligent agents for scientific databases. In Proceedings of the 5th Symposium on Computer Science, Merida, Mexico.
Perrin, P. and Petry, F. (1998). Contextual text representation for unsupervised knowledge discovery in texts. In Proceedings of the Second Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-98),Melbourne, Australia, 15–17 April 1998.
Riloff, E. (1996). Automatically generating extraction patterns from untagged text. In Press, A. P., editor, Proceedings of the 13th National Conference on Artificial Intelligence,pages 1044–1049, Menlo Park, CA.
Salton, G. (1989). Automatic text processing. Addison-Wesley, Reading, MA.
Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technology Journal, 27:379–423,623–656.
Soderland, S., Fisher, D., Aseltine, J., and Lehnert, W. (1995). Crystal: inducing a conceptual dictionary. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI ‘85), Montreal, Canada. AAAI, Morgan Kaufmann.
Sperber, D. and Wilson, D. (1986). Relevance: communication and cognition. Basil Blackwell, Oxford, UK.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer Science+Business Media New York
About this chapter
Cite this chapter
Perrin, P., Petry, F. (1998). Lexical Contextual Relations for the Unsupervised Discovery of Texts Features. In: Liu, H., Motoda, H. (eds) Feature Extraction, Construction and Selection. The Springer International Series in Engineering and Computer Science, vol 453. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5725-8_10
Download citation
DOI: https://doi.org/10.1007/978-1-4615-5725-8_10
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7622-4
Online ISBN: 978-1-4615-5725-8
eBook Packages: Springer Book Archive