Extracting Information from Short Messages
Much currently transmitted information takes the form of e-mails or SMS text messages and so extracting information from such short messages is increasingly important. The words in a message can be partitioned into the syntactic structure, terms from the domain of discourse and the data being transmitted. This paper describes a light-weight Information Extraction component which uses pattern matching to separate the three aspects: the structure is supplied as a template; domain terms are the metadata of a data source (or their synonyms), and data is extracted as those words matching placeholders in the templates.
KeywordsPattern Match Entity Type Sentence Structure Short Message Natural Language Semantic
Unable to display preview. Download preview PDF.
- 2.Fisher, D., Soderland, S., McCarthy, J., Feng, F., Lehnert, W.: Umass System, MUC-6 (1995)Google Scholar
- 3.Cardie, C.: Empirical Methods in Information Extraction. AI Magazine 18(4), 65–79 (1997)Google Scholar
- 8.Cooper, R.L., Ali, S., Bi, C.L.: A System for Extracting Information from Short Messages, Technical Report, University of Glasgow (in press)Google Scholar
- 9.Agichtein, E., Gravano, L.: Snowball: Extracting Relations from Large Plain-Text Collections. In: Proc. 5th ACM International Conference on Digital Libraries, DL (2000)Google Scholar