Abstract
Todays’ overload of information, particularly through the World Wide Web, makes difficult the users’ access to the right information. The situation becomes even more difficult due to the fact that a lot of this information is in different languages. Therefore, it is important to apply an information process that will extract from all that volume of information only the facts that match users’ interests, and allow the user to access facts written in a different language. Information Extraction (IE) technology can meet these requirements, since unlike what happens with information retrieval and filtering technology, in IE the user interests are on specific facts extracted from the documents and not on the documents themselves. Some documents may contain the requested keywords but be irrelevant to the users’ interests. Working with specific facts instead of documents provides users information more relevant to their domain of interest. The IE systems developed so far, extract, in most cases, fixed information from documents in a fixed language. However, in order for the IE technology to be truly applicable in real life applications, meeting the above requirements, IE systems need to be easily adaptable (customisable) to new domains and users interests, as well as to multiple languages. During the last decade, substantial progress has been made in developing reliable Information Extraction (IE) technology. IE technology is currently exploited in real applications, such as the extraction of information for companies acquisitions [1],[2],[3], stock exchanges [4], companies profits and losses [5], joint ventures and management succession events [6],[7],[8], as well as for the understanding of military messages [9] and police reports [10],[11],[12].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cowie J., Wakao T., Jin W., Pustejovsky J. and Waterman S., The diderot information extraction system. In Proceedings of the First Conference of the Pacific Association for Computational Linguistics (PACLING 93). Vancouver, Canada, 1993.
Jacobs P.S. and Rau L.F., Scisor: Extracting information from on-line news. Communications of the ACM, 33(11):88–97, 1990.
Wilks Y. Diderot: a text extraction system. In DARPA Speech and Natural Language Workshop. Morgan Kaufmann, San Mateo, CA, 1991.
Vichot F., Wolinski F., Tomeh J., Guennou S., Dillet B., Aydjian S., High Precision Hypertext Navigation Based on NLP Automatic Extractions, Hypertext, Information Retrieval, Multimedia (HTM′97), Dortmund, Germany, (30): 161–174. October, 1997.
Andersen P.M., Hayes P.J., Huettner A.K., Nirenburg LB., Schmandt L.M. and Weinstein S.P. Automatic extraction of facts from press releases to generate news stories. In Proceedings of the Third Conference on Applied Natural Language Processing, pages 170–177. ACL, 1992.
ECRAN: Extraction of Content: Research at Near Market, http://www2.echo.lu/langeneg/en/le1/ecran/ecran.html
MUC5, 1993. Proceedings of the Fifth Message Understanding Conference, San Francisco, Calif.: Morgan Kaufmann.
MUC6, 1995. Proceedings of the Sixth Message Understanding Conference. San Francisco, Calif.: Morgan Kaufmann.
DARPA Speech and Natural Language Workshop, Harriman, NY, 1992.
AVENT1NUS: Advanced Information System for Multinational Drug Enforcement. http://www2.echo.lu/langeneg/en/lel/aventinus/aventinus.html
Evans R.and Hartley A.F., The traffic information collator. Expert Systems: The International Journal of Knowledge Engineering, 7(4):209–214, 1990.
Gaizauskas R., Evans R., Cahill L.J., Richardson I. and Walker J., Poetic: A system for gathering and disseminating traffic information. In S.G. Ritchie and G.T. Hendrickson, editors, Conference Preprints of the International Conference on Artificial Intelligence Applications in Transportation Engineering, pages 79–98, San Buenaventura, California, 1992.
Gaizauskas, R., Wilks, Y. «Information Extraction beyond Document Retrieval», University of Sheffield, Dept. of Computer Science, CS-97-10, 1997.
Cunningham, H., Wilks, Y., Gaizauskas, R., GATE — a General Architecture for Text Engineering, 16th Conference on Computational Linguistics (COLING′96), 274–279, 1996.
Gazdar G. and Mellish C, 1989. Natural Language Processing in Prolog. Addison-Wesley, 1989.
Paliouras G., Karkaletsis V. and Spyropoulos C.D., “Machine Learning for Domain-Adaptive Word Sense Disambiguation”. Proceedings of the LREC Workshop on “Adapting Lexical and Corpus Resources to Sublanguages and Applications”, Granada, Spain, May 26, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Karkaletsis, V., Spyropoulos, C.D., Petasis, G. (1999). Named Entity Recognition from Greek Texts: The GIE Project. In: Tzafestas, S.G. (eds) Advances in Intelligent Systems. International Series on Microprocessor-Based and Intelligent Systems Engineering, vol 21. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-4840-5_12
Download citation
DOI: https://doi.org/10.1007/978-94-011-4840-5_12
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-0393-6
Online ISBN: 978-94-011-4840-5
eBook Packages: Springer Book Archive