SAUText — A System for Analysis of Unstructured Textual Data

Protaziuk, Grzegorz; Lewandowski, Jacek; Bembenik, Robert

doi:10.1007/978-3-319-08326-1_43

SAUText — A System for Analysis of Unstructured Textual Data

Grzegorz Protaziuk²²,
Jacek Lewandowski²² &
Robert Bembenik²²

Conference paper

1551 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8502))

Abstract

Nowadays semantic lexical resources, like ontologies, are becoming increasingly important in many systems, in particular those providing access to structured textual data. Typically such resources are built based on already existing repositories and by analyzing available texts. In practice, however, building new or enriching existing resources of such type cannot be accomplished without using an appropriate tool. In this paper we present SAUText – a new system which provides infrastructure for carrying out research involving usage of semantic resources and analyzing unstructured textual data. In the system we use dedicated repository for storing various kinds of text data and take advantage of parallelization in order to speed up the analysis.

This work is supported by the National Centre for Research and Development (NCBiR) under Grant No. SP/I/1/77065/10 by the Strategic scientific research and experimental development program: Interdisciplinary System for Interactive Scientific and Scientific-Technical Information.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Buitelaar, P., Olejnik, D., Sintek, M.: A protege plug-in for ontology extraction from text based on linguistic analysis. In: Proceedings of the 1st European Semantic Web Symposium (ESWS), Heraklion, Greece (2004)
Google Scholar
Cimiano, P., Mdche, A., Staab, S., Völker, J.: Ontology learning. In: Staab, S., Studer, R. (eds.) Handbook on Ontologies, International Handbooks on Information Systems, pp. 245–267. Springer, Heidelberg (2009)
Google Scholar
Cimiano, P., Völker, J.: Text2onto - a framework for ontology learning and data-driven change discovery. In: Montoyo, A., Muńoz, R., Métais, E. (eds.) NLDB 2005. LNCS, vol. 3513, pp. 227–238. Springer, Heidelberg (2005)
Chapter Google Scholar
Kao, A., Poteet, S.R.: Natural Language Processing and Text Mining. Springer (2007)
Google Scholar
Maedche, A., Volz, R.: The Text-To-Onto Ontology Extraction and Maintenance System. In: Workshop on Integrating Data Mining and Knowledge Management Co-Located with the 1st International Conference on Data Mining, San Jose, California, USA (November 2001)
Google Scholar
Maynard, D., Funk, A., Peters, W.: Sprat: a tool for automatic semantic pattern-based ontology population. In: International Conference for Digital Libraries and Semantic Web (2009)
Google Scholar
Poon, H., Domingos, P.: Unsupervised ontology induction from text. In: Hajic, J., Carberry, S., Clark, S. (eds.) ACL, pp. 296–305. The Association for Computer Linguistics (2010)
Google Scholar
Protaziuk, G., Kryszkiewicz, M., Rybiński, H., Delteil, A.: Discovering compound and proper nouns. In: Kryszkiewicz, M., Peters, J.F., Rybiński, H., Skowron, A. (eds.) RSEISP 2007. LNCS (LNAI), vol. 4585, pp. 505–515. Springer, Heidelberg (2007)
Chapter Google Scholar
Velardi, P., Navigli, R., Cucchiarelli, A., Neri, F.: Evaluation of OntoLearn, a methodology for automatic population of domain ontologies. In: Buitelaar, P., Cimiano, P., Magnini, B. (eds.) Ontology Learning from Text: Methods, Applications and Evaluation. IOS Press (2006)
Google Scholar
Weiss, S.M., Indurkhya, N., Zhang, T.: Fundamentals of Predictive Text Mining. Texts in Computer Science. Springer (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665, Warsaw, Poland
Grzegorz Protaziuk, Jacek Lewandowski & Robert Bembenik

Authors

Grzegorz Protaziuk
View author publications
You can also search for this author in PubMed Google Scholar
Jacek Lewandowski
View author publications
You can also search for this author in PubMed Google Scholar
Robert Bembenik
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Research Group PLIS: Programming, Logic and Intelligent Systems Dept. of Communication, Business and Information Technologies, Roskilde University, Denmark
Troels Andreasen & Henning Christiansen &
Department of Computer Science and Artificial Intelligence, CITIC, University of Granada, 18071, Granada, Spain
Juan-Carlos Cubero
University of North Carolina, , , 9201 University City Blvd, Charlotte, NC 28223 USA, and Warsaw University of Technology, ul. Nowowiejska 15/19, 00-665 Warsaw, Poland
Zbigniew W. Raś

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Protaziuk, G., Lewandowski, J., Bembenik, R. (2014). SAUText — A System for Analysis of Unstructured Textual Data. In: Andreasen, T., Christiansen, H., Cubero, JC., Raś, Z.W. (eds) Foundations of Intelligent Systems. ISMIS 2014. Lecture Notes in Computer Science(), vol 8502. Springer, Cham. https://doi.org/10.1007/978-3-319-08326-1_43

Download citation

DOI: https://doi.org/10.1007/978-3-319-08326-1_43
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08325-4
Online ISBN: 978-3-319-08326-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics