SAUText — A System for Analysis of Unstructured Textual Data

  • Grzegorz Protaziuk
  • Jacek Lewandowski
  • Robert Bembenik
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8502)


Nowadays semantic lexical resources, like ontologies, are becoming increasingly important in many systems, in particular those providing access to structured textual data. Typically such resources are built based on already existing repositories and by analyzing available texts. In practice, however, building new or enriching existing resources of such type cannot be accomplished without using an appropriate tool. In this paper we present SAUText – a new system which provides infrastructure for carrying out research involving usage of semantic resources and analyzing unstructured textual data. In the system we use dedicated repository for storing various kinds of text data and take advantage of parallelization in order to speed up the analysis.


Text mining text analysis system ontology enrichment 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Buitelaar, P., Olejnik, D., Sintek, M.: A protege plug-in for ontology extraction from text based on linguistic analysis. In: Proceedings of the 1st European Semantic Web Symposium (ESWS), Heraklion, Greece (2004)Google Scholar
  2. 2.
    Cimiano, P., Mdche, A., Staab, S., Völker, J.: Ontology learning. In: Staab, S., Studer, R. (eds.) Handbook on Ontologies, International Handbooks on Information Systems, pp. 245–267. Springer, Heidelberg (2009)Google Scholar
  3. 3.
    Cimiano, P., Völker, J.: Text2onto - a framework for ontology learning and data-driven change discovery. In: Montoyo, A., Muńoz, R., Métais, E. (eds.) NLDB 2005. LNCS, vol. 3513, pp. 227–238. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Kao, A., Poteet, S.R.: Natural Language Processing and Text Mining. Springer (2007)Google Scholar
  5. 5.
    Maedche, A., Volz, R.: The Text-To-Onto Ontology Extraction and Maintenance System. In: Workshop on Integrating Data Mining and Knowledge Management Co-Located with the 1st International Conference on Data Mining, San Jose, California, USA (November 2001)Google Scholar
  6. 6.
    Maynard, D., Funk, A., Peters, W.: Sprat: a tool for automatic semantic pattern-based ontology population. In: International Conference for Digital Libraries and Semantic Web (2009)Google Scholar
  7. 7.
    Poon, H., Domingos, P.: Unsupervised ontology induction from text. In: Hajic, J., Carberry, S., Clark, S. (eds.) ACL, pp. 296–305. The Association for Computer Linguistics (2010)Google Scholar
  8. 8.
    Protaziuk, G., Kryszkiewicz, M., Rybiński, H., Delteil, A.: Discovering compound and proper nouns. In: Kryszkiewicz, M., Peters, J.F., Rybiński, H., Skowron, A. (eds.) RSEISP 2007. LNCS (LNAI), vol. 4585, pp. 505–515. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  9. 9.
    Velardi, P., Navigli, R., Cucchiarelli, A., Neri, F.: Evaluation of OntoLearn, a methodology for automatic population of domain ontologies. In: Buitelaar, P., Cimiano, P., Magnini, B. (eds.) Ontology Learning from Text: Methods, Applications and Evaluation. IOS Press (2006)Google Scholar
  10. 10.
    Weiss, S.M., Indurkhya, N., Zhang, T.: Fundamentals of Predictive Text Mining. Texts in Computer Science. Springer (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Grzegorz Protaziuk
    • 1
  • Jacek Lewandowski
    • 1
  • Robert Bembenik
    • 1
  1. 1.Institute of Computer ScienceWarsaw University of TechnologyWarsawPoland

Personalised recommendations