Abstract
The modeling process for text-data type used for analysis purposes is to give a special representation for this kind of unstructured data. The given representation offers a formal description for text data to enable an effective use of the information contained in the text. In this context, and in order to perform analysis on this unstructured data type, we propose the multidimensional semantic model (MSMTO). The proposed model is based on the object paradigm. The model integrates a new concept Semantic Content Object used to represent and organize the semantic of text data in a hierarchical format, to enable a semantic analysis at different levels of granularity. Our modeling approach considers the internal composition of text documents as a structural hierarchy, which allows the user to perform analysis on different hierarchical levels. Our model offers also flexibility, by considering the semantic content of text-data as a measure, a fact or even a dimension.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Attaf, S., Benblidia, N.: Modelisation multidimensionnelle des donnees textuelles ou en sommesnous? In: ASD Conference Proceedings, Conference maghrebine sur les avancees des systemes decisionnels, pp. 3–25 (2013)
MartÃn-Bautista, M.J., Molina, C., Tejeda, E., Vila, M.A.: Using textual dimensions in data warehousing processes. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010. CCIS, vol. 81, pp. 158–167. Springer, Heidelberg (2010)
Blei, D.M., Ng, A., Jordan, M.: Latent dirichlet allocation. Journal of Machine Learning Research 3(2), 993–1022 (2003)
Boukraa, D., Boussaid, O., Bentayeb, F., Zegour, D.: Modle multidimensionnel d’objets complexes: Du modele d’objets aux cubes d’objets complexes. Ingénierie des Systèmes d’Information 16 (2011)
Kimball, R.: The data warehouse toolkit: Practical Techniques for Building Dimensional Data Warehouses. John Wiley and Sons (1996)
Lin, C.X., Ding, B., Han, J., Zhu, F., Zhao, B.: Text cube: Computing ir measures for multidimensional text database analysis. In: Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pp. 905–910 (2008)
Mothe, J., Chrisment, C., Dousset, B., Alaux, J.: Doccube: Multi-dimensional visualisation and exploration of large document sets. Journal of the American Society for Information Science and Technology 54, 650–659 (2003)
Park, B.-K., Han, H., Song, I.-Y.: Xml-olap: A multidimensional analysis framework for xml warehouses. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2005. LNCS, vol. 3589, pp. 32–42. Springer, Heidelberg (2005)
Tounier, R.: Analyse en ligne (OLAP) de documents. Thèse de doctorat, Université Toulouse III. Paul Sabatier (2007)
Zhang, D., Zhai, C., Han, J.: Topic cube: Topic modeling for olap on multidimensional text databases. In: SDM 2009: Proceedings of the 2009 SIAM International Conference on Data Mining, Sparks, NV, USA, pp. 1124–1135 (2009)
Zhang, D., Zhai, C., Han, J.: Mitexcube:microtextcluster cube for online analysis of text cells. In: The NASA Conference on Intelligent Data Understanding (CIDU), pp. 204–218 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Attaf, S., Benblidia, N., Boussaid, O. (2014). The Multidimensional Semantic Model of Text Objects(MSMTO): A Framework for Text Data Analysis. In: Ait Ameur, Y., Bellatreche, L., Papadopoulos, G.A. (eds) Model and Data Engineering. MEDI 2014. Lecture Notes in Computer Science, vol 8748. Springer, Cham. https://doi.org/10.1007/978-3-319-11587-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-11587-0_12
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11586-3
Online ISBN: 978-3-319-11587-0
eBook Packages: Computer ScienceComputer Science (R0)