Abstract
The standard approach to OLAP requires measures and dimensions of a cube to be known at the design stage. Besides, dimensions are required to be non-volatile, balanced and normalized. These constraints appear too rigid for many data sets, especially semi-structured ones, such as user-generated content in social networks and other web applications. We enrich the multidimensional analysis of such data via content-driven discovery of dimensions and classification hierarchies. Discovered elements are dynamic by nature and evolve along with the underlying data set.
We demonstrate the benefits of our approach by building a data warehouse for the public stream of the popular social network and microblogging service Twitter. Our approach allows to classify users by their activity, popularity, behavior as well as to organize messages by topic, impact, origin, method of generation, etc. Such capturing of the dynamic characteristic of the data adds more intelligence to the analysis and extends the limits of OLAP.
This work was partially supported by DFG Research Training Group GK-1042 ”Explorative Analysis and Visualization of Large Information Spaces”, University of Konstanz.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chaudhuri, S., Dayal, U., Ganti, V.: Database technology for decision support systems. Computer 34(12), 48–55 (2001)
Chen, J., Nairn, R., Nelson, L., Bernstein, M.S., Chi, E.H.: Short and tweet: experiments on recommending content from information streams. In: Proc. CHI, pp. 1185–1194. ACM (2010)
Dehne, F., Eavis, T., Rau-Chaplin, A.: Coarse Grained Parallel On-Line Analytical Processing (OLAP) for Data Mining. In: Alexandrov, V.N., Dongarra, J., Juliano, B.A., Renner, R.S., Tan, C.J.K. (eds.) ICCS 2001. LNCS, vol. 2074, pp. 589–598. Springer, Heidelberg (2001)
Dzeroski, S., Hristovski, D., Peterlin, B.: Using data mining and OLAP to discover patterns in a database of patients with y-chromosome deletions. In: Proceedings of the AMIA Symposium, p. 215. American Medical Informatics Association (2000)
Han, J.: OLAP mining: An integration of OLAP with data mining. In: Proc. of the 7th IFIP 2.6 Working Conf. on Database Semantics, DS-7 (1997)
Han, J., Chee, S., Chiang, J.: Issues for on-line analytical mining of data warehouses. In: Proc. of the Workshop on Research Issues on Data Mining and Knowledge Discovery, Seattle, Washington, pp. 2:1–2:5 (1998)
Hecht, B., Hong, L., Suh, B., Chi, E.H.: Tweets from justin bieber’s heart: the dynamics of the location field in user profiles. In: Proc. CHI, pp. 237–246 (2011)
Java, A., Song, X., Finin, T., Tseng, B.: Why we twitter: understanding microblogging usage and communities. In: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis, pp. 56–65. ACM (2007)
Kimball, R.: The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses. John Wiley & Sons, Inc., New York (1996)
MacLennan, J., Tang, Z., Crivat, B.: Mining OLAP Cubes, ch. 13, pp. 429–431. Wiley Publishing (2008)
Malinowski, E., Zimányi, E.: Hierarchies in a multidimensional model: From conceptual modeling to logical representation. Data & Knowledge Engineering 59(2), 348–377 (2006)
Mansmann, S.: Extending the OLAP Technology to Handle Non-Conventional and Complex Data. PhD thesis, Konstanz, Germany (2008)
Mansmann, S., Scholl, M.H.: Empowering the OLAP Technology to Support Complex Dimension Hierarchies. International Journal of Data Warehousing and Mining 3(4), 31–50 (2007) (Invited Paper)
Mathioudakis, M., Koudas, N.: Twittermonitor: trend detection over the twitter stream. In: Proceedings of the 2010 International Conference on Management of Data, pp. 1155–1158. ACM (2010)
Michelson, M., Macskassy, S.A.: Discovering users’ topics of interest on twitter: a first look. In: Proceedings of the Fourth Workshop on Analytics for Noisy Unstructured Text Data, AND 2010 (in Conjunction with CIKM 2010), Toronto, Ontario, Canada. ACM (October 26, 2010)
Phelan, O., McCarthy, K., Smyth, B.: Using twitter to recommend real-time topical news. In: Proceedings of the Third ACM Conference on Recommender Systems, pp. 385–388. ACM (2009)
Usman, M., Asghar, S., Fong, S.: A conceptual model for combining enhanced OLAP and data mining systems. In: Fifth International Joint Conference on INC, IMS and IDC, NCM 2009, pp. 1958–1963. IEEE (2009)
Zhu, H.: On-line analytical mining of association rules. PhD thesis, Simon Fraser University (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rehman, N.U., Mansmann, S., Weiler, A., Scholl, M.H. (2012). Discovering Dynamic Classification Hierarchies in OLAP Dimensions. In: Chen, L., Felfernig, A., Liu, J., Raś, Z.W. (eds) Foundations of Intelligent Systems. ISMIS 2012. Lecture Notes in Computer Science(), vol 7661. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34624-8_48
Download citation
DOI: https://doi.org/10.1007/978-3-642-34624-8_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34623-1
Online ISBN: 978-3-642-34624-8
eBook Packages: Computer ScienceComputer Science (R0)