Towards Information Warehousing: A Case Study for Tweets
In this paper, we introduce the paradigm of information warehousing and provide a generic information warehouse infrastructure for social media to enable the storage and analysis of massive information volumes generated by users daily through these platforms. We illustrate the implementation of the proposed framework in the case of Twitter by giving a multidimensional model which consists of fact and dimension tables. The extracted Twitter stream is exploited to perform clustering analysis using the BSO-CLARA algorithm in order to discover topics. The obtained results are very promising and the information warehouse is expected to be applicable for other types of information such as scientific articles.
KeywordsInformation warehouse Multidimensional model Twitter Tweets clustering
- 2.Choo, C.W.: The Knowing Organization: How Organisations Use Information to Construct Meaning, Create Knowledge, and Make Decisions. Oxford University Press, New York, Oxford (2006)Google Scholar
- 5.Kamal, J., Pasuparthi, K., Rogers, P., Buskirk, J., Mekhjian, H.: Using an information warehouse to screen patients for clinical trials: a prototype. In: AMIA Annual Symposium Proceedings, p. 1004 (2005)Google Scholar
- 6.Khan, J.M.: Universal information warehouse system and method. Google Patents (2004). https://www.google.com/patents/US6735591
- 7.Post, A.R., Kurc, T., Cholleti, S., Gao, J., Lin, X., Bornstein, W., Cantrell, D., Levine, D., Hohmann, S., Saltz, J.H.: The Analytic Information Warehouse (AIW): a platform for analytics using electronic health record data. J. Biomed. Inf. 46(3), 410–424 (2013). https://doi.org/10.1016/j.jbi.2013.01.005 CrossRefGoogle Scholar