Skip to main content

Extracting Dimensions for OLAP on Multidimensional Text Databases

  • Conference paper
Web Information Systems and Mining (WISM 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6988))

Included in the following conference series:

  • 1367 Accesses

Abstract

With the amount of textual information massively growing in various kinds of business systems and Internet, there are increasingly demands for analyzing both structured data and unstructured text data. Online Analysis Processing (OLAP) is effective for analyzing and mining structured data. However, while handling with unstructured data, it is powerless. After working on several information integration and data analysis applications, we have realized the defect of OLAP on text data analysis and use technical ways to handle this issue. In this paper, we propose a semi-supervised algorithm to extract dimensions and their members from textual information for the purpose of analyzing a huge set of textual data. We use straightforward measures to express analysis results. Experiment result shows that the extracting algorithm is valid and our approach has a high scalability and flexibility.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwal, S., Agrawal, R., Deshpande, P., Gupta, A., Naughton, J.F., Ramakrishnan, R., Sarawagi, S.: On the computation of multidimensional aggregates. In: VLDB, pp. 506–521 (1996)

    Google Scholar 

  2. Chaudhuri, S., Dayal, U.: An overview of data warehousing and olap technology. SIGMOD Rec. 26, 65–74 (1997)

    Article  Google Scholar 

  3. Gray, J., Bosworth, A., Layman, A., Pirahesh, H.: Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. In: ICDE, p. 152 (1996)

    Google Scholar 

  4. Wu, T., Xin, D., Mei, Q.: Promotion analysis in multi-dimensional space. In: VLDB 2009 (2009)

    Google Scholar 

  5. Inokuchi, A., Takeda, K.: A Method for Online Analytical Processing of Text Data. ACM, New York (2007)

    Book  Google Scholar 

  6. Baid, A., Balmin, A., Hwang, H.: DBPubs: multidimensional exploration of database publications. ACM, New York (2008)

    Google Scholar 

  7. Lin, C.X., Ding, B., Han, J., Zhu, F., Zhao, B.: Text Cube: Computing IR Measures for Multidimensional Text Database Analysis. In: ICDM (2008)

    Google Scholar 

  8. Cody, W.F., Kreulen, J.T., Krishna, V., Spangler, W.S.: The integration of business intelligence and knowledge management. IBM Syst. J. 41, 697–713 (2002)

    Article  Google Scholar 

  9. Megaputer’s polyanalyst, http://www.megaputer.com/

  10. Yu, Y., Lin, C.X., Sun, Y.: iNextCube: Information network-enhanced text cube. ACM, New York (2009)

    Google Scholar 

  11. Simitsis, A., Baid, A., Sismanis, Y., Reinwald, B.: VLDB 2008 Multidimensional Content eXploration (2008)

    Google Scholar 

  12. Zhang, D., Zhai, C., Han, J.: Topic Cube: Topic Modeling for OLAP on Multidimensional Text Databases. In: SDM (2009)

    Google Scholar 

  13. Liu, Y.: Semi-Supervised Learning of Attribute-Value Pairs from Product Descriptions. In: IJCAI 2007 (2007)

    Google Scholar 

  14. Brefeld, U.: Co-EM support vector learning. In: Conference on Machine Learning (2004)

    Google Scholar 

  15. Stanford Log-linear Part-Of-Speech Tagger, http://nlp.stanford.edu/software/tagger.shtml

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, C., Wang, X., Peng, Z. (2011). Extracting Dimensions for OLAP on Multidimensional Text Databases. In: Gong, Z., Luo, X., Chen, J., Lei, J., Wang, F.L. (eds) Web Information Systems and Mining. WISM 2011. Lecture Notes in Computer Science, vol 6988. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23982-3_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23982-3_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23981-6

  • Online ISBN: 978-3-642-23982-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics