Skip to main content

Dynamic Table: A Layered and Configurable Storage Structure in the Cloud

  • Conference paper
Web-Age Information Management (WAIM 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7419))

Included in the following conference series:

Abstract

Big data bring us not only constantly growing data volume, dynamic and elastic storage demands, diversified data structures, but also different data features. Apart from the traditional dense data, more and more “sparse” data emerged and account for the majority of the massive data. How to adapt to the characteristics of the sparse data without losing sight of the traits of the dense data is a challenge. To meet the differentiated storage demands and give a proper way to express the semantic of absent values, we proposed a 3-layered storage structure named “Dynamic Table” to represent the incomplete data. Our approach deliberates on the distributed storage requirements in the cloud and aims to support a hybrid row and column layout, which allows users to mix-and-match the two kinds of physical storage formats on demand. In addition, the original semantic of absent values is divided into two parts with distinct treatments. Specifically a four-valued logic is introduced. Experiments on synthetic and real-world data sets demonstrate that our approach combines the advantages of columnar storage and the merits of row-oriented store. The distinguished semantic of absent values are necessary to describe the missing values in sparse data set.

This work is supported by National Science and Technology Major Program for Core Electronic Devices, High-end Generic Chips and Basic Software Project of China under Grant No.2010ZX01042-001-003-05, 2010ZX01042-002-002-02, and Natural Science Foundation of China (NSFC) under grant numbers: 60973002, 61170003.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beckmann, J.L., Halverson, A., Krishnamurthy, R., et al.: Extending RDBMSs to support sparse datasets using an interpreted attribute storage format. In: Proceedings of the 22nd International Conference on Data Engineering, ICDE, p. 58. IEEE Computer Society, Washington (2006)

    Google Scholar 

  2. Yang, B., Qian, W., Zhou, A.: Using Wide Table to manage web data: a survey. Frontiers of Computer Science in China 2, 211–223 (2008)

    Google Scholar 

  3. Eric, C., Beckmann, J., Naughton, J.: The case for a wide-table approach to manage sparse relational data sets. In: Proceedings of SIGMOD, pp. 821–832. ACM, New York (2007)

    Google Scholar 

  4. Agrawal, R., Somani, A., Xu, Y.: Storage and querying of e-commerce data. In: Proceedings of the 27th International Conference on VLDB, pp. 169–180. Morgan Kaufmann Publishers Inc., San Francisco (2001)

    Google Scholar 

  5. Chang, F., Dean, J., Ghemawat, J., et al.: Bigtable: A Distributed Storage System for Structured Data. ACM Transactions on Computer Systems 26, 1–26 (2008)

    Article  MATH  Google Scholar 

  6. Apache HBase, http://hbase.apache.org/

  7. He, Y., Lee, R.B., Huai, Y., et al.: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp. 1199–1208. IEEE, Hannover (2011)

    Google Scholar 

  8. Ailamaki, A., DeWitt, D., Hill, M., et al.: Weaving Relations for Cache Performance. In: Proceedings of the 27th International Conference on VLDB, pp. 149–158. Morgan Kaufmann Publishers Inc., San Francisco (2001)

    Google Scholar 

  9. Ramamurthy, R., DeWitt, D.J., Su, Q.: A Case for Fractured Mirrors. The International Journal on Very Large Data Bases 12, 89–101 (2003)

    Article  Google Scholar 

  10. Boncz, P., Zukowski, M., Nes, N.: MonetDB/X100: Hyper-pipelining query execution. In: Proceedings of the CIDR 2005, pp. 225–237. VLDB, San Francisco (2005)

    Google Scholar 

  11. Stonebraker, M., Abadi, D.J., et al.: C-Store: A Column-oriented DBMS. In: Proceedings of the 31st International Conference on VLDB, pp. 553–564. VLDB Endowment, Trondheim (2005)

    Google Scholar 

  12. Abadi, D.J., Madden, S.R., Hachem, N.: ColumnStores vs. RowStores: How Different Are They Really? In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 967–980. ACM, New York (2008)

    Chapter  Google Scholar 

  13. Copeland, G.P., Khoshafian, S.N.: A decomposition storage model. In: Proceedings of the 1985 ACM SIGMOD International Conference on Management of Data, pp. 268–279. ACM, New York (1985)

    Chapter  Google Scholar 

  14. Floratou, A., Patel, J.M., Shekita, E.J., Tata, S.: Column-Oriented Storage Techniques for MapReduce. Proceedings of the VLDB Endowment 4, 419–429 (2011)

    Google Scholar 

  15. Zaniolo, C.: Database Relations with Null Values. Journal of Computer and System Sciences 28, 142–166 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  16. Candan, K.S., Grant, J., Subrahmanian, V.S.: A Unified Treatment of Null Values Using Constraints. Information Sciences 98, 99–156 (1997)

    Article  Google Scholar 

  17. Codd, E.F.: Missing Information (Applicable and Inapplicable) in Relational database. In: Margaret, H.E. (ed.) ACM SIGMOD Record, vol. 15, pp. 53–53 (1986)

    Google Scholar 

  18. Gessert, G.H.: Four Valued Logic for Relational Database Systems. ACM SIGMOD Record 19, 29–35 (1990)

    Article  Google Scholar 

  19. Vassiliou, Y.: NULL values in database management a denotational semantics approach. In: Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, pp. 162–169. ACM, New York (1979)

    Chapter  Google Scholar 

  20. Thusoo, A., Sarma, J.S., Jain, N.: Hive – A Petabyte Scale Data Warehouse Using Hadoop. In: 2010 IEEE 26th International Conference on ICDE, Long Beach, CA, pp. 996–1005 (2010)

    Google Scholar 

  21. Abadi, D.J.: Column Stores For Wide and Sparse Data. In: Proceedings of CIDR, pp. 292–297 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cheng, X. et al. (2012). Dynamic Table: A Layered and Configurable Storage Structure in the Cloud. In: Bao, Z., et al. Web-Age Information Management. WAIM 2012. Lecture Notes in Computer Science, vol 7419. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33050-6_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33050-6_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33049-0

  • Online ISBN: 978-3-642-33050-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics