Dynamic Table: A Layered and Configurable Storage Structure in the Cloud

Cheng, Xu; Meng, Biping; Chen, Yuxin; Zhao, Peng; Li, Hongyan; Wang, Tengjiao; Yang, Dongqing

doi:10.1007/978-3-642-33050-6_21

Xu Cheng^25,27,
Biping Meng^25,27,
Yuxin Chen^25,27,
Peng Zhao^25,27,
Hongyan Li^26,27,
Tengjiao Wang^25,27 &
…
Dongqing Yang^25,27

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7419))

Included in the following conference series:

International Conference on Web-Age Information Management

828 Accesses
3 Citations

Abstract

Big data bring us not only constantly growing data volume, dynamic and elastic storage demands, diversified data structures, but also different data features. Apart from the traditional dense data, more and more “sparse” data emerged and account for the majority of the massive data. How to adapt to the characteristics of the sparse data without losing sight of the traits of the dense data is a challenge. To meet the differentiated storage demands and give a proper way to express the semantic of absent values, we proposed a 3-layered storage structure named “Dynamic Table” to represent the incomplete data. Our approach deliberates on the distributed storage requirements in the cloud and aims to support a hybrid row and column layout, which allows users to mix-and-match the two kinds of physical storage formats on demand. In addition, the original semantic of absent values is divided into two parts with distinct treatments. Specifically a four-valued logic is introduced. Experiments on synthetic and real-world data sets demonstrate that our approach combines the advantages of columnar storage and the merits of row-oriented store. The distinguished semantic of absent values are necessary to describe the missing values in sparse data set.

This work is supported by National Science and Technology Major Program for Core Electronic Devices, High-end Generic Chips and Basic Software Project of China under Grant No.2010ZX01042-001-003-05, 2010ZX01042-002-002-02, and Natural Science Foundation of China (NSFC) under grant numbers: 60973002, 61170003.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Beckmann, J.L., Halverson, A., Krishnamurthy, R., et al.: Extending RDBMSs to support sparse datasets using an interpreted attribute storage format. In: Proceedings of the 22nd International Conference on Data Engineering, ICDE, p. 58. IEEE Computer Society, Washington (2006)
Google Scholar
Yang, B., Qian, W., Zhou, A.: Using Wide Table to manage web data: a survey. Frontiers of Computer Science in China 2, 211–223 (2008)
Google Scholar
Eric, C., Beckmann, J., Naughton, J.: The case for a wide-table approach to manage sparse relational data sets. In: Proceedings of SIGMOD, pp. 821–832. ACM, New York (2007)
Google Scholar
Agrawal, R., Somani, A., Xu, Y.: Storage and querying of e-commerce data. In: Proceedings of the 27th International Conference on VLDB, pp. 169–180. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Google Scholar
Chang, F., Dean, J., Ghemawat, J., et al.: Bigtable: A Distributed Storage System for Structured Data. ACM Transactions on Computer Systems 26, 1–26 (2008)
Article MATH Google Scholar
Apache HBase, http://hbase.apache.org/
He, Y., Lee, R.B., Huai, Y., et al.: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp. 1199–1208. IEEE, Hannover (2011)
Google Scholar
Ailamaki, A., DeWitt, D., Hill, M., et al.: Weaving Relations for Cache Performance. In: Proceedings of the 27th International Conference on VLDB, pp. 149–158. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Google Scholar
Ramamurthy, R., DeWitt, D.J., Su, Q.: A Case for Fractured Mirrors. The International Journal on Very Large Data Bases 12, 89–101 (2003)
Article Google Scholar
Boncz, P., Zukowski, M., Nes, N.: MonetDB/X100: Hyper-pipelining query execution. In: Proceedings of the CIDR 2005, pp. 225–237. VLDB, San Francisco (2005)
Google Scholar
Stonebraker, M., Abadi, D.J., et al.: C-Store: A Column-oriented DBMS. In: Proceedings of the 31st International Conference on VLDB, pp. 553–564. VLDB Endowment, Trondheim (2005)
Google Scholar
Abadi, D.J., Madden, S.R., Hachem, N.: ColumnStores vs. RowStores: How Different Are They Really? In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 967–980. ACM, New York (2008)
Chapter Google Scholar
Copeland, G.P., Khoshafian, S.N.: A decomposition storage model. In: Proceedings of the 1985 ACM SIGMOD International Conference on Management of Data, pp. 268–279. ACM, New York (1985)
Chapter Google Scholar
Floratou, A., Patel, J.M., Shekita, E.J., Tata, S.: Column-Oriented Storage Techniques for MapReduce. Proceedings of the VLDB Endowment 4, 419–429 (2011)
Google Scholar
Zaniolo, C.: Database Relations with Null Values. Journal of Computer and System Sciences 28, 142–166 (1984)
Article MathSciNet MATH Google Scholar
Candan, K.S., Grant, J., Subrahmanian, V.S.: A Unified Treatment of Null Values Using Constraints. Information Sciences 98, 99–156 (1997)
Article Google Scholar
Codd, E.F.: Missing Information (Applicable and Inapplicable) in Relational database. In: Margaret, H.E. (ed.) ACM SIGMOD Record, vol. 15, pp. 53–53 (1986)
Google Scholar
Gessert, G.H.: Four Valued Logic for Relational Database Systems. ACM SIGMOD Record 19, 29–35 (1990)
Article Google Scholar
Vassiliou, Y.: NULL values in database management a denotational semantics approach. In: Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, pp. 162–169. ACM, New York (1979)
Chapter Google Scholar
Thusoo, A., Sarma, J.S., Jain, N.: Hive – A Petabyte Scale Data Warehouse Using Hadoop. In: 2010 IEEE 26th International Conference on ICDE, Long Beach, CA, pp. 996–1005 (2010)
Google Scholar
Abadi, D.J.: Column Stores For Wide and Sparse Data. In: Proceedings of CIDR, pp. 292–297 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of High Confidence Software Technologies, Peking University, Ministry of Education, China
Xu Cheng, Biping Meng, Yuxin Chen, Peng Zhao, Tengjiao Wang & Dongqing Yang
Key Laboratory of Machine Perception, Peking University, Ministry of Education, China
Hongyan Li
School of Electronics Engineering and Computer Science, Peking University, Beijing, 100871, China
Xu Cheng, Biping Meng, Yuxin Chen, Peng Zhao, Hongyan Li, Tengjiao Wang & Dongqing Yang

Authors

Xu Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Biping Meng
View author publications
You can also search for this author in PubMed Google Scholar
Yuxin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Peng Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Hongyan Li
View author publications
You can also search for this author in PubMed Google Scholar
Tengjiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dongqing Yang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing, National University of Singapore, Singapore
Zhifeng Bao
College of Computer Science and Technology, Zhejiang University, 38 ZheDa Road, 310027, Hangzhou, China
Yunjun Gao
Northeastern University, Shenyang, China
Yu Gu
Heilongjiang University, 150080, Harbin, China
Longjiang Guo
Department of Computer Science, Georgia State University, 34 Peachtree Street, Suite 1413, 30303, Atlanta, GA, USA
Yingshu Li
Renmin University of China, Beijing, China
Jiaheng Lu
School of Computer Science, Hangzhou Dianzi University, Hangzhou, China
Zujie Ren
School of Software, Tsinghua University, Beijing, China
Chaokun Wang
School of Information, Renmin University of China, 100872, Beijing, China
Xiao Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cheng, X. et al. (2012). Dynamic Table: A Layered and Configurable Storage Structure in the Cloud. In: Bao, Z., et al. Web-Age Information Management. WAIM 2012. Lecture Notes in Computer Science, vol 7419. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33050-6_21

Download citation

DOI: https://doi.org/10.1007/978-3-642-33050-6_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33049-0
Online ISBN: 978-3-642-33050-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics