Advertisement

Efficient Aggregation Query Processing for Large-Scale Multidimensional Data by Combining RDB and KVS

  • Yuya Watari
  • Atsushi Keyaki
  • Jun Miyazaki
  • Masahide Nakamura
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11029)

Abstract

This paper presents a highly efficient aggregation query processing method for large-scale multidimensional data. Recent developments in network technologies have led to the generation of a large amount of multidimensional data, such as sensor data. Aggregation queries play an important role in analyzing such data. Although relational databases (RDBs) support efficient aggregation queries with indexes that enable faster query processing, increasing data size may lead to bottlenecks. On the other hand, the use of a distributed key-value store (D-KVS) is key to obtaining scale-out performance for data insertion throughput. However, querying multidimensional data sometimes requires a full data scan owing to its insufficient support for indexes. The proposed method combines an RDB and D-KVS to use their advantages complementarily. In addition, a novel technique is presented wherein data are divided into several subsets called grids, and the aggregated values for each grid are precomputed. This technique improves query processing performance by reducing the amount of scanned data. We evaluated the efficiency of the proposed method by comparing its performance with current state-of-the-art methods and showed that the proposed method performs better than the current ones in terms of query and insertion.

Keywords

Multidimensional data Aggregation query RDB Distributed KVS 

Notes

Acknowledgements

This work was partly supported by JSPS KAKENHI Grant Numbers 15H02701, 16H02908, 17K12684, 18H03242, 18H03342, and ACT-I, JST.

References

  1. 1.
    Codd, E., Codd, S., Salley, C.: Providing OLAP (On-line Analytical Processing) to User-Analysts: An IT Mandate. Codd & Associates (1993)Google Scholar
  2. 2.
    Wang, J., Wu, S., Gao, H., Li, J., Ooi, B.C.: Indexing multi-dimensional data in a cloud system. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 591–602. ACM (2010)Google Scholar
  3. 3.
    Zhang, X., Ai, J., Wang, Z., Lu, J., Meng, X.: An efficient multi-dimensional index for cloud data management. In: Proceedings of the First International Workshop on Cloud Data Management, pp. 17–24. ACM (2009)Google Scholar
  4. 4.
    Li, X., Kim, Y.J., Govindan, R., Hong, W.: Multi-dimensional range queries in sensor networks. In: Proceedings of the 1st International Conference on Embedded Networked Sensor Systems, pp. 63–75. ACM (2003)Google Scholar
  5. 5.
    Escriva, R., Wong, B., Sirer, E.G.: Hyperdex: a distributed, searchable key-value store. ACM SIGCOMM Comput. Commun. Rev. 42(4), 25–36 (2012)CrossRefGoogle Scholar
  6. 6.
    Nishimura, S., Das, S., Agrawal, D., El Abbadi, A.: \(\cal{MD}\)-hbase: design and implementation of an elastic data infrastructure for cloud-scale location services. Distrib. Parallel Databases 31(2), 289–319 (2013)CrossRefGoogle Scholar
  7. 7.
    Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)CrossRefGoogle Scholar
  8. 8.
    Lu, H., Tan, K.L., Ooi, B.-C.: Query Processing in Parallel Relational Database Systems. IEEE Computer Society Press, Los Alamitos (1994)Google Scholar
  9. 9.
    Özsu, M.T., Valduriez, P.: Principles of Distributed Database Systems. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-1-4419-8834-8CrossRefGoogle Scholar
  10. 10.
    Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)CrossRefGoogle Scholar
  11. 11.
    Cooper, B.F., et al.: PNUTS: Yahoo!’s hosted data serving platform. Proc. VLDB Endow. 1(2), 1277–1288 (2008)CrossRefGoogle Scholar
  12. 12.
    Redis: Redis. https://redis.io/
  13. 13.
    DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. ACM SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)CrossRefGoogle Scholar
  14. 14.
    Morton, G.M.: A computer oriented geodetic data base and a new technique in file sequencing. In: International Business Machines Company New York (1966)Google Scholar
  15. 15.
    Hilbert, D.: Ueber die stetige abbildung einer line auf ein flächenstück. Math. Ann. 38(3), 459–460 (1891)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, SIGMOD 1984, pp. 47–57. ACM, New York (1984)Google Scholar
  17. 17.
    Finkel, R.A., Bentley, J.L.: Quad trees a data structure for retrieval on composite keys. Acta Inf. 4(1), 1–9 (1974)CrossRefGoogle Scholar
  18. 18.
    Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)CrossRefGoogle Scholar
  19. 19.
    Nishimura, S., Yokota, H.: Quilts: multidimensional data partitioning framework based on query-aware and skew-tolerant space-filling curves. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 1525–1537. ACM (2017)Google Scholar
  20. 20.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  21. 21.
    Eldawy, A., Mokbel, M.F.: SpatialHadoop: a MapReduce framework for spatial data. In: 2015 IEEE 31st International Conference on Data Engineering, pp. 1352–1363, April 2015Google Scholar
  22. 22.
    Korry Douglas, S.D.: PostgreSQL: A Comprehensive Guide to Building, Programming, and Administering PostgresSQL Databases. Sams Publishing, Indianapolis (2003)Google Scholar
  23. 23.
    The Apache Software Foundation: Apache HBase. https://hbase.apache.org/
  24. 24.
    Brinkhoff, T.: A framework for generating network-based moving objects. GeoInformatica 6(2), 153–180 (2002)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Yuya Watari
    • 1
  • Atsushi Keyaki
    • 1
  • Jun Miyazaki
    • 1
  • Masahide Nakamura
    • 2
  1. 1.Department of Computer Science, School of ComputingTokyo Institute of TechnologyTokyoJapan
  2. 2.Kobe UniversityKobeJapan

Personalised recommendations