Abstract
In order to reduce the large network overhead and the heavy cost of cross-match on the astronomical catalog in the database cluster, we proposed a novel method of cross-matches based on Roaring Bitmap. Firstly, we store astronomical catalog data in column-oriented storage with compression setup to reduce I/O overhead of accessing field in the parallel database system. Secondly, we create the spatial index, which maps the 2D coordinates into integer number. Then, using Roaring Bitmap convert the spatial index into a bitmap index. Finally, the received spatial range search of cross-match is translated into bitmap operations to achieve batch processing. The experiments over the real large-scale astronomical data show that the proposed method is 4 to 10 times faster than traditional method, meanwhile, only consume less than 10% of memory resource.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Metchev, S., et al.: A cross-match of 2MASS and SDSS: newly-found L and T dwarfs and an estimate of the space densitfy of T dwarfs. Astrophys. J. 676(2), 1281–1306 (2012)
Detti, A., et al.: OpenGeoBase: information centric networking meets spatial database applications. In: GLOBECOM Workshops IEEE (2017)
Obe, R., Hsu, L.: PostGIS in Action. Geoinformatics (2015)
Koposov, S., Bartunov, O.: Q3C, quad tree cube – the new sky-indexing concept for huge astronomical catalogues and its realization for main astronomical queries (cone search and Xmatch) in open source database PostgreSQL. Astronom. Data Anal. Softw. Syst. XV, 735 (2006)
Calabretta, M.R., Roukema, B.F.: Mapping on the HEALPix grid. Mon. Not. Roy. Astronom. Soc. 381(2), 865–872 (2010)
Gray, J., Nieto-Santisteban, M.A., Szalay, A.S.: The zones algorithm for finding points-near-a-point or cross-matching spatial datasets. Microsoft Research (2007)
Bonnarel, F., et al.: The ALADIN interactive sky atlas - a reference tool for identification of astronomical sources. Astron. Astrophys. Suppl. 143(1), 33–40 (2000)
Zhao, Q., et al.: A paralleled large-scale astronomical cross-matching function. In: Algorithms and Architectures for Parallel Processing, International Conference, ICA3PP 2009, Taipei, Taiwan, 8–11 June 2009, Proceedings DBLP, pp. 604–614 (2009)
Stonebraker, M., et al.: C-store: a column-oriented DBMS. In: International Conference on Very Large Data Bases, Trondheim, Norway, 30 August–September, DBLP, pp. 553–564 (2005)
Abadi, D., Madden, S., Ferreira, M.: Integrating compression and execution in column-oriented database systems. In: ACM SIGMOD International Conference on Management of Data, Chicago, Illinois, USA, June, DBLP, pp. 671–682 (2006)
Waas, F.M.: Beyond conventional data warehousing — massively parallel data processing with greenplum database. In: Informal Proceedings of the Second International Workshop on Business Intelligence for the Real-Time Enterprise, BIRTE 2008, in Conjunction with VLDB 2008, 24 August 2008, Auckland, New Zealand, DBLP, pp. 89–96 (2008)
Chambi, S., et al.: Better bitmap performance with Roaring Bitmaps. Softw. Pract. Exp. 46(5), 709–719 (2016)
Bayo, A., et al.: VOSA: Virtual Observatory SED Analyzer: an application to the Collinder 69 open cluster. Astron. Astrophys. 492(1), 277–287 (2008)
Pence, W.D.: CFITSIO: a FITS file subroutine library. Astrophysics Source Code Library (2010)
Wu, K.: FastBit: an efficient indexing technology for accelerating data. Intensive Sci. 16(1), 556–560 (2005)
Lemire, D., Ssi-Yan-Kai, G., Kaser, O.: Consistently faster and smaller compressed bitmaps with roaring. Softw. Pract. Exp. 46(11), 1547–1569 (2016)
Wang, J., et al.: An experimental study of bitmap compression vs. inverted list compression. In: ACM International Conference ACM, pp. 993–1008 (2017)
Wu, K., Otoo, E., Shoshani, A.: On the performance of bitmap indices for high cardinality attributes. In: Vldb: International Conference on Very Large Data Bases, pp. 24–35 (2004)
Petropoulos, M., et al.: Optimization of common table expressions in MPP database systems. Proc. Vldb Endowment 8(12), 1704–1715 (2015)
Nobari, S., et al.: TOUCH: in-memory spatial join by hierarchical data-oriented partitioning. In: ACM SIGMOD International Conference on Management of Data ACM, pp. 701–712 (2013)
Soliman, M.A., et al.: Orca: a modular query optimizer architecture for big data. ACM (2014)
Antova, L., El-Helw, A., Soliman, M.A., et al.: Optimizing queries over partitioned tables in MPP systems. In: SIGMOD, pp. 373–384 (2014)
Acknowledgements
This work was supported by the Fund by The National Natural Science Foundation of China (Grant No. 61462012, No. 61562010, No. U1531246), Guizhou University Graduate Innovation Fund (Grant No. 2017081) and the Innovation Team of the Data Analysis and Cloud Service of Guizhou Province (Grant No. [2015]53), Science and Technology Project of the Department of Science and Technology in Guizhou Province (Grant No. LH [2016]7427).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Zhang, J., Li, H., Chen, M., Dai, Z., Zhu, M. (2019). Accelerating Massive Astronomical Cross-Match Based on Roaring Bitmap over Parallel Database System. In: Silhavy, R. (eds) Software Engineering and Algorithms in Intelligent Systems. CSOC2018 2018. Advances in Intelligent Systems and Computing, vol 763. Springer, Cham. https://doi.org/10.1007/978-3-319-91186-1_39
Download citation
DOI: https://doi.org/10.1007/978-3-319-91186-1_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91185-4
Online ISBN: 978-3-319-91186-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)