Abstract
Data exploration and visual analytics systems are of great importance in Open Science scenarios, where less tech-savvy researchers wish to access and visually explore big raw data files (e.g., json, csv) generated by scientific experiments using commodity hardware and without being overwhelmed in the tedious processes of data loading, indexing and query optimization. In this work, we present our work for enabling efficient query processing on raw data files for interactive visual exploration scenarios. We introduce a framework, named RawVis, built on top of a lightweight in-memory tile-based index, VALINOR, that is constructed on-the-fly given the first user query over a raw file and adapted based on the user interaction. We evaluate the performance of prototype implementation compared to three other alternatives and show that our method outperforms in terms of response time, disk accesses and memory consumption.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Available at: https://research.yahoo.com.
- 2.
- 3.
- 4.
The source code is available at https://github.com/Ploigia/RawVis.
References
Alagiannis, I., Borovica, R., Branco, M., Idreos, S., Ailamaki, A.: NoDB: efficient query execution on raw data files. In: SIGMOD (2012)
Battle, L., Chang, R., Stonebraker, M.: Dynamic prefetching of data tiles for interactive visualization. In: SIGMOD 2016 (2016)
Bikakis, N., Liagouris, J., Krommyda, M., Papastefanatos, G., Sellis, T.: GraphVizdb: a scalable platform for interactive large graph visualization. In: ICDE (2016)
Bikakis, N., Papastefanatos, G., Skourla, M., Sellis, T.: A hierarchical aggregation framework for efficient multilevel visual exploration and analysis. Semant. Web J. 8, 139–179 (2017)
Blanas, S., Wu, K., Byna, S., Dong, B., Shoshani, A.: Parallel data analysis directly on scientific file formats. In: SIGMOD (2014)
Cheng, Y., Rusu, F.: SCANRAW: a database meta-operator for parallel in-situ processing and loading. ACM Trans. Database Syst. 40(3), 1–45 (2015)
Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: VLDB (1997)
de Lara Pahins, C.A., Stephens, S.A., Scheidegger, C., Comba, J.L.D.: Hashedcubes: simple, low memory, real-time visual exploration of big data. TVCG 23(1), 671–680 (2017)
El-Hindi, M., Zhao, Z., Binnig, C., Kraska, T.: VisTrees: fast indexes for interactive data exploration. In: HILDA (2016)
Hwang, S., Kwon, K., Cha, S.K., Lee, B.S.: Performance evaluation of main-memory R-tree variants. In: Hadzilacos, T., Manolopoulos, Y., Roddick, J., Theodoridis, Y. (eds.) SSTD 2003. LNCS, vol. 2750, pp. 10–27. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45072-6_2
Idreos, S., Alagiannis, I., Johnson, R., Ailamaki, A.: Here are my data files. Here are my queries. Where are my results? In: CIDR (2011)
Ivanova, M., Kersten, M.L., Manegold, S., Kargin, Y.: Data vaults database technology for scientific file repositories. Comput. Sci. Eng. 15(3), 32–42 (2013)
Jugel, U., Jerzak, Z., Hackenbroich, G., Markl, V.: VDDA: automatic visualization-driven data aggregation in relational databases. VLDBJ 25, 53–77 (2015)
Kalinin, A., Çetintemel, U., Zdonik, S.B.: Interactive data exploration using semantic windows. In: SIGMOD (2014)
Karpathiotakis, M., Branco, M., Alagiannis, I., Ailamaki, A.: Adaptive query processing on raw data. PVLDB 7(12), 1119–1130 (2014)
Olma, M., Karpathiotakis, M., Alagiannis, I., Athanassoulis, M., Ailamaki, A.: Slalom: coasting through raw data via adaptive partitioning and indexing. PVLDB 10(10), 1106–1117 (2017)
Tian, Y., Alagiannis, I., Liarou, E., Ailamaki, A., Michiardi, P., Vukolic, M.: DiNoDB: an interactive-speed query engine for ad-hoc queries on temporary data. IEEE TBD (2017)
Acknowledgments
This research is implemented through the Operational Program “Human Resources Development, Education and Lifelong Learning” and is co-financed by the European Union (European Social Fund) and Greek national funds.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Bikakis, N., Maroulis, S., Papastefanatos, G., Vassiliadis, P. (2018). RawVis: Visual Exploration over Raw Data. In: Benczúr, A., Thalheim, B., Horváth, T. (eds) Advances in Databases and Information Systems. ADBIS 2018. Lecture Notes in Computer Science(), vol 11019. Springer, Cham. https://doi.org/10.1007/978-3-319-98398-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-98398-1_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98397-4
Online ISBN: 978-3-319-98398-1
eBook Packages: Computer ScienceComputer Science (R0)