Abstract
With the increase of data, traditional methods of data processing have become time and power inefficient. As enhancement, we propose a new accelerated architecture for querying big Databases. This architecture combines the advantages of the HDFS for the management of huge amount of data and the fast processing of queries of Spark SQL. It also benefits of the processing efficiency of the hardware acceleration of FPGAs and of the semantic caching architecture to process recently used data stored in the cache.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Soomro, T.R., Shoro, A.G.: Big data analysis: Apache spark perspective. Glob. J. Comput. Sci. Technol. (2015)
Armbrust, M., et al.: Spark SQL: relational data processing in spark. In: SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, pp. 1383–1394 (2015)
Bansod, A.: Efficient big data analysis with apache spark in HDFS. Int. J. Eng. Adv. Technol. (IJEAT) 4(6), 313–316 (2015)
Becher, A., Ziener, D., Meyer-Wegener, K., Teich, J.: A co-design approach for accelerated SQL query processing via FPGA-based data filtering. In: International Conference on Field Programmable Technology (FPT), Queenstown, New Zealand, pp. 192–195 (2015)
Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mobile Netw. Appl. (MONET) 19(2), 171–209 (2014)
Cisco Global Cloud: Cisco global cloud index: Forecast and methodology, 2016–2021 white paper. Technical report, Cisco (2010)
Dar, S., Franklin, M.J., Jónsson, B.T., Srivastava, D., Tan, M.: Semantic data caching and replacement. In: International Conference on Very Large Data Bases (VLDB), Mumbai (Bombay), India, pp. 330–341 (1996)
Dennl, C., Ziener, D., Teich, J.: On-the-fly composition of FPGA-based SQL query accelerators using a partially reconfigurable module library. In: International Symposium on Field-Programmable Custom Computing Machines (FCCM), Toronto, Ontario, Canada, pp. 45–52 (2012)
Esmaeilzadeh, H., Blem, E.R., Amant, R.S., Sankaralingam, K., Burger, D.: Dark silicon and the end of multicore scaling. IEEE Micro 32(3), 122–134 (2012)
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. In: Symposium on Operating Systems Principles (SOSP), Bolton Landing, NY, USA, pp. 29–43 (2003)
Jacobsen, M., Richmond, D., Hogains, M., Kastner, R.: RIFFA 2.1: a reusable integration framework for FPGA accelerators. ACM Trans. Reconfig. Technol. Syst. 8(4), 22:1–22:23 (2015)
Manikandan, S.G., Ravi, S.: Big data analysis using Apache Hadoop. In: International Conference on IT Convergence and Security (ICITCS), Beijing, China (2014)
Ross, P.E.: Why CPU frequency stalled. IEEE Spectr. 45(4), 72 (2008)
Sidler, D., István, Z., Owaida, M., Kara, K., Alonso, G.: doppioDB: a hardware accelerated database. In: International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, pp. 1659–1662 (2017)
Teubner, J.: FPGAs for data processing: current state. Inf. Technol. (IT) 59(3), 125 (2017)
Theis, T.N., Wong, H.P.: The end of Moore’s law: a new beginning for information technology. Comput. Sci. Eng. 19(2), 41–50 (2017)
Vancea, A., Stiller, B.: CoopSC: a cooperative database caching architecture. In: 2010 International Workshops on Enabling Technologies: Infrastructures for Collaborative Enterprises (WETICE), Larissa, Greece, pp. 223–228 (2010)
Ziener, D., et al.: FPGA-based dynamically reconfigurable SQL query processing. ACM Trans. Reconfig. Technol. Syst. (TRETS) 9(4), 25:1–25:24 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Maghzaoui, M., d’Orazio, L., Lallet, J. (2019). Toward FPGA-Based Semantic Caching for Accelerating Data Analysis with Spark and HDFS. In: Kotzinos, D., Laurent, D., Spyratos, N., Tanaka, Y., Taniguchi, Ri. (eds) Information Search, Integration, and Personalization. ISIP 2018. Communications in Computer and Information Science, vol 1040. Springer, Cham. https://doi.org/10.1007/978-3-030-30284-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-30284-9_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30283-2
Online ISBN: 978-3-030-30284-9
eBook Packages: Computer ScienceComputer Science (R0)