Toward FPGA-Based Semantic Caching for Accelerating Data Analysis with Spark and HDFS

Maghzaoui, Marouan; d’Orazio, Laurent; Lallet, Julien

doi:10.1007/978-3-030-30284-9_7

Marouan Maghzaoui¹²,
Laurent d’Orazio¹³ &
Julien Lallet¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1040))

Included in the following conference series:

International Workshop on Information Search, Integration, and Personalization

178 Accesses
1 Citations

Abstract

With the increase of data, traditional methods of data processing have become time and power inefficient. As enhancement, we propose a new accelerated architecture for querying big Databases. This architecture combines the advantages of the HDFS for the management of huge amount of data and the fast processing of queries of Spark SQL. It also benefits of the processing efficiency of the hardware acceleration of FPGAs and of the semantic caching architecture to process recently used data stored in the cache.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Soomro, T.R., Shoro, A.G.: Big data analysis: Apache spark perspective. Glob. J. Comput. Sci. Technol. (2015)
Google Scholar
Armbrust, M., et al.: Spark SQL: relational data processing in spark. In: SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, pp. 1383–1394 (2015)
Google Scholar
Bansod, A.: Efficient big data analysis with apache spark in HDFS. Int. J. Eng. Adv. Technol. (IJEAT) 4(6), 313–316 (2015)
Google Scholar
Becher, A., Ziener, D., Meyer-Wegener, K., Teich, J.: A co-design approach for accelerated SQL query processing via FPGA-based data filtering. In: International Conference on Field Programmable Technology (FPT), Queenstown, New Zealand, pp. 192–195 (2015)
Google Scholar
Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mobile Netw. Appl. (MONET) 19(2), 171–209 (2014)
Article Google Scholar
Cisco Global Cloud: Cisco global cloud index: Forecast and methodology, 2016–2021 white paper. Technical report, Cisco (2010)
Google Scholar
Dar, S., Franklin, M.J., Jónsson, B.T., Srivastava, D., Tan, M.: Semantic data caching and replacement. In: International Conference on Very Large Data Bases (VLDB), Mumbai (Bombay), India, pp. 330–341 (1996)
Google Scholar
Dennl, C., Ziener, D., Teich, J.: On-the-fly composition of FPGA-based SQL query accelerators using a partially reconfigurable module library. In: International Symposium on Field-Programmable Custom Computing Machines (FCCM), Toronto, Ontario, Canada, pp. 45–52 (2012)
Google Scholar
Esmaeilzadeh, H., Blem, E.R., Amant, R.S., Sankaralingam, K., Burger, D.: Dark silicon and the end of multicore scaling. IEEE Micro 32(3), 122–134 (2012)
Article Google Scholar
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. In: Symposium on Operating Systems Principles (SOSP), Bolton Landing, NY, USA, pp. 29–43 (2003)
Google Scholar
Jacobsen, M., Richmond, D., Hogains, M., Kastner, R.: RIFFA 2.1: a reusable integration framework for FPGA accelerators. ACM Trans. Reconfig. Technol. Syst. 8(4), 22:1–22:23 (2015)
Article Google Scholar
Manikandan, S.G., Ravi, S.: Big data analysis using Apache Hadoop. In: International Conference on IT Convergence and Security (ICITCS), Beijing, China (2014)
Google Scholar
Ross, P.E.: Why CPU frequency stalled. IEEE Spectr. 45(4), 72 (2008)
Article Google Scholar
Sidler, D., István, Z., Owaida, M., Kara, K., Alonso, G.: doppioDB: a hardware accelerated database. In: International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, pp. 1659–1662 (2017)
Google Scholar
Teubner, J.: FPGAs for data processing: current state. Inf. Technol. (IT) 59(3), 125 (2017)
Google Scholar
Theis, T.N., Wong, H.P.: The end of Moore’s law: a new beginning for information technology. Comput. Sci. Eng. 19(2), 41–50 (2017)
Article Google Scholar
Vancea, A., Stiller, B.: CoopSC: a cooperative database caching architecture. In: 2010 International Workshops on Enabling Technologies: Infrastructures for Collaborative Enterprises (WETICE), Larissa, Greece, pp. 223–228 (2010)
Google Scholar
Ziener, D., et al.: FPGA-based dynamically reconfigurable SQL query processing. ACM Trans. Reconfig. Technol. Syst. (TRETS) 9(4), 25:1–25:24 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Nokia Bell Labs, Nozay, France
Marouan Maghzaoui & Julien Lallet
Univ Rennes, CNRS, IRISA, Lannion, France
Laurent d’Orazio

Authors

Marouan Maghzaoui
View author publications
You can also search for this author in PubMed Google Scholar
Laurent d’Orazio
View author publications
You can also search for this author in PubMed Google Scholar
Julien Lallet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Laurent d’Orazio .

Editor information

Editors and Affiliations

Lab. ETIS UMR 8051, University of Paris-Seine, University of Cergy-Pontoise, ENSEA, CNRS, Cergy-Pontoise, France
Dimitris Kotzinos
Lab. ETIS UMR 8051, University of Paris-Seine, University of Cergy-Pontoise, ENSEA, CNRS, Cergy-Pontoise, France
Dominique Laurent
LRI, University of Paris-Sud, Orsay, France
Nicolas Spyratos
Hokkaido University, Sapporo, Japan
Yuzuru Tanaka
Kyushu University, Fukuoka, Japan
Rin-ichiro Taniguchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Maghzaoui, M., d’Orazio, L., Lallet, J. (2019). Toward FPGA-Based Semantic Caching for Accelerating Data Analysis with Spark and HDFS. In: Kotzinos, D., Laurent, D., Spyratos, N., Tanaka, Y., Taniguchi, Ri. (eds) Information Search, Integration, and Personalization. ISIP 2018. Communications in Computer and Information Science, vol 1040. Springer, Cham. https://doi.org/10.1007/978-3-030-30284-9_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-30284-9_7
Published: 24 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30283-2
Online ISBN: 978-3-030-30284-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics