Abstract
The massive amounts of data, growing as we speak, are one of the, if not the, most accountable reasons of today’s AI systems which on many tasks exhibit human grade performance. Thanks to the enormous amounts of image data that machines can be trained to recognize scenes and steer cars. Quantities of medical imagery lead to machine provided diagnostics, sensor data allows us to detect natural disasters before they occur, and to prepare for them. Times are exciting since researchers find new applications to AI at astonishing pace. However, there is a small concern. How will we handle the ever-growing amounts of data? The consensus is that storage is cheap, yet with load of data it is expensive and unsustainable. The amount of live streamed data is also increasing. In other words, we are well advised to consider data compression again. In this chapter, we will introduce traditional compression terminology and techniques, before surveying novel approaches proposed by industry and academia. It sounds contradictory, but AI may just as well help us to address this problem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Apple. (2015). LZFSE compression library and command line tool. Retrieved December 19, 2019, from https://github.com/lzfse/lzfse
Bosch. (2015). Bosch BME680. Retrieved December 19, 2019, from https://www.bosch-sensortec.com/bst/products/all_products/bme680
Collet, Y. (2015). Zstandard - Real-time data compression algorithm. Retrieved December 19, 2019, from http://facebook.github.io/zstd/
Duda, J. (2013). Asymmetric numeral systems as close to capacity low state entropy coders. CoRR, from https://arxiv.org/abs/1311.2540/
Falk, E., Charlier, J., & State, R. (2017). Your moves, your device: Establishing behavior profiles using tensors. In Advanced Data Mining and Applications - 13th International Conference, ADMA 2017.
Francis, D., & Tissera, M. (2018). Compression AI. Retrieved December 19, 2019, from https://compression.ai/
Goyal, M., Tatwawadi, K., Chandak, S., & Ochoa, I. (2018). DeepZip: Lossless data compression using recurrent neural networks.
Huffman, D. A. (1952). A method for the construction of minimum-redundancy codes. Proceedings of the IRE, 40(9), 1098–1101.
Kim, P. (2015). Elasticsearch storage overhead. Retrieved December 19, 2019, from https://www.elastic.co/blog/elasticsearch-storage-the-true-story
Milo, T. (2019). Getting rid of data, vldb2019 keynote talk. Retrieved December 19, 2019, from https://vldb.org/2019/?program-schedule-keynote-speakers#Keynote_2
Rhatushnyak, A., Wassenberg, J., Sneyers, J., Alakuijala, J., Vandevenne, L., Versari, L. et al. (2019). Committee draft of JPEG XL image coding system.
Shannon, C. E. (2001). A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review, 5(1), 3–55.
Witten, I. H., Neal, R. M., & Cleary, J. G. (1987). Arithmetic coding for data compression. Communications of the ACM, 30(6), 520–540.
Ziv, J., & Lempel, A. (1977). A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 23(3), 337–343.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Falk, E. (2020). AI to Solve the Data Deluge: AI-Based Data Compression. In: Glauner, P., Plugmann, P. (eds) Innovative Technologies for Market Leadership. Future of Business and Finance. Springer, Cham. https://doi.org/10.1007/978-3-030-41309-5_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-41309-5_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41308-8
Online ISBN: 978-3-030-41309-5
eBook Packages: Business and ManagementBusiness and Management (R0)