Skip to main content

HDCat: Effectively Identifying Hot Data in Large-Scale I/O Streams with Enhanced Temporal Locality

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9529))

Abstract

Hot data is very important for optimizing modern computer systems. For example, the identified hot data can be employed to extend the lifespan of flash memory. However, it is very challenging to effectively identify hot data with low memory consumption and low runtime overhead. This paper proposes a Hot Data Catcher (HDCat) which can effectively identify hot data in large-scale I/O streams by leveraging enhanced temporal locality. HDCat only maintains a hot data queue and a candidate hot data queue to record the data access pattern by tracking limited data set, thus effectively reducing the memory consumption. Furthermore, HDCat adopts a D-bit counter and a recency-bit to leverage both the frequency and recency contained in the data stream. Additionally, HDCat can significantly reduce the conversion between hot data and cold data. Real traces are used to evaluate the proposed approach. Experimental results demonstrate that HDCat significantly outperforms the state-of-the-art Multi-hash algorithm and the two-level LRU algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chang, L.P., Kuo, T.W.: An adaptive striping architecture for flash memory storage systems of embedded systems. In: IEEE Real-time Embedded Technology Applications Symposium, pp. 187–196 (2002)

    Google Scholar 

  2. Chang, L.P., Kuo, T.W.: Efficient management for large-scale flash-memory storage systems with resource conservation. ACM Trans. Storage 1(4), 381–418 (2005)

    Article  Google Scholar 

  3. Chang, L.P., Kuo, T.W., Lo, S.W.: Real-time garbage collection for flash-memory storage systems of real-time embedded systems. ACM Trans. Embed. Comput. Syst. 3(4), 837–863 (2004)

    Article  Google Scholar 

  4. Chiang, M.L., Paul, C.H.L., Chang, R.C.: Managing flash memory in personal communication devices. In: Proceedings of the 1997 International Symposium on Consumer Electronics (ISCE 1997), pp. 177–182 (1997)

    Google Scholar 

  5. Debnath, B., Subramanya, S., Du, D., Lilja, D.J.: Large block clock (lb-clock): a write caching algorithm for solid state disks. In: IEEE International Symposium on Modeling, Analysis Simulation of Computer and Telecommunication Systems, MASCOTS 2009, pp. 1–9 (2009)

    Google Scholar 

  6. Deng, Y.: What is the future of disk drives, death or rebirth? ACM Comput. Surv. 43(3), 194–218 (2011)

    Article  Google Scholar 

  7. Deng, Y., Wang, F., Na, H.: EED: energy efficient disk drive architecture. Inf. Sci. 178(22), 4403–4417 (2008)

    Article  Google Scholar 

  8. Hsieh, J.W., Chang, L.P., Kuo, T.W.: Efficient identification of hot data for flash memory storage systems. ACM Trans. Storage (TOS) TOS Homepage 2, 22–40 (2006)

    Article  Google Scholar 

  9. Jo, H., Kang, J.U., Park, S.Y., Kim, J.S., Lee, J.: FAB: flash-aware buffer management policy for portable media players. IEEE Trans. Consum. Electron. 52(2), 485–493 (2006)

    Article  Google Scholar 

  10. Kim, H., Ahn, S.: BPLRU: a buffer management scheme for improving random writes in flash storage. In: FAST, pp. 239–252 (2008)

    Google Scholar 

  11. Narayanan, D., Donnelly, A.: Write off-loading: practical power management for enterprise storage. Trans. Storage 4(3), 1–23 (2008)

    Article  Google Scholar 

  12. Park, D., Debnath, B., Du, D.: CFTL: a convertible flash translation layer adaptive to data access patterns. In: SIGMETRICS, pp. 365–366 (2010)

    Google Scholar 

  13. Park, S.Y., Jung, D., Kang, J.U., Kim, J.S., Lee, J.: CFLRU: a replacement algorithm for flash memory. In: CASES 2006: Proceedings of the 2006 International Conference on Compilers, Architecture, pp. 234–241 (2006)

    Google Scholar 

  14. Parkz, D., Nam, Y.J., Debnath, B., Du, D.H.C., Kim, Y., Kim, Y.: An on-line hot data identification for flash-based storage using sampling mechanism. ACM SIGAPP Appl. Comput. Rev. 13(1), 51–64 (2013)

    Article  Google Scholar 

  15. Zhang, L., Deng, Y., Zhu, W., Zhou, J., Wang, F.: Skewly replicating hot data to construct a power-efficient storage cluster. J. Netw. Comput. Appl. 50, 168–179 (2015)

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation (NSF) of China under Grant (No. 61572232, and No. 61272073), the key program of Natural Science Foundation of Guangdong Province (No. S2013020012865), the Open Research Fund of Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences (CARCH201401), and the Fundamental Research Funds for the Central Universities, and the Science and Technology Planning Project of Guangdong Province (No. 2013B090200021).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuhui Deng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Chen, J., Deng, Y., Huang, Z. (2015). HDCat: Effectively Identifying Hot Data in Large-Scale I/O Streams with Enhanced Temporal Locality. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9529. Springer, Cham. https://doi.org/10.1007/978-3-319-27122-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27122-4_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27121-7

  • Online ISBN: 978-3-319-27122-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics