Abstract
Matrix multiplication is compute intensive, memory demand and cache intensive algorithm. It performs O(N 3) operations, demands storing O(N 2) elements and accesses O(N) times each element, where N is the matrix size. Implementation of cache intensive algorithms can achieve speedups due to cache memory behavior if the algorithms frequently reuse the data. A block replacement of already stored elements is initiated when the requirements exceed the limitations of cache size. Cache misses are produced when data of replaced block is to be used again. Several cache replacement policies are proposed to speedup different program executions.
In this paper we analyze and compare two most implemented cache replacement policies First-In-First-Out (FIFO) and Least-Recently-Used (LRU). The results of the experiments show the optimal solutions for sequential and parallel dense matrix multiplication algorithm. As the number of operations does not depend on cache replacement policy, we define and determine the average memory cycles per instruction that the algorithm performs, since it mostly affects the performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Al-Zoubi, H., Milenkovic, A., Milenkovic, M.: Performance evaluation of cache replacement policies for the spec cpu2000 benchmark suite. In: Proceedings of the 42nd Annual Southeast Regional Conference, ACM-SE 42, pp. 267–272. ACM, New York (2004)
Duong, N., Cammarota, R., Zhao, D., Kim, T., Veidenbaum, A.: SCORE: A Score-Based Memory Cache Replacement Policy. In: Emer, J. (ed.) JWAC 2010 - 1st JILP Worshop on Computer Architecture Competitions: Cache Replacement Championship, Saint Malo, France (2010)
Gupta, R., Tokekar, S.: Proficient pair of replacement algorithms on 11 and l2 cache for merge sort. J. of Computing 2(3), 171–175 (2010)
Gusev, M., Ristov, S.: Matrix multiplication performance analysis in virtualized shared memory multiprocessor. In: MIPRO, 2012 Proc. of the 35th International Convention, pp. 264–269. IEEE Conference Publications (2012)
Gusev, M., Ristov, S.: Performance gains and drawbacks using set associative cache. Journal of Next Generation Information Technology (JNIT) 3(3), 87–98 (2012)
He, L., Sun, Y., Zhang, C.: Adaptive Subset Based Replacement Policy for High Performance Caching. In: Emer, J. (ed.) JWAC 2010 - 1st JILP Worshop on Computer Architecture Competitions: Cache Replacement Championship, Saint Malo, France (2010)
Hennessy, J.L., Patterson, D.A.: Computer Architecture, 5th edn. A Quantitative Approach (2012)
Intel: Intel smart cache (May 2012), http://www.intel.com/content/www/us/en/architecture-and-technology/intel-smart-cache.html
Ishii, Y., Inaba, M., Hiraki, K.: Cache Replacement Policy Using Map-based Adaptive Insertion. In: Emer, J. (ed.) JWAC 2010 - 1st JILP Worshop on Computer Architecture Competitions: Cache Replacement Championship, Saint Malo, France (2010)
Jaleel, A., Theobald, K.B., Steely Jr., S.C., Emer, J.: High performance cache replacement using re-reference interval prediction (rrip). SIGARCH Comput. Archit. News 38(3), 60–71 (2010)
Janapsatya, A., Ignjatović, A., Peddersen, J., Parameswaran, S.: Dueling clock: adaptive cache replacement policy based on the clock algorithm. In: Proceedings of the Conference on Design, Automation and Test in Europe, DATE 2010, pp. 920–925 (2010)
Lira, J., Molina, C., González, A.: Lru-pea: a smart replacement policy for non-uniform cache architectures on chip multiprocessors. In: Proceedings of the 2009 IEEE International Conference on Computer Design, ICCD 2009, pp. 275–281. IEEE Press, Piscataway (2009)
OpenMP (2012), https://computing.llnl.gov/tutorials/openMP/
Pimple, M., Sathe, S.: Architecture aware programming on multi-core systems. International Journal of Advanced Computer Science and Applications (IJACSA) 2, 105–111 (2011)
Qureshi, M.K., Jaleel, A., Patt, Y.N., Steely, S.C., Emer, J.: Adaptive insertion policies for high performance caching. SIGARCH Comput. Archit. News 35(2), 381–391 (2007)
Reineke, J., Grund, D.: Relative competitive analysis of cache replacement policies. Sigplan Not. 43(7), 51–60 (2008)
Ristov, S., Gusev, M.: Achieving maximum performance for matrix multiplication using set associative cache. In: 2012 The 8th Int. Conf. on. Computing Technology and Information Management (ICCM 2012), vol. 2, pp. 542–547 (2012)
Zhang, K., Wang, Z., Chen, Y., Zhu, H., Sun, X.H.: Pac-plru: A cache replacement policy to salvage discarded predictions from hardware prefetchers. In: Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2011, pp. 265–274. IEEE Computer Society, Washington, DC (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Anchev, N., Gusev, M., Ristov, S., Atanasovski, B. (2013). Optimal Cache Replacement Policy for Matrix Multiplication. In: Markovski, S., Gusev, M. (eds) ICT Innovations 2012. ICT Innovations 2012. Advances in Intelligent Systems and Computing, vol 207. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37169-1_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-37169-1_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37168-4
Online ISBN: 978-3-642-37169-1
eBook Packages: EngineeringEngineering (R0)