A Study of Cache Design in Stream Processor

  • Chiyuan Ma
  • Zhenyu Zhao
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 337)


Stream architecture is a newly developed high performance processor architecture oriented to multimedia processing. FT64 is 64-bit programmable stream processor and it aims at exploiting the parallelism and locality of the applications. In this paper, first, we inspect the memory access characteristics of FT64 with cache and without cache. Second, we propose an improved cache design method. Making use of the feature of stream data type used by FT64, the improved method avoids loading data from memory when the stream store instruction fully modifying cache block misses. The experiments show the performance has been improved by 20.7% and 25.8% when a normal cache and an improved cache are used respectively. Finally, we study on the performance influence of cache capacity and associativity. The results show that better performance can be achieved when we use a small cache and an associativity of 2 or 4.


stream processor FT64 cache memory access fully modify 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kapasi, U., Dally, W.J., Rixner, S., Owens, J.D., Khailany, B.: The Imagine Stream Processor. In: ICCD 2002: Proceedings of 20th IEEE International Conference on Computer Design, pp. 282–288 (2002)Google Scholar
  2. 2.
    Kapasi, U.J., Rixner, S., Dally, W.J., Khailany, B., Ahn, J.H., Mattson, P., Owens, J.D.: Programmable Stream Processors. IEEE Computer 36(8), 54–62 (2003)CrossRefGoogle Scholar
  3. 3.
    Dally, W.J., Hanrahan, P., Erez, M., Knight, T.J.: Merrimac: Supercomputing with Streams. In: SC 2003 (November 2003)Google Scholar
  4. 4.
    Wang, Y., Chen, S., Wan, J., Zhang, K., Chen, S.: AIFSP: An Adaptive Instruction Flow Stream Processor. IEEE Transactions on VLSI (2011)Google Scholar
  5. 5.
    Taylor, M., Kim, J., Miller, J., Wentzlaff, D., et al.: The RAW Microprocessor: A Computational Fabric for Software Circuits and General Purpose Programs. IEEE Micro 22(2), 25–35 (2002)CrossRefGoogle Scholar
  6. 6.
    Yang, Q., Wu, N., Wen, M., He, Y., Su, H., Zhang, C.: SAT: A Stream Architecture Template for Embedded Applications. In: 10th IEEE International Conference on Computer and Information Technology (2010)Google Scholar
  7. 7.
    Kozyrakis, C.: Scalable Vector Media-processors for Embedded Systems. PhD thesis, University of California at Berkeley (2002)Google Scholar
  8. 8.
    Yang, X.J., Yan, X.B., Xing, Z.C., et al.: A 64-bit Stream Processor Architecture for Scientific Applications. In: ISCA 2007 (2007)Google Scholar
  9. 9.
    Hu, S., John, L.: Avoiding Store Misses to Fully Modified Cache Blocks. In: Proceedings of Performance, Computing, and Communications Conference (2006)Google Scholar
  10. 10.
    Burger, D., Keckler, S.W., McKinley, K.S., Dahlin, M., John, L.K., Lin, C., Moore, C.R., Burrill, J., McDonald, R.G., Yoder, W.: Scaling to the End of Silicon with EDGE Architectures. Computer 37(7), 44–55 (2004)CrossRefGoogle Scholar
  11. 11.
    Pham, D., Asano, S., Bolliger, M., Day, M.N., Hofstee, H.P., Johns, C., Kahle, J., Kameyama, A., Keaty, J., Masubuchi, Y., Riley, M., Shippy, D., Stasiak, D., Suzuoki, M., Wang, M., Warnock, J., Weitzel, S., Wendel, D., Yamazaki, T., Yazawa, K.: The Design and Implementation of a First-Generation Cell Processor. In: ISSCC 2005, pp. 184–185 (2005)Google Scholar
  12. 12.
    Sermulins, J., Thies, W., Rabbah, R., Amarasinghe, S.: Cache Aware Optimization of Stream Programs. In: Proceedings of the 2005 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, vol. 40, pp. 115–126 (2005)Google Scholar
  13. 13.
    Lee, J., Park, C., Ha, S.: Memory Access Pattern Analysis and Stream Cache Design for Multimedia Applications. In: Proceedings of the 2003 Conference on Asia South Pacific Design Automation, pp. 22–27 (2003)Google Scholar
  14. 14.
    Iacobovici, S., Spracklen, L., Kadambi, S., Chou, Y., Abraham, S.G.: Effective Stream-Based and Execution-Based Data Prefetching. In: Proceedings of the 18th Annual International Conference on Supercomputing, pp. 1–11 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Chiyuan Ma
    • 1
  • Zhenyu Zhao
    • 1
  1. 1.School of ComputerNational University of Defense TechnologyChangshaChina

Personalised recommendations