Skip to main content

Exploiting Speculative Thread-Level Parallelism in Data Compression Applications

  • Conference paper
Languages and Compilers for Parallel Computing (LCPC 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4382))

Abstract

Although hardware support for Thread-Level Speculation (TLS) can ease the compiler’s tasks in creating parallel programs by allowing the compiler to create potentially dependent parallel threads, advanced compiler optimization techniques must be developed and judiciously applied to achieve the desired performance. In this paper, we take a close examination on two data compression benchmarks, gzip and bzip2, propose, implement and evaluate new compiler optimization techniques to eliminate performance bottlenecks in their parallel execution and improve their performance. The proposed techniques (i) remove the critical forwarding path created by synchronizing memory-resident values; (ii) identify and categorize reduction-like variables whose intermediate results are used within loops, and propose code transformation to remove the inter-thread data dependences caused by these variables; and (iii) transform the program to eliminate stalls caused by variations in thread size. While no previous work has reported significant performance improvement on parallelizing these two benchmarks, we are able to achieve up to 36% performance improvement for gzip and 37% for bzip2.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Akkary, H., Driscoll, M.: A Dynamic Multithreading Processor. In: 31st Annual IEEE/ACM International Symposium on Microarchitecture (Micro-31), December 1998, ACM Press, New York (1998)

    Google Scholar 

  2. AMD Corporation. Leading the industry: Multi-core technology & dual-core processors from amd (2005), http://multicore.amd.com/en/Technology/

  3. Bhowmik, A., Franklin, M.: A fast approximate interprocedural analysis for speculative multithreading compiler. In: 17th Annual ACM International Conference on Supercomputing, ACM, New York (2003)

    Google Scholar 

  4. Blume, W., et al.: Parallel programming with polaris. IEEE Computer 29(12), 78–82 (1996)

    Google Scholar 

  5. Burrow, M., Wheeler, D.: A block-sorting lossless data compression algorithm. Tech. Rep. 124, Digital Systems Research Center (May 1994)

    Google Scholar 

  6. Chen, P.-S., et al.: Compiler support for speculative multithreading architecture with probabilistic points-to analysis. In: ACM SIGPLAN 2003 Symposium on Principles and Practice of Parallel Programming, ACM, New York (2003)

    Google Scholar 

  7. Cintra, M., Torrellas, J.: Learning cross-thread violations in speculative parallelization for multiprocessors. In: 8th International Symposium on High-Performance Computer Architecture (HPCA-8) (2002)

    Google Scholar 

  8. Du, Z.-H., et al.: A cost-driven compilation framework for speculative parallelization of sequential programs. In: ACM SIGPLAN 04 Conference on Programming Language Design and Implementation (PLDI’04), June 2004, ACM, New York (2004)

    Google Scholar 

  9. Dubey, P., et al.: Single-program speculative multithreading (spsm) architecture: Compiler-assisted fine-grained multithreading. In: Malyshkin, V. (ed.) Parallel Computing Technologies. LNCS, vol. 964, Springer, Heidelberg (1995)

    Google Scholar 

  10. Franklin, M., Sohi, G.S.: The expandable split window paradigm for exploiting fine-grain parallelsim. In: 19th Annual International Symposium on Computer Architecture (ISCA ’92), May, pp. 58–67 (1992)

    Google Scholar 

  11. Gupta, M., Nim, R.: Techniques for Speculative Run-Time Parallelization of Loops. In: Supercomputing ’98, November (1998)

    Google Scholar 

  12. Hammond, L., Willey, M., Olukotun, K.: Data Speculation Support for a Chip Multiprocessor. In: 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IIX), October (1998)

    Google Scholar 

  13. Hiranandani, S., Kennedy, K., Tseng, C.-W.: Preliminary experiences with the Fortran D compiler. In: Supercomputing ’93 (1993)

    Google Scholar 

  14. Intel Corporation. Intel’s dual-core processor for desktop PCs (2005), http://www.intel.com/personal/desktopcomputer/dual_core/index.htm

  15. Intel Corporation. Intel itanium architecture software developer’s manual, revision 2.2 (2006), http://www.intel.com/design/itanium/manuals/iiasdmanual.htm

  16. Johnson, T.A., Eigenmann, R., Vijaykumar, T.N.: Min-cut program decomposition for thread-level speculation. In: ACM SIGPLAN 04 Conference on Programming Language Design and Implementation (PLDI’04), June 2004, ACM, New York (2004)

    Google Scholar 

  17. Kalla, R., Sinharoy, B., Tendler, J.M.: IBM Power5 Chip: A Dual-Core Multithreaded Processor. In: Microprocessor Forum ’99, October (1999)

    Google Scholar 

  18. Kennedy, K., Allen, R.: Optimizing Compilers for Modern Architectures: A Dependence-based Approach. Academic Press, London (2002)

    Google Scholar 

  19. Knight, T.: An Architecture for Mostly Functional Languages. In: Proceedings of the ACM Lisp and Functional Programming Conference, August 1986, pp. 500–519. ACM Press, New York (1986)

    Google Scholar 

  20. Krishnan, V., Torrellas, J.: The Need for Fast Communication in Hardware-Based Speculative Chip Multiprocessors. In: Malyshkin, V. (ed.) Parallel Computing Technologies. LNCS, vol. 1662, Springer, Heidelberg (1999)

    Google Scholar 

  21. Li, X.-F., et al.: Software value prediction for speculative parallel threaded computations. In: 1st Value-Prediction Workshop (VPW 2003), June (2003)

    Google Scholar 

  22. Liu, W., et al.: Posh: A tls compiler that exploits program structure. In: ACM SIGPLAN 2006 Symposium on Principles and Practice of Parallel Programming, March 2006, ACM, New York (2006)

    Google Scholar 

  23. Luk, C.-K., et al.: Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In: ACM SIGPLAN 05 Conference on Programming Language Design and Implementation (PLDI’05), June 2005, ACM, New York (2005)

    Google Scholar 

  24. Marcuello, P., Gonzalez, A.: Clustered speculative multithreaded processors. In: 13th Annual ACM International Conference on Supercomputing, Rhodes, Greece, June 1999, ACM, New York (1999)

    Google Scholar 

  25. Oplinger, J., Heine, D., Lam, M.: In search of speculative thread-level parallelism. In: Proceedings PACT 99, October (1999)

    Google Scholar 

  26. Prabhu, M., Olukotun, K.: Using thread-level speculation to simplify manual parallelization. In: ACM SIGPLAN 2003 Symposium on Principles and Practice of Parallel Programming, ACM, New York (2003)

    Google Scholar 

  27. Prabhu, M., Olukotun, K.: Exposing speculative thread parallelism in spec2000. In: ACM SIGPLAN 2005 Symposium on Principles and Practice of Parallel Programming, ACM, New York (2005)

    Google Scholar 

  28. Quinones, C.G., et al.: Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices. In: ACM SIGPLAN 05 Conference on Programming Language Design and Implementation (PLDI’05), June 2005, ACM, New York (2005)

    Google Scholar 

  29. Sohi, G.S., Breach, S., Vijaykumar, T.N.: Multiscalar Processors. In: 22nd Annual International Symposium on Computer Architecture (ISCA ’95), June, pp. 414–425 (1995)

    Google Scholar 

  30. Steffan, J.G., Colohan, C.B., Mowry, T.C.: Architectural support for thread-level data speculation. Tech. Rep. CMU-CS-97-188, School of Computer Science, Carnegie Mellon University (November 1997)

    Google Scholar 

  31. Steffan, J.G., et al.: A Scalable Approach to Thread-Level Speculation. In: 27th Annual International Symposium on Computer Architecture (ISCA ’00), June (2000)

    Google Scholar 

  32. Sun Corporation. Throughput computing—niagara (2005), http://www.sun.com/processors/throughput/

  33. Tjiang, S., et al.: Integrating scalar optimization and parallelization. In: Banerjee, U., et al. (eds.) Languages and Compilers for Parallel Computing. LNCS, vol. 589, pp. 137–151. Springer, Heidelberg (1992)

    Chapter  Google Scholar 

  34. Tsai, J.-Y., et al.: The Superthreaded Processor Architecture. IEEE Transactions on Computers, Special Issue on Multithreaded Architectures 48(9) (1999)

    Google Scholar 

  35. Tsai, J.-Y., Jiang, Z., Yew, P.-C.: Compiler techniques for the superthreaded architectures. International Journal of Parallel Programming - Special Issue on Languages and Compilers for Parallel Computing (June 1998)

    Google Scholar 

  36. Vijaykumar, T.N., Breach, S.E., Sohi, G.S.: Register communication strategies for the multiscalar architecture. Tech. Rep. Technical Report 1333, Department of Computer Science, University of Wisconsin-Madison (Feb. 1997)

    Google Scholar 

  37. Vijaykumar, T.N., Sohi, G.S.: Task selection for a multiscalar processor. In: 31st Annual IEEE/ACM International Symposium on Microarchitecture (Micro-31), Nov. 1998, IEEE, Los Alamitos (1998)

    Google Scholar 

  38. Wang, S., et al.: Loop selection for thread-level speculation. In: The 18th International Workshop on Languages and Compilers for Parallel Computing, Oct. (2005)

    Google Scholar 

  39. Zhai, A., et al.: Compiler optimization of scalar value communication between speculative threads. In: 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-X), Oct. (2002)

    Google Scholar 

  40. Zhai, A., et al.: Compiler optimization of memory-resident value communication between speculative threads. In: The 2004 International Symposium on Code Generation and Optimization, Mar. (2004)

    Google Scholar 

  41. Zilles, C., Sohi, G.S.: Master/slave speculative parallelization. In: 35th Annual IEEE/ACM International Symposium on Microarchitecture (Micro-35), Nov. 2002, IEEE, Los Alamitos (2002)

    Google Scholar 

  42. Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Transaction on Information Theory 23(3), 337–343 (1977)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

George Almási Călin Caşcaval Peng Wu

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Wang, S., Zhai, A., Yew, PC. (2007). Exploiting Speculative Thread-Level Parallelism in Data Compression Applications. In: Almási, G., Caşcaval, C., Wu, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2006. Lecture Notes in Computer Science, vol 4382. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72521-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72521-3_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72520-6

  • Online ISBN: 978-3-540-72521-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics