Abstract
Hardware transactional memory (HTM) is a promising technology to improve the productivity of parallel programming. However, a general agreement has not been reached on the consumability of HTM. User experiences indicate that HTM interface is not straightforward to be adopted by programmers to parallelize existing commercial applications, because of the internal limitation of HTM and the difficulties to identify shared variables hidden in the code. In this paper we demonstrate that, with well-designed encapsulation, HTM can deliver good consumability. Based on the study of a typical commercial application in supply chain simulations - GBSE, we develop a general scheduling engine that encapsulates the HTM interface. With the engine, we can convert the sequential program to multi-threaded model without changing any source code for the simulation logic. The time spent on parallelization is reduced from two months to one week, and the performance is close to the manually tuned counterpart with fine-grained locks.
Chapter PDF
Similar content being viewed by others
References
McDonald, A., Chung, J., Carlstrom, B.D., Minh, C.C., Chafi, H., Kozyrakis, C., Olukotun, K.: Architectural semantics for practical transactional memory. In: Proc. of the 33rd International Symposium on Computer Architecture, pp. 53–65. IEEE, Los Alamitos (2006)
Wang, H., Hou, R., Wang, K.: Hardware transactional memory system for parallel programming. In: Proc. of the 13th Asia-Pacific Computer System Architecture Conference, pp. 1–7. IEEE, Los Alamitos (2008)
Baugh, L., Neelakantam, N., Zilles, C.: Using hardware memory protection to build a high-performance, strongly-atomic hybrid transactional memory. In: Proc. of the 35th International Symposium on Computer Architecture, pp. 115–126. IEEE, Los Alamitos (2008)
Chung, J., Baek, W., Kozyrakis, C.: Fast memory snapshot for concurrent programming without synchronization. In: Proc. of the 23rd International Conference on Supercomputing, pp. 117–125. ACM, New York (2009)
Perumalla, K.: Parallel and distributed simulation: traditional techniques and recent advances. In: Proc. of the 2006 Winter Simulation Conference, pp. 84–95 (2006)
Wang, W., Dong, J., Ding, H., Ren, C., Qiu, M., Lee, Y., Cheng, F.: An introduction on ibm general business simulation environment. In: Proc. of the 2008 Winter Simulation Conference, pp. 2700–2707 (2008)
Chung, J., Chafi, H., Minh, C., McDonald, A., Carlstrom, B., Kozyrakis, C., Olukotun, K.: The common case transactional behavior of multithreaded programs. In: Proc. of the 12th International Symposium on High-Performance Computer Architecture, pp. 166–177. IEEE, Los Alamitos (2006)
Poplawski, A., Nicol, D.: Nops: a conservative parallel simulation engine for ted. In: Proc. of the 12th Workshop on Parallel and Distributed Simulation, pp. 180–187 (1998)
Bohrer, P., Peterson, J., Elnozahy, M., Rajamony, R., Gheith, A., Rochhold, R.: Mambo: a full system simulator for the powerpc architecture. ACM SIGMETRICS Performance Evaluation Review 31(4), 8–12 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, H., Ge, Y., Wang, Y., Zou, Y. (2010). Productivity and Performance: Improving Consumability of Hardware Transactional Memory through a Real-World Case Study. In: D’Ambra, P., Guarracino, M., Talia, D. (eds) Euro-Par 2010 - Parallel Processing. Euro-Par 2010. Lecture Notes in Computer Science, vol 6272. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15291-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-15291-7_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15290-0
Online ISBN: 978-3-642-15291-7
eBook Packages: Computer ScienceComputer Science (R0)