Abstract
Sequential Consistency (SC) is the most intuitive memory model for parallel programs. However, modern architectures aggressively reorder and overlap memory accesses, causing SC violations (SCVs). An SCV is practically always a bug. This paper proposes Dissector, a hardware software combined approach to detect SCVs in a conventional TSO machine. Dissector hardware works by piggybacking information about pending stores with cache coherence messages. Later, it detects if any of those pending stores can cause an SCV cycle. Dissector keeps hardware modifications minimal and simpler by sacrificing some degree of detection accuracy. Dissector recovers the loss in detection accuracy by using a postprocessing software which filters out false positives and extracts detail debugging information. Dissector hardware is lightweight, keeps the cache coherence protocol clean, does not generate any extra messages, and is unaffected by branch mispredictions. Moreover, due to the postprocessing phase, Dissector does not suffer from false positives. This paper presents a detailed design and implementation of Dissector in a conventional TSO machine. Our experiments with different concurrent algorithms, bug kernels, Splash2 and Parsec applications show that Dissector has a better SCV detection ability than a state-of-the-art hardware based approach with much less hardware. Dissector hardware induces a negligible execution overhead of 0.02%. Moreover, with more processors, the overhead remains virtually the same.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Intel Cilk Plus. http://cilkplus.org/
Adve, S.V., Gharachorloo, K.: Shared memory consistency models: a tutorial. Western Reseach Laboratory-Compaq. Research report 95/7, September 1995
Blundell, C., Martin, M.M., Wenisch, T.F.: Invisifence: performance-transparent memory ordering in conventional multiprocessors. In: ISCA (2009)
Bonomi, F., et al.: An improved construction for counting Bloom filters. In: Annual European Symposium on Algorithms, September 2006
Burckhardt, S., Alur, R., Martin, M.M.K.: Checkfence: checking consistency of concurrent data types on relaxed memory models. In: PLDI (2007)
Burnim, J., Sen, K., Stergiou, C.: Sound and complete monitoring of sequential consistency for relaxed memory models. In: Abdulla, P.A., Leino, K.R.M. (eds.) TACAS 2011. LNCS, vol. 6605, pp. 11–25. Springer, Heidelberg (2011). doi:10.1007/978-3-642-19835-9_3
Burnim, J., Sen, K., Stergiou, C.: Testing concurrent programs on relaxed memory models. In: ISSTA (2011)
Duan, Y., Feng, X., Wang, L., Zhang, C., Yew, P.C.: Detecting and eliminating potential violations of sequential consistency for concurrent C/C++ programs. In: CGO (2009)
Gharachorloo, K., Gibbons, P.B.: Detecting violations of sequential consistency. In: SPAA (1991)
Gharachorloo, K., Gupta, A., Hennessy, J.: Two techniques to enhance the performance of memory consistency models. In: ICPP (1991)
Gniady, C., Falsafi, B., Vijaykumar, T.N.: Is sc + ilp = rc? In: ISCA (1999)
Gopalakrishnan, G., Yang, Y., Sivaraj, H.: QB or Not QB: an efficient execution verification tool for memory orderings. In: Alur, R., Peled, D.A. (eds.) CAV 2004. LNCS, vol. 3114, pp. 401–413. Springer, Heidelberg (2004). doi:10.1007/978-3-540-27813-9_31
Intel: Intel parallel studio. https://software.intel.com/en-us/intel-parallel-studio-xe
Islam, M., Muzahid, A.: Characterizing real world bugs causing sequential consistency violations. In: HotPar, June 2013
Lamport, L.: How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput. 28(9), 690–691 (1979)
Lin, C., Nagarajan, V., Gupta, R., Rajaram, B.: Efficient sequential consistency via conflict ordering. In: ASPLOS (2012)
Lucia, B., Ceze, L., Strauss, K., Qadeer, S., Boehm, H.J.: Conflict exceptions: simplifying concurrent language semantics with precise hardware exceptions for data-races. In: ISCA (2010)
Luk, C.K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: PLDI (2005)
Marino, D., Singh, A., Millstein, T., Musuvathi, M., Narayanasamy, S.: DRFx: a simple and efficient memory model for concurrent programming languages. In: PLDI (2010)
Marino, D., Singh, A., Millstein, T., Musuvathi, M., Narayanasamy, S.: A case for an sc-preserving compiler. In: Proceedings of the 32th ACM SIGPLAN Conference on Programming Language and Implementation, PLDI 2011 (2011)
Musuvathi, M., Qadeer, S., Ball, T., Basler, G., Nainar, P.A., Neamtiu, I.: Finding and reproducing heisenbugs in concurrent programs. In: OSDI (2008)
Muzahid, A., Qi, S., Torrellas, J.: Vulcan: Hardware support for detecting sequential consistency violations dynamically. In: MICRO, December 2012
Muzahid, A., Suárez, D., Qi, S., Torrellas, J.: Sigrace: signature-based data race detection. In: ISCA (2009)
Qian, X., Sahelices, B., Torrellas, J., Qian, D.: Volition: precise and scalable sequential consistency violation detection. In: ASPLOS, March 2013
Renau, J., Fraguela, B., Tuck, J., Liu, W., Prvulovic, M., Ceze, L., Sarangi, S., Sack, P., Strauss, K., Montesinos, P.: SESC Simulator, January 2005. http://sesc.sourceforge.net
Sen, K.: Race directed random testing of concurrent programs. In: PLDI (2008)
Sewell, P., Sarkar, S., Owens, S., Nardelli, F.Z., Myreen, M.O.: X86-tso: A rigorous and usable programmer’s model for x86 multiprocessors. Commun. ACM 53(7), 89–97 (2010)
Shasha, D., Snir, M.: Efficient and correct execution of parallel programs that share memory. ACM Trans. Program. Lang. Syst. 10(2), 282–312 (1988)
Singh, A., Narayanasamy, S., Marino, D., Millstein, T., Musuvathi, M.: End-to-end sequential consistency. In: ISCA (2012)
Weaver, D., Germond, T.: The SPARC Architecture Manual Version 9. Prentice Hall, Englewood Cliffs (1994)
Wenisch, T.F., Ailamaki, A., Falsafi, B., Moshovos, A.: Mechanisms for store-wait-free multiprocessors. In: ISCA (2007)
Yang, Y., Gopalakrishnan, G., Lindstrom, G., Slind, K.: Nemos: a framework for axiomatic and executable specifications of memory consistency models. In: IPDPS (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Islam, M.M., Akram, R., Muzahid, A. (2016). Hardware-Based Sequential Consistency Violation Detection Made Simpler. In: Carretero, J., Garcia-Blas, J., Ko, R., Mueller, P., Nakano, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2016. Lecture Notes in Computer Science(), vol 10048. Springer, Cham. https://doi.org/10.1007/978-3-319-49583-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-49583-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49582-8
Online ISBN: 978-3-319-49583-5
eBook Packages: Computer ScienceComputer Science (R0)