10 Years of research on debugging concurrent and multicore software: a systematic mapping study

Abstract

Debugging—the process of identifying, localizing and fixing bugs—is a key activity in software development . Due to issues such as non-determinism and difficulties of reproducing failures, debugging concurrent software is significantly more challenging than debugging sequential software. A number of methods, models and tools for debugging concurrent and multicore software have been proposed, but the body of work partially lacks a common terminology and a more recent view of the problems to solve. This suggests the need for a classification, and an up-to-date comprehensive overview of the area. This paper presents the results of a systematic mapping study in the field of debugging of concurrent and multicore software in the last decade (2005–2014). The study is guided by two objectives: (1) to summarize the recent publication trends and (2) to clarify current research gaps in the field. Through a multi-stage selection process, we identified 145 relevant papers. Based on these, we summarize the publication trend in the field by showing distribution of publications with respect to year, publication venues, representation of academia and industry, and active research institutes. We also identify research gaps in the field based on attributes such as types of concurrency bugs, types of debugging processes, types of research and research contributions. The main observations from the study are that during the years 2005–2014: (1) there is no focal conference or venue to publish papers in this area; hence, a large variety of conferences and journal venues (90) are used to publish relevant papers in this area; (2) in terms of publication contribution, academia was more active in this area than industry; (3) most publications in the field address the data race bug; (4) bug identification is the most common stage of debugging addressed by articles in the period; (5) there are six types of research approaches found, with solution proposals being the most common one; and (6) the published papers essentially focus on four different types of contributions, with “methods” being the most common type. We can further conclude that there are still quite a number of aspects that are not sufficiently covered in the field, most notably including (1) exploring correction and fixing bugs in terms of debugging process; (2) order violation, suspension and starvation in terms of concurrency bugs; (3) validation and evaluation research in the matter of research type; (4) metric in terms of research contribution. It is clear that the concurrent, parallel and multicore software community needs broader studies in debugging. This systematic mapping study can help direct such efforts.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Notes

  1. 1.

    http://endnote.com.

  2. 2.

    http://www.springer.com/computer/theoretical+computer+science/journal/10766.

  3. 3.

    http://paris.utdallas.edu/iwpd14/index.html.

  4. 4.

    http://faculty.uoit.ca/bradbury/padtad2012/.

  5. 5.

    http://eventos.fct.unl.pt/musepat2013/.

  6. 6.

    http://faculty.uoit.ca/bradbury/sac-musepat2015/.

References

  1. Abbaspour A. S., Hansson, H., Sundmark, D., & Eldh, S. (2015). Towards classification of concurrency bugs based on Observable properties. In Proceedings of the 1st international workshop on complex faults and failures in large software systems. Italy. http://www.es.mdh.se/pdf_publications/3920.pdf.

  2. Adalid, D., Salmern, A., del Mar Gallardo, M., & Merino, P. (2014). Using SPIN for automated debugging of infinite executions of Java programs. Journal of Systems and Software, 90, 61–75.

    Article  Google Scholar 

  3. Agarwal, R., & Stoller, S. D. (2006). Run-time detection of potential deadlocks for programs with locks, semaphores, and condition variables. In Proceedings of the 2006 workshop on parallel and distributed systems: Testing and debugging, pp. 51–60. ACM.

  4. Al-Shabibi, A., Gerlach, S., Hersch, R. D., & Schaeli, B. (2007). A debugger for flow graph based parallel applications. In Proceedings of the 2007 ACM workshop on parallel and distributed systems: Testing and debugging, pp. 14–20. ACM.

  5. Altekar, G., & Stoica, I. (2009). ODR: Output-deterministic replay for multicore debugging. In Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles, pp. 193–206. ACM.

  6. Andersen, D. (2007). Implementing a new application debugging framework for the multi-core age. Scientific Computing, 24(8), 34–35.

    Google Scholar 

  7. Anvik, J., Hiew, L., & Murphy, G. C. (2006). Who should fix this bug? In Proceedings of the 28th international conference on software engineering, pp. 361–370. ACM.

  8. Arulraj, J., Chang, P.-C., Jin, G., & Lu, S. (2013). Production-run software failure diagnosis via hardware performance counters. In ACM SIGARCH Computer Architecture News, Vol. 41, pp. 101–112. ACM.

  9. Ball, T., Burckhardt, S., de Halleux, J., Musuvathi, M., & Qadeer, S. (2009). Deconstructing concurrency heisenbugs. In 31st international conference on software engineering-companion volume, 2009. ICSE-Companion 2009, pp. 403–404. IEEE.

  10. Berger, E. D., Yang, T., Liu, T., & Novark, G. (2009). Grace: Safe multithreaded programming for C/C++. In ACM sigplan notices, Vol. 44, pp. 81–96. ACM.

  11. Bond, M. D., Kulkarni, M., Cao, M., Zhang, M., Fathi Salmi, M., Biswas, S., Sengupta, A., & Huang, J. (2013). Octet: Capturing and controlling cross-thread dependences efficiently. In ACM SIGPLAN notices, Vol. 48, pp. 693–712. ACM.

  12. Brito, M., Felizardo, K. R., Souza, P., & Souza, S. (2010). Concurrent software testing: A systematic review. On testing software and systems: Short papers.

  13. Buttigieg, V., & Briffa, J. A. (2011). Codebook and marker sequence design for synchronization-correcting codes. In 2011 IEEE international symposium on information theory proceedings (ISIT), pp. 1579–1583. IEEE.

  14. Chen, F., Serbanuta, T.-F., & Rosu, G. (2008). jPredictor. In ACM/IEEE 30th international conference on software engineering, 2008. ICSE’08, pp. 221–230. IEEE.

  15. Chen, H. Y. (2005). Analysis of potential deadlock in Java multithreaded object-oriented programs. In 2005 IEEE international conference on systems, man and cybernetics, Vol. 1, pp. 146–150. IEEE.

  16. Chen, J., & MacDonald, S. (2007). Testing concurrent programs using value schedules. In Proceedings of the twenty-second IEEE/ACM international conference on automated software engineering, pp. 313–322. ACM.

  17. Chen, Q., & Wang, L. (2009). An integrated framework for checking concurrency-related programming errors. In Computer software and applications conference, 2009. COMPSAC’09. 33rd annual IEEE international, Vol. 1, pp. 676–679. IEEE.

  18. Chen, S.-Y., Neng, C., Yang, G.-H., Jone, W.-B., & Chen, T.-F. (2012). IMITATOR: A deterministic multicore replay system with refining techniques. In 2012 international symposium on VLSI design, automation, and test (VLSI-DAT), pp. 1–4. IEEE.

  19. Chen, Y., & Chen, H. (2013). Scalable deterministic replay in a parallel full-system emulator. In ACM SIGPLAN notices, Vol. 48, pp. 207–218. ACM.

  20. Chiu, Y.-C., Shieh, C.-K., Huang, T.-C., Liang, T.-Y., & Chu, K.-C. (2011). Data race avoidance and replay scheme for developing and debugging parallel programs on distributed shared memory systems. Parallel Computing, 37(1), 11–25.

    Article  MATH  Google Scholar 

  21. Copty, S., & Ur, S. (2007). Toward automatic concurrent debugging via minimal program mutant generation with aspectJ. Electronic Notes in Theoretical Computer Science, 174(9), 151–165.

    Article  Google Scholar 

  22. Dantas, A., Brasileiro, F., & Cirne, W. (2008). Improving automated testing of multi-threaded software. In 2008 1st international conference on software testing, verification, and validation, pp. 521–524. IEEE.

  23. del Mar Gallardo, M., Martnez, J., Merino, P., & Pimentel, E. (2006). On the evolution of reliability methods for critical software. Journal of Integrated Design and Process Science, 10(4), 55–67.

    Google Scholar 

  24. Desouza, J., Kuhn, B., De Supinski, B. R., Samofalov, V., Zheltov, S., & Bratanov, S. (2005). Automated, scalable debugging of MPI programs with Intel Message Checker. In Proceedings of the second international workshop on software engineering for high performance computing system applications, pp. 78–82. ACM.

  25. Devietti, J., Lucia, B., Ceze, L., & Oskin, M. (2010). DMP: Deterministic shared-memory multiprocessing. IEEE Micro, 30(1), 40–49.

    Article  Google Scholar 

  26. Dinh, M. N., Abramson, D., & Jin, C. (2014). Statistical assertion: A more powerful method for debugging scientific applications. Journal of Computational Science, 5(2), 126–134.

    Article  Google Scholar 

  27. Eichinger, F., Pankratius, V., & Bohm, K. (2014). Data mining for defects in multicore applications: An entropy-based call-graph technique. Concurrency and Computation: Practice and Experience, 26(1), 1–20.

    Article  Google Scholar 

  28. Eichinger, F., Pankratius, V., Gro\(\backslash \)s se, P. W. L., & Bhm, K. (2010). Localizing defects in multithreaded programs by mining dynamic call graphs. In Testing practice and research techniques, pp. 56–71. Springer.

  29. Elmas, T., Sezgin, A., Tasiran, S., & Qadeer, S. (2009). An annotation assistant for interactive debugging of programs with common synchronization idioms. In Proceedings of the 7th workshop on parallel and distributed systems: Testing, analysis, and debugging, p. 10. ACM.

  30. Engström, E., & Runeson, P. (2011). Software product line testing—A systematic mapping study. Information Software Technology, 53(1), 2–13. doi:10.1016/j.infsof.2010.05.011.

    Article  Google Scholar 

  31. Flanagan, C., & Freund, S. N. (2010). Adversarial memory for detecting destructive races. In ACM sigplan notices, Vol. 45, pp. 244–254. ACM.

  32. Flanagan, C., Freund, S. N., Lifshin, M., & Qadeer, S. (2008). Types for atomicity: Static checking and inference for Java. ACM Transactions on Programming Languages and Systems (TOPLAS), 30(4), 20.

    Article  Google Scholar 

  33. Fonseca, P., Li, C., & Rodrigues, R. (2011). Finding complex concurrency bugs in large multi-threaded applications. In Proceedings of the sixth conference on computer systems, pp. 215–228. ACM.

  34. Fonseca, P., Li, C., Singhal, V., & Rodrigues, R. (2010). A study of the internal and external effects of concurrency bugs. In 2010 IEEE/IFIP international conference on dependable systems and networks (DSN), pp. 221–230. IEEE.

  35. Francesca, G., Santone, A., Vaglini, G., & Villani, M. L. (2011). Ant colony optimization for deadlock detection in concurrent systems. In Computer software and applications conference (COMPSAC), 2011 IEEE 35th Annual, pp. 108–117. IEEE.

  36. Gao, Q., Zhang, W., Chen, Z., Zheng, M., & Qin, F. (2011). 2ndstrike: Toward manifesting hidden concurrency typestate bugs. ACM SIGARCH Computer Architecture News, 39(1), 239–250.

    Article  Google Scholar 

  37. Gesbert, L, Hu, Z, Loulergue, F, Matsuzaki, K, & Tesson, J (2010) Systematic development of correct bulk synchronous parallel programs. In 2010 international conference on parallel and distributed computing, applications and technologies (PDCAT), pp. 334–340. IEEE.

  38. Godefroid, P. (1997). Model checking for programming languages using verisoft. In Proceedings of the 24th acm sigplan-sigact symposium on principles of programming languages (popl’97). New York, NY, USA: ACM.

  39. Gottbrath, C. (2006). Eliminating parallel application memory bugs with totalview. In Proceedings of the 2006 ACM/IEEE conference on supercomputing, p. 210. ACM.

  40. Gottschlich, J. E., Pokam, G. A., Pereira, C. L., & Wu, Y. (2013). Concurrent predicates: A debugging technique for every parallel programmer. In Proceedings of the 22nd international conference on parallel architectures and compilation techniques, pp. 331–340. IEEE Press.

  41. Gupta, S., Sultan, F., Cadambi, S., Ivancic, F., & Rotteler, M. (2009). Using hardware transactional memory for data race detection. In IEEE international symposium on parallel & distributed processing, 2009. IPDPS 2009. pp. 1–11. IEEE.

  42. Ha, O.-K., Kuh, I.-B., Tchamgoue, G. M., & Jun, Y.-K. (2012). On-the-fly detection of data races in OpenMP programs. In Proceedings of the 2012 workshop on parallel and distributed systems: Testing, analysis, and debugging, pp. 1–10. ACM.

  43. Hilbrich, T., Protze, J., Schulz, M., de Supinski, B. R., & Mller, M. S. (2012). MPI runtime error detection with MUST: Advances in deadlock detection. In Proceedings of the international conference on high performance computing, networking, storage and analysis, 30. IEEE Computer Society Press.

  44. Hong, S., & Kim, M. (2013). Effective pattern-driven concurrency bug detection for operating systems. Journal of Systems and Software, 86(2), 377–388.

    Article  Google Scholar 

  45. Hower, D. R., & Hill, M. D. (2008). Rerun: Exploiting episodes for lightweight memory race recording. In ACM SIGARCH computer architecture news, Vol. 36, pp. 265–276. IEEE computer society.

  46. Huang, J., Meredith, P. O., & Rosu, G. (2014). Maximal sound predictive race detection with control flow abstraction. In Proceedings of the 35th ACM SIGPLAN conference on programming language design and implementation, p. 36. ACM.

  47. Huang, J., & Bond, M. D. (2013). Efficient context sensitivity for dynamic analyses via calling context uptrees and customized memory management. In Proceedings of the 2013 ACM SIGPLAN international conference on object oriented programming systems languages & #38; applications. OOPSLA ’13, pp. 53–72. New York, NY, USA: ACM. ISBN 978-1-4503-2374-1.

  48. Huang, R., Halberg, E., & Suh, G. E. (2013). Non-race concurrency bug detection through order-sensitive critical sections. In ACM SIGARCH computer architecture news, Vol. 41, pp. 655–666. ACM.

  49. Jalali, S., & Wohlin, C. (2012). Systematic literature studies: Database searches versus backward snowballing. In 2012 ACM-IEEE International symposium on empirical software engineering and measurement (ESEM), pp. 29–38, doi:10.1145/2372251.2372257.

  50. Jannesari, A., & Tichy, W. F. (2008). On-the-fly race detection in multi-threaded programs. In Proceedings of the 6th workshop on parallel and distributed systems: Testing, analysis, and debugging, p. 6. ACM.

  51. Jannesari, A., & Tichy, W. F. (2014). Library-independent data race detection. Parallel and Distributed Systems, IEEE Transactions on, 25(10), 2606–2616.

    Article  Google Scholar 

  52. Jin, G., Song, L., Zhang, W., Shan, L., & Liblit, B. (2011). Automated atomicity-violation fixing. ACM SIGPLAN Notices, 46(6), 389–400.

    Article  Google Scholar 

  53. Joshi, P., & Sen, K. (2008). Predictive typestate checking of multithreaded java programs. In Proceedings of the 2008 23rd IEEE/ACM international conference on automated software engineering, pp. 288–296. IEEE computer society.

  54. Jyoti, A., & Arora, V. (2014). Debugging and visualization techniques for multithreaded programs: A survey. In Recent advances and innovations in engineering (ICRAIE), 2014, pp. 1–6. IEEE.

  55. Kahlon, V. (2012). Automatic lock insertion in concurrent programs. In Formal methods in computer-aided design (FMCAD), 2012, pp. 16–23. IEEE.

  56. Kahlon, V., Sankaranarayanan, S., & Gupta, A. (2013). Static analysis for concurrent programs with applications to data race detection. International Journal on Software Tools for Technology Transfer, 15(4), 321–336.

    Article  Google Scholar 

  57. Kahlon, V., Yang, Y., Sankaranarayanan, S., & Gupta, A. (2007). Fast and accurate static data-race detection for concurrent programs. In Computer aided verification, pp. 226–239. Springer.

  58. Kang, M.-S., Ha, O.-K., & Jun, Y.-K. (2014). Visualization tool for debugging data races in structured fork-join parallel programs. International Journal of Software Engineering and Its Applications, 8(4), 157–168.

    Google Scholar 

  59. Kasikci, B., Zamfir, C., & Candea, G. (2013). RaceMob: Crowdsourced data race detection. In Proceedings of the twenty-fourth ACM symposium on operating systems principles, pp. 406–422. ACM.

  60. Keele, S. (2007). Guidelines for performing systematic literature reviews in software engineering, Technical report, Technical report, EBSE Technical Report EBSE-2007-01.

  61. Kelly, T., Wang, Y., Lafortune, S., & Mahlke, S. (2009). Eliminating concurrency bugs with control engineering. IEEE Computer, 42(11), 52–60.

    Article  Google Scholar 

  62. Khoshavi, N., Zarandi, H.R., & Maghsoudloo, M. (2012). Two control-flow error recovery methods for multithreaded programs running on multi-core processors. In Microelectronics (MIEL), 2012 28th international conference on, pp. 371–374. IEEE.

  63. Kiefer, K. E., & Moser, L. E. (2013). Replay debugging of non-deterministic executions in the Kernel-based virtual machine. Software: Practice and Experience, 43(11), 1261–1281.

    Google Scholar 

  64. Kim, B.-C., & Jun, Y.-K. (2010). Program visualization for debugging deadlocks in multithreaded programs. In Advances in software engineering, pp. 228–236. Springer.

  65. Kim, Y.-J., Lim, J.-S., & Jun, Y.-K. (2007a). Scalable thread visualization for debugging data races in OpenMP programs. In Advances in grid and pervasive computing, pp. 310–321. Springer.

  66. Kim, Y.-J., Kang, M.-H., Ha, O.-K., & Jun, Y.-K. (2007b). Efficient race verification for debugging programs with openMP directives. In Parallel computing technologies, pp. 230–239. Springer.

  67. Kistler, Michael, & Brokenshire, Daniel. (2011). Detecting race conditions in asynchronous DMA operations with full system simulation. In Performance analysis of systems and software (ISPASS), 2011 IEEE international symposium on, pp. 207–215. IEEE.

  68. Li, H., Luo, J., & Li, W. (2014). A formal semantics for debugging synchronous message passing-based concurrent programs. Science China Information Sciences, 57(12), 1–18.

    MATH  Google Scholar 

  69. Liu, P., & Zhang, C. (2012). Axis: Automatically fixing atomicity violations through solving control constraints. In Proceedings of the 34th international conference on software engineering, pp. 299–309. IEEE Press.

  70. Lonnberg, J., Ben-Ari, Mordechai, & Malmi, Lauri. (2011). Visualising concurrent programs with dynamic dependence graphs. In Visualizing software for understanding and analysis (VISSOFT), 2011 6th IEEE international workshop on, pp. 1–4. IEEE.

  71. Lu, K., Zhou, X., Wang, X., Zhang, W., & Li, G. (2013). RaceFree: An efficient multi-threading model for determinism. In ACM SIGPLAN notices, Vol. 48, pp. 297–298. ACM.

  72. Lu, L., Ji, W., & Scott, M. L. (2014). Dynamic enforcement of determinism in a parallel scripting language. In Proceedings of the 35th ACM SIGPLAN conference on programming language design and implementation, p. 53. ACM.

  73. Lu, S., Jiang, W., & Zhou, Y. (2007). A study of interleaving coverage criteria. In The 6th joint meeting on European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering: companion papers, pp. 533–536. ACM.

  74. Lu, S., Park, S., Seo, E., & Zhou, Y. (2008). Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. In ACM Sigplan Notices, Vol. 43, pp. 329–339. ACM.

  75. Lucia, B., & Ceze, L. (2009). Finding concurrency bugs with context-aware communication graphs. In Proceedings of the 42nd annual IEEE/ACM international symposium on microarchitecture, pp. 553–563. ACM.

  76. Lucia, B., Ceze, L., & Strauss, K. (2010). ColorSafe: Architectural support for debugging and dynamically avoiding multi-variable atomicity violations. In ACM SIGARCH computer architecture news, Vol. 38, pp. 222–233. ACM.

  77. Ma, H., Chen, Q., Wang, L., Liao, C., & Quinlan, D. (2012). An OpenMP analyzer for detecting concurrency errors. In Parallel processing workshops (ICPPW), 2012 41st international conference on, pp. 590–591. IEEE.

  78. Machado, N., Romano, P., & Rodrigues, L. (2012). Lightweight cooperative logging for fault replication in concurrent programs. In 2012 42nd annual IEEE/IFIP international conference on dependable systems and networks (DSN), pp. 1–12. IEEE.

  79. Maiya, P., Kanade, A., & Majumdar, R. (2014). Race detection for Android applications. In Proceedings of the 35th ACM SIGPLAN conference on programming language design and implementation, p. 34. ACM.

  80. Makela, J.-M., Leppanen, V., & Forsell, M. (2013). Towards a parallel debugging framework for the massively multi-threaded, step-synchronous REPLICA architecture. In Proceedings of the 14th international conference on computer systems and technologies, pp. 153–160. ACM.

  81. Martin, J.-P., Hicks, M., Costa, M., Akritidis, P., & Castro, M. (2010). Dynamically checking ownership policies in concurrent C/C++ programs. In ACM Sigplan Notices, Vol. 45, pp. 457–470. ACM.

  82. Moiseev, M., Glukhikh, M., Zakharov, A., & Richter, H. (2013). A static analysis approach to data race detection in systemic designs. In 2013 IEEE 16th international symposium on design and diagnostics of electronic circuits & systems (DDECS), pp. 54–59. IEEE.

  83. Montesinos, P., Ceze, L., & Torrellas, J. (2008). Delorean: Recording and deterministically replaying shared-memory multiprocessor execution ef? ciently. In 35th International symposium on computer architecture, 2008. ISCA’08, pp. 289–300. IEEE.

  84. Mozaffari-Kermani, M., Azarderakhsh, R., Lee, C.-Y., & Bayat-Sarmadi, S. (2014). Reliable concurrent error detection architectures for extended euclidean-based division over gf(m2). Very large scale integration (VLSI) systems. IEEE Transactions on, 22(5), 995–1003.

    Google Scholar 

  85. Nanz, S., Torshizi, F., Pedroni, M., & Meyer, B. (2013). Design of an empirical study for comparing the usability of concurrent programming languages. Information and Software Technology, 55(7), 1304–1315.

    Article  Google Scholar 

  86. Negishi, Y., Murata, H., Cong, G., Wen, H.-F., Chung, I. et al. (2012). A static analysis tool using a three-step approach for data races in HPC programs. In Proceedings of the 2012 workshop on parallel and distributed systems: Testing, analysis, and debugging, pp. 11–17. ACM.

  87. Oßner, C., & Böhm, K. (2013). Graphs for mining-based defect localization in multithreaded programs. International Journal of Parallel Programming, 41(4), 570–593.

    Article  Google Scholar 

  88. Park, C.-S., & Sen, K. (2008). Randomized active atomicity violation detection in concurrent programs. In Proceedings of the 16th ACM SIGSOFT international symposium on foundations of software engineering, pp. 135–145. ACM.

  89. Park, C.-S., & Sen, K. (2012). Concurrent breakpoints. In ACM SIGPLAN notices, Vol. 47, pp. 331–332. ACM.

  90. Park, H.-D., & Jun, Y.-K. (2012). Detecting first races in shared-memory parallel programs with random synchronization. In Computer applications for graphics, grid computing, and industrial environment, pp. 165–169. Springer.

  91. Park, M.-Y., & Chung, S.-H. (2008). Detection of first races for debugging message-passing programs. In Computer and information technology, 2008. CIT 2008. 8th IEEE international conference on, pp. 261–266. IEEE.

  92. Park, M.-Y., Kim, S. Y., & Park, H.-R. (2007). Visualization of affect-relations of message races for debugging MPI programs. In IEEE international conference on granular computing, 2007. GRC 2007. pp. 745–745. IEEE.

  93. Park, S. (2013). Debugging non-deadlock concurrency bugs. In Proceedings of the 2013 international symposium on software testing and analysis, pp. 358–361. ACM.

  94. Park, S., Vuduc, R., & Harrold, M. J. (2012). A unified approach for localizing non-deadlock concurrency bugs. In 2012 IEEE Fifth International Conference on software testing, verification and validation (ICST), pp. 51–60. IEEE.

  95. Petersen, K., Feldt, R., Mujtaba, S., & Mattsson, M. (2008). Systematic mapping studies in software engineering. In 12th International conference on evaluation and assessment in software engineering, Vol. 17.

  96. Prvulovic, M. (2006). CORD: Cost-effective (and nearly overhead-free) order-recording and data race detection. In The twelfth international symposium on high-performance computer architecture, 2006, pp. 232–243. IEEE.

  97. Pun, K. I., Steffen, M., & Stolz, V. (2014). Deadlock checking by data race detection. Journal of Logical and Algebraic Methods in Programming, 83(5), 400–426.

    MathSciNet  Article  MATH  Google Scholar 

  98. Qi, S., Otsuki, N., Nogueira, L. O., Muzahid, A., & Torrellas, J. (2012). Pacman: Tolerating asymmetric data races with unintrusive hardware. In 2012 IEEE 18th International symposium on high performance computer architecture, pp. 1–12. IEEE.

  99. Qi, Y., Das, R., Luo, Z. D., & Trotter, M. (2009). Multicoresdk: A practical and efficient data race detector for real-world applications. In Proceedings of the 7th workshop on parallel and distributed systems: Testing, analysis, and debugging, p. 5. ACM.

  100. Raychev, V., Vechev, M., & Sridharan, M. (2013). Effective race detection for event-driven programs. In ACM SIGPLAN notices, Vol. 48, pp. 151–166. ACM.

  101. Rister, B. D., Campbell, J., Pillai, P., & Mowry, T. C. (2007). Integrated debugging of large modular robot ensembles. In 2007 IEEE international conference on robotics and automation, pp. 2227–2234. IEEE.

  102. Rossi, D., Omaa, M., Garrammone, G., Metra, C., Jas, A., & Galivanche, R. (2013). Low cost concurrent error detection strategy for the control logic of high performance microprocessors and its application to the instruction decoder. Journal of Electronic Testing, 29(3), 401–413.

    Article  Google Scholar 

  103. Sack, P., Bliss, B. E., Ma, Z., Petersen, P., & Torrellas, J. (2006). Accurate and efficient filtering for the intel thread checker race detector. In Proceedings of the 1st workshop on Architectural and system support for improving software dependability, pp. 34–41. ACM.

  104. Sadowski, C., & Yi, J. (2009). Tiddle: A trace description language for generating concurrent benchmarks to test dynamic analyses. In Proceedings of the seventh international workshop on dynamic analysis, pp. 15–21. ACM.

  105. Said, M., Wang, C., Yang, Z., & Sakallah, K. (2011). Generating data race witnesses by an SMT-based analysis. In NASA formal methods, pp. 313–327. Springer.

  106. Schaeli, B., & Hersch, R. D. (2008). Dynamic testing of flow graph based parallel applications. In Proceedings of the 6th workshop on parallel and distributed systems: Testing, analysis, and debugging, p. 2. ACM.

  107. Schneider, J. (2014). Tracking down root causes of defects in simulink models. In Proceedings of the 29th ACM/IEEE international conference on Automated software engineering. ACM.

  108. Schuppan, V., Baur, M., & Biere, A. (2005). JVM independent replay in Java. Electronic Notes in Theoretical Computer Science, 113, 85–104. doi:10.1016/j.entcs.2004.01.032.

    Article  Google Scholar 

  109. Sen, K. (2008). Race directed random testing of concurrent programs. In ACM SIGPLAN notices, Vol. 43, pp. 11–21. ACM.

  110. Serebryany, K., & Iskhodzhanov, T. (2009). ThreadSanitizer: Data race detection in practice. In Proceedings of the workshop on binary instrumentation and applications, pp. 62–71. ACM.

  111. Shimomura, T., & Ikeda, K. (2013). Waiting blocked-tree type deadlock detection. In Science and information conference (SAI), 2013, pp. 45–50. IEEE.

  112. Shousha, M., Briand, L., & Labiche, Y. (2012). A uml/marte model analysis method for uncovering scenarios leading to starvation and deadlocks in concurrent systems. Software Engineering, IEEE Transactions on, 38(2), 354–374.

    Article  Google Scholar 

  113. Shousha, M., B., Lionel C., & Labiche, Y. (2009). A uml/marte model analysis method for detection of data races in concurrent systems. In Model driven engineering languages and systems, pp. 47–61. Springer.

  114. Song, Y. W., & Lee, Y.-H. (2014). Efficient data race detection for C/C++ programs using dynamic granularity. In Parallel and distributed processing symposium, 2014 IEEE 28th international, pp. 679–688. IEEE.

  115. Tallam, S., Tian, C., & Gupta, R. (2008). Dynamic slicing of multithreaded programs for race detection. In IEEE International conference on software maintenance, 2008. ICSM 2008, pp. 97–106. IEEE.

  116. Tan, L., Feng, M., & Gupta, R. (2013). Lightweight fault detection in parallelized programs. In 2013 IEEE/ACM International symposium on code generation and optimization (CGO), pp. 1–11. IEEE.

  117. Tchamgoue, G.M., Gan, L., Ha, O.-K., Yang, S.-W., & Jun, Y.-K. (2012). Visualizing concurrency faults in ARINC-653 real-time applications. In Digital avionics systems conference (DASC), 2012 IEEE/AIAA 31st, pp. 9–61. IEEE.

  118. Tchamgoue, G. M., Kuh, I.-B., Ha, O.-K., Kim, K.-H., & Jun, Y.-K. (2010). A race healing framework in simulated ARINC-653. In Communication and networking, pp. 238–246. Springer.

  119. Teixeira, B., Loureno, J., Farchi, E., Dias, R., & Sousa, D. (2010). Detection of transactional memory anomalies using static analysis. In Proceedings of the 8th workshop on parallel and distributed systems: Testing, analysis, and debugging, pp. 26–36. ACM.

  120. Tian, C., Nagarajan, V., Gupta, R., & Tallam, S. (2009). Automated dynamic detection of busywait synchronizations. Software: Practice and Experience, 39(11), 947–972.

    Google Scholar 

  121. Torrellas, J., Ceze, L., Tuck, J., Cascaval, C., Montesinos, P., Ahn, W., et al. (2009). The Bulk Multicore architecture for improved programmability. Communications of the ACM, 52(12), 58–65.

    Article  Google Scholar 

  122. Trainin, E., Nir-Buchbinder, Y., Tzoref-Brill, R., Zlotnick, A., Ur, S., & Farchi, E. (2009). Forcing small models of conditions on program interleaving for detection of concurrent bugs. In Proceedings of the 7th workshop on parallel and distributed systems: Testing, analysis, and debugging, p. 7. ACM.

  123. Tzoref, R., Ur, S., & Yom-Tov, E., (2007). Instrumenting where it hurts: An automatic concurrent debugging technique. In Proceedings of the 2007 international symposium on software testing and analysis, pp. 27–38. ACM.

  124. Uhrig, S. (2011). Tracing static fields of embedded parallel Java applications. In Computer software and applications conference workshops (COMPSACW), 2011 IEEE 35th annual, pp. 516–519. IEEE.

  125. Vasudevan, N., Edwards, S. A., & Singh, S., (2008). A deterministic multi-way rendezvous library for Haskell. In IPDPS 2008. IEEE international symposium on parallel and distributed processing, 2008, pp. 1–12. IEEE.

  126. Veeraraghavan, K., Chen, P. M., Flinn, J., & Narayanasamy, S. (2011). Detecting and surviving data races using complementary schedules. In Proceedings of the twenty-third ACM symposium on operating systems principles, pp. 369–384. ACM.

  127. Viennot, N., Nair, S., & Nieh, J. (2013). Transparent mutable replay for multicore debugging and patch validation. In ACM SIGARCH computer architecture news, Vol. 41, pp. 127–138. ACM.

  128. Vo, A., & Gopalakrishnan, G. (2010). Scalable verification of MPI programs. In2010 IEEE international symposium on parallel & distributed processing, workshops and Phd forum (IPDPSW), pp. 1–4. IEEE.

  129. Wang, J.-Y., Shue, Y.-S., & Bagchi, S. (2007). Pesticide: Using SMT to improve performance of pointer-bug detection. In International conference on computer design, 2006. ICCD 2006, pp. 514–521. IEEE.

  130. Wang, L., & Stoller, S. D. (2006a). Accurate and efficient runtime detection of atomicity errors in concurrent programs. In Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming, pp. 137–146. ACM.

  131. Wang, L., & Stoller, S. D. (2006b). Runtime analysis of atomicity for multithreaded programs. Software Engineering, IEEE Transactions on, 32(2), 93–110.

    Article  Google Scholar 

  132. Wang, N., Han, J., & Fang, J. (2012a). A transparent control-flow based approach to record-replay non-deterministic bugs. In 2012 IEEE 7th international conference on networking, architecture and storage (NAS), pp. 189–198. IEEE.

  133. Wang, P., Zhang, X., Hao, P., & Zhang, Y. (2012b). Towards the multithreaded deterministic replay in program debugging. In 2012 8th international conference on information science and digital content technology (ICIDT), Vol. 1, pp. 139–144. IEEE.

  134. Wang, T., Shen, L., & Ma, C. (2014). A process algebra-based detection model for multithreaded programs in communication system. KSII Transactions on Internet and Information Systems (TIIS), 8(3), 965–983.

    Article  Google Scholar 

  135. Wang, W., & Fang, B. (2005). Replaying message-passing programs with an efficient logical clock. WSEAS Transactions on Computers, 4(7), 750–757.

    Google Scholar 

  136. Wang, W., Wang, Z., Wu, C., Yew, P.-C., Shen, X., Yuan, X., Li, J., Feng, X., & Guan, Y. (2014). Localization of concurrency bugs using shared memory access pairs. In Proceedings of the 29th ACM/IEEE international conference on Automated software engineering, pp. 611–622. ACM.

  137. Wang, Y., Liu, P., Kelly, T., Lafortune, S., Reveliotis, S. A., & Zhang, C. (2012). On atomicity enforcement in concurrent software via discrete event systems theory. In CDC, pp. 7230–7237.

  138. Watson, G. R., Rasmussen, C. E., Tibbitts, B. R. (2009). An integrated approach to improving the parallel application development process. In IEEE International Symposium on parallel & distributed processing, 2009. IPDPS 2009, pp. 1–8. IEEE.

  139. Weeratunge, D., Zhang, X., & Jagannathan, S. (2010a). Analyzing multicore dumps to facilitate concurrency bug reproduction. ACM Sigplan Notices, 45(3), 155–166.

    Article  Google Scholar 

  140. Weeratunge, D., Zhang, X., Sumner, W. N., & Jagannathan, S. (2010b). Analyzing concurrency bugs using dual slicing. In Proceedings of the 19th international symposium on Software testing and analysis, pp. 253–264. ACM.

  141. Wen, C.-N., Chou, S.-H., & Chen, T.-F. (2009). dIP: A non-intrusive debugging IP for dynamic data race detection in many-core. In 2009 10th International symposium on pervasive systems, algorithms, and networks (ISPAN), pp. 86–91. IEEE.

  142. Wen, C.-N., Chou, S.-H., Chen, C.-C., & Chen, T.-F. (2012). NUDA: A non-uniform debugging architecture and nonintrusive race detection for many-core systems. Computers, IEEE Transactions on, 61(2), 199–212.

    MathSciNet  Article  Google Scholar 

  143. Wen, Y., Zhao, J., Huang, M., & Chen, H. (2011). Towards detecting thread deadlock in java programs with jvm introspection. In Trust, security and privacy in computing and communications (TrustCom), 2011 IEEE 10th International conference on, pp. 1600–1607. IEEE.

  144. Wester, B., Devecsery, D., Chen, P. M., Flinn, J., & Narayanasamy, S. (2013). Parallelizing data race detection. In ACM SIGARCH computer architecture news, Vol. 41, pp. 27–38. ACM.

  145. Wieringa, R., Maiden, N., Mead, N., & Rolland, C. (2005). Requirements engineering paper classification and evaluation criteria: A proposal and a discussion. Requirements Engineering, 11(1), 102–107.

    Article  Google Scholar 

  146. Wu, X., Wei, J., & Wang, X. (2012). Debug concurrent programs with visualization and inference of event structure. In Software engineering conference (APSEC), 2012 19th Asia-Pacific, Vol. 1, pp. 683–692. IEEE.

  147. Wu, X., Wen, Y., Chen, L., Dong, W., & Wang, J., (2013). Data race detection for interrupt-driven Programs via bounded model checking. In Software security and reliability-companion (SERE-C), 2013 IEEE 7th international conference on, pp. 204–210. IEEE.

  148. Yoshiura, N., & Wei, W. (2014). Static data race detection for Java programs with dynamic class loading. In Internet and distributed computing systems, pp. 161–173. Springer.

  149. Yu, J., Ci, Y., Zhou, P., Wu, Y., & Zhao, C. (2012). Deterministic replay of multithread applications using virtual machine. In Advanced information networking and applications workshops (WAINA), 2012 26th International conference on, pp. 429–434. IEEE.

  150. Yuan, D., Mai, H., Xiong, W., Tan, L., Zhou, Y., & Pasupathy, S., (2010). SherLog: Error diagnosis by connecting clues from run-time logs. In ACM SIGARCH computer architecture news, Vol. 38, pp. 143–154. ACM.

  151. Zaineb, G., & Manarvi, I. A. (2011). Identification and analysis of causes for software bug rejection with their impact over testing efficiency. International journal of software engineering & applications (IJSEA) 2 (4). http://airccse.org/journal/ijsea/papers/1011ijsea07.pdf.

  152. Zeller, A. (2009). Why programs fail: A guide to systematic debugging. Amsterdam: Elsevier.

    Google Scholar 

  153. Zhang, W., De Kruijf, M., Li, A., Shan, L., & Sankaralingam, Karthikeyan. (2013). ConAir: Featherweight concurrency bug recovery via single-threaded idempotent execution. ACM SIGARCH Computer Architecture News, 41(1), 113–126.

    Google Scholar 

  154. Zhou, P., Teodorescu, R., & Zhou, Y. (2007). HARD: Hardware-assisted lockset-based race detection. In IEEE 13th International symposium on high performance computer architecture, 2007. HPCA 2007, pp. 121–132. IEEE.

  155. Zyulkyarov, F., Harris, T., Unsal, O. S., Cristal, A., & Valero, M. (2010). Debugging programs that use atomic blocks and transactional memory. In ACM sigplan notices, Vol. 45, pp. 57–66. ACM.

Download references

Acknowledgments

This research is supported by Swedish Foundation for Strategic Research (SSF) via the SYNOPSIS Project.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Sara Abbaspour Asadollah.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Abbaspour Asadollah, S., Sundmark, D., Eldh, S. et al. 10 Years of research on debugging concurrent and multicore software: a systematic mapping study. Software Qual J 25, 49–82 (2017). https://doi.org/10.1007/s11219-015-9301-7

Download citation

Keywords

  • Concurrent
  • Parallel
  • Multicore
  • Debugging process
  • Bugs
  • Fault
  • Failure
  • Systematic mapping study