Abstract
Previous deterministic replay systems reduce the runtime overhead by either relying on hardware support or by relaxing the determinism requirements for replay. We propose LightPlay that fulfills stricter determinism requirements with low overhead without requiring hardware or OS support. LightPlay guarantees that the memory state after each instruction instance in a replay run is the same as in original run. It reduces logging overhead using a lightweight thread local technique that avoids synchronization between threads during the recording run. GPUs are used to efficiently identify the memory ordering constraints that produce the same memory states before the replay run. LightPlay incurs low space overhead for logging as it only stores the part of log where data races occur. During the logging run LightPlay is 20x–100x faster than logging the total order and requires only 1 % space overhead.
This work is supported by NSF grants CNS-1157377 and CCF-0905509 to UCR.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Altekar, G., Stoica, I.: Odr: output-deterministic replay for multicore debugging. In: SOSP, pp: 193–206 (2009)
Bhansali, S., Chen, W.-K., de Jong, S., Edwards, A., Murray, R., Drinić, M., Mihočka, D., Chau, J.: Framework for instruction-level tracing and analysis of program executions. In: VEE, pp. 154–163 (2006)
Bienia, C., Kumar, S., Singh, J.P., Li, K.: The parsec benchmark suite: characterization and architectural implications. In: PACT (2008)
Bressoud, T.C., Schneider, F.B.: Hypervisor-based fault tolerance. ACM Trans. Comput. Syst. 14(1), 80–107 (1996)
Dunlap, G,W., Lucchetti, D.G., Fetterman, M.A., Chen P.M.: Execution replay of multiprocessor virtual machines. In: VEE (2008)
Hower, D.R., Hill, M.D.: Rerun: exploiting episodes for lightweight memory race recording. In: ISCA, pp. 265–276 (2008)
Huang, J., Liu, P., Zhang, C.: Leap: lightweight deterministic multi-processor replay of concurrent java programs. In: FSE, pp. 207–216 (2010)
King, S.T., Dunlap, G.W., Chen, P.M.: Debugging operating systems with time-traveling virtual machines. In: USENIX (2005)
LeBlanc, T.J., Mellor-Crummey, J.M.: Debugging parallel programs with instant replay. IEEE Trans. Comput. 36(4), 471–482 (1987)
Lee, D., Said, M., Narayanasamy, S., Yang, Z.: Offline symbolic analysis to infer total store order. In: HPCA. IEEE (2011)
Lee, D., Said, M., Narayanasamy, S., Yang, Z.: Pereira. Offline symbolic analysis for multi-processor execution replay. In: MICRO, pp. 564–575 (2009)
Lee, D., Wester, B., Veeraraghavan, K., Narayanasamy, S., Chen, P.M., Flinn, J.: Respec: efficient online multiprocessor replayvia speculation and external determinism. In: ASPLOS, pp. 77–90 (2010)
Montesinos, P., Ceze, L., Torrellas, J.: Delorean: recording and deterministically replaying shared-memory multiprocessor execution efficiently. In: ISCA, pp. 289–300 (2008)
Nagarajan, V., Gupta, R.: Ecmon: exposing cache events for monitoring. In: ISCA, pp. 34–360 (2009)
Narayanasamy, S., Pereira, C., Calder, B.: Recording shared memory dependencies using strata. In: ASPLOS, pp. 229–240 (2006)
Park, S., Zhou, Y., Xiong, W., Yin, Z., Kaushik, R., Lee, K.H., Lu, S.: Pres: probabilistic replay with execution sketching on multiprocessors. In: SOSP, pp. 177–192 (2009)
Srinivasan, S.M., Kandula, S., Andrews, C.R., Zhou, Y.: Flashback: a lightweight extension for rollback and deterministic replay for software debugging. In: USENIX (2004)
Tucek, J., Lu, S., Huang, C., Xanthos, S., Zhou, Y.: Triage: diagnosing production run failures at the user’s site. In: SOSP (2007)
Veeraraghavan, K., Lee, D., Wester, B., Ouyang, J., Chen, P.M., Flinn, J., Narayanasamy, S.: Doubleplay: parallelizing sequential logging and replay. In: ASPLOS, pp. 15–26 (2011)
Vlachos, E., Goodstein, M.L., Kozuch, M.A., Chen, S., Falsafi, B., Gibbons, P.B., Mowry, T.C.: Paralog: enabling and accelerating online parallel monitoring of multithreaded applications. In: ASPLOS, pp. 271–284 (2010)
Weeratunge, D., Zhang, X., Jagannathan, S.: Analyzing multicore dumps to facilitate concurrency bug reproduction. In: ASPLOS (2010)
Xu, M., Bodik, R., Hill, M.D.: A “flight data recorder" for enabling full-system multiprocessor deterministic replay. In: ISCA, pp. 122–135 (2003)
Zamfir, C., Candea, G.: Execution synthesis: a technique for automated software debugging. In: EuroSys, pp. 321–334 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Feng, M., Khorasani, F., Gupta, R., Bhuyan, L.N. (2015). LightPlay: Efficient Replay with GPUs. In: Brodman, J., Tu, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2014. Lecture Notes in Computer Science(), vol 8967. Springer, Cham. https://doi.org/10.1007/978-3-319-17473-0_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-17473-0_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17472-3
Online ISBN: 978-3-319-17473-0
eBook Packages: Computer ScienceComputer Science (R0)