Abstract
Corfu is a distributed shared log that is designed to be scalable and reliable in the presence of failures and asynchrony. Internally, Corfu is fully replicated for fault tolerance, without sharding data or sacrificing strong consistency. In this case study, we present the modeling approaches we followed to test and verify, using Concuerror, the correctness of repair methods for the Chain Replication protocol suitable for Corfu. In the first two methods we tried, Concuerror located bugs quite fast. In contrast, the tool did not manage to find bugs in the third method, but the time this took also motivated an improvement in the tool that reduces the number of traces explored. Besides more details about all the above, we present experiences and lessons learned from applying stateless model checking for verifying complex protocols suitable for distributed programming.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
@slfritchie: “I was all ready to have a celebratory “New algorithm works!” tweet. Then the DPOR model execution w/Concuerror found an invalid case. Ouch.” (https://twitter.com/slfritchie/status/745863131407220737).
- 2.
More information about Concuerror is at http://parapluu.github.io/Concuerror.
References
Abdulla, P., Aronis, S., Jonsson, B., Sagonas, K.: Optimal dynamic partial order reduction. In: Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2014, pp. 373–384. ACM, New York (2014). doi:10.1145/2535838.2535845
Armstrong, J.: Erlang. Commun. ACM 53(9), 68–75 (2010)
Burckhardt, S., Kothari, P., Musuvathi, M., Nagarakatte, S.: A randomized scheduler with probabilistic guarantees of finding bugs. In: Proceedings of ASPLOS, ASPLOS XV, pp. 167–178. ACM, New York (2010). doi:10.1145/1736020.1736040
Christakis, M., Gotovos, A., Sagonas, K.: Systematic testing for detecting concurrency errors in Erlang programs. In: Sixth IEEE International Conference on Software Testing, Verification and Validation (ICST 2013), pp. 154–163. IEEE Computer Society (2013)
Deligiannis, P., Donaldson, A.F., Ketema, J., Lal, A., Thomson, P.: Asynchronous programming, analysis and testing with state machines. In: Proceedings of the 36th PLDI, PLDI 2015, pp. 154–164 (2015). doi:10.1145/2737924.2737996
Emmi, M., Qadeer, S., Rakamarić, Z.: Delay-bounded scheduling. In: Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, pp. 411–422. ACM, New York (2011)
Fritchie, S.L.: Chain replication in theory and in practice. In: Proceedings of the 9th ACM SIGPLAN Workshop on Erlang, Erlang 2010, pp. 33–44. ACM, New York (2010). doi:10.1145/1863509.1863515
Geambasu, R., Birrell, A., MacCormick, J.: Experiences with formal specification of fault-tolerant file systems. In: IEEE International Conference on Dependable Systems and Networks With FTCS and DCC, DSN 2008, pp. 96–101. IEEE (2008)
Godefroid, P.: Partial-Order Methods for the Verification of Concurrent Systems: An Approach to the State-Explosion Problem. Springer-Verlag New York Inc., Secaucus (1996)
Godefroid, P.: Model checking for programming languages using VeriSoft. In: Proceedings of the 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 1997, pp. 174–186. ACM, New York (1997). doi:10.1145/263699.263717
Godefroid, P.: Software model checking: the VeriSoft approach. Form. Methods Syst. Des. 26(2), 77–101 (2005). doi:10.1007/s10703-005-1489-x
Gotovos, A., Christakis, M., Sagonas, K.: Test-driven development of concurrent programs using Concuerror. In: Proceedings of the 10th ACM SIGPLAN Workshop on Erlang, Erlang 2011, pp. 51–61. ACM, New York (2011). doi:10.1145/2034654.2034664
Malkhi, D., Balakrishnan, M., Davis, J.D., Prabhakaran, V., Wobber, T.: From Paxos to CORFU: a flash-speed shared log. SIGOPS Oper. Syst. Rev. 46(1), 47–51 (2012). doi:10.1145/2146382.2146391
Musuvathi, M., Qadeer, S., Ball, T., Basler, G., Nainar, P.A., Neamtiu, I.: Finding and reproducing heisenbugs in concurrent programs. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI 2008, pp. 267–280. USENIX Association, Berkeley (2008)
Qadeer, S., Rehof, J.: Context-bounded model checking of concurrent software. In: Halbwachs, N., Zuck, L.D. (eds.) TACAS 2005. LNCS, vol. 3440, pp. 93–107. Springer, Heidelberg (2005). doi:10.1007/978-3-540-31980-1_7
van Renesse, R., Schneider, F.B.: Chain replication for supporting high throughput and availability. In: Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation, OSDI 2004, pp. 91–104. USENIX, Berkeley (2004)
Schiper, N., Rahli, V., van Renesse, R., Bickford, M., Constable, R.L.: Developing correctly replicated databases using formal tools. In: 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 395–406. IEEE (2014)
Thomson, P., Donaldson, A.F., Betts, A.: Concurrency testing using controlled schedulers: an empirical study. ACM Trans. Parallel Comput. 2(4), 23:1–23:37 (2016). doi:10.1145/2858651
Acknowledgments
This work was supported in part by the Swedish Research Council and carried out within the Linnaeus centre of excellence UPMARC, Uppsala Programming for Multicore Architectures Research Center.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Aronis, S., Fritchie, S.L., Sagonas, K. (2017). Testing and Verifying Chain Repair Methods for Corfu Using Stateless Model Checking. In: Polikarpova, N., Schneider, S. (eds) Integrated Formal Methods. IFM 2017. Lecture Notes in Computer Science(), vol 10510. Springer, Cham. https://doi.org/10.1007/978-3-319-66845-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-66845-1_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66844-4
Online ISBN: 978-3-319-66845-1
eBook Packages: Computer ScienceComputer Science (R0)