Abstract
With the rise of Big Data technologies, distributed stream processing systems (SPS) have gained popularity in the last years. Among them, Spark Streaming stands out as a particularly attractive option with a growing adoption in the industry. In this work we explore the combination of temporal logic and property-based testing for testing Spark Streaming programs, by adding temporal logic operators to ScalaCheck generators and properties. This allows us to deal with the time component that complicates the testing of Spark Streaming programs and SPS in general. In particular we propose a discrete time linear temporal logic for finite words, that allows to associate a timeout to each temporal operator in order to increase the expressiveness of generators and properties. Finally, our prototype is presented with some examples.
This research has been partially supported by MINECO Spanish projects StrongSoft (TIN2012-39391-C04-04), CAVI-ART (TIN2013-44742-C4-3-R), and TRACES (TIN2015-67522-C3-3-R), and by the Comunidad de Madrid project N-Greens Software-CM (S2013/ICE-2731).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
See https://spark.apache.org/docs/latest/programming-guide.html for more details.
- 2.
Here we use the integration of ScalaCheck with the Specs2 [21] testing library.
- 3.
Due to space limitations, the results for release are available in [19].
- 4.
Note that the value? is only reached when the word is consumed and this simplification cannot be applied.
- 5.
References
Akidau, T., Balikov, A., Bekiroğlu, K., Chernyak, S., Haberman, J., Lax, R., McVeety, S., Mills, D., Nordstrom, P., Whittle, S.: MillWheel: fault-tolerant stream processing at internet scale. Proc. VLDB Endowment 6(11), 1033–1044 (2013)
Barringer, H., Havelund, K.: TraceContract: A scala DSL for trace analysis. In: Butler, M., Schulte, W. (eds.) FM 2011. LNCS, vol. 6664, pp. 57–72. Springer, Heidelberg (2011)
Bauer, A., Leucker, M., Schallhart, C.: Monitoring of real-time properties. In: Arun-Kumar, S., Garg, N. (eds.) FSTTCS 2006. LNCS, vol. 4337, pp. 260–272. Springer, Heidelberg (2006)
Bauer, A., Leucker, M., Schallhart, C.: The good, the bad, and the ugly, but how ugly is ugly? In: Sokolsky, O., Taşıran, S. (eds.) RV 2007. LNCS, vol. 4839, pp. 126–138. Springer, Heidelberg (2007)
Beck, K.: Test Driven Development: By Example. Addison-Wesley Professional, Boston (2003)
Blackburn, P., van Benthem, J., Wolter, F. (eds.): Handbook of Modal Logic. Elsevier, Philadelphia (2006)
Claessen, K., Hughes, J.: QuickCheck: A lightweight tool for random testing of Haskell programs. ACM Sigplan Not. 46(4), 53–64 (2011)
Cormode, G., Muthukrishnan, S.: An improved data stream summary: The count-min sketch and its applications. J. Algorithms 55(1), 58–75 (2005)
D’Angelo, B., Sankaranarayanan, S., Sánchez, C., Robinson, W., Finkbeiner, B., Sipma, H.B., Mehrotra, S., Manna, Z.: LOLA: runtime monitoring of synchronous systems. In: Proceedings of the 12th International Symposium on Temporal Representation and Reasoning, TIME, pp. 166–174. IEEE Computer Society (2005)
Halbwachs, N.: Synchronous programming of reactive systems. Springer International Series in Engineering and Computer Science, vol. 215. Kluwer Academic Publishers, Dordrecht (1992)
Karau, H.: Spark-testing-base (2015). http://blog.cloudera.com/blog/2015/09/making-apache-spark-testing-easy-with-spark-testing-base/
Kuhn, R., Allen, J.: Reactive Design Patterns. Manning Publications, Greenwich (2014)
Leucker, M., Schallhart, C.: A brief account of runtime verification. J. Logic Algebraic Program. 78(5), 293–303 (2009)
Marz, N., Warren, J.: Big Data: Principles and Best Practices of Scalable Realtime Data Systems. Manning Publications Co., Stamford (2015)
Morales, G.D.F., Bifet, A.: SAMOA: Scalable advanced massive online analysis. J. Mach. Learn. Res. 16, 149–153 (2015)
Nilsson, R.: ScalaCheck: The Definitive Guide. IT Pro, Artima Incorporated, Upper Saddle River (2014)
Pnueli, A.: Applications of temporal logic to the specification and verification of reactive systems: a survey of current trends. In: de Bakker, J.W., de Roever, W.-P., Rozenberg, G. (eds.) Current Trends in Concurrency. LNCS, vol. 224, pp. 510–584. Springer, Heidelberg (1986)
Raymond, P., Roux, Y., Jahier, E.: Lutin: a language for specifying and executing reactive scenarios. EURASIP J. Emb. Syst. 2008, 1–11 (2008). Article ID: 753821
Riesco, A., Rodríguez-Hortalá, J.: A lightweight tool for random testing of stream processing systems (extended version). Technical Report SIC 02/15, Departamento de Sistemas Informáticos y Computación de la Universidad Complutense de Madrid, September 2015. http://maude.sip.ucm.es/~adrian/pubs.html
Schelter, S., Ewen, S., Tzoumas, K., Markl, V.: All roads lead to Rome: optimistic recovery for distributed iterative data processing. In: Proceedings of the 22nd ACM international conference on Conference on information & knowledge management, pp. 1919–1928. ACM (2013)
Shamshiri, S., Rojas, J.M., Fraser, G., McMinn P.: Random or genetic algorithm search for object-oriented test suite generation? In: Proceedings of the on Genetic and Evolutionary Computation Conference, pp. 1367–1374. ACM (2015)
Venners, B.: Re: Prop.exists and scalatest matchers (2015). https://groups.google.com/forum/#!msg/scalacheck/Ped7joQLhnY/gNH0SSWkKUgJ
Wolper, P.: Temporal logic can be more expressive. Inf. Control 56(1/2), 72–99 (1983)
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, p. 2. USENIX Assoc (2012)
Zaharia, M., Das, T., Li, H., Hunter, T., Shenker, S., Stoica, I.: Discretized streams: fault-tolerant streaming computation at scale. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles, pp. 423–438. ACM (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Riesco, A., Rodríguez-Hortalá, J. (2016). Temporal Random Testing for Spark Streaming. In: Ábrahám, E., Huisman, M. (eds) Integrated Formal Methods. IFM 2016. Lecture Notes in Computer Science(), vol 9681. Springer, Cham. https://doi.org/10.1007/978-3-319-33693-0_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-33693-0_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-33692-3
Online ISBN: 978-3-319-33693-0
eBook Packages: Computer ScienceComputer Science (R0)