On the Timed Analysis of Big-Data Applications

Marconi, Francesco; Quattrocchi, Giovanni; Baresi, Luciano; Bersani, Marcello M.; Rossi, Matteo

doi:10.1007/978-3-319-77935-5_22

Francesco Marconi ORCID: orcid.org/0000-0002-9397-3543¹⁶,
Giovanni Quattrocchi¹⁶,
Luciano Baresi¹⁶,
Marcello M. Bersani¹⁶ &
…
Matteo Rossi ORCID: orcid.org/0000-0002-9193-9560¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10811))

Included in the following conference series:

NASA Formal Methods Symposium

1502 Accesses
4 Citations

Abstract

Apache Spark is one of the best-known frameworks for executing big-data batch applications over a cluster of (virtual) machines. Defining the cluster (i.e., the number of machines and CPUs) to attain guarantees on the execution times (deadlines) of the application is indeed a trade-off between the cost of the infrastructure and the time needed to execute the application. Sizing the computational resources, in order to prevent cost overruns, can benefit from the use of formal models as a means to capture the execution time of applications.

Our model of Spark applications, based on the CLTLoc logic, is defined by considering the directed acyclic graph around which Spark programs are organized, the number of available CPUs, the number of tasks elaborated by the application, and the average execution times of tasks. If the outcome of the analysis is positive, then the execution is feasible—that is, it can be completed within a given time span. The analysis tool has been implemented on top of the Zot formal verification tool. A preliminary evaluation shows that our model is sufficiently accurate: the formal analysis identifies execution times that are close (the error is less than 10%) to those obtained by actually running the applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Abdeddaim, Y., Asarin, E., Maler, O.: Scheduling with timed automata. Theoret. Comput. Sci. 354(2), 272–300 (2006)
Article MathSciNet MATH Google Scholar
Alur, R., Dill, D.L.: A theory of timed automata. Theoret. Comput. Sci. 126(2), 183–235 (1994)
Article MathSciNet MATH Google Scholar
Baresi, L., Pourhashem Kallehbasti, M.M., Rossi, M.: How bit-vector logic can help improve the verification of LTL specifications over infinite domains. In: Proceedings of 31st Annual ACM Symposium on Applied Computing, pp. 1666–1673 (2016)
Google Scholar
Behrmann, G., Larsen, K.G., Rasmussen, J.I.: Optimal scheduling using priced timed automata. SIGMETRICS Perform. Eval. Rev. 32(4), 34–40 (2005)
Article Google Scholar
Bersani, M., Erascu, M., Marconi, F., Rossi, M.: DICE verification tool - final version. Technical report, DICE Consortium (2017). www.dice-h2020.eu
Bersani, M.M., Rossi, M., San Pietro, P.: A tool for deciding the satisfiability of continuous-time metric temporal logic. Acta Informatica 53(2), 171–206 (2016)
Article MathSciNet MATH Google Scholar
Bradley, S., Henderson, W., Kendall, D.: Using timed automata for response time analysis of distributed real-time systems. In: 24th IFAC/IFIP Workshop on Real-Time Programming, pp. 143–148 (1999)
Google Scholar
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proceedings of International World-Wide Web Conference (WWW), pp. 107–117 (1998)
Google Scholar
Brito, A., Ardagna, D., Blanquer, I., Evangelinou, A., Barbierato, E., Gribaudo, M., Almeida, J., Couto, A.P., Braga, T.: D3.4 EUBra-BIGSEA QoS infrastructure services intermediate version. Technical report, 3 July 2017
Google Scholar
Corbett, J.C.: Timing analysis of ada tasking programs. IEEE Trans. Softw. Eng. 22(7), 461–483 (1996)
Article Google Scholar
Demri, S., D’Souza, D.: An automata-theoretic approach to constraint LTL. Inf. Comput. 205(3), 380–415 (2007)
Article MathSciNet MATH Google Scholar
Henzinger, T.A.: The theory of hybrid automata. In: Inan, M.K., Kurshan, R.P. (eds.) Verification of Digital and Hybrid Systems. NATO ASI Series (Series F: Computer and Systems Sciences), vol. 170, pp. 265–292. Springer, Berlin (2000). https://doi.org/10.1007/978-3-642-59615-5_13
Chapter Google Scholar
Krakora, J., Waszniowski, L., Pisa, P., Hanzalek, Z.: Timed automata approach to real time distributed system verification. In: Proceedings of IEEE International Workshop on Factory Communication Systems, pp. 407–410, September 2004
Google Scholar
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of Berkeley symposium on mathematical statistics and probability, vol. 1, pp. 281–297 (1967)
Google Scholar
Marconi, F., Bersani, M.M., Erascu, M., Rossi, M.: Towards the formal verification of data-intensive applications through metric temporal logic. In: Ogata, K., Lawford, M., Liu, S. (eds.) ICFEM 2016. LNCS, vol. 10009, pp. 193–209. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47846-3_13
Chapter Google Scholar
Palencia, J.C., Harbour, M.G.: Schedulability analysis for tasks with static and dynamic offsets. In: Proceedings of IEEE Real-Time Systems Symposium, pp. 26–37, December 1998
Google Scholar
Perez, D., Bernardi, S., Merseguer, J.Z., Requeno, J.I., Casale, G., Zhu, L.: DICE simulation tools - final version. Deliverable. http://www.dice-h2020.eu/resources/

Download references

Acknowledgment

This work has been partially supported by the DICE project (Horizon 2020 project no. 644869) and by the GAUSS national research project (MIUR, PRIN 2015, Contract 2015KWREMX).

Author information

Authors and Affiliations

DEIB, Politecnico di Milano, Milan, Italy
Francesco Marconi, Giovanni Quattrocchi, Luciano Baresi, Marcello M. Bersani & Matteo Rossi

Authors

Francesco Marconi
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Quattrocchi
View author publications
You can also search for this author in PubMed Google Scholar
Luciano Baresi
View author publications
You can also search for this author in PubMed Google Scholar
Marcello M. Bersani
View author publications
You can also search for this author in PubMed Google Scholar
Matteo Rossi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesco Marconi .

Editor information

Editors and Affiliations

NASA Langley Research Center, Hampton, Virginia, USA
Aaron Dutle
NASA Langley Research Center, Hampton, Virginia, USA
César Muñoz
NASA Langley Research Center, Hampton, Virginia, USA
Anthony Narkawicz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Marconi, F., Quattrocchi, G., Baresi, L., Bersani, M.M., Rossi, M. (2018). On the Timed Analysis of Big-Data Applications. In: Dutle, A., Muñoz, C., Narkawicz, A. (eds) NASA Formal Methods. NFM 2018. Lecture Notes in Computer Science(), vol 10811. Springer, Cham. https://doi.org/10.1007/978-3-319-77935-5_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-77935-5_22
Published: 11 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77934-8
Online ISBN: 978-3-319-77935-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics