Abstract
Although most previous work in cache analysis for WCET estimation assumes the LRU replacement policy, in practice more processors use simpler non-LRU policies for lower cost, power consumption, and thermal output. This chapter focuses on the analysis of FIFO, one of the most widely used cache replacement policies. Previous analysis techniques for FIFO caches are based on the same framework as for LRU caches using qualitative always-hit/always-miss classifications. This approach, though works well for LRU caches, is not suitable to analyze FIFO and usually leads to poor WCET estimation quality.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
This can be shown by a reduction from the well-known 3-SAT problem, the details of which are omitted due to the space limit.
- 2.
The idea is to group as many nodes with the same γ-set characterization into “blocks,” to reduce the number of variables used in the MILP encoding.
References
R. Wilhelm, J. Engblom, A. Ermedahl, N. Holsti, S. Thesing, D. Whalley, G. Bernat, C. FerdinanRd, R. Heckmann, T. Mitra, F. Mueller, I. Puaut, P. Puschner, J. Staschulat, P. Stenström, The worst-case execution-time problem overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 7(3), 36:1–36:53 (2008)
H. Theiling, C. Ferdinand, R. Wilhelm, Fast and precise wcet prediction by separated cache and path analyses. Real-Time Syst. 18(2/3), 157–179 (2000). doi:10.1023/A:1008141130870. http://dx.doi.org/10.1023/A:1008141130870
D. Grund, J. Reineke, Abstract interpretation of fifo replacement, in SAS (Springer, Berlin/Heidelberg, 2009), pp. 120–136
D. Grund, J. Reineke, Toward precise plru cache analysis, in WCET, 2010, pp. 23–35
J. Reineke, D. Grund, C. Berg, R. Wilhelm, Timing predictability of cache replacement policies. Real-Time Syst. 37(2), 99–122 (2007). doi:10.1007/s11241-007-9032-3. http://dx.doi.org/10.1007/s11241-007-9032-3
R. Wilhelm, D. Grund, J. Reineke, M. Schlickling, M. Pister, C. Ferdinand, Memory hierarchies, pipelines, and buses for future architectures in time-critical embedded systems. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 28(7), 966–978 (2009)
D. Grund, J. Reineke, Precise and efficient fifo-replacement analysis based on static phase detection, in ECRTS (IEEE Computer Society, Washington, DC, 2010), pp. 155–164
H. Al-Zoubi, A. Milenkovic, M. Milenkovic, Performance evaluation of cache replacement policies for the spec cpu2000 benchmark suite, in ACM-SE 42 (ACM, New York, NY, 2004), pp. 267–272
R. Heckmann, M. Langenbach, S. Thesing, R. Wilhelm, The influence of processor architecture on the design and the results of wcet tools, in Proceedings of the IEEE (2003)
A. Malamy, R.N. Patel, N.M. Hayes, Methods and apparatus for implementing a pseudo-lru cache memory replacement scheme with a locking feature. US Patent 5029072, 1994
J. Reineke, D. Grund, Relative competitive analysis of cache replacement policies, in Proceedings of the 2008 ACM SIGPLAN-SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, LCTES ’08 (ACM, New York, NY, 2008), pp. 51–60
C. Ballabriga, H. Casse, Improving the first-miss computation in set-associative instruction caches, in ECRTS (IEEE Computer Society, Washington, DC, 2008), pp. 341–350
C. Cullmann, Cache persistence analysis: a novel approach theory and practice, in LCTES (ACM, New York, NY, 2011), pp. 121–130
B.K. Huynh, L. Ju, A. Roychoudhury, Scope-aware data cache analysis for wcet estimation, in RTAS (IEEE Computer Society, Washington, DC, 2011), pp. 203–212
J. Gustafsson, A. Betts, A. Ermedahl, B. Lisper, The mälardalen wcet benchmarks: past, present and future, in 10th International Workshop on Worst-Case Execution-Time Analysis, WCET’10, 2010
T. Austin, E. Larson, D. Ernst, Simplescalar: an infrastructure for computer system modeling. Computer 35(2), 59–67 (2002)
R. Wilhelm, Why AI + ILP is good for WCET, but MC is not, nor ILP alone, in VMCAI, 2004
E.W. Dijkstra, Chapter I: notes on structured programming, in Structured Programming (Academic, London, 1972), pp. 1–82
M. Berkelaar, lp_solve: a mixed integer linear program solver. Relatorio Tecnico, Eindhoven University of Technology, 1999
Author information
Authors and Affiliations
Appendix: The Complete IPET Formulation
Appendix: The Complete IPET Formulation
We first introduce the loop structures adopted in our ILP encoding. As a common restriction in structured programming [127], we assume each loop contains a single head basic block, and the program can jump into the loop by reaching the head basic block via some entry edges. The loop bound restricts the maximal times the program iterates every time it enters the loop. The head basic block tests whether the loop condition is satisfied (e.g., the loop bound has not been reached). If the loop condition is satisfied, the program continues to execute the body basic blocks, which are the basic blocks in the loop excluding the head basic block, otherwise the program exists the loop. Formally we define a loop as:
Definition 3.6 (Loop Structure).
A loop in the CFG is a tuple \(\mathfrak{L}_{\ell} = (\mathsf{entr}_{\ell},\) \(\mathsf{head}_{\ell},\mathsf{body}_{\ell},\mathsf{lpb}_{\ell})\) with:
-
entr ℓ : the set of entry edges of the loop;
-
head ℓ : the head basic block of the loop;
-
body ℓ : the set of all body basic blocks of the loop;
-
lpb ℓ : the loop bound.
The ILP formulation uses the following constants
-
C h: the execution delay of each node upon a cache hit,
-
C m: the execution delay of each node upon a cache miss,
and the following non-negative variables
-
c a : for each basic block b a , c a is b a ’s total execution cost,
-
x a : for each basic block b a , x a is the execution count of b a ,
-
y j : for each edge e j in the entry edge set entr ℓ of some loop \(\mathfrak{L}_{\ell}\), y j counts how many times this edge is taken during the whole execution,
-
z i : for each node n i included in some k-Miss node sets regarding some loops, z i counts how many times n i executes with cache misses.
To obtain the WCET, the following maximization problem is solved:
The following constraints are respected to bound the total cost:
-
Cost Constraint: As discussed in Sect. 2.5.5, the total cost of each basic block is
$$\displaystyle{\forall b_{a}: c_{a} = (\pi _{\mathsf{AH}} \times C^{h} +\pi _{\mathsf{ NC}} \times C^{m}) \times x_{ a} +\sum _{n_{i}\in b_{a}^{{\ast}}}\left (C^{m} \times z_{ i} + C^{h} \times (x_{ a} - z_{i})\right )}$$where π AH and π NC is the number of AH and NC nodes in b a , respectively, and \(b_{a}^{{\ast}}\) is the set of nodes in b a that are contained in some k-Miss node sets (regarding some loops). Additionally, each \(n_{i} \in b_{a}^{{\ast}}\) should satisfy \(z_{i} \leq x_{a}\).
-
k- Miss Constraint: As discussed in Sect. 2.5.5, the following constraints bound the number of misses incurred by a k-Miss node set:
$$\displaystyle{\forall (S,\mathfrak{L}_{\ell})\text{ s.t. }S\text{ is }k\mathrm{-\mathsf{Miss}\ regarding\ }\mathfrak{L}_{\ell}:\sum _{n_{i}\in S}z_{i} \leq k \times \sum _{e_{j}\in \mathsf{entr}_{\ell}}y_{j}}$$ -
Structure Constraint: Each basic block should have balanced input and output:
$$\displaystyle{\forall b_{a}:\ \ x_{a} =\sum _{e_{j}\in \mathsf{input}(b_{a})}y_{j} =\sum _{e_{j}\in \mathsf{output}(b_{a})}y_{j}}$$The start basic block b st of the program is executed only once:
$$\displaystyle{x_{st} = 1}$$Each time the program enters the loop, each body basic block executes for at most lpb ℓ times, so we have
$$\displaystyle{\forall \mathfrak{L}_{\ell},\forall b_{a} \in \mathsf{body}_{\ell}: x_{a} \leq \mathsf{lpb}_{\ell} \times \sum _{e_{j}\in \mathsf{entr}_{\ell}}y_{j}}$$The head basic block may execute one more time to realize that the loop condition is not satisfied and thus the program exists the loop, so we have:
$$\displaystyle{\forall \mathfrak{L}_{\ell},b_{a} = \mathsf{head}_{\ell}: x_{a} \leq (\mathsf{lpb}_{\ell} + 1) \times \sum _{e_{j}\in \mathsf{entr}_{\ell}}y_{j}}$$
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Guan, N. (2016). FIFO Cache Analysis for WCET Estimation. In: Techniques for Building Timing-Predictable Embedded Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-27198-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-27198-9_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27196-5
Online ISBN: 978-3-319-27198-9
eBook Packages: EngineeringEngineering (R0)