FIFO Cache Analysis for WCET Estimation

Guan, Nan

doi:10.1007/978-3-319-27198-9_3

FIFO Cache Analysis for WCET Estimation

Nan Guan²

Chapter
First Online: 04 February 2016

884 Accesses

Abstract

Although most previous work in cache analysis for WCET estimation assumes the LRU replacement policy, in practice more processors use simpler non-LRU policies for lower cost, power consumption, and thermal output. This chapter focuses on the analysis of FIFO, one of the most widely used cache replacement policies. Previous analysis techniques for FIFO caches are based on the same framework as for LRU caches using qualitative always-hit/always-miss classifications. This approach, though works well for LRU caches, is not suitable to analyze FIFO and usually leads to poor WCET estimation quality.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
This can be shown by a reduction from the well-known 3-SAT problem, the details of which are omitted due to the space limit.
2.
The idea is to group as many nodes with the same γ-set characterization into “blocks,” to reduce the number of variables used in the MILP encoding.

References

R. Wilhelm, J. Engblom, A. Ermedahl, N. Holsti, S. Thesing, D. Whalley, G. Bernat, C. FerdinanRd, R. Heckmann, T. Mitra, F. Mueller, I. Puaut, P. Puschner, J. Staschulat, P. Stenström, The worst-case execution-time problem overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 7(3), 36:1–36:53 (2008)
Google Scholar
H. Theiling, C. Ferdinand, R. Wilhelm, Fast and precise wcet prediction by separated cache and path analyses. Real-Time Syst. 18(2/3), 157–179 (2000). doi:10.1023/A:1008141130870. http://dx.doi.org/10.1023/A:1008141130870
Article Google Scholar
D. Grund, J. Reineke, Abstract interpretation of fifo replacement, in SAS (Springer, Berlin/Heidelberg, 2009), pp. 120–136
Google Scholar
D. Grund, J. Reineke, Toward precise plru cache analysis, in WCET, 2010, pp. 23–35
Google Scholar
J. Reineke, D. Grund, C. Berg, R. Wilhelm, Timing predictability of cache replacement policies. Real-Time Syst. 37(2), 99–122 (2007). doi:10.1007/s11241-007-9032-3. http://dx.doi.org/10.1007/s11241-007-9032-3
Article MATH Google Scholar
R. Wilhelm, D. Grund, J. Reineke, M. Schlickling, M. Pister, C. Ferdinand, Memory hierarchies, pipelines, and buses for future architectures in time-critical embedded systems. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 28(7), 966–978 (2009)
Article Google Scholar
D. Grund, J. Reineke, Precise and efficient fifo-replacement analysis based on static phase detection, in ECRTS (IEEE Computer Society, Washington, DC, 2010), pp. 155–164
Google Scholar
H. Al-Zoubi, A. Milenkovic, M. Milenkovic, Performance evaluation of cache replacement policies for the spec cpu2000 benchmark suite, in ACM-SE 42 (ACM, New York, NY, 2004), pp. 267–272
Google Scholar
R. Heckmann, M. Langenbach, S. Thesing, R. Wilhelm, The influence of processor architecture on the design and the results of wcet tools, in Proceedings of the IEEE (2003)
Google Scholar
A. Malamy, R.N. Patel, N.M. Hayes, Methods and apparatus for implementing a pseudo-lru cache memory replacement scheme with a locking feature. US Patent 5029072, 1994
Google Scholar
J. Reineke, D. Grund, Relative competitive analysis of cache replacement policies, in Proceedings of the 2008 ACM SIGPLAN-SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, LCTES ’08 (ACM, New York, NY, 2008), pp. 51–60
Google Scholar
C. Ballabriga, H. Casse, Improving the first-miss computation in set-associative instruction caches, in ECRTS (IEEE Computer Society, Washington, DC, 2008), pp. 341–350
Google Scholar
C. Cullmann, Cache persistence analysis: a novel approach theory and practice, in LCTES (ACM, New York, NY, 2011), pp. 121–130
Google Scholar
B.K. Huynh, L. Ju, A. Roychoudhury, Scope-aware data cache analysis for wcet estimation, in RTAS (IEEE Computer Society, Washington, DC, 2011), pp. 203–212
Google Scholar
J. Gustafsson, A. Betts, A. Ermedahl, B. Lisper, The mälardalen wcet benchmarks: past, present and future, in 10th International Workshop on Worst-Case Execution-Time Analysis, WCET’10, 2010
Google Scholar
T. Austin, E. Larson, D. Ernst, Simplescalar: an infrastructure for computer system modeling. Computer 35(2), 59–67 (2002)
Article Google Scholar
R. Wilhelm, Why AI + ILP is good for WCET, but MC is not, nor ILP alone, in VMCAI, 2004
Google Scholar
E.W. Dijkstra, Chapter I: notes on structured programming, in Structured Programming (Academic, London, 1972), pp. 1–82
Google Scholar
M. Berkelaar, lp_solve: a mixed integer linear program solver. Relatorio Tecnico, Eindhoven University of Technology, 1999
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing, Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
Nan Guan (Assistant Professor)

Authors

Nan Guan
View author publications
You can also search for this author in PubMed Google Scholar

Appendix: The Complete IPET Formulation

We first introduce the loop structures adopted in our ILP encoding. As a common restriction in structured programming [127], we assume each loop contains a single head basic block, and the program can jump into the loop by reaching the head basic block via some entry edges. The loop bound restricts the maximal times the program iterates every time it enters the loop. The head basic block tests whether the loop condition is satisfied (e.g., the loop bound has not been reached). If the loop condition is satisfied, the program continues to execute the body basic blocks, which are the basic blocks in the loop excluding the head basic block, otherwise the program exists the loop. Formally we define a loop as:

Definition 3.6 (Loop Structure).

A loop in the CFG is a tuple $\mathfrak{L}_{\ell} = (\mathsf{entr}_{\ell},$ $\mathsf{head}_{\ell},\mathsf{body}_{\ell},\mathsf{lpb}_{\ell})$ with:

entr _ℓ: the set of entry edges of the loop;
head _ℓ: the head basic block of the loop;
body _ℓ: the set of all body basic blocks of the loop;
lpb _ℓ: the loop bound.

The ILP formulation uses the following constants

C ^h: the execution delay of each node upon a cache hit,
C ^m: the execution delay of each node upon a cache miss,

and the following non-negative variables

c _a: for each basic block b _a, c _a is b _a’s total execution cost,
x _a: for each basic block b _a, x _a is the execution count of b _a,
y _j: for each edge e _j in the entry edge set entr _ℓ of some loop $\mathfrak{L}_{\ell}$, y _j counts how many times this edge is taken during the whole execution,
z _i: for each node n _i included in some k-Miss node sets regarding some loops, z _i counts how many times n _i executes with cache misses.

To obtain the WCET, the following maximization problem is solved:

$$\displaystyle{ Maximize\ \ \left \{\sum _{\forall b_{a}}c_{a}\right \} }$$

The following constraints are respected to bound the total cost:

Cost Constraint: As discussed in Sect. 2.5.5, the total cost of each basic block is
$$\displaystyle{\forall b_{a}: c_{a} = (\pi _{\mathsf{AH}} \times C^{h} +\pi _{\mathsf{ NC}} \times C^{m}) \times x_{ a} +\sum _{n_{i}\in b_{a}^{{\ast}}}\left (C^{m} \times z_{ i} + C^{h} \times (x_{ a} - z_{i})\right )}$$
where π _AH and π _NC is the number of AH and NC nodes in b _a, respectively, and $b_{a}^{{\ast}}$ is the set of nodes in b _a that are contained in some k-Miss node sets (regarding some loops). Additionally, each $n_{i} \in b_{a}^{{\ast}}$ should satisfy $z_{i} \leq x_{a}$.
k- Miss Constraint: As discussed in Sect. 2.5.5, the following constraints bound the number of misses incurred by a k-Miss node set:
$$\displaystyle{\forall (S,\mathfrak{L}_{\ell})\text{ s.t. }S\text{ is }k\mathrm{-\mathsf{Miss}\ regarding\ }\mathfrak{L}_{\ell}:\sum _{n_{i}\in S}z_{i} \leq k \times \sum _{e_{j}\in \mathsf{entr}_{\ell}}y_{j}}$$
Structure Constraint: Each basic block should have balanced input and output:
$$\displaystyle{\forall b_{a}:\ \ x_{a} =\sum _{e_{j}\in \mathsf{input}(b_{a})}y_{j} =\sum _{e_{j}\in \mathsf{output}(b_{a})}y_{j}}$$
The start basic block b _st of the program is executed only once:
$$\displaystyle{x_{st} = 1}$$
Each time the program enters the loop, each body basic block executes for at most lpb _ℓ times, so we have
$$\displaystyle{\forall \mathfrak{L}_{\ell},\forall b_{a} \in \mathsf{body}_{\ell}: x_{a} \leq \mathsf{lpb}_{\ell} \times \sum _{e_{j}\in \mathsf{entr}_{\ell}}y_{j}}$$
The head basic block may execute one more time to realize that the loop condition is not satisfied and thus the program exists the loop, so we have:
$$\displaystyle{\forall \mathfrak{L}_{\ell},b_{a} = \mathsf{head}_{\ell}: x_{a} \leq (\mathsf{lpb}_{\ell} + 1) \times \sum _{e_{j}\in \mathsf{entr}_{\ell}}y_{j}}$$

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Guan, N. (2016). FIFO Cache Analysis for WCET Estimation. In: Techniques for Building Timing-Predictable Embedded Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-27198-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-27198-9_3
Published: 04 February 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27196-5
Online ISBN: 978-3-319-27198-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Abstract

Buying options

Notes

References

Author information

Authors and Affiliations

Appendix: The Complete IPET Formulation

Appendix: The Complete IPET Formulation

Definition 3.6 (Loop Structure).

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation