Tight WorstCase Bounds for Polynomial Loop Programs
Abstract
In 2008, BenAmram, Jones and Kristiansen showed that for a simple programming language—representing nondeterministic imperative programs with bounded loops, and arithmetics limited to addition and multiplication—it is possible to decide precisely whether a program has certain growthrate properties, in particular whether a computed value, or the program’s running time, has a polynomial growth rate.
A natural and intriguing problem was to improve the precision of the information obtained. This paper shows how to obtain asymptoticallytight multivariate polynomial bounds for this class of programs. This is a complete solution: whenever a polynomial bound exists it will be found.
1 Introduction
One of the most important properties we would like to know about programs is their resource usage, i.e., the amount of resources (such as time, memory and energy) required for their execution. This information is useful during development, when performance bugs and security vulnerabilities exploiting performance issues can be avoided. It is also particularly relevant for mobile applications, where resources are limited, and for cloud services, where resource usage is a major cost factor.
In the literature, a lot of different “cost analysis” problems (also called “resource bound analysis,” etc.) have been studied (e.g. [1, 11, 13, 18, 19, 24, 26, 27]); several of them may be grouped under the following general definition. The countable resource problem asks about the maximum usage of a “resource” that accumulates during execution, and which one can explicitly count, by instrumenting the program with an accumulator variable and instructions to increment it where necessary. For example, we can estimate the execution time of a program by counting certain “basic steps”. Another example is counting the number of visits to designated program locations. Realistic problems of this type include bounding the number of calls to specific functions, perhaps to system services; the number of I/O operations; number of accesses to memory, etc. The consumption of resources such as energy suits our problem formulation as long as such explicit bookkeeping is possible (we have to assume that the increments, if not constant, are given by a monotone polynomial expression).
In this paper we solve the bound analysis problem for a particular class of programs, defined in [7]. The bound analysis problem is to find symbolic bounds on the maximal possible value of an integer variable at the end of the program, in terms of some integervalued variables that appear in the initial state of a computation. Thus, a solution to this problem might be used for any of the resourcebound analyses above. In this work we focus on values that grow polynomially (in the sense of being bounded by a polynomial), and our goal is to find polynomial bounds that are tight, in the sense of being precise up to a constant factor.
1.1 The Core Language
Data. It is convenient to assume (without loss of generality) that the only type of data is nonnegative integers. Note that a realistic (not “core”) program may include many statements that manipulate noninteger data that are not relevant to loop control—so in a complexity analysis, we may be able to abstract these parts away and still analyze the variables of interest. In other cases, it is possible to preprocess a program to replace complex data values with their size (or “norm”), which is the quantity of importance for loop control. Methods for this process have been widely studied in conjunction with termination and cost analysis.
Command Semantics. The core language is inherently nondeterministic. The choose command represents a nondeterministic choice, and can be used to abstract any concrete conditional command by simply ignoring the condition; this is necessary to ensure that our analysis problem is decidable. Note that what we ignore is branches within a loop body and not branches that implement the loop control, which we represent by a dedicated loop command. The command Open image in new window repeats C a (nondeterministic) number of times bounded by the value of \(\mathtt{E}\), which is evaluated just before the loop is entered. Thus, as a conservative abstraction, it may be used to model different forms of loops (forloops, whileloops) as long as a bound on the number of iterations, as a function of the program state on loop initiation, can be determined and expressed in the language. There is an ample body of research on analysing programs to find such bounds where they are not explicitly given by the programmer; in particular, bounds can be obtained from a ranking function for the loop [2, 3, 5, 6, 23]. Note that the arithmetic in our language is too restricted to allow for the maintenance of counters and the creation of while loops, as there is no subtraction, no explicit constants and no tests. Thus, for realistic “concrete” programs which use such devices, loopbound analysis is supposed to be performed on the concrete program as part of the process of abstracting it to the core language. This process is illustrated in [9, Sect. 2].
1.2 The Algorithm
Consider the program in Fig. 2. Suppose that it is started with the values of the variables \(\texttt {X}_1,\texttt {X}_2,\dots \) being \(x_1,x_2,\dots \). Our purpose is to bound the values of all variables at the conclusion of the program in terms of those initial values. Indeed, they are all polynomially bounded, and our algorithm provides tight bounds. For instance, it establishes that the final value of \(\texttt {X}_3\) is tightly bounded (up to a constant factor) by \(\max (x_4(x_4 + x_1^2),x_4(x_2 + x_3 + x_1^2) )\).
In fact, it produces information in a more precise form, as a disjunction of simultaneous bounds. This means that it generates vectors, called multipolynomials, that give simultaneous bounds on all variables; for example, with the program in Fig. 2, one such multipolynomial is \({\langle x_1, x_2, x_3, x_4\rangle }\) (this is the result of all loops taking a very early exit). This form is important in the context of a compositional analysis. To see why, suppose that we provide, for a command with variables \(\texttt {X},\texttt {Y}\), the bounds \({\langle x,y\rangle }\) and \({\langle y,x\rangle }\). Then we know that the sum of their values is always bounded by \(x+y\), a result that would have not been deduced had we given the bound \(\max (x,y)\) on each of the variables. The difference may be critical for the success of analyzing an enclosing or subsequent command.
Multivariate bounds are often of interest, and perhaps require no justification, but let us point out that multivariate polynomials are necessary even if we’re ultimately interested in a univariate bound, in terms of some single initial value, say n. This is, again, due to the analysis being compositional. When we analyze an internal command that uses variables \(\texttt {X},\texttt {Y},\dots \) we do not know in what possible contexts the command will be executed and how the values of these variables will be related to n.
Some highlights of our solution are as follows.

We reduce the problem of analyzing any corelanguage program to the problem of analyzing a single loop, whose body is already processed, and therefore presented as a collection of multipolynomials. This is typical of algorithms that analyze a structured imperative language and do so compositionally.

Since we are computing bounds only up to a constant factor, we work with abstract polynomials, that have no numeric coefficients.

We further introduce \(\tau \)polynomials, to describe the evolution of values in a loop. These have an additional parameter \(\tau \) (for “time”; more precisely, number of iterations). Introducing \(\tau \)polynomials was a key step in the solution.

The analysis of a loop is simply a closure computation under two operations: ordinary composition, and generalization which is the operation that predicts the evolution of values by judiciously adding \(\tau \)’s to idempotent abstract multipolynomials.
The remainder of this paper is structured as follows. In Sect. 2 we give some definitions and state our main result. In Sects. 3, 4 and 5 we present our algorithm. In Sect. 6, we outline the correctness proofs. Section 7 considers related work, and Sect. 8 concludes and discusses ideas for further work.
2 Preliminaries
In this section, we give some basic definitions, complete the presentation of our programming language and precisely state the main result.
2.1 Some Notation and Terminology
The Language. We remark that in our language syntax there is no special form for a “program unit”; in the text we sometimes use “program” for the subject of our analysis, yet syntactically it’s just a command.
Polynomials and Multipolynomials. We work throughout this article with multivariate polynomials in \(x_1,\dots ,x_n\) that have nonnegative integer coefficients and no variables other than \(x_1,\dots ,x_n\); when we speak of a polynomial we always mean one of this kind. Note that over the nonnegative integers, such polynomials are monotonically (weakly) increasing in all variables.
The postfix substitution operator [a/b] may be applied to any sort of expression containing a variable b, to substitute a instead; e.g., \((x^2+yx+y)[2z/y] = x^2+2zx+2z\).
When discussing a command, statetransition, or program trace, with a variable \(\texttt {X}_{i}\), \(x_i\) will denote, as a rule, the initial value of this variable, and \(x'_i\) its final value. Thus we distinguish the syntactic entity by the typewriter font. We write the polynomials manipulated by our algorithms using the variable names \(x_i\). We presume that an implementation of the algorithm represents polynomials concretely so that ordinary operations such as composition can be applied, but otherwise we do not concern ourselves much with representation.
The parameter n always refers to the number of variables in the subject program. The set [n] is \(\{1,\dots ,n\}\). For a set S an ntuple over S is a mapping from [n] to S. The set of these tuples is denoted by \(S^n\). Throughout the paper, various natural liftings of operators to collections of objects is tacitly assumed, e.g., if S is a set of integers then \(S+1\) is the set \(\{s+1 \mid s\in S\}\) and \(S+S\) is \(\{s+t \mid s,t\in S\}\). We use such lifting with sets as well as with tuples. If S is ordered, we extend the ordering to \(S^n\) by comparing tuples elementwise (this leads to a partial order, in general, e.g., with natural numbers, \({\langle 1,3\rangle }\) and \({\langle 2,2\rangle }\) are incomparable).
Definition 1
A polynomial transition (PT) represents a mapping of an “input” state \(\mathbf x = {\langle x_1,\dots ,x_n\rangle }\) to a “result” state \({\mathbf x}' = {\langle x'_1,\dots ,x'_n\rangle } = {\mathbf p}({\varvec{x}})\) where \(\mathbf p = {\langle {\mathbf p}[1], \dots , {\mathbf p}[n] \rangle }\) is an ntuple of polynomials. Such a \(\mathbf p\) is called a a multipolynomial (MP); we denote by \({\texttt {MPol}}\) the set of multipolynomials, where the number of variables n is fixed by context.
Multipolynomials are used in this work to represent the effect of a command. Various operations will be applied to MPs, mostly obvious—in particular, composition (which corresponds to sequential application of the transitions). Note that composition of multipolynomials, \({\mathbf q} \circ {\mathbf p}\), is naturally defined since \(\mathbf p\) supplies n values for the n variables of \(\mathbf q\) (in other words, they are composed as functions in \(\mathbb {N}^n \rightarrow \mathbb {N}^n\)). We define \( Id \) to be the identity transformation, \({\mathbf x}' = {\mathbf x}\) (in MP notation: \({\mathbf p}[i] = x_i\) for \(i=1,\dots ,n\)).
2.2 Formal Semantics of the Core Language
The semantics associates with every command Open image in new window over variables \(\texttt {X}_1,\dots ,\texttt {X}_n\) a relation Open image in new window . In the expression Open image in new window , vector \(\varvec{x}\) (respectively \(\varvec{y}\)) is the store before (after) the execution of Open image in new window .
Remarks. The following two changes may enhance the applicability of the core language for simulating certain concrete programs; we include them as “options” because they do not affect the validity of our proofs.
 1.
The semantics of an assignment operation may be nondeterministic: Open image in new window assigns to X some nonnegative value bounded by E. This is useful to abstract expressions which are not in the core language, and also to use the results of size analysis of subprograms. Such an analysis may determine invariants such as “the value of Open image in new window is at most the sum of X and Y.”
 2.
The domain of the integer variables may be extended to \(\mathbb {Z}\). In this case the bounds that we seek are on the absolute value of the output in terms of absolute values of the inputs. This change does not affect our conclusions because of the facts \(xy = x\cdot y\) and \(x+y \le x+y\). The semantics of the loop command may be defined either as doing nothing if the loop bound is not positive, or using the absolute value as a bound.
2.3 Detailed Statement of the Main Result
The polynomialbound analysis problem is to find, for any given command, which output variables are bounded by a polynomial in the input values (which are simply the values of all variables upon commencement of the program), and to bound these output values tightly (up to constant factors). The problem of identifying the polynomiallybounded variables is completely solved by [7]. We rely on that algorithm, which is polynomialtime, to do this for us (as further explained below).
Our main result is thus stated as follows.
Theorem 1
There is an algorithm which, for a command Open image in new window , over variables \(\texttt {X}_1\) through \(\texttt {X}_n\), outputs a set \({\mathcal B}\) of multipolynomials, such that the following hold, where \({\texttt {PB}}\) is the set of indices i of variables \(\texttt {X}_i\) which are polynomially bounded under Open image in new window .
 1.(Bounding) There is a constant \(c_{\mathbf p}\) associated with each \({\mathbf p}\in {\mathcal B}\), such that
 2.(Tightness) For every \({\mathbf p}\in {\mathcal B}\) there are constants \(d_{\mathbf p}>0\), \({\varvec{x}}_0\) such that for all \({\varvec{x}}\ge {\varvec{x}}_0\) there is a \(\varvec{y}\) such that
3 Analysis Algorithm: First Concepts
The following sections describe our analysis algorithm. Naturally, the most intricate part of the analysis concerns loops. In fact we break the description into stages: first we reduce the problem of analyzing any program to that of analyzing simple disjunctive loops, defined next. Then, we approach the analysis of such loops, which is the main effort in this work.
Definition 2
A simple disjunctive loop (SDL) is a finite set of PTs.
The loop is “disjunctive” because its meaning is that in every iteration, any of the given transitions may be applied. The semantics is formalized by traces (Definition 4). A SDL does not specify the number of iterations; our analysis generates polynomials which depend on the number of iterations as well as the initial state. For this purpose, we now introduce \(\tau \)polynomials where \(\tau \) represents the number of iterations.
Definition 3
\(\tau \)polynomials are polynomials in \(x_1,\dots ,x_n\) and \(\tau \).
\(\tau \) has a special status and does not have a separate component in the polynomial giving its value. If p is a \(\tau \)polynomial, then \(p(v_1,\dots ,v_n)\) is the result of substituting each \(v_i\) for the respective \(x_i\); and we also write \(p(v_1,\dots ,v_n,t)\) for the result of substituting t for \(\tau \) as well. The set of \(\tau \)polynomials in n variables (n known from context) is denoted Open image in new window .
Multipolynomials and polynomial transitions are formed from \(\tau \)polynomials just as previously defined and are used to represent the effect of a variable number of iterations. For example, the \(\tau \)polynomial transition \({\langle x'_1,x'_2\rangle } = {\langle x_1,\ x_2+\tau x_1\rangle }\) represents the effect of repeating (\(\tau \) times) the assignment Open image in new window . The effect of iterating the composite command: Open image in new window has an effect described by \({\mathbf x}' = {\langle x_1,\ x_2+\tau x_1, \ x_3 + \tau x_2 + \tau ^2 x_1\rangle }\) (here we already have an upper bound which is not reached precisely, but is correct up to a constant factor). We denote the set of \(\tau \)polynomial transitions by Open image in new window . We should note that composition \({\mathbf q}\circ {\mathbf p}\) over Open image in new window is performed by substituting \({\mathbf p}[i]\) for each occurrence of \(x_i\) in \(\mathbf q\). Occurrences of \(\tau \) are unaffected (since \(\tau \) is not part of the state). We make a couple of preliminary definitions before reaching our goal which is the definition of the simple disjunctive loop problem (Definition 6).
Definition 4
Let \({\mathcal S}\) be a set of polynomial transitions. An (abstract) trace over \({\mathcal S}\) is a finite sequence \({\mathbf p}_1;\dots ; {\mathbf p}_{\sigma }\) of elements of \({\mathcal S}\). Thus \(\sigma \) denotes the length of the trace. The set of all traces is denoted \({\mathcal S}^*\). We write Open image in new window for the composed relation \( {\mathbf p}_{\sigma } \circ \dots \circ {\mathbf p}_1\) (for the empty trace, \({\varepsilon }\), we have Open image in new window ).
Definition 5
Let \(p(\mathbf x)\) be a (concrete or abstract) \(\tau \)polynomial. We write \(\dot{p}\) for the sum of linear monomials of p, namely any one of the form \(ax_i\) with constant coefficient a. We write \(\ddot{p}\) for the rest. Thus \(p = \dot{p}+\ddot{p}\).
Definition 6
 1.(Bounding) There is a constant \(c_{\mathbf p} > 0\) associated with each \({\mathbf p}\in {\mathcal B}\), such that
 2.(Tightness) For every \({\mathbf p}\in {\mathcal B}\) there are constants \(d_{\mathbf p}>0\), \({\varvec{x}}_0\) such that for all \({\varvec{x}}\ge {\varvec{x}}_0\) there are a trace \(\sigma \) and a state vector \(\varvec{y}\) such that
Note that in the lowerbound clause (2), the linear monomials of p are not multiplied, in the lefthand side, by the coefficient \(d_{\mathbf p}\); this sets, in a sense, a stricter requirement for them: if the trace maps x to \(x^2\) then the bound \(2x^2\) is acceptable, but if it maps x to x, the bound 2x is not accepted. The reader may understand this technicality by considering the effect of iteration: it is important to distinguish the transition \(x'_1 = x_1\), which can be iterated ad libitum, from the transition \(x'_1 = 2x_1\), which produces exponential growth on iteration. Distinguishing \(x'_1 = x_1^2\) from \(x'_1 = 2x_1^2\) is not as important. The result set \({\mathcal B}\) above is sometimes called a loop summary. We remark that Definition 6 implies that the max of all these polynomials provides a “big Theta” bound for the worstcase (namely biggest) results of the loop’s computation. We prefer, however, to work with sets of polynomials. Another technical remark is that \(c_{\mathbf p}, d_{\mathbf p}\) range over real numbers. However, our data and the coefficients of polynomials remain integers, it is only such comparisons that are performed with real numbers (specifically, to allow \(c_{\mathbf p}\) to be smaller than one).
4 Reduction to Simple Disjunctive Loops
We show how to reduce the problem of analysing corelanguage programs to the analysis of polynomiallybounded simple disjunctive loops.
4.1 Symbolic Evaluation of StraightLine Code
Straightline code consists of atomic commands—namely assignments (or skip, equivalent to \(\texttt {X}_1\texttt {:=}\,\texttt {X}_1\)), composed sequentially. It is obvious that symbolic evaluation of such code leads to polynomial transitions.
Example 1
Open image in new window is precisely represented by the transition \({\langle x_1,x_2,x_3\rangle }' = {\langle x_1x_3,\ x_1,\ x_3,\ x_1+x_3\rangle }\).
4.2 Evaluation of Nondeterministic Choice
Example 2
Open image in new window is represented by the set \(\{ {\langle x_1, x_1, x_3, x_1+x_3\rangle }, \ {\langle x_1 x_3, x_1, x_3, x_4 \rangle } \}\).
4.3 Handling Loops
The above shows that any loopfree command in our language can be precisely represented by a finite set of PTs. Consequently, the problem of analyzing any command is reduced to the analysis of simple disjunctive loops.
Thus, the whole solution is constructed as an ordinary abstract interpretation, following the semantics of the language, except for procedure Solve, described below.
Example 3
The next section describes procedure Solve, and operates under the assumption that all variables are polynomially bounded in the loop. However, a loop can generate exponential growth. To cover this eventuality, we first apply the algorithm of [7] which identifies which variables are polynomially bounded. If some \(\texttt {X}_i\) is not polynomially bounded we replace the ith component of all the loop transitions with \(x_n\) (where we assume \(x_n\) to be a dedicated, unmodified variable). Clearly, after this change, all variables are polynomially bounded; moreover, variables which are genuinely polynomial are unaffected, because they cannot depend on a superexponential quantity (given the restricted arithmetics in our language). In reporting the results of the algorithm, we should display “superpolynomial” instead of all bounds that depend on \(x_n\).
5 Simple Disjunctive Loop Analysis Algorithm
Observation 2 For a SDL that does not use addition, the sequence \(Q_i\) as in (1) reaches a fixed point, and the fixed point provides tight bounds for all the polynomiallybounded variables.
When we have addition, we find that knowing that all variables are polynomially bounded does not imply convergence of the sequence (1). An example is: Open image in new window yielding the infinite sequence of MPs \({\langle x_1,x_2,x_3\rangle }\), \({\langle x_1+x_2,x_2,x_3\rangle }\), \({\langle x_1+2x_2,x_2,x_3\rangle }\), ... Our solution employs two means. One is the introduction of \(\tau \)polynomials, already presented. The other is a kind of abstraction—intuitively, ignoring the concrete values of (nonzero) coefficients. Let us first define this abstraction:
Definition 7
\({\texttt {APol}}\), the set of abstract polynomials, consists of formal sums of distinct monomials over \(x_1,\dots ,x_n\), where the coefficient of every monomial included is 1. We extend the definition to an abstraction of \(\tau \)polynomials, denoted Open image in new window .
 1.
The abstraction of a polynomial p, \(\alpha (p)\), is obtained by modifying all (nonzero) coefficients to 1.
 2.
Addition and multiplication in Open image in new window is defined in a natural way so that \(\alpha (p)+\alpha (q) = \alpha (p+q)\) and \(\alpha (p)\cdot \alpha (q) = \alpha (p\cdot q)\) (to carry these operations out, you just go through the motions of adding or multiplying ordinary polynomials, ignoring the coefficient values).
 3.
The canonical concretization of an abstract polynomial, \(\gamma ({\mathbf p})\) is obtained by simply regarding it as an ordinary polynomial.
 4.
These definitions extend naturally to tuples of (abstract) polynomials.
 5.
The set of abstract multipolynomials \({\texttt {AMPol}}\) and their extension with \(\tau \) ( Open image in new window ) are defined as ntuples over \({\texttt {APol}}\) (respectively, Open image in new window ). We use AMP as an abbreviation for abstract multipolynomial.
 6.
Composition \({\mathbf p}\mathop {\bullet }{\mathbf q}\), for \({\mathbf p}, {\mathbf q} \in {\texttt {AMPol}}\) (or Open image in new window ) is defined as \(\alpha (\gamma ({\mathbf p})\circ \gamma ({\mathbf q}))\); it is easy to see that one can perform the calculation without the detour through polynomials with coefficients. The different operator symbol (“\(\mathop {\bullet }\)” versus “\(\circ \)”) helps in disambiguating expressions.
Definition 8
Note that \({\mathbf t}^{\mathop {\bullet }(n)} = \alpha (\gamma ({\mathbf t})^{(n)})\), where \({\mathbf p}^{(n)}\) is defined using ordinary composition.
Definition 9
In the correctness proof, we argue that when all variables are polynomially bounded in a loop \({\mathcal S}\), the closure of \(\alpha ({\mathcal S})\) can be computed in finite time; equivalently, it equals \(\bigcup _{i=0}^{k} (\alpha ({{\mathcal S}}))^{\mathop {\bullet }(i)}\) for some k. The argument is essentially the same as in the multiplicative case.
The second operation is called generalization and its role is to capture the behaviour of accumulator variables, meaning variables that grow by accumulating increments in the loop, and make explicit the dependence on the number of iterations. The identification of which additive terms in a MP should be considered as increments that accumulate is at the heart of our problem, and is greatly simplified by concentrating on idempotent AMPs.
Definition 10
Open image in new window is called idempotent if \(\mathbf p \mathop {\bullet }\mathbf p = \mathbf p\).
Note that this is composition in the abstract domain. So, for instance, \({\langle x_1, x_2\rangle }\) is idempotent, and so is \({\langle x_1 + x_2, x_2\rangle }\), while \({\langle x_1x_2, x_2\rangle }\) and \({\langle x_1 + x_2, x_1\rangle }\) are not.
Definition 11
For \(\mathbf p\) an (abstract) multipolynomial, we say that \(x_i\) is selfdependent in \(\mathbf p\) if \(\mathbf p[i]\) depends on \(x_i\). We call a monomial selfdependent if all the variables appearing in it are.
Definition 12
Example 4
Let \(\mathbf p = {\langle x_1 + \tau x_2 + \tau x_3 + x_3 x_4,\ x_3,\ x_3 ,\ x_4 \rangle }\). The selfdependent variables are all but \(x_2\). Since \(x_1\) is selfdependent, we will apply the above definition to \({\mathbf p}[1]\), so that \({\mathbf p}[1]' = x_3\), \({\mathbf p}[1]'' = x_3 x_4\) and \({\mathbf p}[1]''' = \tau x_2\). Note that a factor of \(\tau \) is stripped in \({\mathbf p}[1]'\). Had the monomial been \(\tau ^2 x_3\), we would have \({\mathbf p}[1]' = \tau x_3\).
Definition 13
Note that the arithmetic here is abstract (see examples below). Note also that in the term \(\tau {\mathbf p}[i]'\) the \(\tau \) is already present in \(\mathbf p\), while in \(\tau {\mathbf p}[i]''\) it is added to existing monomials. In this definition, the monomials of \({\mathbf p}[i]'''\) are treated like those of \(\tau {\mathbf p}[i]'\); however, in certain steps of the proofs we treat them differently, which is why the notation separates them.
Example 5
Let \(\mathbf p = \langle x_1 + x_3,\ x_2 + x_3 + x_4,\ x_3,\ x_3 \rangle \).
Note that \({\mathbf p}\mathop {\bullet }{\mathbf p} = {\mathbf p}\). We have \({\mathbf p}^\tau = \langle x_1 + \tau x_3,\ x_2 + \tau x_3+x_4,\ x_3,\ x_3 \rangle \).
Example 6
Let \(\mathbf p = {\langle x_1 + \tau x_2 + \tau x_3 + \tau x_3 x_4,\ x_3,\ x_3 ,\ x_4 \rangle }\).
Note that \({\mathbf p}\mathop {\bullet }{\mathbf p} = {\mathbf p}\). The selfdependent variables are all but \(x_2\).
We have \({\mathbf p}^\tau = {\langle x_1 + \tau x_2 + \tau x_3 + \tau x_3 x_4,\ x_3,\ x_3,\ x_4 \rangle } = {\mathbf p}\).
 1.
Set \(T = \alpha ({\mathcal S})\).
 2.Repeat the following steps until T remains fixed:
 (a)
Closure: Set T to \({Cl(T)}\).
 (b)
Generalization: For all \({\mathbf p}\in T\) such that \({\mathbf p}\mathop {\bullet }{\mathbf p} = {\mathbf p}\), add \({\mathbf p}^\tau \) to T.
 (a)
Example 7
6 Correctness

A key notion in our proofs is that of realizability. Intuitively, when we come up with a bound, we want to show that there are traces that achieve (realize) this bound for arbitrarily large input values.

In the lowerbound proof, we describe a “behavior” by a pattern. A pattern is constructed like a regular expression with concatenation and Kleenestar. However, they allow no nested iteration constructs, and the starred subexpressions have to be repeated the same number of times; for example, the pattern \({\mathbf p}^*{\mathbf q}^*\) generates the traces \(\{ {\mathbf p}^t{\mathbf q}^t,\ t\ge 0\}\). The proof constructs a pattern for every multipolynomial computed, showing it is realizable. It is interesting that such simple patterns suffice to establish tight lower bounds for all our programs.

In the upperbound proof, we describe all “behaviors” by a finite set of welltyped regular expressions [10]. This elegant tool channels the power of the Factorization Forest Theorem [25]; this brings out the role of idempotent elements, which is key in our algorithm.

Interestingly, the lowerbound proof not only justifies the tightness of our upper bounds, it also justifies the termination of the algorithm and the application of the Factorization Forest Theorem in the upperbound proof, because it shows that our abstract multipolynomials generate a finite monoid.
7 Related Work
Bound analysis, in the sense of finding symbolic bounds for data values, iteration bounds and related quantities, is a classic field of program analysis [18, 24, 27]. It is also an area of active research, with tools being currently (or recently) developed including COSTA [1], AProVE [13], CiaoPP [19], \(C^4B\) [11], Loopus [26]—all for imperative programs. There is also work on functional and logic programs, term rewriting systems, recurrence relations, etc. which we cannot attempt to survey here. In the rest of this section we survey work which is more directly related to ours, and has even inspired it.
The LOOP language is due to Meyer and Ritchie [20], who note that it computes only primitive recursive functions, but complexity can rise very fast, even for programs with nestingdepth 2. Subsequent work [15, 16, 17, 22] concerning similar languages attempted to analyze such programs more precisely; most of them proposed syntactic criteria, or analysis algorithms, that are sufficient for ensuring that the program lies in a desired class (often, polynomialtime programs), but are not both necessary and sufficient: thus, they do not prove decidability (the exception is [17] which has a decidability result for a weak “core” language). The core language we use in this paper is from BenAmram et al. [7], who observed that by introducing weak bounded loops instead of concrete loop commands and nondeterministic branching instead of “if”, we have weakened the semantics just enough to obtain decidability of polynomial growthrate. Justifying the necessity of these relaxations, [8] showed undecidability for a language that can only do addition and definite loops (that cannot exit early).
In the vast literature on bound analysis in various forms, there are a few other works that give a complete solution for a weak language. Sizechange programs are considered by [12, 28]. Sizechange programs abstract away nearly everything in the program, leaving a controlflow graph annotated with assertions about variables which decrease (or do not increase) in a transition. Thus, it does not assume structured and explicit loops, and it cannot express information about values which increase. Both works yield tight bounds on the number of transitions until termination.
Dealing with a somewhat different problem, [14, 21] both check, or find, invariants in the form of polynomial equations. We find it remarkable that they give complete solutions for weak languages, where the weakness lies in the nondeterministic controlflow, as in our language. If one could give a complete solution for polynomial inequalities, this would have implied a solution to our problem as well.
8 Conclusion and Further Work
We have solved an open problem in the area of analyzing programs in a simple language with bounded loops. For our language, it has been previously shown that it is possible to decide whether a variable’s value, number of steps in the program, etc. are polynomially bounded or not. Now, we have an algorithm that computes tight polynomial bounds on the final values of variables in terms of initial values. The bounds are tight up to constant factors (suitable constants are also computable). This result improves our understanding of what is computable by, and about, programs of this form. An interesting corollary of our algorithm is that as long as variables are polynomially bounded, their worstcase bounds are described tightly by (multivariate) polynomials. This is, of course, not true for common Turingcomplete languages. Another interesting corollary of the proofs is the definition of a simple class of patterns that suffice to realize the worstcase behaviors. This will appear in a planned extended version of this paper.
There are a number of possible directions for further work. We would like to look for decidability results for richer (yet, obviously, subrecursive) languages. Some possible language extensions include deterministic loops, variable resets (cf. [4]), explicit constants, and procedures. The inclusion of explicit constants is a particularly challenging open problem.
Rather than extending the language, we could extend the range of bounds that we can compute. In light of the results in [17], it seems plausible that the approach can be extended to classify the Grzegorczykdegree of the growth rate of variables when they are superpolynomial. There may also be room for progress regarding precise bounds of the form \(2^{poly}\).
In terms of time complexity, our algorithm is polynomial in the size of the program times \({n^{nd}}\), where d is the highest degree of any MP computed. Such exponential behavior is to be expected, since a program can be easily written to compute a multivariate polynomial that is exponentially long to write. But there is still room for finer investigation of this issue.
References
 1.Albert, E., Arenas, P., Genaim, S., Puebla, G., Zanardini, D.: Cost analysis of objectoriented bytecode programs. Theor. Comput. Sci. 413(1), 142–159 (2012). https://doi.org/10.1016/j.tcs.2011.07.009MathSciNetCrossRefzbMATHGoogle Scholar
 2.Alias, C., Darte, A., Feautrier, P., Gonnord, L.: Multidimensional rankings, program termination, and complexity bounds of flowchart programs. In: Cousot, R., Martel, M. (eds.) SAS 2010. LNCS, vol. 6337, pp. 117–133. Springer, Heidelberg (2010). https://doi.org/10.1007/9783642157691_8CrossRefGoogle Scholar
 3.Bagnara, R., Hill, P.M., Zaffanella, E.: The parma polyhedra library: toward a complete set of numerical abstractions for the analysis and verification of hardware and software systems. Sci. Comput. Program. 72(1–2), 3–21 (2008)MathSciNetCrossRefGoogle Scholar
 4.BenAmram, A.M.: On decidable growthrate properties of imperative programs. In: Baillot, P. (ed.) International Workshop on Developments in Implicit Computational complExity (DICE 2010). EPTCS, vol. 23, pp. 1–14 (2010). https://doi.org/10.4204/EPTCS.23.1
 5.BenAmram, A.M., Genaim, S.: Ranking functions for linearconstraint loops. J. ACM 61(4), 26:1–26:55 (2014). https://doi.org/10.1145/2629488MathSciNetCrossRefzbMATHGoogle Scholar
 6.BenAmram, A.M., Genaim, S.: On multiphaselinear ranking functions. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10427, pp. 601–620. Springer, Cham (2017). https://doi.org/10.1007/9783319633909_32CrossRefGoogle Scholar
 7.BenAmram, A.M., Jones, N.D., Kristiansen, L.: Linear, polynomial or exponential? Complexity inference in polynomial time. In: Beckmann, A., Dimitracopoulos, C., Löwe, B. (eds.) CiE 2008. LNCS, vol. 5028, pp. 67–76. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540694076_7CrossRefGoogle Scholar
 8.BenAmram, A.M., Kristiansen, L.: On the edge of decidability in complexity analysis of loop programs. Int. J. Found. Comput. Sci. 23(7), 1451–1464 (2012). https://doi.org/10.1142/S0129054112400588MathSciNetCrossRefzbMATHGoogle Scholar
 9.BenAmram, A.M., Pineles, A.: Flowchart programs, regular expressions, and decidability of polynomial growthrate. In: Hamilton, G., Lisitsa, A., Nemytykh, A.P. (eds.) Proceedings of the Fourth International Workshop on Verification and Program Transformation (VPT). EPTCS, vol. 216, pp. 24–49 (2016). https://doi.org/10.4204/EPTCS.216.2
 10.Bojańczyk, M.: Factorization forests. In: Diekert, V., Nowotka, D. (eds.) DLT 2009. LNCS, vol. 5583, pp. 1–17. Springer, Heidelberg (2009). https://doi.org/10.1007/9783642027376_1CrossRefGoogle Scholar
 11.Carbonneaux, Q., Hoffmann, J., Shao, Z.: Compositional certified resource bounds. In: Proceedings of the ACM SIGPLAN 2015 Conference on Programming Language Design and Implementation (PLDI). ACM (2015)Google Scholar
 12.Colcombet, T., Daviaud, L., Zuleger, F.: Sizechange abstraction and maxplus automata. In: CsuhajVarjú, E., Dietzfelbinger, M., Ésik, Z. (eds.) MFCS 2014. LNCS, vol. 8634, pp. 208–219. Springer, Heidelberg (2014). https://doi.org/10.1007/9783662445228_18CrossRefGoogle Scholar
 13.Giesl, J., et al.: Analyzing program termination and complexity automatically with AProVE. J. Autom. Reasoning 58(1), 3–31 (2017). https://doi.org/10.1007/s108170169388yMathSciNetCrossRefzbMATHGoogle Scholar
 14.Hrushovski, E., Ouaknine, J., Pouly, A., Worrell, J.: Polynomial invariants for affine programs. In: Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2018, pp. 530–539. ACM, New York (2018). https://doi.org/10.1145/3209108.3209142
 15.Jones, N.D., Kristiansen, L.: A flow calculus of mwpbounds for complexity analysis. ACM Trans. Comput. Logic 10(4), 1–41 (2009). https://doi.org/10.1145/1555746.1555752MathSciNetCrossRefzbMATHGoogle Scholar
 16.Kasai, T., Adachi, A.: A characterization of time complexity by simple loop programs. J. Comput. Syst. Sci. 20(1), 1–17 (1980). https://doi.org/10.1016/00220000(80)90001XMathSciNetCrossRefzbMATHGoogle Scholar
 17.Kristiansen, L., Niggl, K.H.: On the computational complexity of imperative programming languages. Theor. Comput. Sci. 318(1–2), 139–161 (2004). https://doi.org/10.1016/j.tcs.2003.10.016MathSciNetCrossRefzbMATHGoogle Scholar
 18.Le Métayer, D.: ACE: an automatic complexity evaluator. ACM Trans. Program. Lang. Syst. 10(2), 248–266 (1988). https://doi.org/10.1145/42190.42347CrossRefGoogle Scholar
 19.LópezGarcía, P., Darmawan, L., Klemen, M., Liqat, U., Bueno, F., Hermengildo, M.V.: Intervalbased resource usage verification by translation into Horn clauses and an application to energy consumption. Theory Pract. Logic Program. 18(2), 167–223 (2018)MathSciNetCrossRefGoogle Scholar
 20.Meyer, A.R., Ritchie, D.M.: The complexity of loop programs. In: Proceedings of the 22nd ACM National Conference, Washington, DC, pp. 465–469 (1967)Google Scholar
 21.MüllerOlm, M., Seidl, H.: Computing polynomial program invariants. Inf. Process. Lett. 91(5), 233–244 (2004). https://doi.org/10.1016/j.ipl.2004.05.004MathSciNetCrossRefzbMATHGoogle Scholar
 22.Niggl, K.H., Wunderlich, H.: Certifying polynomial time and linear/polynomial space for imperative programs. SIAM J. Comput. 35(5), 1122–1147 (2006). https://doi.org/10.1137/S0097539704445597MathSciNetCrossRefzbMATHGoogle Scholar
 23.Podelski, A., Rybalchenko, A.: A complete method for the synthesis of linear ranking functions. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp. 239–251. Springer, Heidelberg (2004). https://doi.org/10.1007/9783540246220_20CrossRefGoogle Scholar
 24.Rosendahl, M.: Automatic complexity analysis. In: Proceedings of the Conference on Functional Programming Languages and Computer Architecture, FPCA 1989, pp. 144–156. ACM (1989). https://doi.org/10.1145/99370.99381
 25.Simon, I.: Factorization forests of finite height. Theor. Comput. Sci. 72(1), 65–94 (1990). https://doi.org/10.1016/03043975(90)90047LMathSciNetCrossRefzbMATHGoogle Scholar
 26.Sinn, M., Zuleger, F., Veith, H.: Complexity and resource bound analysis of imperative programs using difference constraints. J. Autom. Reasoning 59(1), 3–45 (2017). https://doi.org/10.1007/s1081701694024MathSciNetCrossRefzbMATHGoogle Scholar
 27.Wegbreit, B.: Mechanical program analysis. Commun. ACM 18(9), 528–539 (1975). https://doi.org/10.1145/361002.361016MathSciNetCrossRefzbMATHGoogle Scholar
 28.Zuleger, F.: Asymptotically precise ranking functions for deterministic sizechange systems. In: Beklemishev, L.D., Musatov, D.V. (eds.) CSR 2015. LNCS, vol. 9139, pp. 426–442. Springer, Cham (2015). https://doi.org/10.1007/9783319202976_27CrossRefGoogle Scholar
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.