## Specification with Ghost Functions

In my previous post I showed that the C expression sublanguage extended with quantified expressions (**forall **and **exists**) is insufficient to allow some specifications to be expressed. I presented this function (annotated with an incomplete specification) to average an array of data:

int16_t average(constint16_t *arrayreadings, size_t numReadings)pre(readings.lwb == 0; readings.lim == numReadings)pre(numReadings != 0)post(result== ?/* sum of elements of readings *//numReadings) {intsum = 0; size_t i;for(i = 0; i < numReadings; ++i)keep(i <= numReadings)keep(sum == ?/* sum of first i elements of readings */)decrease(numReadings - i) { sum += readings[i]; }return(int16_t)(sum/numReadings); }

The two question marks need to be replaced by expressions for sums over the appropriate elements. Last time I showed how we can replace them by **over**-expressions. In this post, I’ll describe an alternative solution that has more general applications.

The problem we are faced with is that there is no C expression (even if we include **forall **and **exists **expressions) that can express the sum of some of array elements, where the number of elements is not statically known. We can calculate the sum using a loop; but a loop is not an expression and so cannot be used in specifications. However, specification expressions can contain function calls, provided that the functions have no side-effects. We can resolve our problem by defining a *ghost function* that calculates the sum. We’ll specify the function recursively so that we can handle the variable number of elements.

In the postcondition of *average*, we need to calculate the sum of all the elements of the *readings *array. In the loop variant, we need the sum of just the first *i* elements. So let’s declare a function that returns the sum of the first *n *elements of an array *a*, where *a* and *n *are parameters. We define the sum to be zero if *n *is zero, otherwise it’s the sum of the first *n-1* elements plus the *n*th element. Here’s our ghost function specification:

#ifdef__ARC__ghost integersumOf(constint16_t *arraya, size_t n)pre(a.lwb == 0; n <= a.lim)decrease(n)returns(n == 0 ? 0 : a[n - 1] + sumOf(a, n - 1)); #endif

Declaring the function **ghost **tells ArC that it is for use in specifications only. A ghost function must have a specification but no body. It can also do a few things that wouldn’t be allowed for a normal function. In particular, it can use types not normally allowed in C, such as **integer**, which is ArC’s name for the type of unbounded integers. I’ve declared the function as returning **integer **so that neither we nor ArC need worry about arithmetic overflow in the specification.

The precondition defines the conditions for the function to be well-behaved, as usual. I’ve specified the value that the function returns using a postcondition of the form **returns**(*expression*). This has the same meaning as **post**(**result **== *expression*), but it’s shorter. Also, we want to define the function recursively, and ArC doesn’t currently allow a recursive call to the function being specified inside **post**(…) but it does inside **returns**(…). I’ve enclosed the whole function declaration in **#ifdef __ARC__** … **#endif** so that the C compiler doesn’t see it.

Recursion would normally be forbidden in critical software (MISRA rule 16.2), however recursive calls in specifications are safe because they don’t carry any risk of run-time stack overflow. To ensure that the specification makes sense, ArC will need to prove that the recursion is bounded. That’s what the **decrease **clause is for, and it works exactly as if it were a loop variant – that is, it must decrease on each recursive call and it must not go negative.

Having defined *sumOf*,, we can complete the postcondition and loop invariant:

int16_t average(constint16_t *arrayreadings, size_t numReadings)pre(readings.lwb == 0; readings.lim == numReadings)pre(numReadings != 0)returns(sumOf(readings, numReadings)/numReadings) {intsum = 0; size_t i;for(i = 0; i < numReadings; ++i)keep(i <= numReadings)keep(sum == sumOf(readings, i))decrease(numReadings - i) { sum += readings[i]; }return(int16_t)(sum/numReadings); }

I’ve taken the liberty of replacing the original **post**(…) postcondition of average with the **returns**(…) form.

Just like the version of *average *I gave in my previous post, this one verifies completely except for possible overflow during integer arithmetic. I’ll show how we can deal with that next time.