Verifying pointer arithmetic

July 16, 2010 davidcrocker 3 comments

Today I’ll look at whether code that uses pointer arithmetic is any harder to verify than equivalent code that does not use pointer arithmetic.

Consider this function for copying an array (or part of an array) into another array (or part of another array):

void arrayCopy(const int* src, int* dst, size_t num) {
  size_t i;
  for (i = 0; i < num; ++i) {
    dst[i] = src[i];
  }
}

To verify it, we first need to specify what it is supposed to achieve in a postcondition and add any necessary preconditions. Then we need to add a loop invariant so that ArC can verify that the loop achieves the desired state when it terminates, and a loop variant so that ArC can prove that the loop terminates:

void arrayCopy(const int* array src, int* array dst, size_t num)
writes(dst.all)
pre(src.lim >= num; dst.lim >= num)
pre(disjoint(src.all, dst.all))
post(forall i in 0..(num - 1) :- dst[i] == src[i])
{ size_t i;
  for (i = 0; i < num; ++i)
  keep(i <= num)
  keep(forall j in 0..(i - 1) :- dst[j] == src[j])
  decrease(num - i)
  { dst[i] = src[i];
  }
}

This is fairly straightforward. As usual, the main loop invariant (the keep(forall…) clause) is a generalisation of the desired postcondition, and we need another invariant (the keep(i <= num) clause) to ensure that the main invariant and the body don’t violate the preconditions of the indexing operator. These invariants and the loop variant (the decrease(…) clause) are sufficient for ArC to verify this function.

Many C programmers would write the body as *dst++ = *src++; instead of explicitly indexing dst and src. If we make this change, then because we are changing src and dst within the loop, we need to describe their values in the invariant. Here is how we can do that:

void arrayCopy(const int* array src, int* array dst, size_t num)
writes(dst.all)
pre(src.lim >= num; dst.lim >= num)
pre(disjoint(src.all, dst.all))
post(forall i in 0..(num - 1) :- dst[i] == src[i])
{ size_t i;
  for (i = 0; i < num; ++i)
  keep(i <= num)
  keep(forall j in 0..(i - 1) :- (old dst)[j] == (old src)[j])
  keep(src == (old src) + i; dst == (old dst) + i)
  decrease(num - i)
  { *dst++ = *src++;
  }
}

Applying the keyword old to an expression in a loop invariant or loop variant yields the value of that expression just before the first iteration of the loop. I’ve had to replace src in the second loop invariant by old src and similarly for dst, because in that invariant I am referring to array elements computed relative to the original values of src and dst. I’ve described the modification to src and dst by adding a third invariant, that says that at the start and end of any iteration, src has its original value plus i, and similarly for dst. The post-increment operators applied to src and dst in the body ensure that these relationships are maintained. Once again, the function is verifiable by ArC.

What if we go further and get rid of the loop counter i as well? We can do that if we precompute the end pointer src + num at the start, and then iterate until src reaches this point. We still need to describe the values of src and dst in the invariant, but we must do so without the benefit of i. Here’s one way of doing it:

void arrayCopy(const int* array src, int* array dst, size_t num)
writes(dst.all)
pre(src.lim >= num; dst.lim >= num)
pre(disjoint(src.all, dst.all))
post(forall i in 0..(num - 1) :- dst[i] == src[i])
{ const int* array const srcEnd = src + num;
  while(src != srcEnd)
  keep(src.base == old(src.base))
  keep(0 <= src - (old src))
  keep(src - (old src) <= n)
  keep(forall j in 0..((src - (old src)) - 1) :- (old dst)[j] == (old src)[j])
  keep(dst == (old dst) + (src - (old src)))
  decrease(srcEnd - src)
  { *dst++ = *src++;
  }
}

Here’s how I arrived at the new loop annotations:

  • I’ve taken the three keep-clauses from the previous version and substituted (src – (old src)) for the loop counter i. That’s how I got the 3rd, 4th and 5th keep-clauses. For the last of the three, I also removed the resulting tautology src == (old src) + (src – (old src)), leaving just the component that talks about dst.
  • The original first invariant keep(i <= n) placed an upper bound on i, thereby ensuring that src[i] and dst[i] in the original code or *src and *dst in the modified code were in bounds. I didn’t need to put a lower bound on i because its type was size_t which is an unsigned type, so its lower bound was zero implicitly. But I’ve now replaced i by (src – (old src)), which has no such constraint. So I need to put a zero lower bound on this expression, which is what the second keep-clause is for.
  • Unfortunately, this isn’t quite enough. The expression src – old src uses the pointer-difference operator, which has the precondition that the operands are pointers into the same array. In general, when we re-assign an array pointer such as src, there is no requirement that it points into the same array as before. In this case, all we ever do to src is increment it; so it obviously continues to point into the same array as long as it is not over-incremented (which ArC verifies). However, ArC isn’t yet clever enough to spot this and generate an implicit loop invariant. So I’ve added the first invariant src.base == old(src.base) right at the beginning of the loop. In the expression src.base, base is a ghost field of src, just like lim and lwb. It subtracts src.lwb from src, yielding a pointer to the very start of the array that src points into. So to state that two array pointers point into the same array, I just need to say that their bases are equal.
  • Rather than just substitute for i in the loop variant, I’ve changed it to say that I expect the difference between srcEnd and src to decrease on each iteration.

I hope this example demonstrates to you that verification of C code that uses pointer arithmetic can be done, although it is likely to require more loop invariants than equivalent code that does not use pointer arithmetic, and getting them right may be a little harder. The number of verification conditions to be proved may be higher too – there were 24 from the original code, 32 from the second version, and 35 from the last version. So indexing should generally be preferred to explicit pointer arithmetic – as recommended by MISRA-C 2004 rule 17.4 – when writing verifiable code.

Finally, we could implement this function using memcpy:

void arrayCopy(const int* array src, int* array dst, size_t num)
writes(dst.all)
pre(src.lim >= num; dst.lim >= num)
pre(disjoint(src.all, dst.all))
pre(num * sizeof(int) <= maxof(size_t))
post(forall i in 0..(num - 1) :- dst[i] == src[i])
{ memcpy(dst, src, num * sizeof(int));
}

I’ve had to add another precondition to ensure that the computation of the number of bytes num * sizeof(int) doesn’t overflow a size_t. If Arc were to assume that the number of bytes in any array cannot exceed maxof(size_t) – as is the case for many implementations – then it could infer than dst.lim <= maxof(size_t)/sizeof(int) and we would not need the precondition. Unfortunately, the C standard makes no such promise.

The call to memcpy involves implicit conversions of its first two parameters from int* to void* and from const int* to const void*. In general, pointer conversions should be avoided when writing for ArC, because such conversions break ArC’s strong type model. However, ArC knows enough about the semantics of memcpy to prove the 10 verification conditions generated by this version too.

Run-time checks: Are they worth it?

July 7, 2010 davidcrocker 4 comments

One of the criticisms levelled against the use of C in safety-critical software is that the C language does not provide run-time checks automatically. For example, when indexing into an array, there is no check that the index is in bounds. Likewise, when doing integer arithmetic in C, there is no check for arithmetic overflow.

Many other programming languages do provide run-time checks. For example, Java and C# both raise exceptions if the program attempts to access an out-of-bounds array element. C# also provides an option to raise exceptions on arithmetic overflow. Ada provides all these run-time checks by default, although most compilers have an option to inhibit their generation. If you are programming in C++ and follow my recommendation to use a C++ array class, you can choose whether or not to perform index-in-bound checks.

Some developers of critical software consider it axiomatic that you should leave run-time checks enabled, if the programming language provides them. The practice of enabling run-time checks in debug builds but disabling them in release builds had been likened to carrying a fire extinguisher in your car at all times, except when you actually use it! Does this mean that we should insist that run-time checks are always enabled in release builds of critical software – and therefore avoid using programming languages that don’t support them, such as plain C?

In my opinion, shipping software containing run-time checks is generally a good thing if the software can do something useful when a run-time check fails, and that “something useful” is tested. This will probably require the deliberate introduction of temporary bugs, since we expect the software to be free from run-time errors in normal use. For an example of where this wasn’t done properly, see the report on the Ariane 5 launch. In summary, the launcher was lost when a run-time error occurred in a floating-point to fixed-point conversion, in a software module that was performing no useful function at the time. This led to the whole software subsystem shutting down. In doing so, it put an error code on the data bus, where it was interpreted as flight data. This caused the rocket nozzles to move to full deflection, leading to break-up of the rocket and triggering of the self-destruction mechanism. Had the overflowing data conversion simply yielded wrong information instead of raising an exception, the mission might not have been lost. This may be a rare example, but I think it illustrates that run-time checks are not invariably a good thing.

What should we do when a run-time check fails? The Ariane software developers assumed that any run-time check failures would be caused by random hardware errors, so their approach was to log the error (which they got right) and hand over to the backup unit (which didn’t help, because it had already experienced exactly the same run-time check failure).

In an embedded system, the options for handling failed run-time checks may be limited – particularly in life-preserving software, such as fly-by-wire control systems. Handing over to a backup unit won’t help if the run-time failure is caused by a systemic failure, and restarting the software is usually not a viable option. In such cases, the best solution is surely to show that run-time checks can never fail unless the hardware is at fault. You can attempt to show this by thorough testing, but you need to be very careful. The Ariane 5 software had been thoroughly tested and “proven in use” in Ariane 4; however, the software was then re-used in Ariane 5 without repeating the tests using the higher horizontal velocity inputs that occur in an Ariane 5 launch.

You can also attempt to show that run-time checks will never fail by formal verification. This will expose any hidden preconditions – such as the maximum horizontal velocity that could be handled by the Ariane software. Your re-use strategy must ensure that when you use previously verified software in a new environment, the new environment continues to respect the preconditions.

If run-time checks can be a mixed blessing in release software, are they ever always a good thing? Yes they are, when you are testing the software! No worries about what to do when a run-time check fails: just log it and terminate the test! Whether you are doing unit testing or integration testing, a failed run-time check gives an early indication of something being wrong, often making diagnosis of the bug much easier. Run-time checks can also catch “benign” bugs that don’t lead to incorrect results in tests, but which may bite you later. For example, reading an array beyond its bounds may cause the program to read a value that depends on the previous test. It might read a “benign” value during testing, but a “harmful” value during some particular use in the field.

If you’re performing formal verification before testing, you may argue that run-time checks are a waste of testing time. After all, they are never going to fail, right? Well, even with full formal verification, errors might occur. The compiler you are using might be generating the wrong code; or the linker might introduce an error; or the hardware itself may be faulty. Even formal verification systems have been known to contain errors. When we test formally verified software, any test failure is symptomatic of a fault in the development process, tool chain, or hardware. If we test throughly and find no errors, this gives us confidence that the process and tool chain are sound. Testing with run-time checks enabled (as well as without, if we intend to ship without run-time checks) and experiencing no run-time check failures adds to that confidence.

We adopt this philosophy when developing ArC and Perfect Developer. Both are written in Perfect, from which we generate code. For releases, we generate C++ because of the increased speed it offers – after all, theorem proving is computationally expensive. But all our development prior to final regression testing is done using C# code generation, so that we have the added benefit of run-time checks.

Aliasing and how to control it

Today I’ll start by writing a simple function that determines the maximum and minimum of two integers. We want to return two values, and C doesn’t make that easy unless we declare a struct to hold them. So I’ll pass two pointers to where I want the results stored instead. Here goes:

#include "arc.h"

void minMax(int a, int b, out int *min, out int *max)
post(*min <= a; *min <= b; *min == a || *min == b)
post(*max >= a; *max >= b; *max == a || *max == b)
{ *min = a < b ? a : b;
  *max = a > b ? a : b;
}

I’ve highlighted the ArC annotations in green. Notice how, in the postcondition, I’ve defined the minimum as a value that is less than or equal to both inputs and equal to one of them. Similarly for the maximum. Also notice that the two parameters used to pass the results back are flagged with the out keyword. This tells ArC (and users of this function) that the initial values of *min and *max are not used by the function, and the obligation to initialize them rests with the function, instead of with the caller of the function.

Does this function work? Always? Let’s see what ArC makes of it:

c:\escher\arctest\arctest20.c (9,25): Warning! Unable to prove: Postcondition satisfied when function returns (defined at c:\escher\arctest\arctest20.c (5,27)) (see C:\Escher\ArcTest\arctest20_unproven.htm#2), did not prove: *min <= b.
c:\escher\arctest\arctest20.c (9,25): Warning! Unable to prove: Postcondition satisfied when function returns (defined at c:\escher\arctest\arctest20.c (5,16)) (see C:\Escher\ArcTest\arctest20_unproven.htm#1), did not prove: *min <= a.

c:\escher\arctest\arctest20.c (9,25): Information! Confirmed: Postcondition satisfied when function returns (defined at c:\escher\arctest\arctest20.c (5,43)) (see C:\Escher\ArcTest\arctest20_proof.htm#3), proved: (a == *min) || (b == *min).
c:\escher\arctest\arctest20.c (9,25): Information! Confirmed: Postcondition satisfied when function returns (defined at c:\escher\arctest\arctest20.c (6,16)) (see C:\Escher\ArcTest\arctest20_proof.htm#4), proved: a <= *max.
c:\escher\arctest\arctest20.c (9,25): Information! Confirmed: Postcondition satisfied when function returns (defined at c:\escher\arctest\arctest20.c (6,27)) (see C:\Escher\ArcTest\arctest20_proof.htm#5), proved: b <= *max.
c:\escher\arctest\arctest20.c (9,25): Information! Confirmed: Postcondition satisfied when function returns (defined at c:\escher\arctest\arctest20.c (6,43)) (see C:\Escher\ArcTest\arctest20_proof.htm#6), proved: (a == *max) || (b == *max).

So ArC proves that the maximum is returned correctly, but not the minimum. What’s going on here? If we follow the link to the “unproven” file, we find a clue:

Could not prove: !(min == max)

Sure enough, it’s clear that if we make the following call:

int temp;
minMax(42, 24, &temp, &temp);

then minMax fails to meet its specification, because after storing the minimum in temp, it overwrites that value with the maximum.

The fix is to add a precondition to make that sort of call illegal:

void minMax(int a, int b, out int *min, out int *max)
pre(min != max)
post(*min <= a; *min <= b; *min == a || *min == b)
post(*max >= a; *max >= b; *max == a || *max == b)
{ *min = a < b ? a : b;
  *max = a > b ? a : b;
}

ArC is then able to verify the function.

This is a simple example of classic pointer aliasing - the situation in which two pointers refer to the same memory location. Any assignments we make through one of the pointers affects the value we read through the other. Another form of aliasing is pointer-variable aliasing. In this variant, a function is manipulating the value referred to by a pointer, and also a variable – typically a static variable declared outside the function. If the pointer points to the variable, then assignments through the pointer affect the value read from the variable, and vice versa.

ArC adopts a strict aliasing model that disallows casts between different pointer types. This means that in an ArC-compatible C program, pointer-pointer aliasing can only occur if the pointers have the same type, or if one of the types pointed to contains a field or element of the other. Similarly, aliasing between a pointer and a variable can only happen if the type pointed to is either the same as the type of the variable, or the same as the type a field or element contained in the variable. This serves to reduce the potential for aliasing.

Nevertheless, there are still many situations in which aliasing is possible and will break the program if it occurs. For example, consider the following function for copying elements from one array to another:

#include "arc.h"

void arrayCopy(const int* array src, int* array dst, size_t num)
pre(src.lim >= num; dst.lim >= num)
post(forall i in 0..(num - 1) :- dst[i] == src[i])
{ size_t j;
  int i;
  for (j = 0; j < num; ++j)
  keep(j <= num)
  keep(forall i in 0..(j - 1) :- dst[i] == src[i])
  decrease(size - i)
  { dst[j] = src[j];
  }
}

The precondition ensure that we don’t get out-of-bounds array accesses, and the postcondition describes what we want to achieve. For an explanation of the loop invariants (keep clauses) and loop variants (decrease clauses), see Verifying loops in C and C++, Verifying loops – part 2 and Verifying loops: proving termination.

Does this function meet its specification? Not if the elements of dst we are writing overlap with the elements of src that we are reading! So we need an anti-aliasing precondition again. Here’s one possibility:

void arrayCopy(const int* array src, int* array dst, size_t num)
pre(src.lim >= num; dst.lim >= num)
pre(forall i in src.indices; j in dst.indices; &src[i] != &dst[j])
post(forall i in 0..(num - 1) :- dst[i] == src[i])
{ ...
}

Stating that arrays don’t overlap in this way is rather cumbersome. But it can get a lot worse! Suppose we have, for example, an int* and a pointer to an array of structs, where each struct has several int fields, some int[] fields, and some struct fields that may themselves contains ints. If we want to write a precondition stating that the int* doesn’t alias any of the ints contained in the array of structs, we would need one precondition term for each possible way in which they could be aliased. If we add another int* parameter that is also forbidden to alias the other pointers, we would need to do much the same all over again.

To make it easier to express absence of aliasing, ArC provides the disjoint-expression. Here’s our example modified to use it:

void arrayCopy(const int* array src, int* array dst, size_t num)
pre(src.lim >= num; dst.lim >= num)
pre(disjoint(src.all, dst.all))
post(forall i in 0..(num - 1) :- dst[i] == src[i])
{ ...
}

The disjoint-expression takes two or more operands and yields true if and only if no pair of operands refer to overlapping storage. Internally, ArC determines all the ways in which pairs of operands could be aliased, and expands the expression into a conjunction of equivalent predicates for the prover to use.

Although ArC can reason about C programs even in the presence of aliasing, it’s best to write programs that avoid the possibility of aliasing as far as possible. To help you, ArC extends the strict aliasing model when you declare a new type using a constrained typedef. ArC doesn’t allow you to cast a pointer to a constrained type to or from a pointer to any other type. For example, suppose I declare the following:

typedef int invariant(value >= 0) count_t;
typedef int invariant(value >= -273) temperature_t;

void foo(count_t * a, temperature_t *b) { ... }

Within foo, ArC knows that *a and *b cannot be aliases for the same variable, because the pointer types are not compatible. If you should want to declare a new sort of integral type without actually constraining it, you can use invariant(true).

Verifying absence of integer overflow

June 7, 2010 davidcrocker 2 comments

One class of errors we need to guard against when writing critical software is arithmetic overflow. Before I go into detail, I invite you to consider the following program and decide what it prints:

#include <stdio.h>
int main(int argc, char *argv[]) {
unsigned int x = 42;
long y = -10;
printf("%s\n", (x > y ? "Hello, normal world!" : "Hello, strange world!"));
return 0;
}

I’ll return to this example later. For now, consider the function for averaging a set of readings stored in an array that I examined at in my previous two posts (ArC-specific bits are shown in green):

int16_t average(const int16_t * array readings, size_t numReadings)
pre(readings.lwb == 0; readings.upb == numReadings)
pre(numReadings != 0)
returns((+ over readings.all)/numReadings)

{
int sum = 0;
size_t i;
for (i = 0; i < numReadings; ++i)
keep(i <= numReadings)
keep(sum == + over readings.all.take(i))
decrease(numReadings - i)
{
sum += readings[i];
}
return (int16_t)(sum/numReadings);
}

I reported in the earlier post that ArC is able to verify this, apart from proving absence of integer overflow. Here are the verification warnings that ArC reports:

c:\escher\arctest\arctest14.c (17,13): Warning! Exceeded time limit proving: Arithmetic result of operator ‘+’ is within limit of type ‘int’, suggestion available (see C:\Escher\ArcTest\arctest14_unproven.htm#9), did not prove: minof(int) <= (sum$loopstart_5398,5$ + readings[i$loopstart_5398,5$ - readings.lwb]).

c:\escher\arctest\arctest14.c (17,13): Warning! Exceeded time limit proving: Arithmetic result of operator ‘+’ is within limit of type ‘int’, suggestion available (see C:\Escher\ArcTest\arctest14_unproven.htm#10), did not prove: (sum$loopstart_5398,5$ + readings[i$loopstart_5398,5$ - readings.lwb]) <= maxof(int).

c:\escher\arctest\arctest14.c (19,22): Warning! Unable to prove: Type constraint satisfied in implicit conversion from ‘int’ to ‘unsigned int’, suggestion available (see C:\Escher\ArcTest\arctest14_unproven.htm#18), did not prove: minof(unsigned int) <= sum$5398,5$.

Any time we do an explicit or implicit conversion from one type to another type that is not wider than the original or is defined by a constrained typedef, or we do certain arithmetic operations, we need to guard against integer overflow. This example contains several such operations.

The first implicit type conversion is when we initialize the loop counter i to the constant 0. We declared i to be of type size_t, which is an unsigned type, whereas the constant 0 has type (signed) int because we didn’t use the U suffix. So there is an implicit conversion from int to size_t here, and ArC will generate verification conditions that the constant 0 is in the range of size_t. Of course, the proof is trivial.

The other implicit type conversion is in the subexpression sum/numReadings in the return statement. Since numReadings has type size_t and sum has type int, the “usual arithmetic conversions” will be applied. In practice, size_t is never smaller than int, so there will be an implicit conversion of sum to size_t. ArC therefore generates the verification condition that the current value of sum is representable in size_t. As there is nothing to prevent sum from being negative, the proof fails – hence it emits the third warning message listed above.

In fact, ArC – like most ordinary static analysers and some compilers – warns about the signed/unsigned mismatch even before starting the proofs. Avoiding signed/unsigned mismatches in the first place is probably a more efficient strategy than dealing with the verification failures that often occur when you don’t. Had I followed my own advice to avoid using unsigned types, the signed/unsigned mismatch would not have arisen.

To fix the error, we can either change the type of the parameter numReadings to int, or cast it to type int before using it as the divisor. If we do the latter, we’ll also need to add a function precondition that numReadings is not too large to fit in an int.

For the integer operators + – and * it can be the case that applying the “usual arithmetic conversions” to an operator expression does not involve an unsafe implicit type conversion, yet the arithmetic result is not representable in the result type. The C standard defines the result in this case as undefined if the promoted operands are signed, or the arithmetic result modulo some N if they are unsigned. ArC treats it as an error in either case, because unsigned types are frequently used where modulo arithmetic is not the user’s intention; so it always generates verification conditions that the result is in type. This is the reason for the first two verification failures that ArC reports for our original example. The expression sum += readings[i] could easily overflow, because the sum could be as high as maxof(int16_t) * maxof(size_t) or as low as minof(int16_t) * maxof(size_t), which are larger than maxof(int) and minof(int) respectively. To avoid overflow, we will need to constrain either the minimum and maximum values of each reading, or the number of readings. Here’s our example with the precondition changed to limit the number of readings:

int16_t average(const int16_t * array readings, size_t numReadings)
pre(readings.lwb == 0; readings.upb == numReadings)
pre(numReadings != 0; numReadings <= maxof(int)/(maxof(int16_t) + 1))
returns((+ over readings.all)/numReadings)

{
int sum = 0;
size_t i;
for (i = 0; i < numReadings; ++i)
keep(i <= numReadings)
keep(sum == + over readings.all.take(i))
decrease(numReadings - i)
{
sum += readings[i];
}
return (int16_t)(sum/(int)numReadings);
}

I’ve referred to the maximum value supported by type int as maxof(int). ArC provides type operators maxof(T) and minof(T) for your convenience, to save you from having to #include <limits.h> or similar just for specification purposes. I’ve also cast numReadings to int prior to using it in the division.

With the compiler parameters set to assume int is 32 bits and short int is 16 bits, ArC verifies this example completely. You may be wondering why the expression (+ over readings.all)/numReadings in the returns specification doesn’t also generate a signed/unsigned warning or verification failure. That’s because integer arithmetic in specifications is always carried out after promoting operands to type integer, the ArC integral types that has no bounds.

Finally, let’s return to the short program I invited you to consider at the start of this post. Is the world normal or strange? On my computer, compiling with Microsoft Visual C++ 2008, it’s strange. That’s because for this compiler, both (unsigned) int and long are 32 bits. So both operands of less-than get converted to unsigned long … and -10 won’t fit! For the world to be normal, you’ll need to use a compiler for which long is larger than int, so that both operands get converted to (signed) long instead.

Specification with Ghost Functions

In my previous post I showed that the C expression sublanguage extended with quantified expressions (forall and exists) is insufficient to allow some specifications to be expressed. I presented this function (annotated with an incomplete specification) to average an array of data:

int16_t average(const int16_t * array readings, size_t numReadings)
pre(readings.lwb == 0; readings.lim == numReadings)
pre(numReadings != 0)
post(result == ? /* sum of elements of readings */ /numReadings)

{
int sum = 0;
size_t i;
for (i = 0; i < numReadings; ++i)
keep(i <= numReadings)
keep(sum == ? /* sum of first i elements of readings */ )
decrease(numReadings - i)
{
sum += readings[i];
}
return (int16_t)(sum/numReadings);
}

The two question marks need to be replaced by expressions for sums over the appropriate elements. Last time I showed how we can replace them by over-expressions. In this post, I’ll describe an alternative solution that has more general applications.

The problem we are faced with is that there is no C expression (even if we include forall and exists expressions) that can express the sum of some of array elements, where the number of elements is not statically known. We can calculate the sum using a loop; but a loop is not an expression and so cannot be used in specifications. However, specification expressions can contain function calls, provided that the functions have no side-effects. We can resolve our problem by defining a ghost function that calculates the sum. We’ll specify the function recursively so that we can handle the variable number of elements.

In the postcondition of average, we need to calculate the sum of all the elements of the readings array. In the loop variant, we need the sum of just the first i elements. So let’s declare a function that returns the sum of the first n elements of an array a, where a and n are parameters. We define the sum to be zero if n is zero, otherwise it’s the sum of the first n-1 elements plus the nth element. Here’s our ghost function specification:

#ifdef __ARC__
ghost integer sumOf(const int16_t * array a, size_t n)
pre(a.lwb == 0; n <= a.lim)
decrease(n)
returns(n == 0 ? 0 : a[n - 1] + sumOf(a, n - 1));
#endif

Declaring the function ghost tells ArC that it is for use in specifications only. A ghost function must have a specification but no body. It can also do a few things that wouldn’t be allowed for a normal function. In particular, it can use types not normally allowed in C, such as integer, which is ArC’s name for the type of unbounded integers. I’ve declared the function as returning integer so that neither we nor ArC need worry about arithmetic overflow in the specification.

The precondition defines the conditions for the function to be well-behaved, as usual. I’ve specified the value that the function returns using a postcondition of the form returns(expression). This has the same meaning as post(result == expression), but it’s shorter. Also, we want to define the function recursively, and ArC doesn’t currently allow a recursive call to the function being specified inside post(…) but it does inside returns(…). I’ve enclosed the whole function declaration in #ifdef __ARC__#endif so that the C compiler doesn’t see it.

Recursion would normally be forbidden in critical software (MISRA rule 16.2), however recursive calls in specifications are safe because they don’t carry any risk of run-time stack overflow. To ensure that the specification makes sense, ArC will need to prove that the recursion is bounded. That’s what the decrease clause is for, and it works exactly as if it were a loop variant – that is, it must decrease on each recursive call and it must not go negative.

Having defined sumOf,, we can complete the postcondition and loop invariant:

int16_t average(const int16_t * array readings, size_t numReadings)
pre(readings.lwb == 0; readings.lim == numReadings)
pre(numReadings != 0)
returns(sumOf(readings, numReadings)/numReadings)

{
int sum = 0;
size_t i;
for (i = 0; i < numReadings; ++i)
keep(i <= numReadings)
keep(sum == sumOf(readings, i))
decrease(numReadings - i)
{
sum += readings[i];
}
return (int16_t)(sum/numReadings);
}

I’ve taken the liberty of replacing the original post(…) postcondition of average with the returns(…) form.

Just like the version of average I gave in my previous post, this one verifies completely except for possible overflow during integer arithmetic. I’ll show how we can deal with that next time.

Expressing the Inexpressible

May 24, 2010 davidcrocker 3 comments

When writing preconditions, postconditions and other specifications for C programs, sometimes we need to write expressions that can’t be expressed in plain C. That’s why formal verification systems based on annotated programming languages almost always augment the expression sublanguage with forall and exists expressions. In previous posts, I’ve introduced ArC’s implementations of these. For example, the following expression yields true if all elements of the array arr are between 0 and 100 inclusive:

forall ind in arr.indices :- arr[ind] >= 0 && arr[ind] <= 100

Here, ind is declared as a bound variable that ranges over the values in the expression that follows the keyword in, which in this case is all the indices into arr.

Similarly, this expression:

exists ind in 0..9 :- arr[ind] >= 0 && arr[ind] <= 100

yields true if at least one of the first ten elements is in the range 0 to 100. I’ve used one of ArC’s special operators here: the range operator “..”, which yields a sequence of values from its first operand up to its second operand. In fact, in the first example, arr.indices is defined as arr.lwb .. arr.upb, so I was implicitly using the range operator there too.

Providing forall and exists to quantify over finite sets and sequences lets us express many sorts of specification, but isn’t always enough. Very occasionally, we need to use forall or exists to quantify over a potentially infinite type, which ArC also allows. For example, the following expression yields true if p(x) is true for all values of x belonging to type T:

forall T x :- p(x)

However, there are many cases in which a specification still cannot be expressed. For example, consider the following function for averaging a set of readings stored in an array:

int16_t average(const int16_t * array readings, size_t numReadings)
pre(readings.lwb == 0; readings.lim == numReadings)
pre(numReadings != 0)
post(result == ? /* sum of elements of readings */
/numReadings)
{
int sum = 0;
size_t i;
for (i = 0; i < numReadings; ++i)
keep(i <= numReadings)
keep(sum == ? /* sum of first i elements of readings */ )
decrease(numReadings - i)
{
sum += readings[i];
}
return (int16_t)(sum/numReadings);
}

I’ve included some ArC annotations (highlighted in green text) in this example, to specify that the valid indices of readings are 0..(numReadings – 1), and that numReadings isn’t zero so that the final division operation will be valid. I’ve also provided a loop invariant and loop variant (see my earlier posts on verifying loops if you aren’t familiar with these). However, in the postcondition I’ve put a question mark where I need to express “sum of all elements of readings“, and in the loop invariant I’ve put another question-mark where I want “sum of the first i elements of readings“. How can we express these quantities?

In this case, there is an easy way, and another way that is less easy but more general. Let’s start with the easy way. ArC is derived from Perfect Developer, and when writing ArC specifications you can use most of the library types and expression types provided by PD. In particular, type seq of T from PD is treated as equivalent to T[] in C. So we can use Perfect sequence operations on C arrays. A list of member functions of seq of T can be found in the Perfect Developer Library Reference. In PD, most of these functions are available for use anywhere, since code can be generated for them; but when used in ArC, they are of course all “ghost” functions – that is, functions that can be used in specifications only.

The particular Perfect expression type we need here is the over expression, which expresses repeated application of an operator over the elements of a collection. Those with a background in functional programming may recognise it as left-fold. Our postcondition can be written:

post(result == (+ over readings.all)/numReadings)

We’ve had to use readings.all rather than just readings as the operand of +over, because readings is an array pointer, and we need to provide a genuine array to +over. Hence ArC provides the array pointer type with ghost member all. You can think of readings.all as yielding the sequence readings[readings.lwb], readings[readings.lwb + 1] and so on up to readings[readings.upb]. We’d have liked to use *readings to mean the array that readings points to, but of course in C that just yields the value of the first element.

For the loop invariant expression, we can use +over again, but we need to apply it to the first i elements of readings rather than all elements. The seq of T class provides member take for this purpose, allowing us to use:

keep(sum == + over readings.all.take(i))

That’s enough to verify our function, apart from dealing with a potential integer overflow when summing the elements, which I’ll return to in a later post. Next time I’ll demonstrate how we can use a ghost function to define the notion of summation without recourse to  over-expressions.

Verifying a binary search, part 2

May 6, 2010 davidcrocker 2 comments

In my last entry I showed how to use a correct-by-construction approach to develop a binary search function. We got as far as specifying the function and the loop, but we left the loop body undefined. The function declaration looked like this:

size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key)
pre(table.lwb == 0; table.lim == nElems)
pre(forall a in table.indices; b in table.indices
:- b > a => table[b].raw > table[a].raw)
post(result <= nElems)
post(forall i in 0..(result - 1) :- key >= table[i])
post(forall i in result..(nElems - 1) :- key <= table[i]);

and the function definition like this:

size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key) {
size_t low = 0, high = nElems;
while (high != low)
keep(high >= low)
keep(high <= nElems)
keep(forall i in 0..(low - 1) :- table[i].rawValue <= key)
keep(forall i in high..(nElems - 1) :- table[i].rawValue >= key)
decrease(high - low) {
low = ?; high = ?;
}
return low;
}

ArC verified the program, apart from the incomplete loop body. So we just need to write loop body code that preserves the four loop invariants (the expressions in the keep clauses), decreases the variant (the expression in the decrease clause), and leaves the variant non-negative unless the loop is about to terminate.

The classic way to do a binary search is to pick an index midway between low and high, compare the table item at that point with the key, and adjust either low or high to that index depending on the result. So here’s a first attempt:

const size_t mid = (low + high)/2;
if (key >= table[mid].raw) {
low = mid;
} else {
high = mid;
}

If we try to verify this using ArC, we get a single failed verification condition:

c:\escher\arctest\ex5.c (23,9): Warning! Exceeded time limit proving: Loop body establishes end condition or decreases variant (defined at c:\escher\arctest\ex5.c (21,5)) (see C:\Escher\ArcTest\ex5_unproven.htm#6), did not prove: (low$loopstart_7603,5$ + high$loopend$) < (low$loopend$ + high$loopstart_7603,5$).

The prover has timed out trying to prove that the body either decreases the variant or leads to termination of the loop. There are four possibilities:

  1. There is a genuine problem – the code may loop indefinitely.
  2. The code won’t loop indefinitely, but we need a different loop variant in order to make this provable.
  3. The verification condition was simply too hard for the prover (this is always a possibility when the proof timed out). We could try increasing the prover timeout.
  4. There is a bug in ArC.

In this case, if we follow the link to the HTML proof report file, we see this:

Could not prove any of:
!((1 + lowloopstart_7603,5) == highloopstart_7603,5)

So the prover thinks that the loop may not make any progress if high == 1 + low at the start of the loop. Sure enough, there is; because in this situation, mid == low. So we may end up doing the assignment low = low, leading to an indefinite loop.

One way to fix this is to avoid the situation high = 1 + low at the start of the loop. For example, we could change the while-condition to high > low + 1. Returning low immediately after the end of the loop will no longer be correct, so we’ll need to do something different.

However, the loop invariant we have to maintain is that all elements up to and including low – 1 hold raw values less than or equal to key. Therefore, having established that the element at mid is less than or equal to key, we can set low to mid + 1 instead of mid, avoiding the problem. So the code now looks like this:

size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key) {
size_t low = 0, high = nElems;
while (high != low)
keep(high >= low)
keep(high <= nElems)
keep(forall i in 0..(low - 1) :- table[i].rawValue <= key)
keep(forall i in high..(nElems - 1) :- table[i].rawValue >= key)
decrease(high - low) {
const size_t mid = (low + high)/2;
if (key >= table[mid].raw) {
low = mid + 1;
} else {
high = mid;
}
}
return low;
}

This is sufficient to make the function fully verifiable.

I hope you will agree that this has demonstrated a correct-by-construction approach to writing loops, avoiding the dangers of non-termination and off-by-one errors. It’s all too easy to introduce off-by-one errors in C, because – unlike Ada – the C language doesn’t provide a  form of loop that iterates over exactly the index range of an array, so you need to design the termination condition yourself.

Those of you who are familiar with SPARK Ada may be wondering why I used multiple preconditions, postcondition and loop invariants, rather than “and”-ing them all together like this:

keep(high >= low
&& high <= nElems
&& (forall i in 0..(low - 1) :- table[i].rawValue <= key)
&& (forall i in high..(nElems - 1) :- table[i].rawValue >= key))

You can do that in ArC, but there are two reasons why it is better not to. Firstly, when ArC is generating verification conditions – in this case, that the initialization establishes the loop invariant that the loop body preserves it – ArC will generate a separate verification condition for each invariant expression. If you “and” the invariants together, you have a single expression, and therefore a single verification condition. If ArC fails to prove it, you can’t easily tell which part of the invariant failed without looking at the prover output. Writing the invariants separately (or, alternatively, putting them all in one keep-clause as separate expressions separated by semicolons) causes ArC to generate a separate verification condition for each one, so that you can see immediately which one(s) failed.

The second reason for using multiple specification clauses is also to do with error reporting. Remember that ArC keywords such as pre, post and keep are implemented as macros, so that the specifications can be made invisible to standard C compilers. When a C preprocessor expands the macros, it generally ignores the layout of the actual macro arguments, and generates the expanded version using the layout of the macro definition. So a macro argument that you wrote across several lines in your source code typically gets expanded to a single line. This means that if an error is reported, the line number will be reported as the line on which the ArC keyword occurred, even if the error is in a macro argument a few lines later. So it’s best to keep the arguments to ArC specification macros on the same line as the keyword. Therefore, when using long expressions, you’ll want to use a separate instance of the ArC keyword with each expression.

Verifying a binary search

In the last post, I covered some different levels of formal verification that you may be interested in, and showed how to add minimum annotation to the linearization example to allow ArC to prove predictable execution. The example provided a prototype for the binary search function it called, to which we added a minimal postcondition, so that it looked like this:

size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key)
post(result <= nElems);

Let’s develop a verified implementation of this binary search function. We could write the code and then try to get ArC to verify it – either for predictable execution only, or for full functional correctness. However, it’s quite hard to get a binary search function right. So for this example, it’s more productive to use formal specification to develop an implementation that is correct by construction.

First, we need to specify exactly what bSearch is supposed to return, and what its preconditions are. Let’s start with the preconditions. We’re passing the number of elements of table in nElems, so we need to specify this:

size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key)
pre(table.lwb == 0; table.lim == nElems)
post(result <= nElems);

The precondition says that the lower bound (the lowest legal index) into table is zero – in other words, table is a pointer to the first element of an array – and that the limit of table (one plus the highest legal index) is nElems. However, this isn’t enough: the raw values need to be in increasing order too. A standard way of specifying “in increasing order” is to say that for any two indices a and b, if b > a then the element at b is greater than the element at a. Here’s that expressed as a precondition:

size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key)
pre(table.lwb == 0; table.lim == nElems)
pre(forall a in table.indices; b in table.indices :- b > a => table[b].raw > table[a].raw)
post(result <= nElems);

The ghost member indices of an array pointer is defined by ArC as the sequence of all valid indices into the array. So table.indices means the same as table.lwb .. table.upb. The => symbol means “implies”, so b > a => table[b].raw > table[a].raw means the same as !(b > a) || table[b].raw > table[a].raw but is clearer.

Now we need to specify what the function returns. In developing linearize in the previous post, I said I was assuming a return value of 0 means the raw value is off the bottom of the table, nElems means it is off the top, otherwise the table entry indexed by the return value and the previous table entry bracket the raw value. Actually, we can simplify this. If we say that any elements with indices below the returned value have raw values less than or equal to the key, and any elements with indices at or above the returned value have raw values greater than or equal to the key, then this covers the cases of returning 0 or nElems as well.  So let’s express this single constraint in a postcondition:

size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key)
pre(table.lwb == 0; table.lim == nElems)
pre(forall a in table.indices; b in table.indices :- b > a => table[b].raw > table[a].raw)
post(result <= nElems)
post(forall i in 0..(result - 1) :- key >= table[i])
post(forall i in result..(nElems - 1) :- key <= table[i])
;

This specification isn’t precise about the value we return when there is a table entry whose raw value exactly matches the key – it allows us to return the index of either that entry or the next entry. My original informal specification wasn’t precise either – it said that the two entries “bracket” the key. Given that our precondition forbids duplicate entries, we could be more precise if we want, e.g. by changing <= in the final postcondition to <.

Now we can work on the implementation. Note that when you have a function prototype declaration separate from the implementation (as here), ArC expects you to put the function specification in the prototype only, so we don’t need to repeat the preconditions and postconditions in the implementation. Here’s a rough sketch of what we want to do:

size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key) {
size_t low = 0, high = nElems;
while (high != low) {
/* increase low or decrease high, such that high remains >= low,
all elements below low have raw values <= the key,
and all elements at or above high have raw values >= the key */

}
return low;
}

Let’s express the text in the comment as a loop variant and a loop invariant (see my earlier post on verifying loops if you’re not familiar with these):

size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key) {
size_t low = 0, high = nElems;
while (high != low)
keep(high >= low)
keep(forall i in 0..(low - 1) :- table[i].rawValue <= key)
keep(forall i in high..(nElems - 1) :- table[i].rawValue >= key)
decrease(high - low) {
/* do something */
}
return low;
}

Before we go any further, let’s check that this makes sense. ArC will need to prove that the loop invariants (the keep clauses) are true at the start of the loop. That’s easy – the initialization of low and high ensures that high >= low (which satisfies the first invariant), and that there are no table indices below low and no table indices at or above high (which satisfies the second and third invariants, because forall over an empty range is always true). Also, because we are returning low when the loop terminates, ArC will need to prove that the value of low at the end of the loop meets the postcondition on the return value. We know that when the loop terminates, the loop invariants will be true and the while-clause will be false. Therefore, we can substitute result for low (because we are returning low) and result for high (because high == low is the inverse of the while-clause) in the keep-clauses, to find out what is known about result.  When we do this, the last two keep clauses magically turn into the last two postconditions – which is exactly what we want. However, this doesn’t help with the first postcondition, which requires result <= nElems. Adding an extra loop invariant low <= nElems or high <= nElems will do the trick, since substituting result for both low and high will then give us the required term. I’ll choose high <= nElems, because it is stronger and I don’t intend that high should ever exceed nElems.

We can use ArC to check that the design is OK so far, even before we code the loop body. As well as adding the new loop invariant, we’ll need to tell ArC that we intend to assign to low and high low within the loop body, which we can do like this:

size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key) {
size_t low = 0, high = nElems;
while (high != low)
keep(high >= low)
keep(high <= nElems)
keep(forall i in 0..(low - 1) :- table[i].rawValue <= key)
keep(forall i in high..(nElems - 1) :- table[i].rawValue >= key)
decrease(high - low) {
low = ?; high = ?;
}
return low;
}

ArC allows you to use ? in place of an expression to mean you haven’t decided what goes there yet. Naturally, ArC won’t be able to prove that the loop preserves its invariants or decreases its variant. Sure enough, if we run ArC on this example, we get 6 unproven verification conditions that refer to the loop body: one for each of the loop invariants and two for the variant. However, ArC reports that everything else is OK, including that the loop invariants are satisfied by the initialization and that the return value meets all three postconditions.

So all we need to do now is to write loop body code that assigns low and/or high such that the four loop invariants are preserved and the loop variant high – low decreases. But this post is already quite long, so I’ll do that next time.

What are you trying to prove?

April 27, 2010 davidcrocker 9 comments

If you’re thinking of using formal verification to increase the quality and reliability of your software, one of the first decisions you need to make is what you want to prove. Roughly speaking, you have three levels to choose from:

1. Predictable execution – that is, freedom from undefined and implementation-defined behaviour. The proofs cover such things as: all array indices are in bounds, variables are initialized before use, conversions of values to narrower types do not result in overflow, arithmetic does not overflow or underflow, null pointers are not dereferenced, and division by zero does not occur. It includes all the usual causes of what the MISRA standards call “run-time errors”, so it covers MISRA C rule 21.1.

2. Partial correctness – all the above, and the program meets its specification, if it terminates. The specification you provide may be a full functional specification, or a partial specification covering the most important parts, such as safety properties.

3. Full correctness – all the above, and the program terminates.

Most formal program verification systems supports one or more of these levels. For example, SPARK Ada supports levels 1 and 2. In SPARK-speak, level 1 is called “exception freedom”, because Ada programs normally perform run-time checking, so run-time errors manifest themselves as exceptions (which are predictable). On the other hand, Perfect Developer is focussed on full correctness, because it is a top-down tool that starts from specifications – leaving you no choice but to specify what the program is intended to achieve. ArC gives you the choice of all three levels.

You don’t have to choose a single verification level for the entire system. So you might choose full correctness for the most critical parts of the system, but be content with predictable execution elsewhere.

What difference does the choice of verification level make to you as software developer? The main effect is that if you wish to prove partial or full correctness rather than just predictable execution, you will need to write function postconditions that state what the functions are intended to achieve. If you choose only to prove predictable execution, you will sometimes still need to write function postconditions, but they can be more vague – stating something about what the function achieves rather than everything it is supposed to achieve.

Here’s an example. Suppose we have read a value from a sensor that has a non-linear response. We wish to linearize the value. To this end, we have a constant table of pairs of (raw value, linearized value), ordered such that both values are monotonically increasing. We will use a binary search to find the two adjacent entries whose raw values bracket the value read from the sensor, then interpolate between the corresponding linearized values. So the function looks something like this:

#include "arc.h"
#include "stddef.h"

typedef unsigned short uint16_t;
typedef struct { uint16_t raw; uint16_t corrected; } LinEntry;

size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key);

uint16_t linearize(const LinEntry* array table, size_t nElems, uint16_t rawValue) {
  size_t index = bSearch(table, nElems, rawValue);
  if (index == 0) return table[0].corrected;
  else if (index == nElems) return table[nElems - 1].corrected;
  else {
    LinEntry low = table[index - 1], high = table[index];
    return (((high.corrected - low.corrected)/(high.raw - low.raw))
      * (rawValue - low.raw)) + low.corrected;
  }
}

static const LinEntry linTable[] = {
{0, 0}, {10, 15}, {20, 29}, ...
};

... linearize(linTable, sizeof(linTable)/sizeof(linTable[0]), someData) ...

The bSearch function takes a pointer to an array of LinEntry records (see my earlier post on taming arrays if you haven’t seen the array keyword before), the number of elements in the array, and the raw data value. I’ve assumed it returns a value in the range 0..nElems, such that 0 means the raw value is off the bottom of the table, nElems means it is off the top, otherwise the table entry it indexes and the previous table entry bracket the raw value. If we want to prove partial or full correctness, we need to specify all of this. But we can be less precise if we’re just proving predictable execution. Let’s see what ArC makes of it with no annotation other than the array keywords:

c:\arc\ex1.c (11,33): Warning! Unable to prove: Precondition of ‘[]‘ satisfied
(see C:\Arc\ex1_unproven.htm#9), cannot prove: 0 < table.lim.
c:\arc\ex1.c (12,43): Warning! Exceeded boredom threshold proving: Precondition of ‘[]‘ satisfied
(see C:\Arc\ex1_unproven.htm#7), cannot prove: nElems < (1 + table.lim).
c:\arc\ex1.c (14,29): Warning! Exceeded boredom threshold proving: Precondition of ‘[]‘ satisfied
(see C:\Arc\ex1_unproven.htm#2), cannot prove: index < (1 + table.lim).
c:\arc\ex1.c (14,54): Warning! Exceeded boredom threshold proving: Precondition of ‘[]‘ satisfied
(see C:\Arc\ex1_unproven.htm#4), cannot prove: index < table.lim.
c:\arc\ex1.c (15,17): Warning! Unable to prove: Precondition of ‘/’ satisfied
(see C:\Arc\ex1_unproven.htm#5), cannot prove: low.raw < high.raw.

There are a few problems here, all in function linearize:

  • The first warning says that table[0] might be out-of-bounds, because table might be an empty array.
  • The second says that table[nElems - 1] might be out-of-bounds, because we haven’t told ArC that nElems is the number of elements in table.
  • The third and fourth say that table[index] and table[index - 1] might be out-of-bounds, because ArC doesn’t know that bSearch returns a value between 0 and nElems, and that table points to the start of an array with nElems elements.
  • The final warning says that in the interpolation operation, the divisor (table[index].raw – table[index - 1].raw) might not be positive. Division by zero would typically cause an exception; while division by a negative number yields an implementation-defined result in C’90 and is therefore not allowed by ArC. The division will be safe if the raw values in table are monotonically increasing.

So let’s give ArC the information it needs, by adding some partial specifications:

size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key)
post(result <= nElems);

uint16_t linearize(const LinEntry* array table, size_t nElems, uint16_t rawValue)
pre(table.lwb == 0; table.lim == nElems)
pre(nElems  >= 1)
pre(forall i in 1..(nElems - 1) :- table[i].raw > table[i - 1].raw) {
  size_t index = bSearch(table, nElems, rawValue);
  if (index == 0) return table[0].corrected;
  else if (index == nElems) return table[nElems - 1].corrected;
  else {
    LinEntry low = table[index - 1], high = table[index];
    return (((high.corrected - low.corrected)/(high.raw - low.raw))
      * (rawValue - low.raw)) + low.corrected;
  }
}

In the postcondition of bSearch, we’ve specified that the return value is no greater than nElems (since it is of type size_t, it can’t be less than zero anyway). In the precondition of linearize we’ve said that table is a regular array with lower bound zero and limit nElems. These are enough to ensure that the array accesses table[index - i] and table[index] in the else-part of the if-statement are in bounds, because the previous parts of the if-statement handle the cases index == 0 and index == nElems. We’ve also specified that nElems is at least one to take care of the access to table[0].

The final precondition we’ve added to linearize says the raw values in table are monotonically increasing, by stating that for all values i from 1 up to nElems - 1 (i.e. the highest index of table), table[i].raw is greater than table[i - 1].raw.

These partial specifications are sufficient for ArC to prove that function linearize executes predictably, assuming that it is called in a state satisfying its preconditions, and that bSearch executes predictably, has no side-effects, and returns a value satisfying its postcondition. The preconditions of linearize will be verified at every place where it is called. We’ll look at verifying bSearch next time.

Danger – unsigned types used here!

April 7, 2010 davidcrocker 20 comments

By way of a change from the last two posts on formal verification, this time I’m going to talk about using unsigned types in C/C++. Modern programming languages such as C# and Java don’t provide unsigned types, with good reason (actually, C# does have unsigned types, but only for the purpose of interfacing to COM objects written in other languages, and they are not used in the .NET Framework API).

To illustrate the dangers of using unsigned types, I invite you to consider this example (in which uint16_t and int32_t are typedef’d to be unsigned 16-bit and signed 32-bit types respectively) and decide whether it makes safe use of unsigned values:

void foo(const uint16_t* reading, const size_t size) {
  size_t i;
  for (i = size - 1; i >= 0; --i) {
    doSomething(reading[i]);
    if (i > 0) {
      int32_t diff = reading[i] - reading[i - 1];
      doSomethingElse(diff);
    }
  }
}

Why are unsigned types dangerous in C/C++? Here are some reasons:

1. C/C++ provide implicit type conversions between signed and unsigned values. Unlike Ada, there is no a runtime check to make sure the value is convertible to the new type. For example, you can readily “convert” a negative signed value to an unsigned value.

2. If you mix signed and unsigned operands in an arithmetic operation, the signed operand is always converted to unsigned. This may not be what you wanted.

3. In arithmetic expressions, operands whose type is smaller than (unsigned) int get promoted to int or unsigned int. This complicates the situation – especially if the result is implicitly converted again, because the implicit type conversions may not be happening where you think they are.

4. It is very easy to underflow the minimum value of an unsigned int (i.e. zero). It is much more difficult to underflow or overflow the minimum or maximum value of a signed int, especially on a 32-bit (or greater) platform.

How many problems did you spot in the example? There are at least three. The first one is the use of size – 1 as the initial value of the loop counter i. Since size is of type size_t which is an unsigned type, size – 1 will underflow if size is zero. In this case, the loop will count down from the maximum value of a size_t rather than not executing at all. The second problem is the use of i >= 0 in the for-loop header, with i again of type size_t. The loop will never terminate, because i cannot go negative. The third problem concerns the assignment of reading[i] – reading[i - 1] to diff. Suppose reading[i -i] is greater than reading[i], for example by one. Will diff end up as -1 ? Unfortunately, that depends on your compiler and target platform. If uint16_t maps to unsigned short and int32_t maps to int, and assuming 2′s complement hardware, then yes. Both readings will be promoted to unsigned int prior to subtraction, yielding 0xFFFFFFFF. This is then converted implicitly to int for the assignment to diff, yielding -1. But if uint16_t maps to unsigned int and int32_t maps to long, we get a different result. The subtraction yields 0xFFFF this time, which is converted to 0x0000FFFF for assignment to diff.

If you regularly use static analysis on your C/C++ programs (as I hope you do), you might like to check whether your static analyzer reports all three problems for this example.

What’s the best way to manage the dangers inherent in using unsigned types? One strategy is Just Say No. Don’t use unsigned types, except in very special situations. Is this really feasible? When using a 32-bit (or greater) platform, I think it is. You’ll need to define a signed size-type (i.e. an integral type with the same size as size_t) to use as the natural type for representing array indices, string lengths, and sizes. On most platforms, ptrdiff_t is a suitable type to start from. Whenever you call a library function that returns a size_t, or use a sizeof expression, you should immediately cast the result to your signed size type. You may have difficulties if you use an array or string that takes up more than half the address space of the processor, but are you ever going to do that? The other thing you’ll need to do is convert unsigned data read from the hardware to signed data as soon as it is read in, after any necessary shifting and masking. If you read an unsigned value from a 16-bit A-to-D converter, you’ll need to store each value as 32-bits. Alternatively, if you need to store lots of them, you can exempt that data from Just Say No and store it as 16-bit unsigned values, doing the conversion to 32-bit signed int whenever you pick a value out from the store.

Here’s my original example using Just Say No, assuming that the readings came from a 15-bit (or less) A-to-D converter, with sz_t standing for our signed size type:

void foo(const int16_t* reading, const sz_t size) {
  sz_t i;
  for (i = size - 1; i >= 0; --i) {
    doSomething(reading[i]);
    if (i > 0) {
      int32_t diff = reading[i] - reading[i - 1];
      doSomethingElse(diff);
    }
  }
}

The only changes I made are to replace unsigned types by signed types. You’ll still need to use unsigned values where you do bitwise operations, such as shifting and masking. But you probably only need to do these operations on raw incoming data, before storing the result as signed data. Also, if you are using bit fields, any fields that store data of Boolean or enumeration type should be declared unsigned.

If Just Say No is too radical for you, then the alternative is to embrace unsigned values, but be very careful using them. You must use a good static analyzer to detect possible problems. MISRA C 2004 rule 10.1 prohibits implicit signed/unsigned conversions (amongst others), so a static analyzer that enforces MISRA compliance should catch them. Even better, use a formal tool such as ArC. Here’s a safer version of our example that still uses unsigned types:

void foo(const uint16_t* reading, const size_t size) {
  size_t i;
  for (i = size; i != 0; ) {
    --i;
    doSomething(reading[i]);
    if (i > 0) {
      int32_t diff = (int32_t)(reading[i]) - (int32_t)(reading[i - 1]);
      doSomethingElse(diff);
    }
  }
}

I’ve changed the loop to count down from size instead of size – 1, I’ve changed the termination condition, and I’ve moved the decrement of i to the start of the loop body. I’ve also cast the unsigned readings to signed before subtracting them.

Have you been caught out by the behavour of unsigned types? Does your organization have a policy on using them? As always, I welcome your feedback – just click on the Leave a comment link near the top of this page.