Home > C and C++ in critical systems, Formal verification of C programs > Verifying absence of integer overflow

Verifying absence of integer overflow

June 7, 2010

One class of errors we need to guard against when writing critical software is arithmetic overflow. Before I go into detail, I invite you to consider the following program and decide what it prints:

#include <stdio.h>
int main(int argc, char *argv[]) {
unsigned int x = 42;
long y = -10;
printf("%s\n", (x > y ? "Hello, normal world!" : "Hello, strange world!"));
return 0;
}

I’ll return to this example later. For now, consider the function for averaging a set of readings stored in an array that I examined at in my previous two posts (ArC-specific bits are shown in green):

int16_t average(const int16_t * array readings, size_t numReadings)
pre(readings.lwb == 0; readings.upb == numReadings)
pre(numReadings != 0)
returns((+ over readings.all)/numReadings)
{
int sum = 0;
size_t i;
for (i = 0; i < numReadings; ++i)
keep(i <= numReadings)
keep(sum == + over readings.all.take(i))
decrease(numReadings - i)
{
sum += readings[i];
}
return (int16_t)(sum/numReadings);
}

I reported in the earlier post that ArC is able to verify this, apart from proving absence of integer overflow. Here are the verification warnings that ArC reports:

c:\escher\arctest\arctest14.c (17,13): Warning! Exceeded time limit proving: Arithmetic result of operator ‘+’ is within limit of type ‘int’, suggestion available (see C:\Escher\ArcTest\arctest14_unproven.htm#9), did not prove: minof(int) <= (sum$loopstart_5398,5$ + readings[i$loopstart_5398,5$ – readings.lwb]).

c:\escher\arctest\arctest14.c (17,13): Warning! Exceeded time limit proving: Arithmetic result of operator ‘+’ is within limit of type ‘int’, suggestion available (see C:\Escher\ArcTest\arctest14_unproven.htm#10), did not prove: (sum$loopstart_5398,5$ + readings[i$loopstart_5398,5$ – readings.lwb]) <= maxof(int).

c:\escher\arctest\arctest14.c (19,22): Warning! Unable to prove: Type constraint satisfied in implicit conversion from ‘int’ to ‘unsigned int’, suggestion available (see C:\Escher\ArcTest\arctest14_unproven.htm#18), did not prove: minof(unsigned int) <= sum$5398,5$.

Any time we do an explicit or implicit conversion from one type to another type that is not wider than the original or is defined by a constrained typedef, or we do certain arithmetic operations, we need to guard against integer overflow. This example contains several such operations.

The first implicit type conversion is when we initialize the loop counter i to the constant 0. We declared i to be of type size_t, which is an unsigned type, whereas the constant 0 has type (signed) int because we didn’t use the U suffix. So there is an implicit conversion from int to size_t here, and ArC will generate verification conditions that the constant 0 is in the range of size_t. Of course, the proof is trivial.

The other implicit type conversion is in the subexpression sum/numReadings in the return statement. Since numReadings has type size_t and sum has type int, the “usual arithmetic conversions” will be applied. In practice, size_t is never smaller than int, so there will be an implicit conversion of sum to size_t. ArC therefore generates the verification condition that the current value of sum is representable in size_t. As there is nothing to prevent sum from being negative, the proof fails – hence it emits the third warning message listed above.

In fact, ArC – like most ordinary static analysers and some compilers – warns about the signed/unsigned mismatch even before starting the proofs. Avoiding signed/unsigned mismatches in the first place is probably a more efficient strategy than dealing with the verification failures that often occur when you don’t. Had I followed my own advice to avoid using unsigned types, the signed/unsigned mismatch would not have arisen.

To fix the error, we can either change the type of the parameter numReadings to int, or cast it to type int before using it as the divisor. If we do the latter, we’ll also need to add a function precondition that numReadings is not too large to fit in an int.

For the integer operators + – and * it can be the case that applying the “usual arithmetic conversions” to an operator expression does not involve an unsafe implicit type conversion, yet the arithmetic result is not representable in the result type. The C standard defines the result in this case as undefined if the promoted operands are signed, or the arithmetic result modulo some N if they are unsigned. ArC treats it as an error in either case, because unsigned types are frequently used where modulo arithmetic is not the user’s intention; so it always generates verification conditions that the result is in type. This is the reason for the first two verification failures that ArC reports for our original example. The expression sum += readings[i] could easily overflow, because the sum could be as high as maxof(int16_t) * maxof(size_t) or as low as minof(int16_t) * maxof(size_t), which are larger than maxof(int) and minof(int) respectively. To avoid overflow, we will need to constrain either the minimum and maximum values of each reading, or the number of readings. Here’s our example with the precondition changed to limit the number of readings:

int16_t average(const int16_t * array readings, size_t numReadings)
pre(readings.lwb == 0; readings.upb == numReadings)
pre(numReadings != 0; numReadings <= maxof(int)/(maxof(int16_t) + 1))
returns((+ over readings.all)/numReadings)
{
int sum = 0;
size_t i;
for (i = 0; i < numReadings; ++i)
keep(i <= numReadings)
keep(sum == + over readings.all.take(i))
decrease(numReadings - i)
{
sum += readings[i];
}
return (int16_t)(sum/(int)numReadings);
}

I’ve referred to the maximum value supported by type int as maxof(int). ArC provides type operators maxof(T) and minof(T) for your convenience, to save you from having to #include <limits.h> or similar just for specification purposes. I’ve also cast numReadings to int prior to using it in the division.

With the compiler parameters set to assume int is 32 bits and short int is 16 bits, ArC verifies this example completely. You may be wondering why the expression (+ over readings.all)/numReadings in the returns specification doesn’t also generate a signed/unsigned warning or verification failure. That’s because integer arithmetic in specifications is always carried out after promoting operands to type integer, the ArC integral types that has no bounds.

Finally, let’s return to the short program I invited you to consider at the start of this post. Is the world normal or strange? On my computer, compiling with Microsoft Visual C++ 2008, it’s strange. That’s because for this compiler, both (unsigned) int and long are 32 bits. So both operands of less-than get converted to unsigned long … and -10 won’t fit! For the world to be normal, you’ll need to use a compiler for which long is larger than int, so that both operands get converted to (signed) long instead.

  1. Dave Banham
    June 10, 2010 at 13:33

    David,
    With regard to your tease about C’s integer promotion rules, does ArC understand these rules and is able to show when expectations become broken, just as your example illustrates? If so, is this for one particular integer size setup, or can ArC help with producing fully portable code? Or is that an unrealistic dream for a C program!

    Best Regards
    Dave Banham

    • June 10, 2010 at 15:02

      Dave,

      Producing portable code is unfortunately very difficult in C. In developing the specification for C semantics in ArC, I found that there were more sources of non-portability than I had previously realised. Defining types such as ‘int16_t’ and ‘int32_t’ as MISRA suggests solves the problem of knowing that variables are big enough to hold the data you want to store, but unfortunately the semantics of integer operations on these types depends on whether they map to ‘short int’ and ‘int’, or to ‘int’ and ‘long int’, or to something else. This is compounded by the fact that the type of a constant such as 32768 depends on how many bits are in an ‘int’. So ArC requires you to configure the parameters of the compiler you are targeting, including the sizes of each integral type, so that it can determine the semantics on your target platform and warn you about broken expectations. If you intend to compile the same code for two platforms having different integral sizes, you need to run ArC verification separately for each of them.

      If you want to produce maximally portable C code, you could take the approach we use in our C++ code generator in Perfect Developer:

      1. Decide how many bits you need to represent any integral value you ever want to store and any intermediate result of a computation you will ever produce. Then pick a signed integral type on your target platform that is at least this big. For most embedded applications, I think a signed type with 32 bits should suffice.
      2. Use a typedef to give this type a convenient name. In your program, declare all integral variables, parameters etc. to be of this type. Don’t use any other integral types. If you find yourself doing a sizeof() or otherwise getting handed a size_t or any other integral type, cast it immediately to your standard type before using it.
      3. Declare a macro that you will use to surround every integer constant in your program in order to make sure it will have your chosen integral type. So instead of writing e.g. the constant 1 in your program, you will write e.g. INT(1), where INT(x) expands to x if your chosen type is equivalent to ‘int’, and xL if it is equivalent to ‘long int’.

      If you do this, I think your program will be portable in respect of integer semantics, subject to defining your typedef and the INT() macro appropriately for each target platform.

  1. No trackbacks yet.
Comments are closed.
%d bloggers like this: