Making sure variables are initialized
One source of program bugs is use of variables before they have been initialized. In C/C++ all static variables get zero-initialized if they have no specified initialization, so it is only local variables we need to worry about. Bugs caused by use of uninitialized local variables can be particularly nasty, because the value of such a variable depends on whatever previously occupied the same stack location. Read more…
Using and Abusing Unions
The C union type is one of those features that is generally frowned on by those who set programming standards for critical systems, yet is quite often used. MISRA C 2004 rule 18.4 bans them (“unions shall not be used”) on the grounds that there is a risk that the data may be misinterpreted. However, it goes on to say that deviations are acceptable for packing and unpacking of data, and for implementing variant records provided that the variants are differentiated by a common field. Read more…
Safer arrays: using a C++ array class
In a previous post, I remarked that arrays in C leave much to be desired, and that in C++ it is better to avoid using naked arrays. You can avoid naked arrays in C++ programming by wrapping them up in a suitable array class instead. The Joint Strike Fighter C++ Coding Standards document takes a similar view; rule 97 in that standard states: Read more…
How (un)safe is pointer arithmetic?
I recognize that this is a controversial topic – if you’re a safety-critical professional using C or C++, I’d be glad to hear your views.
Using explicit pointer arithmetic in critical software is generally frowned upon. MISRA 2004 rules 17.1 to 17.3 prohibit some particular cases of explicit pointer arithmetic that do not give rise to well-defined results. Read more…
Using Unicode in embedded software
Unicode provides a single character set that can represent nearly all of the world’s written languages. Mainstream software development has largely moved to Unicode already, helped by the fact that in modern languages such as Java and C#, type char is defined to be a Unicode character. However, in C a char is invariably 8 bits on modern architectures, and the associated character set is ASCII. Does this matter, for embedded software? Read more…
Using constrained types in C
When writing critical software, one of the advantages cited for using Ada rather than C is that Ada lets you define constrained types, like this:
type Percentage is Integer range 0 .. 100;
Reasoning about null-terminated strings in C/C++
In my last post I described how ArC supports reasoning about array access, by allowing you to refer to the bounds of an array in a specification. If the code itself needs to know the size of an array, then the size is provided by other means, for example by passing it as an extra parameter. However, when using arrays of characters, standard practice in C is not to pass the number of elements, but to use a null character to indicate the end. Read more…
The Taming of the Pointer – Part 2
In last Wednesday’s post I mentioned three ways in which pointers are troublesome in C/C++, and I introduced the null ArC keyword to mitigate one of them. Now I’ll turn to the second issue: the fact that given (say) a variable or parameter of type int*, the type does not allow us to determine whether it refers to a single int, or to an array of ints – nor, if it refers to an array, can we find how many elements the array contains. Read more…
Using strongly-typed Booleans in C and C++
One of the glaring omissions from the original C language was provision of a Boolean type. Booleans in C are represented by integers, with false being represented as zero and true being represented as 1. When an integer value is used as a condition, all nonzero values are intrepreted as true.
Strong typing is a valuable asset when writing code – whether critical or not – because type checks can and do uncover errors. So how can we use strongly-typed Booleans in C and C++? Read more…
Taming Pointers in C/C++
When doing verification or deep static analysis of C and C++, pointers are troublesome in several ways:
- Zero (i.e. NULL) is an allowed value of every pointer type in the language. Occasionally we want to allow null pointers, for example in the link field of the last element of a linked list. More usually, we don’t want to allow null. Verification requires that anywhere we use the * or [ ] operator on a pointer, we can be sure that it is not null.
- C and C++ do not distinguish between pointers to single variables and pointers to arrays. So, where we have a parameter or variable of type T*, we can’t tell whether it is supposed to point to a variable or an array. If it points to a single variable, then we mustn’t do pointer arithmetic or indexing on it. The verifier must be able to check this.
- Array parameters in C/C++ are passed as pointers. Aside from the problem that we can’t distinguish array pointers from pointers to single variables, we also have the problem that there is no size information contained in an array pointer.
- Anywhere we use pointers to mutable data, there is the possibility of aliasing. In other words, there may be more than one pointer to the same data. The verifier needs to take account of the fact that changes to data made through one pointer may affect the values subsequently read through another pointer.
Although pointers are less troublesome in Ada, the aliasing problem still exists. The SPARK verifiable subset of Ada handles this by banning pointers altogether. Unfortunately, this isn’t an option in a C/C++ subset for critical systems, because pointers are the only mechanism for passing parameters by reference.
I’ll deal with the issue of unwanted nullability first. One solution is to add invariants stating that particular variables of pointer type cannot be NULL. Similarly, where a function takes parameters of pointer type, we can write preconditions that these parameters cannot be NULL. Here are a couple of examples:
struct Status1 {
const char* message;
...
invariant(message != NULL)
}
void sum(int *arr, int size)
pre(arr != NULL)
{ ... }
The problem with this approach is that you need a lot of these invariants and preconditions because, more often than not, it makes no sense to allow NULL. So in ArC we take the opposite approach. We assume that pointers are not allowed to be NULL except where you say otherwise. In the above examples, you can leave out the precondition and invariant if you don’t want to allow either pointer to be NULL.
To tell ArC that a pointer is allowed to be NULL, you flag it with the null attribute, like this:
struct Status2 {
const char* null message;
...
}
void setMessage(const char * null msg) { ... }
This greatly reduces the amount of annotation needed, because the null annotation is more concise than a precondition or invariant, and it is needed less often. As you might expect, null is another macro defined in arc.h that expands to emptiness when you compile the code. Syntactically, it behaves like const or volatile.
That’s all for today – I’ll discuss how we handle the other problems with pointers later.