<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>David Crocker&#039;s Verification Blog</title>
	<atom:link href="http://critical.eschertech.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://critical.eschertech.com</link>
	<description>Formal verification of C/C++ code for critical systems</description>
	<lastBuildDate>Mon, 06 Sep 2010 16:17:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='critical.eschertech.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/2d9ea17580af79aa98fdb5a69e5d2f3b?s=96&#038;d=http://s2.wp.com/i/buttonw-com.png</url>
		<title>David Crocker&#039;s Verification Blog</title>
		<link>http://critical.eschertech.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://critical.eschertech.com/osd.xml" title="David Crocker&#039;s Verification Blog" />
	<atom:link rel='hub' href='http://critical.eschertech.com/?pushpress=hub'/>
		<item>
		<title>ArC is now eCv!</title>
		<link>http://critical.eschertech.com/2010/09/06/arc-is-now-ecv/</link>
		<comments>http://critical.eschertech.com/2010/09/06/arc-is-now-ecv/#comments</comments>
		<pubDate>Mon, 06 Sep 2010 16:17:51 +0000</pubDate>
		<dc:creator>davidcrocker</dc:creator>
				<category><![CDATA[C and C++ in critical systems]]></category>

		<guid isPermaLink="false">http://critical.eschertech.com/?p=516</guid>
		<description><![CDATA[We&#8217;re now nearing the beta release of our C verification product. We&#8217;ve decided to call the product Escher C Verifier, or eCv for short. While we liked the name ArC (Automated Reasoning about C), there are just too many pieces of software already out there with ARC in the name. The market that we&#8217;re trying [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=516&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>We&#8217;re now nearing the beta release of our C verification product. We&#8217;ve decided to call the product Escher C Verifier, or <em>eCv </em>for short. While we liked the name ArC (Automated Reasoning about C), there are just too many pieces of software already out there with ARC in the name.</p>
<p>The market that we&#8217;re trying to address with <em>eCv </em>is critical embedded systems &#8211; especially SIL3 and higher &#8211; with code written to the MISRA-C 2004 standard or similar. Restricting ourselves to a C subset in this way allows us to reduce the amount of non-trivial annotation required, aiding developer productivity. If you want to verify programs that don&#8217;t fall into the subset we handle, then there are alternatives such as Microsoft Research&#8217;s Vcc &#8211; but don&#8217;t be surprised if the annotation needed is more complex.</p>
<p>We&#8217;ll be presenting a case study on applying <em>eCv </em>to critical software at the Safety Critical Systems Club meeting on 15 September in Manchester; and we&#8217;ll be discussing the relevance of <em>eCv </em>to ensuring compliance with &#8220;hard&#8221; MISRA C rules at the MISRA-C meeting on 25 November in London. You can find details of these events at the <a href="http://www.safety-club.org.uk/" target="_blank">SCSC website</a>. If you&#8217;ve been following my blog and you&#8217;ll be attending one of these meetings, please introduce yourself to me in one of the breaks.</p>
<p>Information about <em>eCv </em>itself is available at the <a href="http://www.eschertech.com/products/ecv.php" target="_blank">Escher Technologies site</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/davidcrocker.wordpress.com/516/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/davidcrocker.wordpress.com/516/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/davidcrocker.wordpress.com/516/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/davidcrocker.wordpress.com/516/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/davidcrocker.wordpress.com/516/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/davidcrocker.wordpress.com/516/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/davidcrocker.wordpress.com/516/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/davidcrocker.wordpress.com/516/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/davidcrocker.wordpress.com/516/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/davidcrocker.wordpress.com/516/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/davidcrocker.wordpress.com/516/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/davidcrocker.wordpress.com/516/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/davidcrocker.wordpress.com/516/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/davidcrocker.wordpress.com/516/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=516&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://critical.eschertech.com/2010/09/06/arc-is-now-ecv/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/9efed2bd9429eac89f62a336b6d05174?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">davidcrocker</media:title>
		</media:content>
	</item>
		<item>
		<title>Dynamic Memory Allocation in Critical Embedded Systems</title>
		<link>http://critical.eschertech.com/2010/07/30/dynamic-memory-allocation-in-critical-embedded-systems/</link>
		<comments>http://critical.eschertech.com/2010/07/30/dynamic-memory-allocation-in-critical-embedded-systems/#comments</comments>
		<pubDate>Fri, 30 Jul 2010 13:49:55 +0000</pubDate>
		<dc:creator>davidcrocker</dc:creator>
				<category><![CDATA[C and C++ in critical systems]]></category>

		<guid isPermaLink="false">http://critical.eschertech.com/?p=509</guid>
		<description><![CDATA[Today I&#8217;m going to talk about why dynamic memory allocation is rarely used in critical embedded systems, and whether using only static allocation is a necessary restriction. I&#8217;m going to assume that maintaining system availability is critical, that there are hard real-time deadlines to be met, and that the system is long-running. Issues we have [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=509&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Today I&#8217;m going to talk about why dynamic memory allocation is rarely used in critical embedded systems, and whether using only static allocation is a necessary restriction. I&#8217;m going to assume that maintaining system availability is critical, that there are hard real-time deadlines to be met, and that the system is long-running.</p>
<p>Issues we have to face when using dynamic memory allocation in C/C++ include the following:</p>
<ul>
<li><strong>Sufficiency</strong>: how can we be sure that we have provided sufficient memory, so that a critical memory allocation will never be refused?</li>
<li><strong>Garbage management</strong>: how can we be sure that memory that is no longer required is released to the memory manager at exactly the right time? If we release memory too early, we will have dangling pointers or double-free errors. If we release memory too late, we will have memory leaks. These kinds of problem plague development of large C/C++ programs.</li>
<li><strong>Fragmentation</strong>: how can we avoid the situation in which we want to allocate N bytes of memory, but all the available memory is in fragments smaller than N, even though the total available is much larger than N? Memory fragmentation is a plague of long-running C/C++ systems that use dynamic memory extensively.</li>
<li><strong>Timeliness</strong>: when we need to allocate memory, what is the upper bound on the time that the memory manager may take to service the request? Memory managers for C/C++ typically search freelists or more complex structures for fragments of sufficient size, therefore calls to <strong>alloc </strong>or <strong>new </strong>typically exhibit a variable and sometimes long latency.</li>
</ul>
<p>Many modern languages such as C# and Java provide garbage collection, in which the system automatically identifies memory that is no longer accessible by the program and releases it back to the memory manager. Garbage collectors solve the garbage management issue and generally the fragmentation issue too, since a garbage collection cycle usually includes compacting the heap to move all the free space to one end. Unfortunately, garbage collectors create additional timeliness issues. There has been some work on concurrent and &#8220;real-time&#8221; garbage collectors, although the ones I am aware of still need to &#8220;stop the world&#8221; for a short while at the start of a garbage collection cycle.</p>
<p>As we&#8217;re talking about C and C++, we have  to make do without automatic garbage collection. In fact there is a <a href="http://www.hpl.hp.com/personal/Hans_Boehm/gc/" target="_blank">garbage collector for C++</a>, however it is of the conservative non-copying variety, so it doesn&#8217;t fully solve the problems I am about to list. I also believe it would be very difficult to make a good safety case for using any sort of conservative collector.</p>
<p>Although the four issues of sufficiency, garbage management, fragmentation and timeliness are serious obstacles to using dynamic memory in C/C++ critical embedded systems, there are a few strategies that can mitigate them. Here are the ones I am aware of:</p>
<p><strong>1. Use dynamic memory allocation during the initialization phase only</strong></p>
<p>The amount of memory dynamically allocated may be constant, or may depend on static inputs such as configuration jumpers. Showing there is sufficient memory should be relatively easy to establish by analysing the system to identify candidate worst-case static inputs and testing with those inputs. Even if the system finds it has insufficient memory during actual service, the error occurs in the initialization phase, so it can be handled in a similar way to a power-on self-test failure.</p>
<p><strong>2. Allocate memory, but never release it</strong></p>
<p>This addresses garbage management if objects never need to be freed once allocated. Alternatively, if there are only a few types of object that ever need to be freed, each one can have its own freelist. Objects can be allocated from the corresponding freelist if it is not empty, and always returned to the freelist. This is easy to implement in C++ by implementing placement <strong>new </strong>and <strong>delete </strong>operators for the classes concerned.</p>
<p>This strategy also addresses the sufficiency issue, provided that an upper limit can be placed on the numbers of each object that may be allocated. The total heap memory requirement is then just the sum of the total requirement for each object type.</p>
<p>Fragmentation will not occur because memory is never released.</p>
<p>Timeliness of memory allocations is easily addressed, since we can use a simple allocator that works in constant time. It just needs to increase the heap pointer by the allocation size and return the old heap pointer.</p>
<p><strong>3. Use reference-counting garbage collection</strong></p>
<p>Reference counting works by having each object keep track of how many pointers there are that point to it. The reference count is typically kept up to date by using C++ &#8220;smart pointer&#8221; classes instead of simple pointers or references.</p>
<p>This isn&#8217;t as easy as it sounds. The overheads of using smart pointers everywhere are high, so in practice it is desirable to use plain pointers in many situations, such as parameter passing. Then you need to make sure that an object&#8217;s reference count cannot reach zero while there is a plain pointer referring to it. This is not easy to get right by hand. However, if the code is produced by an automatic code generator (such as <a href="http://www.eschertech.com/products/perfect_developer.php" target="_blank"><em>Perfect Developer</em></a>), then the code generator can be written to ensure that plain pointers are only used when it is safe to do so.</p>
<p>Beware of using reference counting on objects shared by multiple processor cores &#8211; reference counts need to be updated atomically, which is typically slow in a multiprocessor system. Also, you need to avoid creating circular chains of pointers, since such  chains are not reclaimed by reference-counting garbage collection. This is because the reference counts can never drop to zero unless the chain is broken.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/davidcrocker.wordpress.com/509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/davidcrocker.wordpress.com/509/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/davidcrocker.wordpress.com/509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/davidcrocker.wordpress.com/509/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/davidcrocker.wordpress.com/509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/davidcrocker.wordpress.com/509/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/davidcrocker.wordpress.com/509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/davidcrocker.wordpress.com/509/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/davidcrocker.wordpress.com/509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/davidcrocker.wordpress.com/509/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/davidcrocker.wordpress.com/509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/davidcrocker.wordpress.com/509/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/davidcrocker.wordpress.com/509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/davidcrocker.wordpress.com/509/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=509&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://critical.eschertech.com/2010/07/30/dynamic-memory-allocation-in-critical-embedded-systems/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/9efed2bd9429eac89f62a336b6d05174?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">davidcrocker</media:title>
		</media:content>
	</item>
		<item>
		<title>Verifying pointer arithmetic</title>
		<link>http://critical.eschertech.com/2010/07/16/verifying-pointer-arithmetic/</link>
		<comments>http://critical.eschertech.com/2010/07/16/verifying-pointer-arithmetic/#comments</comments>
		<pubDate>Fri, 16 Jul 2010 17:31:13 +0000</pubDate>
		<dc:creator>davidcrocker</dc:creator>
				<category><![CDATA[C and C++ in critical systems]]></category>
		<category><![CDATA[Formal verification of C programs]]></category>
		<category><![CDATA[formal specification]]></category>
		<category><![CDATA[formal verification]]></category>
		<category><![CDATA[pointer arithmetic]]></category>

		<guid isPermaLink="false">http://critical.eschertech.com/?p=455</guid>
		<description><![CDATA[Today I&#8217;ll look at whether code that uses pointer arithmetic is any harder to verify than equivalent code that does not use pointer arithmetic. Consider this function for copying an array (or part of an array) into another array (or part of another array): void arrayCopy(const int* src, int* dst, size_t num) { size_t i; [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=455&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Today I&#8217;ll look at whether code that uses pointer arithmetic is any harder to verify than equivalent code that does not use pointer arithmetic.</p>
<p>Consider this function for copying an array (or part of an array) into another array (or part of another array):</p>
<pre><strong>void </strong>arrayCopy(<strong>const int</strong>* src, <strong>int</strong>* dst, size_t num) {
  size_t i;
  <strong>for </strong>(i = 0; i &lt; num; ++i) {
    dst[i] = src[i];
  }
}
</pre>
<p>To verify it, we first need to specify what it is supposed to achieve in a postcondition and add any necessary preconditions. Then we need to add a <a href="http://critical.eschertech.com/2010/03/29/verifying-loops-part-2/" target="_blank">loop invariant</a> so that ArC can verify that the loop achieves the desired state when it terminates, and a <a href="http://critical.eschertech.com/2010/03/31/verifying-loops-proving-termination/" target="_blank">loop variant</a> so that ArC can prove that the loop terminates:</p>
<pre><strong>void </strong>arrayCopy(<strong>const int</strong>* <strong><span style="color:#008000;">array</span> </strong>src, <strong>int</strong>* <strong><span style="color:#008000;">array</span> </strong>dst, size_t num)
<span style="color:#008000;"><strong>writes</strong>(dst.all)
<strong>pre</strong>(src.lim &gt;= num; dst.lim &gt;= num)
<strong>pre</strong>(<strong>disjoint</strong>(src.all, dst.all))
<strong>post</strong>(<strong>forall </strong>i <strong>in </strong>0..(num - 1) :- dst[i] == src[i])</span>
{ size_t i;
  <strong>for </strong>(i = 0; i &lt; num; ++i)
  <span style="color:#008000;"><strong>keep</strong>(i &lt;= num)
  <strong>keep</strong>(<strong>forall </strong>j <strong>in </strong>0..(i - 1) :- dst[j] == src[j])</span>
  <span style="color:#008000;"><strong>decrease</strong>(num - i)</span>
  { dst[i] = src[i];
  }
}</pre>
<p>This is fairly straightforward. As usual, the main loop invariant (the <span style="color:#008000;"><strong>keep</strong>(<strong>forall</strong>&#8230;)</span> clause) is a generalisation of the desired postcondition, and we need another invariant (the <span style="color:#008000;"><strong>keep</strong>(i &lt;= num)</span> clause) to ensure that the main invariant and the body don&#8217;t violate the preconditions of the indexing operator. These invariants and the loop variant (the <span style="color:#008000;"><strong>decrease</strong>(&#8230;)</span> clause) are sufficient for ArC to verify this function.</p>
<p>Many C programmers would write the body as <span style="color:#993300;"><em>*dst++ = *src++;</em></span> instead of explicitly indexing <span style="color:#993300;"><em>dst </em></span>and <span style="color:#993300;"><em>src</em></span>. If we make this change, then because we are changing <span style="color:#993300;"><em>src </em></span>and <span style="color:#993300;"><em>dst</em><span style="color:#000000;"> with</span></span><span style="color:#000000;">i</span>n the loop, we need to describe their values in the invariant. Here is how we can do that:</p>
<pre><strong>void </strong>arrayCopy(<strong>const int</strong>* <span style="color:#008000;"><strong>array </strong></span>src, <strong>int</strong>* <span style="color:#008000;"><strong>array </strong></span>dst, size_t num)
<span style="color:#008000;"><strong>writes</strong>(dst.all)
<strong>pre</strong>(src.lim &gt;= num; dst.lim &gt;= num)
<strong>pre</strong>(<strong>disjoint</strong>(src.all, dst.all))
<strong>post</strong>(<strong>forall </strong>i <strong>in </strong>0..(num - 1) :- dst[i] == src[i])</span>
{ size_t i;
  <strong>for </strong>(i = 0; i &lt; num; ++i)
  <span style="color:#008000;"><strong>keep</strong>(i &lt;= num)
  <strong>keep</strong>(<strong>forall </strong>j <strong>in </strong>0..(i - 1) :- (<span style="color:#ff0000;"><strong>old </strong></span>dst)[j] == (<span style="color:#ff0000;"><strong>old </strong></span>src)[j])
  <span style="color:#ff0000;"><strong>keep</strong>(</span></span><span style="color:#ff0000;">src == (<strong>old </strong>src) + i; </span><span style="color:#008000;"><span style="color:#ff0000;">dst == (<strong>old </strong>dst) + i)</span>
  <strong>decrease</strong>(num - i)</span>
  { <span style="color:#ff0000;">*dst++ = *src++</span>;
  }
}</pre>
<p>Applying the keyword <strong>old </strong>to an expression in a loop invariant or loop variant yields the value of that expression just before the first iteration of the loop. I&#8217;ve had to replace <span style="color:#993300;"><em>src </em></span>in the second loop invariant by <span style="color:#993300;"><em><strong>old</strong> src</em></span> and similarly for <em><span style="color:#993300;">dst</span></em>, because in that invariant I am referring to array elements computed relative to the original values of <span style="color:#993300;"><em>src </em></span>and <span style="color:#993300;"><em>dst</em></span>. I&#8217;ve described the modification to <span style="color:#993300;"><em>src </em></span>and <span style="color:#993300;"><em>dst </em></span>by adding a third invariant, that says that at the start and end of any iteration, <span style="color:#993300;"><em>src </em></span>has its original value plus <span style="color:#993300;"><em>i</em></span>, and similarly for <span style="color:#993300;"><em>dst</em></span>. The post-increment operators applied to <span style="color:#993300;"><em>src </em></span>and <span style="color:#993300;"><em>dst </em></span>in the body ensure that these relationships are maintained. Once again, the function is verifiable by ArC.</p>
<p>What if we go further and get rid of the loop counter <span style="color:#993300;"><em>i </em></span>as well? We can do that if we precompute the end pointer <span style="color:#993300;"><em>src + num</em></span> at the start, and then iterate until <span style="color:#993300;"><em>src </em></span>reaches this point. We still need to describe the values of <span style="color:#993300;"><em>src </em></span>and <span style="color:#993300;"><em>dst </em></span>in the invariant, but we must do so without the benefit of <span style="color:#993300;"><em>i</em></span>. Here&#8217;s one way of doing it:</p>
<pre><strong>void </strong>arrayCopy(<strong>const int</strong>* <span style="color:#008000;"><strong>array </strong></span>src, <strong>int</strong>* <span style="color:#008000;"><strong>array </strong></span>dst, size_t num)
<span style="color:#008000;"><strong>writes</strong>(dst.all)
<strong>pre</strong>(src.lim &gt;= num; dst.lim &gt;= num)
<strong>pre</strong>(<strong>disjoint</strong>(src.all, dst.all))
<strong>post</strong>(<strong>forall </strong>i <strong>in </strong>0..(num - 1) :- dst[i] == src[i])</span>
{ <span style="color:#ff0000;"><strong>const int</strong>* <strong>array const</strong> srcEnd = src + nu<span style="color:#ff0000;">m</span></span><span style="color:#ff0000;">;</span>
  <span style="color:#ff0000;"><strong>while</strong>(src != srcEnd)</span>
  <span style="color:#008000;"><span style="color:#ff0000;"><strong>keep</strong>(src.base == <strong>old</strong>(src.base))
  <strong>keep</strong>(</span></span><span style="color:#008000;"><span style="color:#ff0000;"><span style="color:#008000;"><span style="color:#ff0000;">0 &lt;= src - (<strong>old </strong>src)</span></span></span></span><span style="color:#008000;"><span style="color:#ff0000;">)
  <span style="color:#008000;"><strong>keep</strong>(<span style="color:#ff0000;">src - (<strong>old </strong>src)</span> &lt;= n)</span>
</span>  <strong>keep</strong>(<strong>forall </strong>j <strong>in </strong>0..(<span style="color:#ff0000;">(src - (<strong>old </strong>src))</span> - 1) :- (<strong>old </strong>dst)[j] == (<strong>old </strong>src)[j])
</span><span style="color:#008000;"><span style="color:#ff0000;">  <span style="color:#008000;"><strong>keep</strong>(dst == (<strong>old </strong>dst) + <span style="color:#ff0000;">(src - (<strong>old </strong>src))</span>)</span></span>
</span><span style="color:#008000;">  <strong>decrease</strong>(<span style="color:#ff0000;">srcEnd - src</span>)</span>
  { *dst++ = *src++;
  }
}</pre>
<p>Here&#8217;s how I arrived at the new loop annotations:</p>
<ul>
<li>I&#8217;ve taken the three <strong>keep</strong>-clauses from the previous version and substituted <span style="color:#993300;"><em>(src &#8211; (<strong>old </strong>src))</em></span> for the loop counter <span style="color:#993300;"><em>i</em></span>. That&#8217;s how I got the 3rd, 4th and 5th <strong>keep</strong>-clauses. For the last of the three, I also removed the resulting tautology <span style="color:#993300;"><em>src == (<strong>old </strong>src) + (src &#8211; (<strong>old </strong>src))</em></span>, leaving just the  component that talks about <span style="color:#993300;"><em>dst</em></span>.</li>
<li>The original first invariant <em><span style="color:#993300;"><strong>keep</strong>(i &lt;= n)</span></em> placed an upper bound on <span style="color:#993300;"><em>i</em></span>, thereby ensuring that <span style="color:#993300;"><em>src[i]</em></span> and <span style="color:#993300;"><em>dst[i]</em></span> in the original code or <span style="color:#993300;"><em>*src</em></span> and <em><span style="color:#993300;">*dst</span></em> in the modified code were in bounds. I didn&#8217;t need to put a lower bound on <span style="color:#993300;"><em>i </em></span>because its type was <span style="color:#993300;"><em>size_t</em></span> which is an unsigned type, so its lower bound was zero implicitly. But I&#8217;ve now replaced <span style="color:#993300;"><em>i</em></span> by <span style="color:#993300;"><em>(src &#8211; (<strong>old </strong>src))</em></span>, which has no such constraint. So I need to put a zero lower bound on this expression, which is what the second <strong>keep</strong>-clause is for.</li>
<li>Unfortunately, this isn&#8217;t quite enough. The expression <span style="color:#993300;"><em>src &#8211; <strong>old </strong>src</em></span> uses the pointer-difference operator, which has the precondition that the operands are pointers into the same array. In general, when we re-assign an array pointer such as <span style="color:#993300;"><em>src</em></span>, there is no requirement that it points into the same array as before. In this case, all we ever do to <span style="color:#993300;"><em>src </em></span>is increment it; so it obviously continues to point into the same array as long as it is not over-incremented (which ArC verifies). However, ArC isn&#8217;t yet clever enough to spot this and generate an implicit loop invariant. So I&#8217;ve added the first invariant <span style="color:#993300;"><em>src.base == <strong>old</strong>(src.base)</em></span> right at the beginning of the loop. In the expression <span style="color:#993300;"><em>src.base</em></span>, <em><span style="color:#993300;">base </span></em>is a ghost field of <span style="color:#993300;"><em>src</em></span>, just like <span style="color:#993300;"><em>lim </em></span>and <span style="color:#993300;"><em>lwb</em></span>. It subtracts <span style="color:#993300;"><em>src.lwb</em></span> from <em><span style="color:#993300;">src</span></em>, yielding a pointer to the very start of the array that <span style="color:#993300;"><em>src </em></span>points into. So to state that two array pointers point into the same array, I just need to say that their bases are equal.</li>
<li>Rather than just substitute for <em><span style="color:#993300;">i </span></em>in the loop variant, I&#8217;ve changed it to say that I expect the difference between <span style="color:#993300;"><em>srcEnd </em></span>and <span style="color:#993300;"><em>src </em></span>to decrease on each iteration.</li>
</ul>
<p>I hope this example demonstrates to you that verification of C code that uses pointer arithmetic can be done, although it is likely to require more loop invariants than equivalent code that does not use pointer arithmetic, and getting them right may be a little harder. The number of verification conditions to be proved may be higher too &#8211; there were 24 from the original code, 32 from the second version, and 35 from the last version. So indexing should generally be preferred to explicit pointer arithmetic &#8211; as recommended by MISRA-C 2004 rule 17.4 &#8211; when writing verifiable code.</p>
<p>Finally, we could implement this function using <span style="color:#993300;"><em>memcpy</em></span>:</p>
<pre><strong>void </strong>arrayCopy(<strong>const int</strong>* <span style="color:#008000;"><strong>array </strong></span>src, <strong>int</strong>* <span style="color:#008000;"><strong>array </strong></span>dst, size_t num)
<span style="color:#008000;"><strong>writes</strong>(dst.all)
<strong>pre</strong>(src.lim &gt;= num; dst.lim &gt;= num)
<strong>pre</strong>(<strong>disjoint</strong>(src.all, dst.all))
<span style="color:#ff0000;"><strong>pre</strong>(num * <strong>sizeof</strong>(int) &lt;= maxof(size_t))</span>
<strong>post</strong>(<strong>forall </strong>i <strong>in </strong>0..(num - 1) :- dst[i] == src[i])</span>
{ <span style="color:#ff0000;">memcpy(dst, src, num * <strong>sizeof</strong>(<strong>int</strong>));</span>
}</pre>
<p>I&#8217;ve had to add another precondition to ensure that the computation of the number of bytes <span style="color:#993300;"><em>num * <strong>sizeof</strong>(int)</em></span> doesn&#8217;t overflow a <span style="color:#993300;"><em>size_t</em></span>. If Arc were to assume that the number of bytes in any array cannot exceed <em><span style="color:#993300;"><strong>maxof</strong>(size_t)</span></em> &#8211; as is the case for many implementations &#8211; then it could infer that <span style="color:#993300;"><em>dst.lim &lt;= <strong>maxof</strong>(size_t)/<strong>sizeof</strong>(int)</em></span> and we would not need the precondition. Unfortunately, the C standard makes no such promise.</p>
<p>The call to <span style="color:#993300;"><em>memcpy </em></span>involves implicit conversions of its first two parameters from <span style="color:#993300;"><em><strong>int</strong>*</em></span> to <span style="color:#993300;"><em><strong>void</strong>*</em></span> and from <span style="color:#993300;"><em><strong>const int</strong>*</em></span> to <span style="color:#993300;"><em><strong>const void</strong>*</em></span>. In general, pointer conversions should be avoided when writing for ArC, because such conversions break ArC&#8217;s strong type model. However, ArC knows enough about the semantics of <span style="color:#993300;"><em>memcpy </em></span>to prove the 10 verification conditions generated by this version too.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/davidcrocker.wordpress.com/455/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/davidcrocker.wordpress.com/455/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/davidcrocker.wordpress.com/455/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/davidcrocker.wordpress.com/455/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/davidcrocker.wordpress.com/455/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/davidcrocker.wordpress.com/455/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/davidcrocker.wordpress.com/455/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/davidcrocker.wordpress.com/455/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/davidcrocker.wordpress.com/455/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/davidcrocker.wordpress.com/455/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/davidcrocker.wordpress.com/455/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/davidcrocker.wordpress.com/455/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/davidcrocker.wordpress.com/455/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/davidcrocker.wordpress.com/455/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=455&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://critical.eschertech.com/2010/07/16/verifying-pointer-arithmetic/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/9efed2bd9429eac89f62a336b6d05174?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">davidcrocker</media:title>
		</media:content>
	</item>
		<item>
		<title>Run-time checks: Are they worth it?</title>
		<link>http://critical.eschertech.com/2010/07/07/run-time-checks-are-they-worth-it/</link>
		<comments>http://critical.eschertech.com/2010/07/07/run-time-checks-are-they-worth-it/#comments</comments>
		<pubDate>Wed, 07 Jul 2010 16:28:41 +0000</pubDate>
		<dc:creator>davidcrocker</dc:creator>
				<category><![CDATA[C and C++ in critical systems]]></category>

		<guid isPermaLink="false">http://critical.eschertech.com/?p=436</guid>
		<description><![CDATA[One of the criticisms levelled against the use of C in safety-critical software is that the C language does not provide run-time checks automatically. For example, when indexing into an array, there is no check that the index is in bounds. Likewise, when doing integer arithmetic in C, there is no check for arithmetic overflow. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=436&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>One of the criticisms levelled against the use of C in safety-critical software is that the C language does not provide run-time checks automatically. For example, when indexing into an array, there is no check that the index is in bounds. Likewise, when doing integer arithmetic in C, there is no check for arithmetic overflow.</p>
<p>Many other programming languages <em>do </em>provide run-time checks. For example, Java and C# both raise exceptions if the program attempts to access an out-of-bounds array element. C# also provides an option to raise exceptions on arithmetic overflow. Ada provides all these run-time checks by default, although most compilers have an option to inhibit their generation. If you are programming in C++ and follow my recommendation to use a <a href="http://critical.eschertech.com/2010/03/09/safer-arrays-using-a-c-array-class/" target="_blank">C++ array class</a>, you can choose whether or not to perform index-in-bound checks.</p>
<p>Some developers of critical software consider it axiomatic that you should leave run-time checks enabled, if the programming language provides them. The practice of enabling run-time checks in debug builds but disabling them in release builds had been likened to carrying a fire extinguisher in your car at all times, except when you actually use it! Does this mean that we should insist that run-time checks are always enabled in release builds of critical software &#8211; and therefore avoid using programming languages that don&#8217;t support them, such as plain C?</p>
<p>In my opinion, shipping software containing run-time checks is generally a good thing <strong>if </strong>the software can do something useful when a run-time check fails, <strong>and </strong>that &#8220;something useful&#8221; is tested. This will probably require the deliberate introduction of temporary bugs, since we expect the software to be free from run-time errors in normal use. For an example of where this wasn&#8217;t done properly, see the <a href="http://esamultimedia.esa.int/docs/esa-x-1819eng.pdf" target="_blank">report on the Ariane 5 launch</a>. In summary, the launcher was lost when a run-time error occurred in a floating-point to fixed-point conversion, in a software module that was performing no useful function at the time. This led to the whole software subsystem shutting down. In doing so, it put an error code on the data bus, where it was interpreted as flight data. This caused the rocket nozzles to move to full deflection, leading to break-up of the rocket and triggering of the self-destruction mechanism. Had the overflowing data conversion simply yielded wrong information instead of raising an exception, the mission might not have been lost. This may be a rare example, but I think it illustrates that run-time checks are not invariably a good thing.</p>
<p>What should we do when a run-time check fails? The Ariane software developers assumed that any run-time check failures would be caused by random hardware errors, so their approach was to log the error (which they got right) and hand over to the backup unit (which didn&#8217;t help, because it had already experienced exactly the same run-time check failure).</p>
<p>In an embedded system, the options for handling failed run-time checks may be  limited &#8211; particularly in life-preserving software, such as fly-by-wire control systems. Handing over to a backup unit won&#8217;t help if the run-time failure is caused by a systemic failure, and restarting the software is usually not a viable option. In such cases, the best solution is surely to show that run-time checks can never fail unless the hardware is at fault. You can attempt to show this by thorough testing, but you need to be very careful. The Ariane 5 software had been thoroughly tested and &#8220;proven in use&#8221; in Ariane 4; however, the software was then re-used in Ariane 5 without repeating the tests using the higher horizontal velocity inputs that occur in an Ariane 5 launch.</p>
<p>You can also attempt to show that run-time checks will never fail by formal verification. This will expose any hidden preconditions &#8211; such as the maximum horizontal velocity that could be handled by the Ariane software. Your re-use strategy must ensure that when you use previously verified software in a new environment, the new environment continues to respect the preconditions.</p>
<p>If run-time checks can be a mixed blessing in release software, are they ever always a good thing? Yes they are, when you are testing the software! No worries about what to do when a run-time check fails: just log it and terminate the test! Whether you are doing unit testing or integration testing, a failed run-time check gives an early indication of something being wrong, often making diagnosis of the bug much easier. Run-time checks can also catch &#8220;benign&#8221; bugs that don&#8217;t lead to incorrect results in tests, but which may bite you later. For example, reading an array beyond its bounds may cause the program to read a value that depends on the previous test. It might read a &#8220;benign&#8221; value during testing, but a &#8220;harmful&#8221; value during some particular use in the field.</p>
<p>If you&#8217;re performing formal verification before testing, you may argue that run-time checks are a waste of testing time. After all, they are never going to fail, right? Well, even with full formal verification, errors might occur. The compiler you are using might be generating the wrong code; or the linker might introduce an error; or the hardware itself may be faulty. Even formal verification systems have been known to contain errors. When we test formally verified software, any test failure is symptomatic of a fault in the development process, tool chain, or hardware. If we test throughly and find no errors, this gives us confidence that the process and tool chain are sound. Testing with run-time checks enabled (as well as without, if we intend to ship without run-time checks) and experiencing no run-time check failures adds to that confidence.</p>
<p>We adopt this philosophy when developing ArC and <em>Perfect Developer</em>. Both are written in <em>Perfect</em>, from which we generate code. For releases, we generate C++ because of the increased speed it offers &#8211; after all, theorem proving is computationally expensive. But all our development prior to final regression testing is done using C# code generation, so that we have the added benefit of run-time checks.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/davidcrocker.wordpress.com/436/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/davidcrocker.wordpress.com/436/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/davidcrocker.wordpress.com/436/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/davidcrocker.wordpress.com/436/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/davidcrocker.wordpress.com/436/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/davidcrocker.wordpress.com/436/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/davidcrocker.wordpress.com/436/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/davidcrocker.wordpress.com/436/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/davidcrocker.wordpress.com/436/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/davidcrocker.wordpress.com/436/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/davidcrocker.wordpress.com/436/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/davidcrocker.wordpress.com/436/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/davidcrocker.wordpress.com/436/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/davidcrocker.wordpress.com/436/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=436&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://critical.eschertech.com/2010/07/07/run-time-checks-are-they-worth-it/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/9efed2bd9429eac89f62a336b6d05174?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">davidcrocker</media:title>
		</media:content>
	</item>
		<item>
		<title>Aliasing and how to control it</title>
		<link>http://critical.eschertech.com/2010/06/22/aliasing-and-how-to-control-it/</link>
		<comments>http://critical.eschertech.com/2010/06/22/aliasing-and-how-to-control-it/#comments</comments>
		<pubDate>Tue, 22 Jun 2010 11:22:53 +0000</pubDate>
		<dc:creator>davidcrocker</dc:creator>
				<category><![CDATA[C and C++ in critical systems]]></category>
		<category><![CDATA[Formal verification of C programs]]></category>
		<category><![CDATA[aliasing]]></category>
		<category><![CDATA[formal specification]]></category>
		<category><![CDATA[formal verification]]></category>

		<guid isPermaLink="false">http://critical.eschertech.com/?p=420</guid>
		<description><![CDATA[Today I&#8217;ll start by writing a simple function that determines the maximum and minimum of two integers. We want to return two values, and C doesn&#8217;t make that easy unless we declare a struct to hold them. So I&#8217;ll pass two pointers to where I want the results stored instead. Here goes: #include "arc.h" void [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=420&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Today I&#8217;ll start by writing a simple function that determines the maximum and minimum of two integers. We want to return two values, and C doesn&#8217;t make that easy unless we declare a <strong>struct </strong>to hold them. So I&#8217;ll pass two pointers to where I want the results stored instead. Here goes:</p>
<pre style="padding-left:30px;"><span style="color:#008000;">#<strong>include </strong>"arc.h"</span>
<strong>
void </strong>minMax(<strong>int </strong>a, <strong>int </strong>b, <strong><span style="color:#008000;">out</span> int </strong>*min, <strong><span style="color:#008000;">out</span> int </strong>*max)
<span style="color:#008000;"><strong>post</strong>(*min &lt;= a; *min &lt;= b; *min == a || *min == b)
<strong>post</strong>(*max &gt;= a; *max &gt;= b; *max == a || *max == b)</span>
{ *min = a &lt; b ? a : b;
  *max = a &gt; b ? a : b;
}</pre>
<p>I&#8217;ve highlighted the ArC annotations in <span style="color:#008000;">green</span>. Notice how, in the postcondition, I&#8217;ve defined the minimum as a value that is less than or equal to both inputs and equal to one of them. Similarly for the maximum. Also notice that the two parameters used to pass the results back are flagged with the <span style="color:#008000;"><strong>out</strong></span> keyword. This tells ArC (and users of this function) that the initial values of *<em>min </em>and *<em>max </em>are not used by the function, and the obligation to initialize them rests with the function, instead of with the caller of the function.</p>
<p>Does this function work? Always? Let&#8217;s see what ArC makes of it:</p>
<p style="font-size:smaller;"><span style="color:#ff0000;">c:\escher\arctest\arctest20.c (9,25): Warning! Unable to prove: Postcondition satisfied when function returns (defined at c:\escher\arctest\arctest20.c (5,27)) (see C:\Escher\ArcTest\arctest20_unproven.htm#2), did not prove: *min &lt;= b.<br />
c:\escher\arctest\arctest20.c (9,25): Warning! Unable to prove: Postcondition satisfied when function returns (defined at c:\escher\arctest\arctest20.c (5,16)) (see C:\Escher\ArcTest\arctest20_unproven.htm#1), did not prove: *min &lt;= a.</span><br />
<span style="color:#008000;">c:\escher\arctest\arctest20.c (9,25): Information! Confirmed: Postcondition satisfied when function returns (defined at c:\escher\arctest\arctest20.c (5,43)) (see C:\Escher\ArcTest\arctest20_proof.htm#3), proved: (a == *min) || (b == *min).<br />
c:\escher\arctest\arctest20.c (9,25): Information! Confirmed: Postcondition satisfied when function returns (defined at c:\escher\arctest\arctest20.c (6,16)) (see C:\Escher\ArcTest\arctest20_proof.htm#4), proved: a &lt;= *max.<br />
c:\escher\arctest\arctest20.c (9,25): Information! Confirmed: Postcondition satisfied when function returns (defined at c:\escher\arctest\arctest20.c (6,27)) (see C:\Escher\ArcTest\arctest20_proof.htm#5), proved: b &lt;= *max.<br />
c:\escher\arctest\arctest20.c (9,25): Information! Confirmed: Postcondition satisfied when function returns (defined at c:\escher\arctest\arctest20.c (6,43)) (see C:\Escher\ArcTest\arctest20_proof.htm#6), proved: (a == *max) || (b == *max).</span></p>
<p>So ArC proves that the maximum is returned correctly, but not the minimum. What&#8217;s going on here? If we follow the link to the &#8220;unproven&#8221; file, we find a clue:</p>
<p style="font-size:small;padding-left:30px;"><span style="color:#008000;"><strong>Could not prove:</strong> !(min == max)</span></p>
<p>Sure enough, it&#8217;s clear that if we make the following call:</p>
<pre style="padding-left:30px;"><strong>int </strong>temp;
minMax(42, 24, &amp;temp, &amp;temp);
</pre>
<p>then <em>minMax </em>fails to meet its specification, because after storing the minimum in <em>temp</em>, it overwrites that value with the maximum.</p>
<p>The fix is to add a precondition to make that sort of call illegal:</p>
<pre style="padding-left:30px;"><strong>void </strong>minMax(<strong>int </strong>a, <strong>int </strong>b, <strong><span style="color:#008000;">out</span> int </strong>*min, <strong><span style="color:#008000;">out</span> int </strong>*max)
<span style="color:#008000;"><span style="color:#ff0000;"><strong>pre</strong>(min != max)</span>
<strong>post</strong>(*min &lt;= a; *min &lt;= b; *min == a || *min == b)
<strong>post</strong>(*max &gt;= a; *max &gt;= b; *max == a || *max == b)</span>
{ *min = a &lt; b ? a : b;
  *max = a &gt; b ? a : b;
}</pre>
<p>ArC is then able to verify the function.</p>
<p>This is a simple example of classic <strong>pointer aliasing </strong>- the situation in which two pointers refer to the same memory location. Any assignments we make through one of the pointers affects the value we read through the other. Another form of aliasing is <strong>pointer-variable</strong> aliasing. In this variant, a function is manipulating the value referred to by a pointer, and also a variable &#8211; typically a static variable declared outside the function. If the pointer points to the variable, then assignments through the pointer affect the value read from the variable, and vice versa.</p>
<p>ArC adopts a <strong>strict aliasing</strong> model that disallows casts between different pointer types. This means that in an ArC-compatible C program, pointer-pointer aliasing can only occur if the pointers have the same type, or if one of the types pointed to contains a field or element of the other. Similarly, aliasing between a pointer and a variable can only happen if the type pointed to is either the same as the type of the variable, or the same as the type a field or element contained in the variable. This serves to reduce the potential for aliasing.</p>
<p>Nevertheless, there are still many situations in which aliasing is possible and will break the program if it occurs. For example, consider the following function for copying elements from one array to another:</p>
<pre style="padding-left:30px;"><span style="color:#008000;"><strong>#include</strong> "arc.h"</span><strong>

void arrayCopy</strong>(<strong>const int</strong>* <span style="color:#008000;"><strong>array</strong></span> src, <strong>int</strong>* <span style="color:#008000;"><strong>array </strong></span>dst, size_t num)
<span style="color:#008000;"><strong>pre</strong>(src.lim &gt;= num; dst.lim &gt;= num)
<strong>post</strong>(<strong>forall </strong>i in 0..(num - 1) :- dst[i] == src[i])</span>
{ size_t j;
  <strong>int </strong>i;
  <strong>for </strong>(j = 0; j &lt; num; ++j)
  <span style="color:#ff0000;"><span style="color:#008000;"><strong>keep</strong>(j &lt;= num)
  <span style="color:#008000;"><strong>keep</strong>(</span></span></span><span style="color:#008000;"><strong>forall </strong>i in 0..(j - 1) :- dst[i] == src[i]</span><span style="color:#ff0000;"><span style="color:#008000;">)</span>
  <span style="color:#008000;"><strong>decrease</strong>(size - i)</span></span>
  { dst[j] = src[j];
  }
}</pre>
<p>The precondition ensure that we don&#8217;t get out-of-bounds array accesses, and the postcondition describes what we want to achieve. For an explanation of the loop invariants (<strong>keep </strong>clauses) and loop variants (<strong>decrease </strong>clauses), see <a href="http://critical.eschertech.com/2010/03/22/verifying-loops-in-c/" target="_blank">Verifying loops in C and C++</a>, <a href="http://critical.eschertech.com/2010/03/29/verifying-loops-part-2/" target="_blank">Verifying loops &#8211; part 2</a> and <a href="http://critical.eschertech.com/2010/03/31/verifying-loops-proving-termination/" target="_blank">Verifying loops: proving termination</a>.</p>
<p>Does this function meet its specification? Not if the elements of <em>dst </em>we are writing overlap with the elements of <em>src </em>that we are reading! So we need an anti-aliasing precondition again. Here&#8217;s one possibility:</p>
<pre style="padding-left:30px;"><strong></strong><strong>void arrayCopy</strong>(<strong>const int</strong>* <span style="color:#008000;"><strong>array</strong> </span>src, <strong>int</strong>* <span style="color:#008000;"><strong>array </strong></span>dst, size_t num)
<span style="color:#008000;"><strong>pre</strong>(src.lim &gt;= num; dst.lim &gt;= num)</span><strong>
<span style="color:#ff0000;">pre</span></strong><span style="color:#ff0000;">(<strong>forall </strong>i in src.indices; j in dst.indices; &amp;src[i] != &amp;dst[j])</span><span style="color:#008000;">
<strong>post</strong>(<strong>forall </strong>i in 0..(num - 1) :- dst[i] == src[i])</span>
{ ...<strong></strong>
}</pre>
<p>Stating that arrays don&#8217;t overlap in this way is rather cumbersome. But it can get a lot worse! Suppose we have, for example, an <strong>int</strong>* and a pointer to an array of structs, where each struct has several <strong>int </strong>fields, some <strong>int</strong>[] fields, and some struct fields that may themselves contains <strong>int</strong>s. If we want to write a precondition stating that the <strong>int</strong>* doesn&#8217;t alias any of the <strong>int</strong>s contained in the array of structs, we would need one precondition term for each possible way in which they could be aliased. If we add another <strong>int</strong>* parameter that is also forbidden to alias the other pointers, we would need to do much the same all over again.</p>
<p>To make it easier to express absence of aliasing, ArC provides the <strong>disjoint</strong>-expression. Here&#8217;s our example modified to use it:</p>
<pre style="padding-left:30px;"><strong></strong><strong>void arrayCopy</strong>(<strong>const int</strong>* <span style="color:#008000;"><strong>array</strong> </span>src, <strong>int</strong>* <span style="color:#008000;"><strong>array </strong></span>dst, size_t num)
<span style="color:#008000;"><strong>pre</strong>(src.lim &gt;= num; dst.lim &gt;= num)</span><strong>
<span style="color:#ff0000;">pre</span></strong><span style="color:#ff0000;">(<strong>disjoint</strong>(src.all, dst.all))</span><span style="color:#008000;">
<strong>post</strong>(<strong>forall </strong>i in 0..(num - 1) :- dst[i] == src[i])</span>
{ ...<strong></strong>
}</pre>
<p>The <strong>disjoint</strong>-expression takes two or more operands and yields <strong>true </strong>if and only if no pair of operands refer to overlapping storage. Internally, ArC determines all the ways in which pairs of operands could be aliased, and expands the expression into a conjunction of equivalent predicates for the prover to use.</p>
<p>Although ArC can reason about C programs even in the presence of aliasing, it&#8217;s best to write programs that avoid the possibility of aliasing as far as possible. To help you, ArC extends the strict aliasing model when you declare a new type  using a <a href="http://critical.eschertech.com/2010/02/26/using-constrained-types-in-c/" target="_blank">constrained typedef</a>. ArC doesn&#8217;t allow you to cast a  pointer to a constrained type to or from a pointer to any other type. For  example, suppose I declare the following:</p>
<pre style="padding-left:30px;"><strong>typedef int <span style="color:#008000;">invariant</span></strong><span style="color:#008000;">(<strong>value </strong>&gt;= 0)</span> count_t;
<strong>typedef int <span style="color:#008000;">invariant</span></strong><span style="color:#008000;">(<strong>value </strong>&gt;= -273)</span> temperature_t;

<strong>void </strong>foo(count_t * a, temperature_t *b) { ... }
</pre>
<p>Within <em>foo</em>, ArC knows that *a and *b cannot be aliases for the  same variable, because the pointer types are not compatible. If you  should want to declare a new sort of integral type without actually  constraining it, you can use <span style="color:#008000;"><strong>invariant</strong>(<strong>true</strong>)</span>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/davidcrocker.wordpress.com/420/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/davidcrocker.wordpress.com/420/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/davidcrocker.wordpress.com/420/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/davidcrocker.wordpress.com/420/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/davidcrocker.wordpress.com/420/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/davidcrocker.wordpress.com/420/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/davidcrocker.wordpress.com/420/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/davidcrocker.wordpress.com/420/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/davidcrocker.wordpress.com/420/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/davidcrocker.wordpress.com/420/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/davidcrocker.wordpress.com/420/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/davidcrocker.wordpress.com/420/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/davidcrocker.wordpress.com/420/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/davidcrocker.wordpress.com/420/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=420&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://critical.eschertech.com/2010/06/22/aliasing-and-how-to-control-it/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/9efed2bd9429eac89f62a336b6d05174?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">davidcrocker</media:title>
		</media:content>
	</item>
		<item>
		<title>Verifying absence of integer overflow</title>
		<link>http://critical.eschertech.com/2010/06/07/verifying-absence-of-integer-overflow/</link>
		<comments>http://critical.eschertech.com/2010/06/07/verifying-absence-of-integer-overflow/#comments</comments>
		<pubDate>Mon, 07 Jun 2010 16:44:39 +0000</pubDate>
		<dc:creator>davidcrocker</dc:creator>
				<category><![CDATA[C and C++ in critical systems]]></category>
		<category><![CDATA[Formal verification of C programs]]></category>
		<category><![CDATA[formal verification]]></category>

		<guid isPermaLink="false">http://critical.eschertech.com/?p=364</guid>
		<description><![CDATA[One class of errors we need to guard against when writing critical software is arithmetic overflow. Before I go into detail, I invite you to consider the following program and decide what it prints: #include &#60;stdio.h&#62; int main(int argc, char *argv[]) { unsigned int x = 42; long y = -10; printf("%s\n", (x &#62; y [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=364&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>One class of errors we need to guard against when writing critical software is arithmetic overflow. Before I go into detail, I invite you to consider the following program and decide what it prints:</p>
<p style="padding-left:30px;"><code><strong>#include</strong> &lt;stdio.h&gt;<br />
<strong>int </strong>main(<strong>int </strong>argc, <strong>char </strong>*argv[]) {<br />
<span style="padding-left:15px;"><strong>unsigned int </strong>x = 42;</span><br />
<span style="padding-left:15px;"><strong>long </strong>y = -10;</span><br />
<span style="padding-left:15px;">printf("%s\n", (x &gt; y ? "Hello, normal world!" : "Hello, strange world!"));</span><br />
<span style="padding-left:15px;"><strong>return </strong>0;</span><br />
}</code></p>
<p>I&#8217;ll return to this example later. For now, consider the function for averaging a set of readings stored in an array that I examined at in my previous two posts (ArC-specific bits are shown in <span style="color:#008000;">green</span>):</p>
<p style="padding-left:30px;"><code>int16_t average(<strong>const </strong>int16_t * <span style="color:#008000;"><strong>array </strong></span>readings, size_t numReadings)<br />
<span style="color:#008000;"><strong>pre</strong>(readings.lwb == 0; readings.upb == numReadings)<br />
<strong>pre</strong>(numReadings != 0)<br />
<strong>returns</strong>((+ <strong>over </strong>readings.all)/numReadings)</span><br />
{<br />
<span style="padding-left:15px;"><strong>int </strong>sum = 0;</span><br />
<span style="padding-left:15px;">size_t i;</span><br />
<span style="padding-left:15px;"><strong>for </strong>(i = 0; i &lt; numReadings; ++i)</span><br />
<span style="padding-left:15px;color:#008000;"><strong>keep</strong>(i &lt;= numReadings)</span><br />
<span style="padding-left:15px;"><span style="color:#008000;"><strong>keep</strong>(sum == + <strong>over </strong>readings.all.take(i))</span></span><br />
<span style="padding-left:15px;color:#008000;"><strong>decrease</strong>(numReadings - i)</span><br />
<span style="padding-left:15px;">{</span><br />
<span style="padding-left:30px;">sum += readings[i];</span><br />
<span style="padding-left:15px;">}</span><br />
<span style="padding-left:15px;"><strong>return </strong>(int16_t)(sum/numReadings);</span><br />
}</code></p>
<p>I reported in the earlier post that ArC is able to verify this, apart from proving absence of integer overflow. Here are the verification warnings that ArC reports:</p>
<p style="font-size:smaller;color:#ff0000;">c:\escher\arctest\arctest14.c (17,13): Warning! Exceeded time limit proving: Arithmetic result of operator &#8216;+&#8217; is within limit of type &#8216;int&#8217;, suggestion available (see C:\Escher\ArcTest\arctest14_unproven.htm#9), did not prove: minof(int) &lt;= (sum$loopstart_5398,5$ + readings[i$loopstart_5398,5$ - readings.lwb]).</p>
<p style="font-size:smaller;color:#ff0000;">c:\escher\arctest\arctest14.c (17,13): Warning! Exceeded time limit proving: Arithmetic result of operator &#8216;+&#8217; is within limit of type &#8216;int&#8217;, suggestion available (see C:\Escher\ArcTest\arctest14_unproven.htm#10), did not prove: (sum$loopstart_5398,5$ + readings[i$loopstart_5398,5$ - readings.lwb]) &lt;= maxof(int).</p>
<p style="font-size:smaller;color:#ff0000;">c:\escher\arctest\arctest14.c (19,22): Warning! Unable to prove: Type constraint satisfied in implicit conversion from &#8216;int&#8217; to &#8216;unsigned int&#8217;, suggestion available (see C:\Escher\ArcTest\arctest14_unproven.htm#18), did not prove: minof(unsigned int) &lt;= sum$5398,5$.</p>
<p>Any time we do an explicit or implicit conversion from one type to another type that is not wider than the original or is defined by a <a href="http://critical.eschertech.com/2010/02/26/using-constrained-types-in-c/" target="_blank">constrained typedef</a>, or we do certain arithmetic operations, we need to guard against integer overflow. This example contains several such operations.</p>
<p>The first implicit type conversion is when we initialize the loop counter <em>i</em> to the constant 0. We declared <em>i</em> to be of type <strong>size_t</strong>, which is an unsigned type, whereas the constant 0 has type (signed) <strong>int </strong>because we didn&#8217;t use the U suffix. So there is an implicit conversion from <strong>int </strong>to<strong> size_t</strong> here, and ArC will generate verification conditions that the constant 0 is in the range of <strong>size_t</strong>. Of course, the proof is trivial.</p>
<p>The other implicit type conversion is in the subexpression <em>sum/numReadings</em> in the <strong>return </strong>statement. Since <em>numReadings </em>has type <strong>size_t</strong> and <strong>sum</strong> has type <strong>int</strong>, the &#8220;usual arithmetic conversions&#8221; will be applied. In practice, <strong>size_t</strong> is never smaller than <strong>int</strong>, so there will be an implicit conversion of <em>sum </em>to <strong>size_t</strong>. ArC therefore generates the verification condition that the current value of <em>sum </em>is representable in<strong> size_t</strong>. As there is nothing to prevent sum from being negative, the proof fails &#8211; hence it emits the third warning message listed above.</p>
<p>In fact, ArC &#8211; like most ordinary static analysers and some compilers &#8211; warns about the signed/unsigned mismatch even before starting the proofs. Avoiding signed/unsigned mismatches in the first place is probably a more efficient strategy than dealing with the verification failures that often occur when you don&#8217;t. Had I followed my own advice to <a href="http://critical.eschertech.com/2010/04/07/danger-unsigned-types-used-here/" target="_blank">avoid using unsigned types</a>, the signed/unsigned mismatch would  not have arisen.</p>
<p>To fix the error, we can either change the type of the parameter <em>numReadings </em>to <strong>int</strong>, or cast it to type <strong>int </strong>before using it as the divisor. If we do the latter, we&#8217;ll also need to add a function precondition that <em>numReadings </em>is not too large to fit in an <strong>int</strong>.</p>
<p>For the integer operators + &#8211; and * it can be the case that applying the &#8220;usual arithmetic conversions&#8221; to an operator expression does not involve an unsafe implicit type conversion, yet the arithmetic result is not representable in the result type. The C standard defines the result in this case as undefined if the promoted operands are signed, or the arithmetic result modulo some <em>N </em>if they are unsigned. ArC treats it as an error in either case, because unsigned types are frequently used where modulo arithmetic is not the user&#8217;s intention; so it always generates verification conditions that the result is in type. This is the reason for the first two verification failures that ArC reports for our original example. The expression <span style="color:#ff0000;"><code>sum  += readings[i]</code></span> could easily overflow, because the sum could be as high as <em>maxof(int16_t) * maxof(size_t)</em> or as low as <em>minof(int16_t) * maxof(size_t)</em>, which are larger than <em>maxof(int)</em> and <em>minof(int)</em> respectively. To avoid overflow, we will need to constrain either the minimum and maximum values of each reading, or the number of readings. Here&#8217;s our example with the precondition changed to limit the number of readings:</p>
<p style="padding-left:30px;"><code>int16_t average(<strong>const </strong>int16_t * <span style="color:#008000;"><strong>array </strong></span>readings, size_t numReadings)<br />
<span style="color:#008000;"><strong>pre</strong>(readings.lwb == 0; readings.upb == numReadings)<br />
<strong>pre</strong>(numReadings != 0; <span style="color:#ff0000;">numReadings &lt;= maxof(<strong>int</strong>)/(maxof(int16_t) + 1)</span>)<br />
<strong>returns</strong>((+ <strong>over </strong>readings.all)/numReadings)</span><br />
{<br />
<span style="padding-left:15px;"><strong>int </strong>sum = 0;</span><br />
<span style="padding-left:15px;">size_t i;</span><br />
<span style="padding-left:15px;"><strong>for </strong>(i = 0; i &lt; numReadings; ++i)</span><br />
<span style="padding-left:15px;color:#008000;"><strong>keep</strong>(i &lt;= numReadings)</span><br />
<span style="padding-left:15px;"><span style="color:#008000;"><strong>keep</strong>(sum == + <strong>over </strong>readings.all.take(i))</span></span><br />
<span style="padding-left:15px;color:#008000;"><strong>decrease</strong>(numReadings - i)</span><br />
<span style="padding-left:15px;">{</span><br />
<span style="padding-left:30px;">sum += readings[i];</span><br />
<span style="padding-left:15px;">}</span><br />
<span style="padding-left:15px;"><strong>return </strong>(int16_t)(sum/<span style="color:#ff0000;">(<strong>int</strong>)</span>numReadings);</span><br />
}</code></p>
<p>I&#8217;ve referred to the maximum value supported by type <strong>int </strong>as <em>maxof(<strong>int</strong>)</em>.  ArC provides type operators <em>maxof(T)</em> and <em>minof(T)</em> for  your convenience, to save you from having to #include &lt;limits.h&gt;  or similar just for specification purposes. I&#8217;ve also cast <em>numReadings </em>to <strong>int </strong>prior to using it in the division.</p>
<p>With the compiler parameters set to assume <strong>int </strong>is 32 bits and <strong>short int</strong> is 16 bits, ArC verifies this example completely. You may be wondering why the expression <code>(+ <strong>over </strong>readings.all)/numReadings</code> in the <strong>returns </strong>specification doesn&#8217;t also generate a signed/unsigned warning or  verification failure. That&#8217;s because integer arithmetic in  specifications is always carried out after promoting operands to type <strong>integer</strong>,  the ArC integral types that has no bounds.</p>
<p>Finally, let&#8217;s return to the short program I invited you to consider at the start of this post. Is the world normal or strange? On my computer, compiling with Microsoft Visual C++ 2008, it&#8217;s strange. That&#8217;s because for this compiler, both (unsigned) <strong>int </strong>and <strong>long </strong>are 32 bits. So both operands of less-than get converted to <strong>unsigned long</strong> &#8230; and -10 won&#8217;t fit! For the world to be normal, you&#8217;ll need to use a compiler for which <strong>long </strong>is larger than <strong>int</strong>, so that both operands get converted to (signed) <strong>long </strong>instead.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/davidcrocker.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/davidcrocker.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/davidcrocker.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/davidcrocker.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/davidcrocker.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/davidcrocker.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/davidcrocker.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/davidcrocker.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/davidcrocker.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/davidcrocker.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/davidcrocker.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/davidcrocker.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/davidcrocker.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/davidcrocker.wordpress.com/364/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=364&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://critical.eschertech.com/2010/06/07/verifying-absence-of-integer-overflow/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/9efed2bd9429eac89f62a336b6d05174?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">davidcrocker</media:title>
		</media:content>
	</item>
		<item>
		<title>Specification with Ghost Functions</title>
		<link>http://critical.eschertech.com/2010/05/26/specification-with-ghost-functions/</link>
		<comments>http://critical.eschertech.com/2010/05/26/specification-with-ghost-functions/#comments</comments>
		<pubDate>Wed, 26 May 2010 17:58:12 +0000</pubDate>
		<dc:creator>davidcrocker</dc:creator>
				<category><![CDATA[C and C++ in critical systems]]></category>
		<category><![CDATA[Formal verification of C programs]]></category>
		<category><![CDATA[formal specification]]></category>

		<guid isPermaLink="false">http://critical.eschertech.com/?p=381</guid>
		<description><![CDATA[In my previous post I showed that the C expression sublanguage extended with quantified expressions (forall and exists) is insufficient to allow some specifications to be expressed. I presented this function (annotated with an incomplete specification) to average an array of data: int16_t average(const int16_t * array readings, size_t numReadings) pre(readings.lwb == 0; readings.lim == [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=381&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In my previous post I showed that the C expression sublanguage extended with quantified expressions (<strong>forall </strong>and <strong>exists</strong>) is insufficient to allow some specifications to be expressed. I presented this function (annotated with an incomplete specification) to average an array of data:</p>
<p style="padding-left:30px;"><code>int16_t average(<strong>const </strong>int16_t * <span style="color:#008000;"><strong>array </strong></span>readings, size_t numReadings)<br />
<span style="color:#008000;"><strong>pre</strong>(readings.lwb == 0; readings.lim == numReadings)<br />
<strong>pre</strong>(numReadings != 0)<br />
<strong>post</strong>(<strong>result </strong>== ? <span style="color:#993300;"><em>/* sum of elements of readings */</em></span> /numReadings)</span><br />
{<br />
<span style="padding-left:15px;"><strong>int </strong>sum = 0;</span><br />
<span style="padding-left:15px;">size_t i;</span><br />
<span style="padding-left:15px;"><strong>for </strong>(i = 0; i &lt; numReadings; ++i)</span><br />
<span style="padding-left:15px;color:#008000;"><strong>keep</strong>(i &lt;= numReadings)</span><br />
<span style="padding-left:15px;color:#008000;"><strong>keep</strong>(sum == ? <span style="color:#993300;"><em>/* sum of first i elements of readings */</em></span> )</span><br />
<span style="padding-left:15px;color:#008000;"><strong>decrease</strong>(numReadings - i)</span><br />
<span style="padding-left:15px;">{</span><br />
<span style="padding-left:30px;">sum += readings[i];</span><br />
<span style="padding-left:15px;">}</span><br />
<span style="padding-left:15px;"><strong>return </strong>(int16_t)(sum/numReadings);</span><br />
}</code></p>
<p>The two question marks need to be replaced by expressions for sums over the appropriate elements. Last time I showed how we can replace them by <strong>over</strong>-expressions. In this post, I&#8217;ll describe an alternative solution that has more general applications.</p>
<p>The problem we are faced with is that there is no C expression (even if we include <strong>forall </strong>and <strong>exists </strong>expressions) that can express the sum of some of array elements, where the number of elements is not statically known. We can calculate the sum using a loop; but a loop is not an expression and so cannot be used in specifications. However, specification expressions can contain function calls, provided that the functions have no side-effects. We can resolve our problem by defining a <em>ghost function</em> that calculates the sum. We&#8217;ll specify the function recursively so that we can handle the variable number of elements.</p>
<p>In the postcondition of <em>average</em>, we need to calculate the sum of all the elements of the <em>readings </em>array. In the loop variant, we need the sum of just the first <em>i</em> elements. So let&#8217;s declare a function that returns the sum of the first <em>n </em>elements of an array <em>a</em>, where <em>a</em> and <em>n </em>are parameters. We define the sum to be zero if <em>n </em>is zero, otherwise it&#8217;s the sum of the first <em>n-1</em> elements plus the <em>n</em>th element. Here&#8217;s our ghost function specification:</p>
<p style="padding-left:30px;"><code>#<strong>ifdef </strong>__ARC__<br />
<span style="padding-left:15px;"><strong>ghost integer </strong>sumOf(<strong>const </strong>int16_t * <strong>array </strong>a, size_t n)</span><br />
<span style="padding-left:15px;"><strong>pre</strong>(a.lwb == 0; n &lt;= a.lim)</span><br />
<span style="padding-left:15px;"><strong>decrease</strong>(n)</span><br />
<span style="padding-left:15px;"><strong>returns</strong>(n == 0 ? 0 : a[n - 1] + sumOf(a, n - 1));</span><br />
#<strong>endif</strong></code></p>
<p>Declaring the function <strong>ghost </strong>tells ArC that it is for use in specifications only. A ghost function must have a specification but no body. It can also do a few things that wouldn&#8217;t be allowed for a normal function. In particular, it can use types not normally allowed in C, such as <strong>integer</strong>, which is ArC&#8217;s name for the type of unbounded integers. I&#8217;ve declared the function as returning <strong>integer </strong>so that neither we nor ArC need worry about arithmetic overflow in the specification.</p>
<p>The precondition defines the conditions for the function to be  well-behaved, as usual. I&#8217;ve specified the value that the function returns using a postcondition of the form <strong>returns</strong>(<em>expression</em>). This has the same meaning as <strong>post</strong>(<strong>result </strong>== <em>expression</em>), but it&#8217;s shorter. Also, we want to define the function recursively, and ArC doesn&#8217;t currently allow a recursive call to the function being specified inside <strong>post</strong>(&#8230;) but it does inside <strong>returns</strong>(&#8230;). I&#8217;ve enclosed the whole function declaration in <strong>#ifdef __ARC__</strong> &#8230; <strong>#endif</strong> so that the C  compiler doesn&#8217;t see it.</p>
<p>Recursion would normally be forbidden in critical software (MISRA rule 16.2), however  recursive calls in specifications are safe because they don&#8217;t carry any  risk of run-time stack overflow. To ensure that the specification makes sense, ArC will need to prove that the recursion is bounded. That&#8217;s what the <strong>decrease </strong>clause is for, and it works exactly as if it were a loop variant &#8211; that is, it must decrease on each recursive call and it must not go negative.</p>
<p>Having defined <em>sumOf</em>,, we can complete the postcondition and loop invariant:</p>
<p style="padding-left:30px;"><code>int16_t average(<strong>const </strong>int16_t * <span style="color:#008000;"><strong>array </strong></span>readings, size_t numReadings)<br />
<span style="color:#008000;"><strong>pre</strong>(readings.lwb == 0; readings.lim == numReadings)<br />
<strong>pre</strong>(numReadings != 0)<br />
<span style="color:#ff0000;"><strong>returns</strong>(sumOf(readings, numReadings)/numReadings)</span> </span><br />
{<br />
<span style="padding-left:15px;"><strong>int </strong>sum = 0;</span><br />
<span style="padding-left:15px;">size_t i;</span><br />
<span style="padding-left:15px;"><strong>for </strong>(i = 0; i &lt; numReadings; ++i)</span><br />
<span style="padding-left:15px;color:#008000;"><strong>keep</strong>(i &lt;= numReadings)</span><br />
<span style="padding-left:15px;color:#ff0000;"><strong>keep</strong>(sum == sumOf(readings, i)</span>)<br />
<span style="padding-left:15px;color:#008000;"><strong>decrease</strong>(numReadings - i)</span><br />
<span style="padding-left:15px;">{</span><br />
<span style="padding-left:30px;">sum += readings[i];</span><br />
<span style="padding-left:15px;">}</span><br />
<span style="padding-left:15px;"><strong>return </strong>(int16_t)(sum/numReadings);</span><br />
}</code></p>
<p>I&#8217;ve taken the liberty of replacing the original <strong>post</strong>(&#8230;) postcondition of average with the <strong>returns</strong>(&#8230;) form.</p>
<p>Just like the version of <em>average </em>I gave in my previous post, this one verifies completely except for possible overflow during integer arithmetic. I&#8217;ll show how we can deal with that next time.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/davidcrocker.wordpress.com/381/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/davidcrocker.wordpress.com/381/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/davidcrocker.wordpress.com/381/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/davidcrocker.wordpress.com/381/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/davidcrocker.wordpress.com/381/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/davidcrocker.wordpress.com/381/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/davidcrocker.wordpress.com/381/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/davidcrocker.wordpress.com/381/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/davidcrocker.wordpress.com/381/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/davidcrocker.wordpress.com/381/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/davidcrocker.wordpress.com/381/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/davidcrocker.wordpress.com/381/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/davidcrocker.wordpress.com/381/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/davidcrocker.wordpress.com/381/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=381&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://critical.eschertech.com/2010/05/26/specification-with-ghost-functions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/9efed2bd9429eac89f62a336b6d05174?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">davidcrocker</media:title>
		</media:content>
	</item>
		<item>
		<title>Expressing the Inexpressible</title>
		<link>http://critical.eschertech.com/2010/05/24/expressing-the-inexpressible/</link>
		<comments>http://critical.eschertech.com/2010/05/24/expressing-the-inexpressible/#comments</comments>
		<pubDate>Mon, 24 May 2010 16:48:20 +0000</pubDate>
		<dc:creator>davidcrocker</dc:creator>
				<category><![CDATA[C and C++ in critical systems]]></category>
		<category><![CDATA[Formal verification of C programs]]></category>
		<category><![CDATA[formal specification]]></category>

		<guid isPermaLink="false">http://critical.eschertech.com/?p=366</guid>
		<description><![CDATA[When writing preconditions, postconditions and other specifications for C programs, sometimes we need to write expressions that can&#8217;t be expressed in plain C. That&#8217;s why formal verification systems based on annotated programming languages almost always augment the expression sublanguage with forall and exists expressions. In previous posts, I&#8217;ve introduced ArC&#8217;s implementations of these. For example, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=366&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>When writing preconditions, postconditions and other specifications for C programs, sometimes we need to write expressions that can&#8217;t be expressed in plain C. That&#8217;s why formal verification systems based on annotated programming languages almost always augment the expression sublanguage with <strong>forall </strong>and <strong>exists </strong>expressions. In previous posts, I&#8217;ve introduced ArC&#8217;s implementations of these. For example, the following expression yields <strong>true </strong>if all elements of the array <em>arr </em>are between 0 and 100 inclusive:</p>
<p style="padding-left:30px;"><code><strong>forall </strong>ind <strong>in </strong>arr.indices :- arr[ind] &gt;= 0 &amp;&amp; arr[ind] &lt;= 100</code></p>
<p>Here, <em>ind </em>is declared as a bound variable that ranges over the values in the expression that follows the keyword <strong>in</strong>, which in this case is all the indices into <em>arr</em>.</p>
<p>Similarly, this expression:</p>
<p style="padding-left:30px;"><code><strong>exists </strong>ind <strong>in </strong>0..9 :- arr[ind] &gt;= 0 &amp;&amp; arr[ind]  &lt;= 100</code></p>
<p>yields <strong>true </strong>if at least one of the first ten elements is in the range 0 to 100. I&#8217;ve used one of ArC&#8217;s special operators here: the range operator &#8220;..&#8221;, which yields a sequence of values from its first operand up to its second operand. In fact, in the first example,<em> arr.indices</em> is defined as <em>arr.lwb .. arr.upb</em>, so I was implicitly using the range operator there too.</p>
<p>Providing <strong>forall </strong>and <strong>exists </strong>to quantify over finite sets and sequences lets us express many sorts of specification, but isn&#8217;t always enough. Very occasionally, we need to use <strong>forall </strong>or <strong>exists </strong>to quantify over a potentially infinite type, which ArC also allows. For example, the following expression yields <strong>true </strong>if <em>p(x)</em> is true for all values of <em>x</em> belonging to type <em>T</em>:</p>
<p style="padding-left:30px;"><code><strong>forall </strong>T x :- p(x)</code></p>
<p>However, there are many cases in which a specification still cannot be expressed. For example, consider the following function for averaging a set of readings stored in an array:</p>
<p style="padding-left:30px;"><code>int16_t average(<strong>const </strong>int16_t * <span style="color:#008000;"><strong>array </strong></span>readings, size_t numReadings)<br />
<span style="color:#008000;"><strong>pre</strong>(readings.lwb == 0; readings.lim == numReadings)<br />
<strong>pre</strong>(numReadings != 0)<br />
<strong>post</strong>(<strong>result </strong>== ? <span style="color:#993300;"><em>/* sum of elements of readings */</em></span> </span></code><code><span style="color:#008000;">/numReadings</span></code><code><span style="color:#008000;">) </span><br />
{<br />
<span style="padding-left:15px;"><strong>int </strong>sum = 0;</span><br />
<span style="padding-left:15px;">size_t i;</span><br />
<span style="padding-left:15px;"><strong>for </strong>(i = 0; i &lt; numReadings; ++i)</span><br />
<span style="padding-left:15px;color:#008000;"><strong>keep</strong>(i &lt;= numReadings)</span><br />
<span style="padding-left:15px;color:#008000;"><strong>keep</strong>(sum == ? <span style="color:#993300;"><em>/* sum of first i elements of readings */</em></span> )</span><br />
<span style="padding-left:15px;color:#008000;"><strong>decrease</strong>(numReadings - i)</span><br />
<span style="padding-left:15px;">{</span><br />
<span style="padding-left:30px;">sum += readings[i];</span><br />
<span style="padding-left:15px;">}</span><br />
<span style="padding-left:15px;"><strong>return </strong>(int16_t)(sum/numReadings);</span><br />
}</code></p>
<p>I&#8217;ve included some ArC annotations (highlighted in <span style="color:#008000;">green text</span>) in this example, to specify that the valid indices of <em>readings </em>are<em> 0..(numReadings &#8211; 1)</em>, and that <em>numReadings </em>isn&#8217;t zero so that the final division operation will be valid. I&#8217;ve also provided a loop invariant and loop variant (see my earlier posts on verifying loops if you aren&#8217;t familiar with these). However, in the postcondition I&#8217;ve put a question mark where I need to express &#8220;sum of all elements of <em>readings</em>&#8220;, and in the loop invariant I&#8217;ve put another question-mark where I want &#8220;sum of the first <em>i</em> elements of <em>readings</em>&#8220;. How can we express these quantities?</p>
<p>In this case, there is an easy way, and another way that is less easy but more general. Let&#8217;s start with the easy way. ArC is derived from <a title="Perfect Developer" href="http://www.eschertech.com/products/perfect_developer.php" target="_blank">Perfect Developer</a>, and when writing ArC specifications you can use most of the library types and expression types provided by PD. In particular, type <em>seq of T</em> from PD is treated as equivalent to <em>T[]</em> in C. So we can use <em>Perfect </em>sequence operations on C arrays. A list of member functions of <em>seq of T</em> can be found in the <a title="Perfect Developer Library Reference" href="http://www.eschertech.com/product_documentation/Language%20Reference/LibraryReference.htm#seq" target="_blank">Perfect Developer Library Reference</a>. In PD, most of these functions are available for use anywhere, since code can be generated for them; but when used in ArC, they are of course all &#8220;ghost&#8221; functions &#8211; that is, functions that can be used in specifications only.</p>
<p>The particular <em>Perfect </em>expression type we need here is the <strong>over </strong>expression, which expresses repeated application of an operator over the elements of a collection. Those with a background in functional programming may recognise it as <em>left-fold</em>. Our postcondition can be written:</p>
<p style="padding-left:30px;"><span style="color:#008000;"><code><strong>post</strong>(<strong>result </strong>== (+ <strong>over </strong>readings.all)/numReadings)</code></span></p>
<p>We&#8217;ve had to use <em>readings.all</em> rather than just <em>readings </em>as the operand of +<strong>over</strong>, because <em>readings </em>is an array pointer, and we need to provide a genuine array to +<strong>over</strong>. Hence ArC provides the array pointer type with ghost member <em>all</em>. You can think of<em> readings.all</em> as yielding the sequence <em>readings[readings.lwb]</em>, <em>readings[readings.lwb + 1]</em> and so on up to <em>readings[readings.upb]</em>. We&#8217;d have liked to use *<em>readings </em>to mean the array that <em>readings </em>points to, but of course in C that just yields the value of the first element.</p>
<p>For the loop invariant expression, we can use +<strong>over </strong>again, but we need to apply it to the first <em>i</em> elements of <em>readings </em>rather than all elements. The <em>seq of T</em> class provides member <em>take </em>for this purpose, allowing us to use:</p>
<p style="padding-left:30px;"><span style="color:#008000;"><code><strong>keep</strong>(sum == + <strong>over </strong>readings.all.take(i))</code></span></p>
<p>That&#8217;s enough to verify our function, apart from dealing with a potential integer overflow when summing the elements, which I&#8217;ll return to in a later post. Next time I&#8217;ll demonstrate how we can use a <em>ghost function</em> to define the notion of summation without recourse to  <strong>over</strong>-expressions.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/davidcrocker.wordpress.com/366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/davidcrocker.wordpress.com/366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/davidcrocker.wordpress.com/366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/davidcrocker.wordpress.com/366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/davidcrocker.wordpress.com/366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/davidcrocker.wordpress.com/366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/davidcrocker.wordpress.com/366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/davidcrocker.wordpress.com/366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/davidcrocker.wordpress.com/366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/davidcrocker.wordpress.com/366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/davidcrocker.wordpress.com/366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/davidcrocker.wordpress.com/366/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/davidcrocker.wordpress.com/366/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/davidcrocker.wordpress.com/366/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=366&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://critical.eschertech.com/2010/05/24/expressing-the-inexpressible/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/9efed2bd9429eac89f62a336b6d05174?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">davidcrocker</media:title>
		</media:content>
	</item>
		<item>
		<title>Verifying a binary search, part 2</title>
		<link>http://critical.eschertech.com/2010/05/06/verifying-a-binary-search-part-2/</link>
		<comments>http://critical.eschertech.com/2010/05/06/verifying-a-binary-search-part-2/#comments</comments>
		<pubDate>Thu, 06 May 2010 19:31:38 +0000</pubDate>
		<dc:creator>davidcrocker</dc:creator>
				<category><![CDATA[C and C++ in critical systems]]></category>
		<category><![CDATA[Formal verification of C programs]]></category>
		<category><![CDATA[formal verification]]></category>

		<guid isPermaLink="false">http://critical.eschertech.com/?p=351</guid>
		<description><![CDATA[In my last entry I showed how to use a correct-by-construction approach to develop a binary search function. We got as far as specifying the function and the loop, but we left the loop body undefined. The function declaration looked like this: size_t bSearch(const LinEntry* array table, size_t nElems, uint16_t key) pre(table.lwb == 0; table.lim [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=351&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In my last entry I showed how to use a correct-by-construction approach to develop a binary search function. We got as far as specifying the function and the loop, but we left the loop body undefined. The function declaration looked like this:</p>
<p style="padding-left:30px;"><code>size_t    bSearch(<strong>const </strong>LinEntry*  <strong>array </strong>table,  size_t nElems,    uint16_t key)<br />
<strong>pre</strong>(table.lwb   == 0; table.lim == nElems)<br />
<strong>pre</strong>(<strong>forall </strong>a <strong>in </strong>table.indices; b <strong>in </strong>table.indices<br />
<span style="padding-left:45px;">:- b &gt; a =&gt;  table[b].raw &gt; table[a].raw)<br />
<strong>post</strong>(<strong>result </strong>&lt;= nElems)<br />
<strong>post</strong>(<strong>forall </strong>i <strong>in </strong>0..(<strong>result </strong>- 1) :- key &gt;= table[i])<br />
<strong>post</strong>(<strong>forall </strong>i <strong>in result</strong>..(nElems - 1) :- key &lt;= table[i]);</span></code></p>
<p>and the function definition like this:</p>
<p style="padding-left:30px;"><code>size_t   bSearch(<strong>const </strong>LinEntry*  <strong>array </strong>table,   size_t nElems,   uint16_t key) {<br />
<span style="padding-left:15px;">size_t low = 0, high = nElems;</span><br />
<span style="padding-left:15px;"><strong>while </strong>(high != low)</span><br />
<span style="padding-left:15px;"><strong>keep</strong>(high &gt;= low)</span><br />
<span style="padding-left:15px;"><strong>keep</strong>(high &lt;= nElems)</span><br />
<span style="padding-left:15px;"><strong>keep</strong>(<strong>forall </strong>i <strong>in </strong>0..(low - 1) :- table[i].rawValue &lt;= key)</span><br />
<span style="padding-left:15px;"><strong>keep</strong>(<strong>forall </strong>i <strong>in </strong>high..(nElems - 1) :- table[i].rawValue &gt;= key)</span><br />
<span style="padding-left:15px;"><strong>decrease</strong>(high - low) {</span><br />
<span style="padding-left:30px;">low = ?; high = ?;</span><br />
<span style="padding-left:15px;">}</span><br />
<span style="padding-left:15px;"><strong>return </strong>low;</span><br />
}<br />
</code></p>
<p>ArC verified the program, apart from the incomplete loop body. So we just need to write loop body code that preserves the four loop invariants (the expressions in the <strong>keep </strong>clauses), decreases the variant (the expression in the <strong>decrease </strong>clause), and leaves the variant non-negative unless the loop is about to terminate.</p>
<p>The classic way to do a binary search is to pick an index midway between <em>low </em>and <em>high</em>, compare the table item at that point with the key, and adjust either <em>low </em>or <em>high </em>to that index depending on the result. So here&#8217;s a first attempt:</p>
<p style="padding-left:30px;"><code><strong>const </strong>size_t mid = (low + high)/2;<br />
<strong>if </strong>(key &gt;= table[mid].raw) {<br />
<span style="padding-left:15px;">low = mid;</span><br />
} <strong>else </strong>{<br />
<span style="padding-left:15px;">high = mid;</span><br />
}</code></p>
<p>If we try to verify this using ArC, we get a single failed verification condition:</p>
<p><span style="color:#ff0000;font-size:smaller;">c:\escher\arctest\ex5.c (23,9): Warning! Exceeded time limit proving: Loop body establishes end condition or decreases variant (defined at c:\escher\arctest\ex5.c (21,5)) (see C:\Escher\ArcTest\ex5_unproven.htm#6), did not prove: (low$loopstart_7603,5$ + high$loopend$) &lt; (low$loopend$ + high$loopstart_7603,5$).<br />
</span></p>
<p>The prover has timed out trying to prove that the body either decreases the variant or leads to termination of the loop. There are four possibilities:</p>
<ol>
<li>There is a genuine problem &#8211; the code may loop indefinitely.</li>
<li>The code won&#8217;t loop indefinitely, but we need a different loop variant in order to make this provable.</li>
<li>The verification condition was simply too hard for the prover (this is always a possibility when the proof timed out). We could try increasing the prover timeout.</li>
<li>There is a bug in ArC.</li>
</ol>
<p>In this case, if we follow the link to the HTML proof report file, we see this:</p>
<p style="padding-left:30px;"><span style="color:#008080;"><strong>Could not prove any of:</strong><br />
!((1 + low<sub>loopstart_7603,5</sub>) == high<sub>loopstart_7603,5</sub>)<br />
&#8230;</span></p>
<p>So the prover thinks that the loop may not make any progress if <em>high == 1 + low</em> at the start of the loop. Sure enough, there is; because in this situation, <em>mid == low</em>. So we may end up doing the assignment <em>low = low</em>, leading to an indefinite loop.</p>
<p>One way to fix this is to avoid the situation <em>high = 1 + low</em> at the start of the loop. For example, we could change the while-condition to <em>high &gt; low + 1</em>. Returning <em>low </em>immediately after the end of the loop will no longer be correct, so we&#8217;ll need to do something different.</p>
<p>However, the loop invariant we have to maintain is that all elements up to and including <em>low &#8211; 1</em> hold raw values less than or equal to <em>key</em>. Therefore, having established that the element at <em>mid </em>is less than or equal to <em>key</em>, we can set <em>low </em>to <em>mid + 1</em> instead of <em>mid</em>, avoiding the problem. So the code now looks like this:</p>
<p style="padding-left:30px;"><code>size_t   bSearch(<strong>const </strong>LinEntry*  <strong>array </strong>table,   size_t nElems,   uint16_t key) {<br />
<span style="padding-left:15px;">size_t low = 0, high = nElems;</span><br />
<span style="padding-left:15px;"><strong>while </strong>(high != low)</span><br />
<span style="padding-left:15px;"><strong>keep</strong>(high &gt;= low)</span><br />
<span style="padding-left:15px;"><strong>keep</strong>(high &lt;= nElems)</span><br />
<span style="padding-left:15px;"><strong>keep</strong>(<strong>forall </strong>i <strong>in </strong>0..(low - 1) :- table[i].rawValue &lt;= key)</span><br />
<span style="padding-left:15px;"><strong>keep</strong>(<strong>forall </strong>i <strong>in </strong>high..(nElems - 1) :- table[i].rawValue &gt;= key)</span><br />
<span style="padding-left:15px;"><strong>decrease</strong>(high - low) {</span><br />
<span style="padding-left:30px;"><strong>const </strong>size_t mid = (low + high)/2;</span><br />
<span style="padding-left:30px;"><strong>if </strong>(key &gt;= table[mid].raw) {</span><br />
<span style="padding-left:45px;">low = mid<span style="color:#ff0000;"> + 1</span>;</span><br />
<span style="padding-left:30px;">} <strong>else </strong>{</span><br />
<span style="padding-left:45px;">high = mid;</span><br />
<span style="padding-left:30px;">}</span><br />
<span style="padding-left:15px;">}</span><br />
<span style="padding-left:15px;"><strong>return </strong>low;</span><br />
}<br />
</code></p>
<p>This is sufficient to make the function fully verifiable.</p>
<p>I hope you will agree that this has demonstrated a correct-by-construction approach to writing loops, avoiding the dangers of non-termination and off-by-one errors. It&#8217;s all too easy to introduce off-by-one errors in C, because &#8211; unlike Ada &#8211; the C language doesn&#8217;t provide a  form of loop that iterates over exactly the index range of an array, so you need to design the termination condition yourself.</p>
<p>Those of you who are familiar with SPARK Ada may be wondering why I used multiple preconditions, postcondition and loop invariants, rather than &#8220;and&#8221;-ing them all together like this:</p>
<p style="padding-left:30px;"><code><strong>keep</strong>(high &gt;= low<br />
<span style="padding-left:15px;">&amp;&amp; high &lt;= nElems</span><br />
<span style="padding-left:15px;">&amp;&amp; (<strong>forall </strong>i <strong>in </strong>0..(low - 1) :- table[i].rawValue &lt;= key)</span><br />
<span style="padding-left:15px;">&amp;&amp; (<strong>forall </strong>i <strong>in </strong>high..(nElems - 1) :- table[i].rawValue &gt;= key))</span><br />
</code></p>
<p>You can do that in ArC, but there are two reasons why it is better not to. Firstly, when ArC is generating verification conditions &#8211; in this case, that the initialization establishes the loop invariant that the loop body preserves it &#8211; ArC will generate a separate verification condition for each invariant expression. If you &#8220;and&#8221; the invariants together, you have a single expression, and therefore a single verification condition. If ArC fails to prove it, you can&#8217;t easily tell which part of the invariant failed without looking at the prover output. Writing the invariants separately (or, alternatively, putting them all in one keep-clause as separate expressions separated by semicolons) causes ArC to generate a separate verification condition for each one, so that you can see immediately which one(s) failed.</p>
<p>The second reason for using multiple specification clauses is also to do with error reporting. Remember that ArC keywords such as <strong>pre</strong>, <strong>post </strong>and <strong>keep </strong>are implemented as macros, so that the specifications can be made invisible to standard C compilers. When a C preprocessor expands the macros, it generally ignores the layout of the actual macro arguments, and generates the expanded version using the layout of the macro definition. So a macro argument that you wrote across several lines in your source code typically gets expanded to a single line. This means that if an error is reported, the line number will be reported as the line on which the ArC keyword occurred, even if the error is in a macro argument a few lines later. So it&#8217;s best to keep the arguments to ArC specification macros on the same line as the keyword. Therefore, when using long expressions, you&#8217;ll want to use a separate instance of the ArC keyword with each expression.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/davidcrocker.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/davidcrocker.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/davidcrocker.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/davidcrocker.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/davidcrocker.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/davidcrocker.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/davidcrocker.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/davidcrocker.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/davidcrocker.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/davidcrocker.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/davidcrocker.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/davidcrocker.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/davidcrocker.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/davidcrocker.wordpress.com/351/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=351&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://critical.eschertech.com/2010/05/06/verifying-a-binary-search-part-2/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/9efed2bd9429eac89f62a336b6d05174?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">davidcrocker</media:title>
		</media:content>
	</item>
		<item>
		<title>Verifying a binary search</title>
		<link>http://critical.eschertech.com/2010/05/05/verifying-a-binary-search/</link>
		<comments>http://critical.eschertech.com/2010/05/05/verifying-a-binary-search/#comments</comments>
		<pubDate>Wed, 05 May 2010 07:23:09 +0000</pubDate>
		<dc:creator>davidcrocker</dc:creator>
				<category><![CDATA[C and C++ in critical systems]]></category>
		<category><![CDATA[Formal verification of C programs]]></category>
		<category><![CDATA[formal verification]]></category>

		<guid isPermaLink="false">http://critical.eschertech.com/?p=337</guid>
		<description><![CDATA[In the last post, I covered some different levels of formal verification that you may be interested in, and showed how to add minimum annotation to the linearization example to allow ArC to prove predictable execution. The example provided a prototype for the binary search function it called, to which we added a minimal postcondition, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=337&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In the last post, I covered some different levels of formal verification that you may be interested in, and showed how to add minimum annotation to the linearization example to allow ArC to prove predictable execution. The example provided a prototype for the binary search function it called, to which we added a minimal postcondition, so that it looked like this:</p>
<p style="padding-left:30px;"><code>size_t bSearch(<strong>const </strong>LinEntry* <strong>array </strong>table,  size_t nElems, uint16_t key)<br />
<strong> </strong><strong>post</strong>(<strong>result </strong>&lt;= nElems);</code></p>
<p>Let&#8217;s develop a verified implementation of this binary search function. We could write the code and then try to get ArC to verify it &#8211; either for predictable execution only, or for full functional correctness. However, it&#8217;s quite hard to get a binary search function right. So for this example, it&#8217;s more productive to use formal specification to develop an implementation that is correct by construction.</p>
<p>First, we need to specify exactly what <em>bSearch </em>is supposed to return, and what its preconditions are. Let&#8217;s start with the preconditions. We&#8217;re passing the number of elements of <em>table </em>in <em>nElems</em>, so we need to specify this:</p>
<p style="padding-left:30px;"><code>size_t   bSearch(<strong>const </strong>LinEntry* <strong>array </strong>table,  size_t nElems,   uint16_t key)<br />
<span style="color:#ff0000;"><strong>pre</strong>(table.lwb  == 0; table.lim == nElems)</span><br />
<strong>post</strong>(<strong>result </strong>&lt;= nElems);</code></p>
<p>The precondition says that the lower bound (the lowest legal index) into <em>table </em>is zero &#8211; in other words, <em>table </em>is a pointer to the first  element of an array &#8211; and that the limit of <em>table </em>(one plus the  highest legal index) is <em>nElems</em>. However, this isn&#8217;t enough: the raw values need to be in increasing order too. A standard way of specifying &#8220;in increasing order&#8221; is to say that for any two indices <em>a</em> and <em>b</em>, if <em>b</em> &gt; <em>a</em> then the element at <em>b</em> is greater than the element at <em>a</em>. Here&#8217;s that expressed as a precondition:</p>
<p style="padding-left:30px;"><code>size_t   bSearch(<strong>const </strong>LinEntry*  <strong>array </strong>table,  size_t nElems,   uint16_t key)<br />
<strong>pre</strong>(table.lwb   == 0; table.lim == nElems)<br />
<span style="color:#ff0000;"><strong>pre</strong>(<strong>forall </strong>a in table.indices; b in table.indices :- b &gt; a =&gt; table[b].raw &gt; table[a].raw)</span><br />
<strong>post</strong>(result   &lt;= nElems);</code></p>
<p>The ghost member <span style="color:#008080;"><em>indices </em></span>of an array pointer is defined by ArC as the sequence of all valid indices into the array. So<em> <span style="color:#008080;">table.indices</span></em> means the same as<em> <span style="color:#008080;">table.lwb .. table.upb</span></em>. The <span style="color:#008080;">=&gt;</span> symbol means &#8220;implies&#8221;, so<em> <span style="color:#008080;">b  &gt; a =&gt; table[b].raw &gt; table[a].raw</span></em> means the same as<em> <span style="color:#008080;">!(b  &gt; a) || table[b].raw &gt; table[a].raw</span></em> but is clearer.</p>
<p>Now we need to specify what the function returns. In developing <em>linearize </em>in the previous post, I said I was assuming a return value of <em>0</em> means the raw value is off the bottom of the table, <em>nElems </em>means it is off the top, otherwise the table entry indexed by the return value and  the previous table entry bracket the raw value. Actually, we can simplify this. If we say that any elements with indices below the returned value have raw values less than or equal to the key, and any elements with indices at or above the returned value have raw values greater than or equal to the key, then this covers the cases of returning <em>0</em> or <em>nElems </em>as well.  So let&#8217;s express this single constraint in a postcondition:</p>
<p style="padding-left:30px;"><code>size_t    bSearch(<strong>const </strong>LinEntry*  <strong>array </strong>table,  size_t nElems,    uint16_t key)<br />
<strong>pre</strong>(table.lwb   == 0; table.lim == nElems)<br />
<strong>pre</strong>(<strong>forall </strong>a <strong>in </strong>table.indices; b <strong>in </strong>table.indices :- b &gt; a =&gt;  table[b].raw &gt; table[a].raw)<br />
<strong>post</strong>(<strong>result </strong>&lt;= nElems)<br />
<span style="color:#ff0000;"><strong>post</strong>(<strong>forall </strong>i <strong>in </strong>0..(<strong>result </strong>- 1) :- key &gt;= table[i])<br />
<strong>post</strong>(<strong>forall </strong>i <strong>in result</strong>..(nElems - 1) :- key &lt;= table[i])</span>;</code></p>
<p>This specification isn&#8217;t precise about the value we return when there is a table entry whose raw value exactly matches the key &#8211; it allows us to return the index of either that entry or the next entry. My original informal specification wasn&#8217;t precise either &#8211; it said that the two entries &#8220;bracket&#8221; the key. Given that our precondition forbids duplicate entries, we could be more precise if we want, e.g. by changing <span style="color:#008080;">&lt;=</span> in the final postcondition to <span style="color:#008080;">&lt;</span>.</p>
<p>Now we can work on the implementation. Note that when you have a function prototype declaration separate from the implementation (as here), ArC expects you to put the function specification in the prototype only, so we don&#8217;t need to repeat the preconditions and postconditions in the implementation. Here&#8217;s a rough sketch of what we want to do:</p>
<p style="padding-left:30px;"><code>size_t   bSearch(<strong>const </strong>LinEntry*  <strong>array </strong>table,   size_t nElems,   uint16_t key) {<br />
<span style="padding-left:15px;">size_t low = 0, high = nElems;</span><br />
<span style="padding-left:15px;"><strong>while </strong>(high != low) {</span><br />
<em><span style="padding-left:30px;">/* increase low or decrease high, such that high remains &gt;= low,</span><br />
<span style="padding-left:52px;">all elements below low have raw values &lt;= the key,</span><br />
<span style="padding-left:52px;">and all elements at or above high have raw values &gt;= the key */</span></em><br />
<span style="padding-left:15px;">}</span><br />
<span style="padding-left:15px;"><strong>return </strong>low;</span><br />
}<br />
</code></p>
<p>Let&#8217;s express the text in the comment as a loop variant and a loop invariant (see my earlier post on verifying loops if you&#8217;re not familiar with these):</p>
<p style="padding-left:30px;"><code>size_t   bSearch(<strong>const </strong>LinEntry*  <strong>array </strong>table,   size_t nElems,   uint16_t key) {<br />
<span style="padding-left:15px;">size_t low = 0, high = nElems;</span><br />
<span style="padding-left:15px;"><strong>while </strong>(high != low)</span><br />
<span style="color:#ff0000;"><span style="padding-left:15px;"><strong>keep</strong>(high &gt;= low)</span></span><br />
<span style="color:#ff0000;"><span style="padding-left:15px;"><strong>keep</strong>(<strong>forall </strong>i <strong>in </strong>0..(low - 1) :- table[i].rawValue &lt;= key)</span></span><br />
<span style="color:#ff0000;"><span style="padding-left:15px;"><strong>keep</strong>(<strong>forall </strong>i <strong>in </strong>high..(nElems - 1) :- table[i].rawValue &gt;= key)</span></span><br />
<span style="padding-left:15px;"><span style="color:#ff0000;"><strong>decrease</strong>(high - low)</span> {</span><br />
<em><span style="padding-left:30px;">/* do something */</span></em><br />
<span style="padding-left:15px;">}</span><br />
<span style="padding-left:15px;"><strong>return </strong>low;</span><br />
}<br />
</code></p>
<p>Before we go any further, let&#8217;s check that this makes sense. ArC will need to prove that the loop invariants (the <strong>keep </strong>clauses) are true at the start of the loop. That&#8217;s easy &#8211; the initialization of low and high ensures that <em>high &gt;= low</em> (which satisfies the first invariant), and that there are no table indices below <em>low </em>and no table indices at or above <em>high</em> (which satisfies the second and third invariants, because <strong>forall </strong>over an empty range is always true). Also, because we are returning <em>low </em>when the loop terminates, ArC will need to prove that the value of <em>low </em>at the end of the loop meets the postcondition on the return value. We know that when the loop terminates, the loop invariants will be true and the while-clause will be false. Therefore, we can substitute <em>result </em>for <em>low </em>(because we are returning <em>low</em>) and <em>result </em>for <em>high </em>(because <em>high </em>== <em>low </em>is the inverse of the while-clause) in the keep-clauses, to find out what is known about <em>result</em>.  When we do this, the last two <strong>keep </strong>clauses magically turn into the last two postconditions &#8211; which is exactly what we want. However, this doesn&#8217;t help with the first postcondition, which requires <em>result &lt;= nElems</em>. Adding an extra loop invariant<em> low &lt;= nElems</em> or <em>high &lt;= nElems</em> will do the trick, since substituting <em>result </em>for both <em>low </em>and <em>high </em>will then give us the required term. I&#8217;ll choose <em>high &lt;= nElems</em>, because it is stronger and I don&#8217;t intend that <em>high </em>should ever exceed <em>nElems</em>.</p>
<p>We can use ArC to check that the design is OK so far, even before we code the loop body. As well as adding the new loop invariant, we&#8217;ll need to tell ArC that we intend to assign to<em> low </em>and <em>high </em><em>low </em>within the loop body, which we can do like this:</p>
<p style="padding-left:30px;"><code>size_t   bSearch(<strong>const </strong>LinEntry*  <strong>array </strong>table,   size_t nElems,   uint16_t key) {<br />
<span style="padding-left:15px;">size_t low = 0, high = nElems;</span><br />
<span style="padding-left:15px;"><strong>while </strong>(high != low)</span><br />
<span style="padding-left:15px;"><strong>keep</strong>(high &gt;= low)</span><br />
<span style="color:#ff0000;"><span style="padding-left:15px;"><strong>keep</strong>(high &lt;= nElems)</span></span><br />
<span style="padding-left:15px;"><strong>keep</strong>(<strong>forall </strong>i <strong>in </strong>0..(low - 1) :- table[i].rawValue &lt;= key)</span><br />
<span style="padding-left:15px;"><strong>keep</strong>(<strong>forall </strong>i <strong>in </strong>high..(nElems - 1) :- table[i].rawValue &gt;= key)</span><br />
<span style="padding-left:15px;"><span style="color:#000000;"><strong>decrease</strong>(high - low)</span> {</span><br />
<span style="padding-left:30px;"><span style="color:#ff0000;">low = ?; high = ?;</span></span><br />
<span style="padding-left:15px;">}</span><br />
<span style="padding-left:15px;"><strong>return </strong>low;</span><br />
}<br />
</code></p>
<p>ArC allows you to use <span style="color:#ff0000;">?</span> in place of an expression to mean you haven&#8217;t decided what goes there yet. Naturally, ArC won&#8217;t be able to prove that the loop preserves its invariants or decreases its variant. Sure enough, if we run ArC on this example, we get 6 unproven verification conditions that refer to the loop body: one for each of the loop invariants and two for the variant. However, ArC reports that everything else is OK, including that the loop invariants are satisfied by the initialization and that the return value meets all three postconditions.</p>
<p>So all we need to do now is to write loop body code that assigns <em>low </em>and/or <em>high </em>such that the four loop invariants are preserved and the loop variant <em>high &#8211; low</em> decreases. But this post is already quite long, so I&#8217;ll do that next time.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/davidcrocker.wordpress.com/337/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/davidcrocker.wordpress.com/337/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/davidcrocker.wordpress.com/337/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/davidcrocker.wordpress.com/337/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/davidcrocker.wordpress.com/337/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/davidcrocker.wordpress.com/337/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/davidcrocker.wordpress.com/337/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/davidcrocker.wordpress.com/337/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/davidcrocker.wordpress.com/337/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/davidcrocker.wordpress.com/337/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/davidcrocker.wordpress.com/337/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/davidcrocker.wordpress.com/337/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/davidcrocker.wordpress.com/337/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/davidcrocker.wordpress.com/337/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=critical.eschertech.com&amp;blog=11762912&amp;post=337&amp;subd=davidcrocker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://critical.eschertech.com/2010/05/05/verifying-a-binary-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/9efed2bd9429eac89f62a336b6d05174?s=96&#38;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&#38;r=G" medium="image">
			<media:title type="html">davidcrocker</media:title>
		</media:content>
	</item>
	</channel>
</rss>