Safer arrays: using a C++ array class
In a previous post, I remarked that arrays in C leave much to be desired, and that in C++ it is better to avoid using naked arrays. You can avoid naked arrays in C++ programming by wrapping them up in a suitable array class instead. The Joint Strike Fighter C++ Coding Standards document takes a similar view; rule 97 in that standard states:
Arrays shall not be used in interfaces. Instead, the Array class should be used.
Rationale: Arrays degenerate to pointers when passed as parameters. This “array decay” problem has long been known to be a source of errors.
Unfortunately, the Array.doc file mentioned in the standard was not made available to the public. However, we can easily assemble our own set of array classes, starting with the fixed-length array class included in C++ Technical Report 1. That class is called std::tr1::array where T is the element type and N is the number of elements. Implementation of this class does not require the use of dynamic memory.
To illustrate the use of the TR1 array class, here is a small example using C arrays:
#define BUFLEN (100) static int buf[BUFLEN]; for (size_t i = 0; i < BUFLEN; ++i) { ... buf[i] ... }
Here is the same example using C++ and the array class:
using std::tr1::array; const size_t buflen = 100; typedef array<int, buflen> buf_t; static buf_t buf; for (buf_t::iterator it = buf.begin(); it != buf.end(); ++it) { ... *it ... }
Since I want to process every element of buf, I’ve used an array iterator in the for-loop header. If I want to process the elements in reverse order, I can use a reverse_iterator instead:
for (buf_t::reverse_iterator rit = buf.rbegin(); rit != buf.rend(); ++rit) { ... *rit ... }
Using iterators avoids the risk of off-by-one errors.
If I want to pass one of these buffers to a function, the simplest way is to pass it by reference or by const-reference:
int sumBuffer(const buf_t& src) { int sum = 0; for (buf_t::const_iterator it = src.begin(); it != src.end(); ++it) { sum += *it; } return sum; }
One disadvantage of passing an array in this manner is that the size is part of the type. Therefore, I can’t write a version of sumBuffer that works with buffers of varying sizes, unless I write it as a template – which a compiler would typically instantiate separately for each buffer size.
We can avoid this disadvantage by defining two more class templates to represent references to an array. Here are their outlines:
template<class T> class array_ref { T* ptr; size_t sz; public: template<size_t L> array_ref(array<T, L>& arg) : ptr(arg.data()), sz(L) {} size_t size() const { return sz; } T& operator[](size_t index) const { return ptr[index]; } }; template<class T> class const_array_ref { const T* ptr; size_t sz; public: template<size_t L>const_array_ref(const array<T, L>& arg) : ptr(arg.data()), sz(L) {} size_t size() const { return sz; } const T& operator[](size_t index) const { return ptr[index]; } };
Class array_ref holds a reference to the underlying data held in an array, along with a note of the number of elements. Class const_array_ref does the same but provides read-only access. We can go on to define iterators for these classes, so that we can safely traverse arrays passed by reference. This allows us to write the following:
static array<int, 100> bigBuf; static array<int, 20> smallBuf; int sumBuffer(const_array_ref<int> src) { int sum = 0; for (const_array_ref::const_iterator it = src.begin(); it != src.end(); ++it) { sum += *it; } return sum; } ... int total = sumBuffer(smallBuf) + sumBuffer(bigBuf);
As standard, class std::tr1::array provides bounds-checking for the at() function but not for the indexing operator. However, many implementations provide for the insertion of bounds checks, typically enabled by the DEBUG macro. Similarly, bounds checking can easily be provided in array_ref and const_array_ref if required.
In summary, C++ allows us to avoid the problems of naked array pointers, by providing us with alternative array representations that carry size information, support iterators so that we can more safely iterate through arrays, and optionally include run-time bounds checking.
Just curious: why/how is this better than std::vector?
Dynamic memory allocation is prohibited in critical embedded systems, because it causes execution time to be unpredictable, and memory fragmentation can cause long-running systems to run out of memory. So unfortunately we can’t use std::vector or most of the other STL collection classes, because they rely on dynamic memory allocation. It would be possible to allocate fixed-size arrays using std::vector during the initialization phase, but only if we could be sure that the program never subsequently did any operations on a vector that allocated or released memory.
(same as above but with escape-code for <>)
Each time I create a new array with a different size, the compiler creates a new code-base for that array size, because the size is a part of the array type. For example:
Std::array<unsigned char, 10> MyUC_10;
Std::array<unsigned char, 20> MyUC_20;
Std::array<unsigned char, 30> MyUC_30;
All get their own code base. Is there any way to avoid this type of compiler behaviour?
I would like all of the above to reference the same code-base.
As a test I created:
Template<typename _T_base, unsigned int _C_Size = 10>class MyTest
{
Public:
_T_base _myVal;
};
MyTest<unsigned char, 10> T1;
MyTest<unsigned char, 20> T2;
In the test I do not utilize/reference _C_Size any where in my template class, but the compiler creates 2 different code-bases, if I remove the _C_Size in the template it becomes the same code-base. I’ve tried many constructions without any luck. Form a functional point of view they should be similar, it might be the C++ compiler way of identifying types that causes the problem.
I really appreciate some help on this topic.
Have you looked at the code for both debug and release builds? In a debug build (i.e. compiler optimization disabled) I would expect each different instantiation to get separate code. In a release build (i.e. optimization enabled) I would expect most of the member functions to get inlined because they are so simple. Code sharing for those functions is not an issue. You may also find that code that is not inlined but does not depend on _C_SIZE gets shared between instantiations, depending on the compiler. If that isn’t the case and you find there is lots of duplicate code even with compiler optimization enabled, then you could write yor own array class, with the code you want to share declared in a non-templated base class.