The format of non-volatile data may change with a new version of software. Carefully planning data layouts and using data version numbers can make these upgrades easier.
Have you ever upgraded a package on your PC, only to find that all of the data files generated by previous versions of that product are no longer readable? With so many devices using flash memory to allow software upgrades in the field, a similar issue can arise on your embedded system. If data is stored in persistent memory (NVRAM or EEPROM usually), the format of that data may change from one version of the software to the next, to support new features.
In How to Protect Non-Volatile Data, we discussed the use of checksums to confirm that non-volatile data remained valid during a reset or power cycle.1 We also introduced the idea of breaking up non-volatile data into multiple blocks, each of which has its own checksum, to allow better management of data. In this article, we will look at how to organize non-volatile data for a smooth transition during software upgrades.
Non-volatile data layout
In a C program, the layout of the data in persistent store will usually be represented as a structure. The structure may be copied to/from serial EEPROM, or, if the persistent storage is mapped directly into the program's memory map, the structure simply lives in that area of memory. Listing 1 shows an example of such a structure. Because we want to be able to trust our calibration data even if the user settings become corrupt, we have broken the data into two groups: calibration and user settings. For simplicity, I do not show a second copy of the data as would be needed for double buffering, as discussed in How to Protect Non-Volatile Data, but that technique could be combined with everything we will discuss in this article.
typedef struct 
{
   float32_t  sensor1_offset;
   float32_t  sensor1_slope;
} calib_t;
typedef struct
{
   int  speed;
   int  alarm_limit;
} settings_t;
typedef struct
{
   calib_t     calib;
   uint16_t    calib_crc;
   settings_t  settings;
   uint16_t    settings_crc;
} persistent_store_t;
persistent_store_t * persistent = (persistent_store_t *) 0x00FF;
Listing 1. Structures to describe the layout of persistent store
The pointer persistent is declared to point to the place in the memory map where the persistent storage is located. If the space is memory mapped, there is no need to declare an instance of this structure; we can simply point at the space that always exists on the NVRAM chip. If this were a serial EEPROM implementation, you would probably declare an instance of the structure as a global variable, then copy that structure through the serial interface when required.
We want to be able to add other data to this structure in future versions. The first thing we want to ensure is that as the structure grows, it does not become too big for the physical medium on which it is being stored. In Listing 2, I check that the size of the structure is not greater than the number of bytes available, which in this case is 100. The CASSERT macro is a check that leads to a syntax error if the condition being checked is false. Thanks and credit to Bill Emshoff for telling me about this macro.2
#define CASSERT(ex) { typedef char cassert_type[(ex) ? 1 : -1]; }
int main(void)
{
   CASSERT(sizeof(persistent_store_t) <= 100);
...
Listing 2. How to check that your structures are not too big for the size of persistent store
If we add a data member to calib_t, we face the problem that the location of all of the members of settings_t will move, probably by the size of the item that was added. If we load the software onto a device that was previously running an older version of the software, any of the fields after the new one will be corrupted. A problem will be detected immediately, since the CRCs will have moved, and the misread calib_crc value will not match the value calculated from the calib structure. Similarly, the check of settings_crc will also fail. This is fine if we accept that every time the software is upgraded, all of the persistent data will be lost and we will have to recalibrate the device.
In some environments, we want to do a bit better than this. This is possible only if we plan for it up front and leave some padding for the new fields. Listing 3 shows the same structures with padding inserted. By using the size of the structure in the calculation of the size of the padding, we have an array that gradually shrinks as the size of the structure grows.
typedef struct
{
   float32_t  sensor1_offset;
   float32_t  sensor1_slope;
} calib_t;
typedef struct
{
   int  speed;
   int  alarm_limit;
} settings_t;
#define CALIB_SIZE 20
#define SETTINGS_SIZE 30
typedef struct
{
   calib_t     calib;
   uint8_t     calib_pad[CALIB_SIZE - sizeof(calib_t)];
   uint16_t    calib_crc;
   settings_t  settings;
   uint8_t     settings_pad[SETTINGS_SIZE - sizeof(settings_t)];
   uint16_t    settings_crc;
} persistent_store_t;
Listing 3. Using padding to ensure that the position of the CRCs doesn't change across versions of software
We also have to be sure that the padding is always included in the CRC calculation. Assuming we have a function called calc_crc() that calculates a CRC for a set number of bytes, Listing 4 shows a function to update the CRC after the value of a data member has been changed, and to check that the CRC is sane after a reboot. Note that we have set 20 bytes aside for the calibration section, and so we use all 20 bytes for the CRC or checksum, regardless of how many of those bytes are actually being used in the current version.
void calib_crc_set(void)
{
   persistent->calib_crc = calc_crc(&persistent->calib, CALIB_SIZE);
}
int calib_crc_check(void)
{
   return (persistent->calib_crc==calc_crc(&persistent->calib, CALIB_SIZE));
}
Listing 4. Setting and checking the CRC
Now each time we add a field, the padding will shrink, but the CRC and any following structures remain in the same place. The main problem with this scheme is that you have to guess how many extra bytes you are going to need. If we have broken up the persistent store into many separate blocks, for the reasons discussed in How to Protect Non-Volatile Data, then we have to guess the amount of padding for each block. In the example shown here, if the padding for the calibration section were completely used up, it wouldn't be possible to borrow any space from the settings section, because that would cause the CRC and other fields to move. This can lead to cases where fields are put into the wrong block because there was no room in the block where it really belonged. This issue should be considered when you are deciding how many divisions you want to create in your persistent store.
Data versioning
We now have a structure that will pass its CRC test after the new software has been installed. However, we still want to know that the upgrade happened. We need this information because we have fields living in locations that used to be part of the padding. The padding was probably all zeros at the beginning of the product's life so we may need to put some sensible initial values in those fields as soon as we start running a version of software that uses that area.
In some cases, there might be an ad-hoc way of detecting the change. For example, if the value of a new field is zero, when zero is not a legal value for that field, we can assume that this is the first time running with the updated software. Such schemes quickly turn ugly when a number of upgrade paths are possible—you have to assume that some customers out there will have all the previous versions and they might upgrade to any version newer than the one they have.
A better option is to put a version number in the persistent store. There is a bit of a dilemma about where to place this version number. In some cases, in spite of the method just discussed, the CRCs might move. Reading the version number might tell us where they are for this particular version. If the version number is inside a block that has a CRC, then you do not know how to CRC the block before reading the version number, because we do not know the size of the block. Because of this possible catch-22, a reasonable approach is to store the version number in the first byte and the inverse of it in the second byte, as a check. Changes to the structures that follow will never reposition the version number, and the version number will never depend on the size of the blocks that we are checking against a CRC.
The version number of persistent store is different from the version number of software. It is possible that many releases of the software will be made that do not change the layout of the persistent store and, so, the version number stored there does not need to be upgraded. Any time that the layout of our structure is changed, we must update the constant in software that says which version of persistent storage is in use. We will call this constant the persistent store version number (PSVN). It is important to be aware that any version of software will have a PSVN defined as a compile time constant. The PSVN stored in persistent storage is the same as the PSVN of the program that wrote the data to that persistent store.
When the program boots, it needs to check the PSVN in persistent storage. If it doesn't match the compile time constant, we know that this version of software is being run for the first time on this device. It will be necessary to set any new fields to sensible defaults. The number of new values in the structure depends on how many versions there are between the current PSVN and the PSVN found in the persistent store.
Once you have set up reasonable values in all fields, update the PSVN stored in the persistent store to match the number defined in the current revision of software. In addition to setting the new values, the conversion process may involve massaging some of the old ones. On one project, the programmers changed the range of values that the user was allowed to enter. The old range was 0 to 60. The new range was 0 to 50. If the setting was 55 when you did the upgrade, then the first time the new version of software is run it will have an illegal setting, which, on this system, would cause an assert to fire, since there was no legal way for the user to set a value of 55 in the new version. The solution was to check the setting, and if it was above the new maximum, 50, then set it to be equal to 50. This meant that we were not preserving all of the settings exactly, but it was a reasonable compromise.
Other problems are caused by values that change their meaning, as when an enumerated type is reordered. For these types of changes, which require the persistent store to be updated, it is necessary to increase the PSVN for the new version of the program, even if the actual layout has not changed.
Pointers in non-volatile data
In most cases you would not store pointers in the structures that are persistently stored. Pointers into RAM lose all meaning after a reset. Occasionally, however, pointers to ROM can be usefully stored. They may point to constant strings. In other cases there may be a set of structures in ROM. Each structure represents a mode of operation. Storing a pointer to the structure represents the current mode of operation. You can avoid pointers by storing the structures in an array and simply storing the index into the array. In that case you are paying the price of an array access every time the structure is used.
If you are one of the brave souls who decides to store pointers in persistent store, you should take note of the following warnings:
- Any recompilation may change the location of pointers in ROM, so the PSVN needs to be increased for every compile.
- The first time a new version of software is run, all of the pointer fields need to be set to defaults.
With this scheme, I have sometimes found it useful to set a flag during debugging to indicate that all persistent data needs to be set to defaults during start-up. This avoids the need to change the PSVN after every recompile. Such a flag may also be useful if your layout is constantly changing during development. Once things settle down, it is obviously important to test the system without this flag so you can observe proper persistence of the data through a reset or power cycle.
Dropping fields from non-volatile data
So far, we have considered what happens when a field is added. If a field is dropped, the layout will also change. My usual policy is to rename them as dummy1, dummy2, and so on. By renaming them, you can be fairly sure that no other part of the program will use them, believing them to be valid. Since they have not been removed, the layout does not change. If space gets tight later on, there is no harm in reusing one of them, so long as it is exactly the same size and does not change the layout in any other way, for example, by causing padding to be inserted for alignment reasons.
Padding in non-volatile data
While on the subject of padding, it is worth considering that there may be padding between any of the fields that you insert into the structures. The compiler does this to satisfy the requirements of some types to only exist on word boundaries, for some processors. For example a char may follow another char with no unused bytes in between. However, a char followed by an int may have an unused byte inserted in between which ensures that the int starts on a word boundary. For some preprocessors, the software can work with any alignment, but uneven alignment will have a performance penalty. How much padding is required is a function of the processor and of the compiler used. In some cases, compiler flags can change the amount of padding.
Compiler flags may also change the size of certain types. Enumerated types can often fit in a single byte. Some compilers have a flag that can force all enumerated types to be 32 bits. Changing such a compile-time switch between releases of your software can change the layout. If you are forced to make such a change you may be able to compensate by adjusting the size of your padding array, or inserting a dummy field to consume space where the compiler used to insert padding. Ideally, such byte shuffling should be avoided.
Storing C++ objects in non-volatile data
On a number of projects, I've considered the alluring proposition of placing C++ objects in persistent store. The placement new operator allows you to choose the location of an object, and to run a constructor which will use the exact location you specify.3 This allows you to run a constructor on the space the first time it is used, or when reinitializing after a failed CRC.
My view of this technique is that you get very little return on investment. The objects in persistent store do not behave like ordinary objects. If you are using inheritance, the size of the objects changes, and that affects the layout of the objects. If you use virtual functions, the virtual function pointer is stored with the object, and may therefore become invalid each time you recompile your software (just like the pointers discussed previously). Unlike the pointers discussed earlier, you may not be able to fix the virtual function pointer, since the compiler knows the value of that pointer, but your program does not have direct access to it.
While it is no doubt possible to get your objects to live in persistent memory with some restrictions, I prefer the approach of placing simple structures in persistent store and then using an object living in normal RAM to wrap the persistent data. If the object in RAM contains a pointer to the structure within the persistent store, the user of that object will get all of the benefits of the data being preserved across power cycles, without the hassle of worrying about some of the more esoteric pieces of C++. If any readers have experimented with this, and found some real advantages to placing objects in persistent store on an embedded system, let me know.
User-friendly data upgrades
It is usually a good idea to indicate to the user that an upgrade of the persistent data has been performed with a message on the display. If the user performed the software upgrade, this will reassure him that it went smoothly. It will also make it more likely that problems will be spotted, for example, if a corrupted persistent store is falsely interpreted as being a valid, but older, set of data.
Saving some persistent data can hugely enhance the user experience, especially in underdeveloped locations such as California, where power outages can happen at any time. Just remember that persistent storage is the one place where bad data will not get fixed by cycling the power, so you have to take more care with persistent data than you do with any other.
Related Barr Group Courses:
For a full list of Barr Group courses, go to our Course Catalog.
Endnotes
1. Murphy, Niall. "How to Protect Non-Volatile Data." . [back]
2. Emshoff, Bill. Feedback to "Assertiveness Training for Programmers," in response to the article as originally published in Embedded Systems Programming, March, 2001. [back]
3. Cline, Marshall. "What is 'placement new' and why would I use it?" C++ FAQ (part 4 of 10), Question 11.10, February, 2000. [back]



