There are many sources of bugs in software programs. Some are created by the original programmer. Others by misunderstandings by those who later maintain, extend, and/or reuse/port the original code. Both types of bugs can be kept out by following simple coding standard rules. To increase security and keep bugs out of medical devices and other safety-critical embedded systems, add these 10 bug-killing rules to your embedded C coding standard.
The remainder of this page is a transcript of a 1-hour webinar. A recording of the webinar is available at https://vimeo.com/98478085.
Slide 1: Top 10 Bug-Killing Coding Standard Rules
Good afternoon and thank you for attending Barr Group's webinar on the Top 10 Bug-Killing Coding Standard Rules. My name is Jennifer and I will be moderating today's webinar
The presenters today will be Michael Barr, Chief Technical Officer at Barr Group, and Dan Smith, Principal Engineer at Barr Group.
The presentation will last for approximately 40 minutes, after which there will be a moderated question and answer session. Please hold your questions until the end, so the presentation can go on without interruption. Once the presentation ends, you will have time to submit questions for our presenters to answer for the group. If you think of a question during the presentation, write it down so you don't forget it later.
During the webinar, please close out background programs and turn off anything that might affect your audio feed as this webinar is being recorded. And, with that, I am pleased to present Michael Barr and Dan Smith's webinar the Top 10 Bug-Killing Coding Standard Rules.
Slide 2: Michael Barr, CTO
Thank you Jennifer. Welcome everyone. My name is Michael Barr. I am the Chief Technical Officer of the Barr Group and I am pleased to have you all here for this webinar.
My background, briefly, is as an Electrical Engineer and also a Software Developer. My focus, for more than 20 years, has been on the design of embedded software and in particular I have focused in my practice as a consultant and as a trainer in best practices for embedded software process and also for embedded software architecture.
I've worked across a range of different industries including some products that can kill the users such as medical devices and industrial controls. I’ve also been a teacher speaking at conferences around the world, speaking at private companies as well and also at the University of Maryland and at Johns Hopkins University where I have been an adjunct lecturer.
Some of you may know that for about three and a half years I was Editor and Chief of Embedded Systems Programming Magazine, which has now become embedded.com, but I was an editor back in the day when it was at Print Publication with about 60,000 subscribers and later a columnist for the magazine. And I have also worked in addition to doing engineering and consulting in training as an expert witness for example working in a number of patent and copyright cases related to embedded systems software and also in the Toyota unintended acceleration litigation. Finally, I am the author of 3 books and more than 70 articles in papers, all of them about embedded software.
Slide 3: Book : Embedded C Coding Standard
One of those books is the Embedded C Coding Standard, which the slides for today's webinar are based in part upon. This is a book that we publish at the Barr Group; it was initially developed for our own internal use as our own internal coding standard as we engaged in our consulting business. And our engineers follow this coding standard, but it's also available for purchase at Amazon and on our own website in various different formats such as PDF and print, Kindle, etc.
In a nutshell the book contains the 10 bug-killing rules you are going to learn about here as well as other bug-killing rules, our own internal stylistic rules and it's overall, it is designed to be and it is complementary to the MISRA-C guidelines, meaning that you can use our coding standard in addition to those MISRA-C guidelines, which do not address style and Dan who will teach the course today, will go into a little bit more detail about that.
Generally, the philosophy behind this coding standard, which Dan will also go into, is that bugs are expensive to find and kill once they are in the software. So, it's always cheaper and easier to keep a bug out and so the rules in our coding standard favor keeping bugs out from the beginning.
Slide 4: Barr Group - The Embedded Systems Experts
The Barr Group is a group of very talented engineers and what we do is we don’t make products of our own, but rather we help companies make their products safer and more secure. Our focus is always on the embedded software and the electronics that it runs upon and we do product development where we take on pieces of the project, sometimes the whole and we also do consulting providing engineering guidance to Vice Presidents and other engineering leaders. And also we do training for engineers and you can find out more about us at barrgroup.com. We serve a number of different industries and you can find a lot of good information there.
Slide 5: Upcoming Public Courses
I mentioned that we do trainings and we have some trainings coming up. These are public trainings that anyone can attend, one person or a small group from a company can attend and these are hands-on one week programming courses. The course titles, dates, and other details for upcoming public courses are available at the Training Calendar area of our website.
Slide 6: Overview of Today’s Webinar
Briefly, to summarize the goal of today's webinar it's to make sure that everyone who is listening understands that there are few simple rules you can follow that will reduce the number of bugs and the severity of bugs in your embedded systems and ultimately what you are going to take away from this course are 10 specific rules. All of them are easy to follow, all of them will by following them keep bugs out. So, these are good rules to include in any coding standard whether you choose to adopt our coding standard is a separate issue. We hope that you will adopt these 10 rules in your coding standard whatever that might be.
The only thing that we are assuming in terms of teaching this course is that you are already familiar with the syntax of C or C++ and that you are also familiar with embedded programming somewhat at least to the sense of understanding the constraints on embedded developers and the differences between embedded development and other types of software development.
Slide 7: Dan Smith, Principal Engineer
I would like to introduce Dan Smith, Principal Engineer here at the Barr Group, who is going to be taking over and teaching the course. Dan has a bachelor’s degree in Electrical Engineering focused in Computer Engineering from Princeton and more than 20 years of experience as an embedded systems designer.
He has designed electronics, he has designed software, he has worked in a number of different industries. You can see some of them listed here. Medical devices, transportation equipment, control systems etcetera. And also he has used many different technologies in doing that different real time operating systems, different CPU families and other computing platforms and tools.
Finally, through his role at the Barr Group, Dan is a seasoned engineer, instructor, he is a developer of our Embedded Security Boot Camp course and he will be teaching it again this fall and he also speaks at various industry conferences and he is a consultant working with a number of different clients.
Dan’s focus throughout his career has been on making embedded software and embedded systems that are reliable, safe and secure and you can see it we have provided here his e-mail address, should you do have any follow up questions after the course.
Now with that I turn it over to Dan and I hope you all will enjoy and get a lot of good information out of today's webinar. Thank you.
Slide 8: Why Adopt a Coding Standard?
Thanks Michael and thank you Jennifer as well. Before we get into the actual 10 Coding Standard Rules that we are going to recommend here let's talk briefly about why to adopt a coding standard. Certainly one of the reasons to adopt the coding standard is to improve the readability of the code, to make the code more consistent. Particularly, if you are working in a large development organization it's nice to have one style, one look and feel for the code. It makes it more readable, more maintainable, we can probably all agree on that.
We’ve probably also seen the justification for portability, in other words having a coding standard, which focuses on removing portability issues can make your code more portable down the road when the hardware changes etcetera. However, that said probably the best reason to have a coding standard is to keep bugs out of your system whether it's during the initial development process or during the maintenance process.
Each of these 10 Coding Standard Rules that we are going to talk about today is focused entirely on keeping bugs out of your system because we all know how difficult it is to debug code.
Slide 9: Where Bugs Come From
So as I just mentioned one of the primary reasons for having a good coding standard is to keep bugs out of your system. So, let's talk briefly about where bugs come from in the first place. We can all agree that there are bugs in compilers, there are silicon errata but by far and away the majority of bugs in a system are introduced by us the developers. And let's not forget that most of the time somebody else is probably going to be the person who takes over and maintains our code.
When we are writing code we are making assumption about the underlying platform, about future developers and their capabilities etcetera. All sorts of things we are not even really aware of and these assumptions are often undocumented. So there is an opportunity for disconnect between the original programmer and how he or she wrote the code and the person who comes along next who has to maintain the code.
We strongly encourage you any time you find yourself making assumption to document these very deliberately and comments in the code. That’s not one of the 10 rules we are going to talk about today, but we want to mention it because this can help overcome the disconnect that can happen between the original developer and the next person who comes along to change the code.
Slide 10: MISRA–C Guidelines
No talk on an embedded C coding standard would be complete without mentioning the MISRA-C coding guidelines. The MISRA-C guidelines came out of the automotive industry precipitated by the need to move from assembly language to C. Let's face it, we all love the C program language, but it is a dangerous language indeed.
MISRA-C has become the standard for embedded C programming and essentially all safety related industries such as medical devices, industrial controls, avionics you name it. I am sure there are people listening today who use the MISRA guidelines on a daily basis.
MISRA-C defines a subset of the C programming language that’s designed to increase the safety and reliability of code. Now, the MISRA guidelines are not a coding standard per se, it's a set of rules. For each rule a rationale is provided and this really important because as long as a rationale is provided the rule is more likely to be followed because the engineers are given a reason for it, they feel that the rule is not arbitrary or dictated, but they actually understand why the rule is in place. So, even if you are not in automotive, even if you are not doing anything that’s safety critical the MISRA rules can still benefit your software project. So, take a look at the rules and adopt and incorporate those to make sense for your project.
Slide 11: Coding Standard Enforcement
In Barr Group we work with a lot of companies and a lot of different code bases and probably about half the companies and projects we work with have a coding standard. However, when we get the code and we look at the coding standard the two don’t seem to work together. In other words the code does not often seem to resemble the coding standard and we ask ourselves well how does this happen, what's going on here and generally what happens is that it comes down to enforcement. So, good coding rules have to be enforceable and we found that objective rules tend to be more enforceable.
By objective I mean rules that are not subject to interpretation, rules that can be enforced automatically by a tool for example.
Lastly, the coding standard has to be embraced and adopted by every member of the team. There can’t be exceptions for this developer, that developer in fact any deviations from the coding standard need to be documented and they should also be very rare.
Slide 12: Rule #1 –Always Use Braces
Okay, on to our first rule, rule number 1, which states to always use braces. There are certain key words in the C programming language such as for, while, if, else etcetera; which are followed by a statement to be executed. The way the language is designed, this can either be a single statement such as an assignment or a function call or what have you or it could be a compound statement, which is one or more statements surrounded by braces.
The way the language is designed if you only have a single statement or even an empty statement following an if, while, else etcetera the language does not require you to use braces. So many developers don’t use braces for a single statement, but almost always this is a disaster waiting to happen. First of all lack of braces can make it more confusing for others to come along to maintain your code. This is somewhat subjective so if that’s not enough to convince you to always use braces I will give a couple more reasons that directly with code maintenance, so let's go look at some code.
Slide 13: Always–Braces Example Code
Okay, so here we have code. I think we can all agree that the way this code reads when foo = 5 we are going to call bar, so we are going to call bar conditionally and then we are going to call the function always run all the time regardless of the value of foo.
Now let's say it's late at night you are debugging and you need to make a change and you are thinking that call to bar is causing a problem. So, just temporarily you want to comment out the call to bar. So you go ahead and put 2 forward slashes in front of it and you run your code and you are testing now.
Well what's going to happen is that now always run is going to be called not all the time but conditionally based on the value of foo, only when foo = 5. This is almost certainly not what you wanted, you just wanted to comment out the call to bar, but you did not want to affect how and when always run is called. So that’s one little surprise you are going to run into because there were not braces surrounding the call to bar.
Now let's approach it from the other side where we want to add in code. So, now when foo = 5 we want to call not only bar, but we also want to call function fred. Well again because we didn’t put the call to bar inside 2 braces, now what's actually going to happen the way this code reads is that when foo = 5 we are going to call bar, but we are actually going to call fred unconditionally all the time just like always run in spite of how the code is formatted.
So we can see just by surrounding the call to bar in braces, it makes the code easier to maintain whether you are removing code or adding code and whether it's you maintaining the code or somebody else who comes along later.
Slide 14: Always–Braces Example Code
So here we have two examples of using the braces even when it's not strictly
necessary. Here we have a single function call to bar inside the conditional. This allows us to add code or remove code inside the braces and the behavior will be exactly what we expect. And the second example here there is effectively a null statement, an empty statement that gets executed while the timer is not expired, but that’s again exactly what we want and we have a nice expressive comment there, so that the next person who comes along knows exactly what we are doing in this code.
Slide 15: Rule #2 – Whenever Possible ‘const‘
Okay, on to our second rule, rule number 2, which states to use the const keyword whenever possible as much as possible. Here on this slide we list four different opportunities for using it. There are certainly other cases where you would want to use it as well. What I want to mention though, however, is that for embedded programmers as const is particularly important because any object that is const, anything that is not going to be modified at runtime could be placed by your development toolset into your ROM section and especially if you are working on a resource-constrained system on a small microcontroller typically your ROM, your flash is a lot more plentiful than your RAM.
Another thing I want to point out by using const is that it catches problems at compile time. So any attempt to write to a field of a structure or an object that’s marked as const will be caught at compile time as opposed to at runtime. And I think we all know anything you can catch at compile time, at build time instead of a run time that’s a win.
Slide 16: Maximize-const Example Code
So let's look at a few examples very quickly. So, here we have a couple of objects gp_model_name, which is a pointer to a constant care, it's really a pointer to a string and there is no reason for this string to be modified at runtime this is the products name. Similarly here we have a build number, it's to find it's a constant integer. It's initialized at build time and it should not be modified at runtime.
The next example function parameters that must not be modified. So here we have strncpy, a function many of you are familiar with. We want to copy from the source to the destination. So, obviously there is no reason that anybody should be writing to the region that source is pointing to. So it's marked as a pointer to a constant character.
Lastly and the final example we define an object called heap size of type size_t, initialize it to 8192 presumably that’s going to be the size of our memory heap that we are going to use at runtime. So, by declaring it as constant in an object of type size_t, we have to type safety; however, we are not using that preprocessor pound to find macro.
There are some examples, some cases where you still have to use a pound to find preprocessor macro, for example a case label and a switch statement or the size of an array. In C++ it's a little bit different, but in C you can’t get away with using a constant object for that you still need to use the preprocessor macro.
Slide 17: Rule #3 Whenever Possible 'static’
Okay, moving on rule number 3, which states whenever possible use the static keyword. So, the static keyword should be used to declare all functions and variables or objects that don’t need to be visible or accessible outside of the module the source file, the translation unit in which they are declared. The whole point of this is encapsulation. This is all about localizing data and functions to provide better encapsulation. By preventing modules from accessing data or functions that they shouldn’t have access to, you make the software more maintainable and less brittle.
For example, you might have a timer driver, timer.c or something like that. You don’t necessarily want all the variables and all the functions in that module to be accessible from the outside world. The nice thing about static is that it's enforced by the toolset during a build. So, if somebody tries to reference something that they don’t have access to your build will fail.
Again the nice thing about this just like the const keyword is that you will fail at build time as opposed to at runtime. The upshot of this is that makes your code more maintainable by a limiting visibility and access to the outside world.
Slide 18: Maximize–static Example Code
Okay, so let's look at some code here. Here we have a file timer.c the first thing we are going to do is include our associated header file timer.h many of you probably implement your code like this. So timer.h is what we will expose the public aspects of this file. So not everything in this file, not every function and not every object is going to be exposed to the outside world.
So here we have a variable called g_next_timeout it's a unit32, but it's declared to static and what that means is that no one outside of timer.c will be able to see this object. So, this is an implementation detail in your timer driver that no one except for the actual implementation needs to access.
We also have a function called add_timer_to_active_list, this declared to static. So this is a helper function, a function that the outside world cannot call this is not part of the API, perhaps the API is something like start timer and then this helper function is used as part of the implementation, so perhaps we have a linked list of timers. There is no reason that the outside world needs to care about the implementation. Therefore, we make this function static preventing anybody in the outside from ever calling this function directly.
Slide 19: Rule #4 Whenever Necessary ‘volatile’
Okay, on to rule number 4, which states whenever necessary use the volatile keyword. There are few different use cases for the volatile keyword. We will go through 3 of these right here right now.
The first is when you have a global variable or an object that’s shared between an interrupt service routine and non-interrupt code, so for example if you have a foreground/background architecture where you have interrupts running in the foreground communicating by writing things to variables that are swept up by your background that’s a perfect example of where those objects, those variables need to be declared as volatile.
Second example is if you have a variable where 2 or more tasks are communicating by sharing that variable again not the way you generally want to write your code, but if you do have such a model those variables, those objects need to be declared as volatile.
The third example and the example that’s probably most familiar to those of us who know the volatile keyword is when you have hardware. So you have got memory-mapped I/O and you are writing to that hardware through a structure overlay or a pointer that needs to be declared as volatile. Otherwise your compiler and your optimizer are going to play some games and it might actually cause your code not to work. Just a quick interesting side note here, it's our experience that only about half of embedded developers even know about or understand the volatile keyword.
Slide 20: Optimization: Redundant Reads
Let's look at a code example. Here we have a hardware peripheral, that’s a timer, that’s memory-mapped and it has not been declared as volatile. You guys have all seen code like this before. First we reset the count to zero, we start the timer and then we wait for it to count up to 100. Remember I said that the timer peripheral is not been declared as volatile. So the compiler C’s count written to zero and then it never sees it changed again.
So as far as the compiler is concerned timer.count will always be zero. So the compiler doing it's very best effort to remove code, to make your code smaller and faster, so what the compiler is going to do it's going to say hey in that while loop I don’t need to keep reading from timer.count it's always going to be zero. So, I am just going to turn that while zero is less than 100, I am just going to turn that into a while 1. It's a win, win. Smaller code, faster execution time, the problem is it's actually not going to go read your timer peripheral each time through that while loop, so it's going to get stuck there forever.
Slide 21: Optimization: Unnecessary Writes
Now let's look at another example of hardware not being declared as volatile. Except this time is that are reading from the hardware, we are going to be writing to the hardware. So here we have the hardware peripherals called led_reg and presumably the way this works is setting a bit low in the register turns on the corresponding LED. So, this first line of code presumably we’re turning on the right most LED, okay indicating a patient is dying. Then we have a bunch of other code that does not read from LED register and then we have a line of code that says LED register = FF.
So again what the compiler is going to do and it's very best effort to do a good job for you and to make your code as small and fast as possible it's going to say if you wrote FE up there and then he never read from it and then he wrote FF so I can just go ahead and do that final write to FF and no one is going to be the wiser and I save it as that’s CPU cycles and code space. Awesome, it's a win, win right.
So we’re never are going to see that LED turn on indicating that the patient is dying that right operation was optimized away from the compiler and it had every right to do that because LED register was not declared as volatile.
Slide 22: More on ‘volatile’
So, just a few more words on the volatile keyword here; so often times what happens is as you are developing code and you are getting near the end of a project you begin to run out of memory or you begin to run out of “CPU cycles”. So what's the first thing you often do, you turn on the optimizer to try to get back some memory, to try to get back some CPU cycles.
What often happens is, once the optimizer is enabled the code stops working. Most developers will immediately say, “Ah, I just found a bug in the compilers optimizers,” but very often that’s not the case. Typically it's a missing volatile keyword either on your access to hardware or to a shared variable. So, the first thing we recommend you do when you turn on the optimizer and something breaks is look for missing uses of volatile when you should be. So these examples I showed you a couple of slides ago, these recommendations went to use volatile.
Another thing is that the volatile keyword is essentially unused outside of embedded software. So this is a very good question to ask when you are interviewing people. One other notable uses of volatile the keyword that I will point is when you are doing security related functions and you have sensitive data in memory such as plain text that’s needs to be encrypted or encryption keys or things like that. What often happens is that these are local and when you are done with them you want to wipe them, you want to overwrite them to zero because as we all when you returned from a function that information is still sitting there in the stack even though it's not supposed to be accessed.
Well, if you go to wipe these keys or this plain text on your stack at the end of a routine. The compiler might say well after this person overwrites this with zeros no one ever reads back from this, so I am just going to optimize out that that wiping that overwriting function not understanding the security sensitive information. By marking those objects as volatile you are guaranteed that any write operations, specifically the wiping is actually going to take place.
Slide 23: Volatile Usage – Example Code
So let's just look at a few examples where we are using the volatile keyword in the way it should be used. So in the first example here we have a global variable called g_state and again we all know the global variables are not good and we don’t want to use them, but in this example we have one. We are initializing it to SYSTEM_STARTUP and it's marked as volatile and that’s means in any task or in any thread or in any interrupt, any access to g_state, any read or any write will be performed, it will not be optimized away.
The next example we have a structure overlay called my_fpga_t and we are declaring a pointer it's called p_timer and it's a constant pointer to a volatile my_fpga object. So the pointer itself isn’t going to change what it points to, but what it points to is volatile to a piece of hardware presumably over in an FPGA. It's very important that we mark hardware accesses as volatile.
And then lastly we have an example of an array presumably local called plaintext, size MAX_PLAINTEXT and it's marked as volatile that means any wiping that we are going to do at the end of the routine where this is declared will actually be performed.
Slide 24: Rule #5 Don’t Disable Code with Comments
On to rule number 5 which says, “don’t disable code by commenting it out.” What we are saying here is if you need to disable code for example you are testing, you are debugging use the preprocessors conditional compilation feature. The thing is nested comments are not part of the C standard. The preprocessor’s conditional compilation functionality is designed to nest properly, so that is going to work on any toolset, on any compiler, on any platform whereas nested comments you might move the code to a different platform and it's not going to work anymore and that’s because the support is not part of the C standard.
If you have to make experimental code changes, make a branch in your version control system that’s what it's there for. Don’t leave commented out code there in the code base for the next person to come along, that’s dead code. Remember, commented out code is not compiled, so it could be that that code has been there for 10 years and it wouldn’t even compile anymore. Maybe the dependencies have changed, maybe it was even for different hardware platform, but the next person who comes along is going to look at that code, scratch his or her head and say, “Hmm, I wonder if this is important.” But longer that code links in your system the more it rots the more it becomes out of date and the more potential it has to cause problems. So get rid of that code keep it in your version control, but get rid of it so that it's not polluting the minds and code base of everyone who is working on it.
Slide 25: Commented–out Code Example
So let's look at some code that’s been commented out here in the DON’T section here. Originally we have 3 lines of code incrementing A and then a comment and then incrementing B. Now someone comes along and they just want to comment out or disable this big block of code. So they wrap this thing in your classic C comment with a /* and then after at the end of all that code that they want to disable they put a */ thinking that they are effectively disabling all of that code.
Well because you have got this nested comment in here it's very possible that the end of the comment where we see nested comment that that’s /* that that’s actually going to end the comment that were started where we see outer comment in red. And so we are going to comment out the a = a + 1, but the b = b + 1 is not going to be commented out and of course that last red */ is probably going to give you a compiler error. So, this is a code that will work on one compiler, you go to a different compiler it might behave differently.
So our recommendation again using the preprocessor #if 0 to temporarily disable code. As I mentioned before please don’t check in code like this in your version control. This is only why you are developing and debugging but then eventually what you check in should not have this in there at all. It's going to be confusing to people and it's another opportunity for code to be unintentionally reactivated by someone with good intentions who just doesn’t understand that that code needs to be disabled. Remove it from your source code check it into your version control you are good to go.
Slide 26: Rule #6 Fixed-width Data Types
Okay rule number 6, which is about the usage of fixed-width data types. So an embedded programming sometimes you really need to specify the specific width of an object. Perfect example is when you are doing memory-mapped I/O and you have a structure overlay and certain registers in your hardware are 16 bits, some are 8 bits, some might be 32 bits. This is defined by your hardware and when you are defining a structure that you are going to overlay you need to make sure that the data types that you are using match up exactly against the hardware registers.
And another reason for using fixed-width data types, sometimes you know the size and the range of the objects that you are dealing with and if you use a type such as int that might work on the current platform you are using, but then if you port that code to a different platform where now an integer is no longer 16 bits perhaps now it's 32 bits and now that code is going to break. So our recommendation is to use C99’s fixed width data types for signed and unsigned values and I am going to show you on the next slide what those are. And one of the upshots of this is that in general you are not going to want to use things like int, long, short, care etcetera because these are not portable. The width of these types varies from platform to platform.
Slide 27: Recommended Fixed–width Type Names
So let's look at the types that are provided for you in C99. The header file you are going to want to include is standard int.h, stdint.h, is provided here for you. Notice that we have both signed and unsigned types, as small as 8 bits and as large as 64 bits. Now notice that this is independent of the underlying architecture so whether you are using a little 8 bit microcontroller or a 32 or 64 bit microcontroller, all of these types are available for you. How they are implemented on your processor platform that’s a different story.
If you are unfortunate enough to not have a C99 development toolset, what we advise is that you create a header file by using typedefs create the same types, so that your code looks as if it's written for C99. And then when you move over to a C99 development platform your code will still work and run just as it did before.
Slide 28: Rule #7 – Bit-wise Operators
Rule number 7, basically rule number 7 all comes down to this, the advice, “don’t use bit-wise operators on signed data.” A corollary to this is anytime you have an integer literal, a decimal constant and you want it to be treated as an unsigned value put the U suffix at the end. Most people rely on the fact that their underlying architecture is probably a 2’s complement architecture, but the C standard does not specify that and there is no guarantee of that, but bit-wise operations rely on those assumptions and that’s why bit-wise operators on signed integers or signed data are not supported. Let me show you some examples and you will see the kind of trouble you can get into.
Slide 29: No Bit-wise Signed Example Code
So here in the first example we have something called signed data, it’s a signed 8 bit value when you are sized to get to -4 no problem with that and then almost certainly the developer in the next line where he or she is shifting sign data right by 1 is trying to divide it by 2. But you don’t need to try to outsmart the compiler and tell it to shift. Just tell it divide by 2 and it will figure out the best way to do that. A right shift of a signed value is implementation defined. So what's going to happen is not necessarily going to be portable across different architectures. You have no reason to expect that that right shift is going to actually divide that by two.
In the second example, here we have a value of 32 bit signed value, we are initializing it to -100 and then we are less shifting it by 1. Presumably the intention is to multiply it by 2, but this is in fact undefined behavior, all bits are off as soon as you venture into undefined behavior.
In the last example, here we have an 8 bit unsigned value we are calling it max_unsigned and it's ~ 0. Well we all know what 0 looks like, ~ 0 it's going to flip all the bits and that’s going to give us the maximum unsigned value, it's 255 because it's 8 bits. Now someone comes along and wants to do the same thing for an 8 bit signed value. So they do ~ 0 as well. Well that’s actually not going to represent the maximum signed value which will be +127, it's actually going to give you -1 when all the bits are set in the 2’s complement notation. So these are all examples of how performing a bit-wise operation on signed data can burn you.
Slide 30: Rule #8 – Don’t Mix Signed & Unsigned
This next rule, rule number 8 is one of my favorites. It's one of my favorites because this is a rule that I find is eye-opening even for experienced developers. The rule is to not mix signed and unsigned values in a comparison or expression. The reason for that is that C has very, very complex integer conversion rules part of which is what's called integer promotion rules and mixing signed and unsigned values in the same expression or a comparison can lead as a very unexpected behavior. And the problem with this is things can work for a long period of time because you have never actually mixed values where you are going to see this.
Sometimes this is the case where bugs can actually even escape to the field, they pass all tests, they get out to the field and then something very unexpected happens. So let me show you some code and let me show you how this actually works and it might be an eye-opening for a lot of you.
Slide 31: No Mix – Signed Example Code
So here is a very simple example that illustrates exactly the kind of hazard that you can run into when you mix signed and unsigned in the same expression. We have an int S initialized to -9; we have an unsigned int U initialized to 6. Now if you ask any 3rd grader what is -9 + 6 that child is going to tell you it's -3. So, we would expect because -3 is less than 4 that we would go into the if clause there. But what actually happens is we are going to go into the else clause and that’s because S, the signed integer, is promoted to an unsigned int before it's added to U.
Slide 32: Rule #9 Favor Inline Over Macros
Okay, our second last rule, rule number 9 states to favor inline functions over macros. A lot of people don’t realize that as of C99 inline functions are part of the C standard. Now it's important to understand as we are noting here that the inline keyword is not a command to the compiler, it's more of a hint or a suggestion just because you put inline it doesn’t mean that the compiler is actually going to inline the function and in fact if something people are not often aware of is that you can also have functions, which do not have the inline keyword that the compiler – if you turn on the optimizer that the compiler might actually inline anyways; but it is a hint, it's a suggestion to the compiler. So, let's take a look at some code and we will show you how this can actually save you from getting into trouble.
Slide 33: Inline vs. Macros Example Code
So here we have an example, we have a macro called square, it takes a single parameter A, notice that there is no type for A because square is a macro, it's not a function. Macros have no concept of type checking, macros are handled by your preprocessor not by your compiler. So presumably the intention is whenever you use this square macro the intention is for that to expand into an expression, which is the square of the parameter that’s passed; but now let's say you pass ++I, let's say you initialize I = 5, int I = 5 and that you call square with ++I.
Well you can look here at the macro expansion and you can guess pretty quickly what's going to happen. I is going to be pre-incremented twice. So you are actually going to evaluate 6 times 7 again assuming I was initialized to 5 before we called square, instead of which you would expect which is a value of 36, 6 times 6 it's actually going to evaluate to 6 times 7 because that pre-increment operator is going to evaluated twice that side effect.
Now let's look at a similar implementation by using inline functions. So again, here now we have type checking, so the input parameter is a uint16, obviously for squaring it we need the return value to be a uint32. Someone, so someone is calling square with a value that won't fit inside a uint16 your compiler is going to give you a warning, or at least the static analysis tool will.
So now we have type checking, similarly if you call this inline function square with ++I when I is set to 5 you are going to invoke this function, it's going to be passed to value of 6 and you are going to return 6 times 6 which is a value of 36, which is exactly what you expect. So this illustrates the fact that inline functions give you type safety and safety from side effects.
Slide 34: Rule #10 One Variable Declaration Per Line
Okay, last but certainly not least rule number 10, our final rule is to give each variable declaration it's own line in your source code. I mean after all each of these little variables is going to work so hard on your behalf during the programs execution. The least you can do is give it's own line in the source file, all right. The main reason for this is to improve the readability of the code. The compilation process is not going to go any slower. The code is not going to be any bigger or run any slower if you give each variable it's own line, but now the code is going to be a lot more readable.
Case in point, let's look at the last line of code here. If I want to ask everybody what is the type of variable Y, I am sure some people would say pointer to character and some people would say character. It turns out that the answer is character that asterisk binds to the X, but by putting these two variable declarations on their own line the code is unambiguous. So all we are saying is write your code, thinking about the person who comes after you who might now have the same strong command of the C programming language that you do.
Slide 35: Key Takeaways
So as we discussed the coding style of the original programmer will influence not only the bugs introduced by him or her, but also this can have an influence on bugs that are introduced later on during the maintenance phase. And then lastly a coding standard is only as good as its enforcement, so make sure that your coding standard rules are enforceable and that you process mandates that the coding standard is actually enforced.
So, I hope that you found these 10 rules to be valuable if you have a coding standard you might want to integrate some of these rules into your own coding standard. If you don’t have a coding standard perhaps these 10 rules can be the basis for a coding standard that you begin to use in your own development process.
So that’s it. Thank you very much for joining us and we look forward to seeing you next time. I am now going to turn the presentation back over to Jennifer.
Question and Answer
Q: How / where can I buy Barr Group's Embedded C Coding Standard book?
In addition to being for sale at Amazon and from various other booksellers, our coding standard is available as a book or PDF from our website at:
Q: Does Barr Group have a coding standard for embedded C++?
That said, we use C++ in our own work, and believe it has its place in embedded development. One book that several of us own is "C++ Coding Standards" by Sutter & Alexandrescu. Also recommended is Martin Reddy's "API Design for C++".
Q: Where can I get the latest C standard?
C99 - just search for "WG14/N1336" and look for PDFs. Our coding standard is based on C99.
The following sections might be of particular interest:
- Bitwise operators - section 6.5.x
- Conversion rules - section 6.3
Our coding standard precedes C11, the latest C standard, which is available here:
Q: Where can I get the MISRA C Coding Guidelines?
The MISRA C Coding Guidelines are available for purchase at:
Barr Group has no official affiliation with MISRA, by the way.
Q: What static analysis tools can check these rules?
Most of the rules can be checked by mainstream static analysis tools. Some rules (e.g., preferring const over #define, not commenting out code, use volatile when needed) are more difficult for a tool to check and enforce.
In our daily work, we use a variety of tools, including PC-Lint / Flexelint (Gimpel), LDRA, Klocwork, Coverity, PQRA, etc. We encourage you to visit vendor websites and research tool capabilities and costs.
Q: Does Barr Group offer private trainings?
Q: For legacy code, do you recommend going back and fixing it all now (brackets, CONST, etc.) or fixing it as things come up?
We recommend that you apply coding standard rules incrementally to legacy code as you go back and maintain it. Of course, apply & enforce the coding standard vigilantly to any new code.
Q: In your experience, do most organizations use some kind of coding standard?
In our experience, we'd say approximately half of organizations *have* a standard, fewer actually use it, and even fewer enforce it.
Q: Are there any issues with over-using the volatile keyword?
The primary issue is performance. If you use "volatile" somewhere it doesn't need to be used, you'll be constraining the set of optimizations available to the compiler.
Generally, under-use of volatile is more of a problem than over-use.
Q: Can I incorporate the Barr Group coding standard into my own coding standard?
Yes. This text is directly from the beginning of the book:
This document as well as the selection and arrangement of the rules it comprises is Copyright © 2013 by Barr Group. It is permissible for individuals, companies, and institutions to adopt all or a subset of the coding rules herein; indeed we hope that many more will. This may be done simply by identifying the “Barr Group Embedded C Coding Standard” as the source of your rules and retaining this paragraph in its entirety. All other rights in copyright law are reserved by Barr Group.
Q: Could you discuss using "enums" vs. const vs. #define for constants?
Each of those options has their own pros and cons.
Dan Saks, author of the "Programming Pointers" column, has a good article on these 3 options, including usage under the C++ programming language. The article can be found here:
Q: A "static where possible" rule was mentioned -- do you have any caveats to using static within a function such that the variable is static instead of automatic causing unexpected concurrency bugs?
Excellent question, and in fact this is a problem / question posed in one of our Embedded Software Boot Camp course. Whereas a static local variable "lives" throughout the life of the program, an automatic local variable is "destroyed" when the function ends. A static local variable retains its state across invocations of the function, whereas an automatic local variable requires initialization each time the function is called. As you pointed out, this can lead to concurrency issues, because multiple tasks / threads could call the function with the static local variable, causing race conditions. A static local variable shares the same hazard as any other "global" variables, the only difference is that the variable can only be accessed from with its defining function (but from the context of any thread).
Q: What advice do you have for me to convince my team that fixing our code is worth the time it will take? My team leader is all about keeping to schedule, and obviously following these rules for our existing code will take some time.
There is always the tension between quality and schedule and cost. I can only say that my experience is that taking shortcuts and cutting corners always costs more in the long run. We all know that cutting corners in the short term can have much higher costs in the long term. Thus, it's our contention that in the end, following a good coding standard will actually reduce the overall development time, although during the larger development push, the "overhead" of following a coding standard is more tangible than the reduced debugging time on the back end.
Essentially this is a process issue, and ultimately it's a cultural issue. Some good reading on the topic would include:
- The Mythical Man Month
- Code Complete (2nd ed.)
- Clean Code: A Handbook of Agile Software Craftsmanship
Q: If access to a non-volatile global is wrapped in mutex lock/unlock, is the compiler still possibly going to optimize out a read/write that may affect behavior?
Absolutely, this problem can still happen. The mutex will protect the object from concurrency issues, but lack of volatile can still cause the compiler to remove reads or writes when it's performing aggressive optimizations.
Q: If bit-wise operation is not recommended for division on signed data, what should I use if I want to divide ‘a’ by 2 where ‘a’ is a signed INT32?
Just use C's division operator (/), the compiler (assuming it was written after 1982!) will generate the best code. Same goes for multiplying by 2 - don't shift, just say "a *= 2;" and let the compiler do the work for you!
Q: Are tools available that could detect and report usage of magic number? i.e. direct numbers instead of using #defines?
I'm not aware of any such tool, but I'd like to hear from anyone who knows of such a tool.