How to Choose an Embedded C Compiler

Let's face it, there's nothing sexy about the topic of cross compilers. Embedded programmers couldn't get the job done without one, but spend very little time thinking about how they work or how they could make our work easier.

Most of the time our choice of compiler is limited. It may be dictated to us by the hardware or system designer's choice of a processor or by our own choice of a real-time operating system or debugging tool. In such cases, we must put up with all of the annoyances of the particular compiler we're tied to. But what if you have more choice? What are the little things you should look for when comparing two or more cross compilers that will both work with your required hardware and software?

Just to be clear, let's understand that we're talking about C/C++ compilers that run on a PC, Mac, or UNIX workstation and produce code for a specific target processor that is used in an embedded system. That's what we mean by a cross compiler. The target processor may be an 8-bit microcontroller, a 32-bit microprocessor, or even a DSP. Neither the host platform nor target processor matters for the purposes of this discussion.

As with life in general, the little details are easily overlooked, and yet they often matter most to our happiness in the long run. Little details make our use of a particular cross compiler easier and reduce our frustration with the project as a whole. Ideally, you should never have to think about your compiler. It should simply be a tool that you use to turn algorithms and rules for system behavior into executable programs.

It's obvious to most embedded systems designers that the efficiency and compactness of the code produced by a compiler is often tantamount to the success of a project. And if that's the case for you, be sure to select the best compiler in that regard. But if more than one compiler satisfies those requirements, or if those issues are not as important on your next project, you can make a decision based on little differences like those described below.

Inline assembly

Though it has been more than 25 years since the introduction of the C programming language, it is still commonplace to use some amount of assembly language when developing software for embedded systems. On almost every project I've worked on there have been a few critical functions or algorithms that ran significantly faster when re-implemented by hand in assembly.

But interfacing assembly language routines with high-level language functions can be difficult. The programmer must study the parameter passing rules for function calls in the high-level language to learn what registers should be saved and restored on function entry and exit and how to return any result to the calling function. These are details that are much better handled by the compiler than the programmer. And there is often no advantage to implementing the entire function in assembly. Rather, all of the speedup can be achieved with just a few instructions of assembly placed strategically within the larger C/C++ function.

For example, early in my career I implemented a digital filtering algorithm in C and targeted it to a TI TMS320C30 DSP. The only cross compiler available to us at that time was unable to take advantage of a special processor instruction that performed exactly the mathematical operations I needed extremely fast. By replacing one of the for() loops in the filtering function with that one special 'C30 instruction, I was able to speedup the overall calculation by more than a factor of ten.

The compiler feature that made this possible was called inline assembly. This feature is not available in all cross compilers, and there is no standard for how it is implemented. The best implementations I have seen simply add a new asm keyword to the C language. Whatever follows on that line (or within the bracket-enclosed block that follows) is assembled rather than compiled. Even better, you can still refer to variables and the other symbols of your C/C++ program within the assembly language code. You need not know in advance which register or memory location the compiler will select as the container for the data you need.

Listing 1 contains a simple example of the use of inline assembly. In this example, assembly language is used to access an I/O port on an 80x86 processor. That processor family's in and out instructions cannot be invoked directly from C/C++. Assembly language is necessary at some level. Implementing it within the high-level language function is attractive because there is no programmer overhead involved in saving and restoring registers and because it is more efficient than calling a general-purpose I/O wrapper function (like inport() or outport() from <dos.h>).

#define LEDPORT 0xFF5E  /* LED Control Register */

/**********************************************************
 *
 * Function:    setLedMask()
 *
 * Description: Change the state of a set of 8 LEDs.
 * 
 * Notes:       This function is 80x86-specific.
 *
 * Returns:     The previous state of the LEDs.
 *
 **********************************************************/
unsigned char
setLedMask(unsigned char newMask)
{
    unsigned char  oldMask;

    asm {
       mov dx, LEDPORT  /* Load the register address      */
       in al, dx        /* Read the current LED state     */
       mov oldMask, al  /* Save the old register contents */
       mov al, newMask  /* Load the new register contents */
       out dx, al       /* Modify the state of the LEDs   */
    };

    return (oldMask);

} /* setLedMask() */

Listing 1. Inline assembly example

Of course, the asm keyword is not a part of the ANSI C standard. And I'm sure that some people might argue that extending the language in a non-standard way like this is a bad idea. Don't get me wrong. I do think standards are a good thing, especially language standards. But let's face it. You're writing software for one particular embedded system, and if you need to use assembly language at all then your program will not be easily portable. In the embedded systems case, code portability is not as important as the ease of getting the program right the first time, even on a target processor you aren't that familiar with. Inline assembly makes the programmer's life easier, and it should be considered an important feature of your next cross compiler.

Interrupt functions

Another desirable feature for a cross compiler is the interrupt type specifier. This non-standard keyword is a common addition to the C language for the PC platform. When used as part of a function declaration, it tells the compiler that that function is an interrupt service routine (ISR). The compiler can then generate the extra stack information and register saves and restores required for any ISR. (A good compiler will also prevent a function declared this way from being called by some other part of the program.)

It should be clear that the overhead associated with entering and exiting an ISR is no more or less in C/C++ than it is in assembly. Either way, the same set of opcodes must appear at the beginning and end of that block of code. It's within the body of the ISR that efficiency issues may arise. If the ISR is not particularly time-sensitive, the entire ISR could be written in C/C++. This would certainly make the implementation easier to write and understand. However, it is likely that the programmer will want to augment his high level language ISR with inline assembly where it will improve performance.

The advantages of the interrupt keyword are similar to those of inline assembly. The programmer doesn't have to know as much about the ISR-requirements of a particular processor. He need neither know what additional registers are saved and restored nor what special instruction, if any, is used to return from an interrupt. All of this makes his program more likely to work on the first try.

I have also seen this feature implemented as a processor-specific #pragma. For example, the GNU compiler (GCC) recognizes #pragma interrupt to mean that the next function in the file should be treated as an interrupt service routine. Unfortunately, only a few target processors (Hitachi H8/300 and Tandem ST-2000) are supported by this feature at this time.

If I were in the business of writing and selling compilers myself, I think I'd take the interrupt feature one step further. It's not a big stretch for the compiler to understand the structure of a processor's interrupt vector table. (This is the processor's table of addresses of ISRs, indexed by interrupt type.) That being the case, it would be simple to add an interrupt type to the ISR marker (e.g., #pragma interrupt(0x1E)). This would make automatic generation of the interrupt vector table possible and eliminate the potential for programmer misunderstanding or error.

Of course, all of these features should be options. I'm not suggesting that every project or programmer would be well served by C-wrappers for entirely assembly language ISRs, or by a tool that generates the interrupt vector table automatically. But there are many situations in which the programmer's task would be made easier and the entire project finished more quickly as a result.

Assembly language generation

Can your current cross compiler generate an assembly-language listing? Mine can. A command-line argument causes this compiler to produce assembly-language listings as part of the compilation process. Each input C/C++ source file results in a single assembly-language file being created.

The assembly language files contain the results of each compilation, exactly as the target processor will execute it. But this listing is in a human-readable form. The original C/C++ code is provided in comments that are interspersed with the assembly. Each line of source code is followed by the compiled result.

I find this feature very helpful for manual code optimization. This is because you can easily see what code is produced for each line of your high-level language program. And if a particular function is too slow for a given application, you'll be able to easily select the best part of the function to rewrite in assembly.

Standard libraries

When you're developing application software for a general-purpose computer, you expect that your compiler will include a set of standard C libraries, math libraries, and C++ classes. These include various routines like memcpy(), sin(), and cout, respectively. But because the functions in these libraries are not strictly part of the C or C++ language standards (the library standards are separate), a compiler vendor may choose to omit them. Such omissions are more common among vendors of the cross compilers used by embedded systems programmers. So in many cases, you've got to fight for your right to the standard libraries.

Just think about how much time you'd spend rewriting all of those functions yourself. Then spend some of that time insisting that standard libraries be included with any compiler that you buy. Of course, it's unlikely you'll be using printf() in the majority of embedded systems applications. But you may not realize just how many of the other functions you will need until it's already too late.

If standard libraries are provided, be sure that they are reentrant. In other words, that each of the functions within those libraries can be executed multiple times simultaneously. Reentrant functions can be called recursively or from within multiple threads of execution. What this means in the case of a library routine is that it may not use global variables. All of its internal data must be on the stack.

Fortunately, if you absolutely must buy a compiler that does not come with standard libraries, there is an alternative. Cygnus (now RedHat) developed a standard C library and math library specifically for embedded systems. All of the functions include source code and are designed with reentrancy in mind. And porting these libraries to new platforms is made easier by a design that puts all board and processor-specific interfaces into a single directory. The package is called newlib and the latest version is available for download at https://sourceware.org/newlib/.

Startup code

Another thing that non-embedded software development tools usually do for you automatically is to include startup code. Startup code is an extra piece of software that executes prior to main(). The startup code is generally written in assembly language and linked with any executable that you build. It prepares the way for the execution of programs written in a high-level language. Each such language has its own set of expectations about the run-time environment in which programs are executed. For example, many languages utilize a stack. Space for the stack must be allocated and some registers or data structures initialized before software written in the high-level language can be properly executed.

Startup code for an embedded system should be provided with the cross compiler. If the compiler is designed to be used for embedded software development, it generally will be. But it is also important to consider whether this code and its proper use are well documented. The startup code will likely be written in the assembly language of your target processor and should, ideally, be provided to you in source code form. If properly implemented, you shouldn't ever need to make any changes, but it's still helpful to look at it and understand what it does.

Startup code for C/C++ usually performs the following actions:

1. Disable interrupts

2. Copy any initialized data from ROM to RAM

3. Zero the uninitialized data area

4. Allocate space for and initialize the stack

5. Create and initialize the heap

6. Execute the constructors and initializers for all global variables (C++ only)

7. Enable interrupts

8. Call main()

To properly utilize the compiler-supplied startup code, you'll need to know how it should be linked with your program. You'll also need to know where and how to place the initialized data in ROM (so they can be copied to RAM) and how to set the size of the stack and heap. In the best case, these steps are documented in the literature provided by the compiler vendor. But I have seen many cases in which they were not.

Target Processor

The first step in selecting a cross compiler is finding one that will produce code for your target processor.

Host Platform

The next step is to decide on a development platform. If there are several platforms available, you may want to check some of the other items in this list before making a decision about the host.

RTOS Support

If you're planning to use an RTOS, does the compiler vendor have a working relationship with your RTOS vendor? This is important because part or all of the RTOS may be provided in object files or libraries. For compatibility, your compiler and linker must support that same object file format.

Integration with Other Tools

Is the compiler compatible with any debugging environments? Is a make utility included? If the compiler is shipped with an IDE, is it extensible so that you can integrate your version control tool?

Standard Libraries

Will you need functions from the standard C library, math library, or C++ classes? If so, are they provided with the compiler? Are all of the functions in those libraries reentrant?

Startup Code

Is startup code for embedded systems provided? Are the code and its use well documented? If you can't find any mention of startup code in the user's manuals for a potential cross compiler, consider that a bad sign.

Execution Speed Optimizations

If your program is too slow, you'll want the compiler to try to speed it up. Will the compiler do this? If so, what specific optimizations are supported? Can they be individually enabled or disabled?

Program Size Optimizations

If your program is too big for your target memory, you'll want the compiler to attempt to reduce the amount of code space used. Will the compiler do this? If so, what specific optimizations are supported?

Support for Embedded C++

The Embedded C++ (EC++) standard is a proper subset of the C++ language and libraries that reduces run-time overhead. In order to restrict yourself to EC++ functionality, you'll need a cross compiler that knows what features of the language it is not allowed to use.

Table 1. Checklist for cross compiler selection

Details

Obviously, a compiler that lacks some of the features I've mentioned above may still be a good compiler. And there will undoubtedly be a few people who would argue that non-standard features like the asm and interrupt keywords reduce code portability. But if you've got a choice between two or more cross compilers that are otherwise equivalent, you may want to look for these things. Even if they aren't strictly required to get the job done, they may just make your work easier. And they will certainly reduce programmer frustration.

This article was published in the May 1999 issue of Embedded Systems Programming. If you wish to cite the article in your own work, you may find the following MLA-style information helpful:

Barr, Michael. "Choosing a Compiler: The Little Things," Embedded Systems Programming, May 1999, pp. 71-78.