Controller Area Network (CAN) is the most widely-used automotive bus architecture. Here are some reasons why.

At peak, some automobiles contained up to three miles of cabling. To reduce the cost and weight of wiring and still allow ECUs to become more intelligent, new methods had to be found to reduce the amount of wiring. The CAN bus has since found application in other industries as well.

While wiring weight leads to some problems, the complexity of the wiring leads to other difficulties, specifically diagnosing faults and making minor modifications. I once encountered one emergency vehicle that was abandoned because faults in its wiring harness couldn't be diagnosed and rewiring the whole vehicle would be as expensive as purchasing a new one. Reducing wiring could alleviate this problem.

Cars are not the only application domain affected by wiring weight and complexity. In some luxury yachts, manufacturers add concrete blocks to one side of the boat to compensate for the heavy wiring loom on the far side.

A bus architecture is the only way to keep the volume of wiring from becoming unmanageable. In this article, I discuss CAN, the most widely-used automotive bus architecture.

Controller Area Network (CAN)

The Controller Area Network (CAN) bus has come to dominate the automotive industry in Europe, and U.S. manufacturers are starting to adopt it. Hundreds of millions of CAN controllers are sold every year and most go into cars. Typically, CAN controllers are sold as on-chip peripherals in microcontrollers.

For the physical layer, a twisted pair multidrop cable is specified with a length ranging from 1,000m at 40Kbps to 40m at 1Mbps. The maximum payload of a message is 8 bytes, and all messages carry a cyclic redundancy code (CRC). Each message has an identifier, which can be interpreted differently depending on the application or higher-level protocols used. All nodes on the network receive each message and then decide whether that identifier value is of interest.

Choosing a CAN controller defines the physical and data-link portions of your protocol stack. In a closed system, you could choose to implement your own higher-level protocol. If you need to interoperate with other vehicle components, though, the vehicle manufacturer will most likely mandate that you use one of the standard higher-level protocols. For engine management, the J1939 protocol is common, while CANOpen is preferred for body management, such as lights and locks. Both buses run on the same hardware; different application-specific needs are met by the higher-level protocols.

CAN is a relatively slow medium and can't satisfy all automotive needs. For example, in-car entertainment requires high-speed audio and video streaming. These needs are being addressed by Media-Oriented Systems Transport (MOST) and IDB-1394b, which is based on Firewire. Diverse requirements mean that vehicles will generally have to run more than one bus.

Nondestructive Bus Arbitration

The requirements of a bus in an automotive environment are different from desktop networks, where Ethernet is the technology of choice. In an embedded environment, the bus needs better real-time performance. Delays caused by the bus will form only one part of the delay, but if there's a nondeterministic component in the bus architecture, it will be impossible to completely compensate for it at higher levels.

If two Ethernet nodes start transmitting at the same time, the resulting signal on the bus is an abnormal voltage. Both nodes detect this and both nodes back-off for a random period and then try again. Neither node has priority, and so whichever node retries first will gain advantage. If the nodes clash again or clash with a third node, there will be further delay. A well-managed Ethernet network is operated well below full capacity, keeping such clashes to a minimum, but still leaving us with a nondeterministic component in our communications. Since the original clashing messages were both destroyed, this situation is sometimes referred to as destructive arbitration.

CAN takes a different approach. Every bit transmitted on the bus is defined as recessive or dominant, which maps to 1 or 0, respectively. All nodes can listen and transmit at the same time. If more than one node is transmitting, the result will carry a dominant bit if at least one node is transmitting a dominant bit. When a node transmits a dominant bit, it will see a dominant bit on the bus. In this case, the node will not know if anyone else was trying to transmit. If a node transmits a recessive bit, but a dominant bit is seen on the bus, the node knows that someone else is on the bus.

The clever part of CAN bus arbitration is the first node's decision to back off if some other node transmits a dominant bit the first time the first node sends a recessive bit. The identifier is the first part of the message transmitted; by the time the identifier has been sent, all nodes bar one will have backed off. The message identifier is sometimes called the arbitration field because it decides which messages get priority.

All nodes transmit a single dominant bit when starting a message. This is the start of message (SOM) bit. Any node just listening will see bus activity and will not attempt to start a transmission until the current packet is complete. So the only possibility for collision is between nodes that simultaneously send an SOM bit. These nodes will remain synchronized for the duration of the packet or until all but one of them backs off. After the SOM bit, the arbitration field is transmitted. The winning node will always be the one with the arbitration field of the lowest value, because it's the one that will transmit a dominant (0) bit first, while the other nodes are transmitting recessive bits. Thus, you could consider the numerical value of the arbitration field to be the priority of the message.

This is nondestructive bus arbitration, since the highest priority message doesn't get destroyed. In fact, the node transmitting that message doesn't even know that a collision happened. The only way for a node to know there is a collision is for the node to see something on the bus that's different from what it transmitted. So the successful node and any other listening nodes never see any evidence of a collision on the bus.

The highest priority message always gets through, but at the expense of the lower-priority messages. Thus, CAN's real-time properties are analogous to the properties of a preemptive real-time kernel on a single processor. In both cases, the goal is to ensure that the highest-priority work gets completed as soon as possible. It's still possible to miss a hard real-time deadline, but there should never be a case where a high priority job misses its deadline because it was waiting for a lower-priority task to complete.

If a number of nodes clash, one will win out. After that message has completed, all of the "losers" will try again. In this second round, the next highest-value arbitration field will win out, and the process will repeat. There's nothing to stop the highest-value arbitration field from being transmitted again. This is similar to the situation in a preemptive real-time kernel where a high-priority task could choose to run continuously and thereby prevent some lower-priority tasks from completing their work. In both cases, it would be bad design to lock out lower priorities in this way, but it's important to realize that the CAN bus doesn't prevent this scenario—it's the designer's responsibility to ensure that no one message type hogs the bus.

The arbitration field can be 11 or 29 bits long, depending which variation of the protocol is used. You can use the first few bits for priority and the remaining bits to identify the message type. The CAN standard doesn't dictate what meaning you attach to those bits, but the many higher-level protocols that sit on top of CAN do define them. For example, the J1939 standard allows one portion of the bits to be a destination address, since the CAN protocol itself specifies a source address for all packets, but doesn't mandate a destination address. This is quite reasonable since much of the traffic on an automotive bus consists of broadcasts of measured information, which isn't destined for one specific node.

Fault Tolerance

CAN provides a number of fault tolerance mechanisms. One is the inclusion of a 2-bit acknowledgment field. During the acknowledgment time after each packet is sent, the transmitter sends a recessive bit while any receivers send a dominant bit. The transmitter can thus determine that at least one node has received the packet. This prevents a disconnected node from continuing its transmission, blissfully ignorant that no one is listening.

CAN also provides globalized message error handling. If one node on the bus receives a corrupted message, it sends an error frame to alert the other nodes to discard the message and the transmitter to resend the message. This increases the likelihood that all nodes will receive the same message at the same time.

In addition, when the bus speed is kept below 125Kbps, the bus can use a fault-tolerant mode where the bus will function if one of the two wires is cut. The motivation for this design is that the bus may continue to operate after a car crash has severed one of the lines. One-wire mode is also used if one of the lines is shorted to ground or to the supply voltage. In this mode, noise tolerance is reduced. Each node continues to monitor the faulty line and will resume dual-wire operation if the fault condition goes away.

Noise Tolerance

Information is carried on the bus as a voltage difference between the two lines. If both lines are at the same voltage, the signal is a recessive bit. If the CAN_H line is higher than the CAN_L line by 0.9V, the signal line is a dominant bit. There's no independent ground reference point for these two lines. The bus is therefore immune to any ground noise, which in a vehicle can be considerable.

The signals on the two CAN lines will both be subject to the same electromagnetic influences, and so the difference in voltages between the two lines will not vary. Because of this, the bus is also immune to electromagnetic interference.

Stateless Messaging

If two nodes are communicating, it's reasonable for the receiving node to request that a message be repeated if the first attempt is corrupted. On a CAN bus, much of the traffic is broadcast messages. Because there are many receivers, it's possible (though unlikely, thanks to the error handling described previously) that one node will be affected by a local failure, while other nodes have successfully received the message.

For this reason, you should avoid using messages that depend on previous state or contain relative information. Consider a hypothetical message that indicates that vehicle speed has increased by 10mph. If one node receives a corrupted message and requests a repeat, some of the other nodes will receive two complete identical messages. This will lead to some receivers believing that the total change in speed was 20mph.

I consider it good practice to avoid these types of messages, regardless of the communications architecture. Messages that depend on state information make it more difficult to design one node so that it can be reset independently of the rest of the system. If a node resets and then receives a message that depends on some state information, such as the current speed, you have to ensure that this state information can be retrieved after each reset.

Event-Driven and Time-Triggered

CAN is an event-driven protocol. The bus architecture doesn't impose any restrictions on when nodes are allowed to place messages on the bus. An alternative approach is a time-triggered protocol where messages have preallocated time slots. FlexRay is an example of a time-triggered automotive bus protocol. FlexRay has a maximum bandwidth of 10Mbps, and may prove to be the successor to CAN when the complexity of automotive networks leads to requirements that can't be met by CAN. However, current investment in CAN will ensure that such a transition is many years away.

While the basic CAN bus definition doesn't contain a time-triggered scheduling mechanism, the Time-Triggered CAN (TTCAN) protocol, which sits on top of standard CAN hardware, provides a mechanism for scheduling messages. You can alternatively design your own schedule if your application is running on a closed network.

In many designs, it's simpler to allow each node to send messages at arbitrary times. If you're simply transmitting a simple measured value, then the software will always read the value just when it's needed for the next transmission. However, if the message is the result of some event, such as an alarm condition, then the software is responsible for delaying that message until its slot becomes available.

One of the main motivations for time-triggered communications is that it fits well with the design of process control loops. If you need to use the velocity of a wheel as feedback for a control loop, then having a guarantee that the velocity will appear on the bus at fixed intervals means that the control loop has a fixed worst case for the latency for that data.

Other Considerations

Of almost 300 million CAN nodes sold in 2002, only 15 million were stand-alone chips. The remainder were built into microcontrollers, usually 16-bit parts. So if you are using CAN, you'll probably be programming an on-chip peripheral. No doubt some CAN peripherals are sold into applications that don't use that particular peripheral, but you can still interpret 300 million as meaning this protocol is mighty popular.

Because the CAN hardware looks after the entire packet, including CRC checks, the overhead on the processor is far less than it would be for an equivalent serial port. Failed messages are retried automatically, with no software interaction.

The CAN controller queues incoming and outgoing messages. The length of this queue will have a big impact on how long your processor can spend processing a single message. In other words, a short queue will increase the risk that you'll miss a message.

Since most CAN nodes are destined for the automotive market, battery consumption is important. For this reason most CAN controllers have a sleep mode, where they'll be awakened if a message does appear on the bus. This sort of power saving becomes important when you leave your car in an airport for two weeks, and you would like to have some juice left in the battery when you get back.

De Facto Standard

CAN has become a de facto standard for automotive communications and is going to dominate the automotive scene for many years to come. It's also having considerable impact in other industries where noise immunity and fault tolerance are more important than raw speed. Because CAN hardware has become so cheap and is integrated into so many microcontrollers, it's a design option well worth considering the next time you want to get your embedded systems talking to each other.

This article was published in the August 2003 issue of Embedded Systems Programming. If you wish to cite the article in your own work, you may find the following MLA-style information helpful:

Murphy, Niall. "A Short Trip on the CAN Bus" Embedded Systems Programming, August 2003.