Finding and killing latent bugs in embedded software is a difficult business. Heroic efforts and expensive tools are often required to trace backward from an observed crash, hang, or other unplanned run-time behavior to the root cause. In the worst case scenario, the root cause damages the code or data in a subtle way such that the system still appears to work fine or mostly fine--at least for a while.
Too often engineers give up trying to discover the cause of infrequent anomalies that cannot be easily reproduced in the lab--frequently dismissing them as "user errors" or "glitches." Yet these ghosts in the machine live on.
So here's a guide to the most frequent root causes of difficult-to-reproduce bugs.
#10: Jitter
#9: Incorrect Priority Assignment
#7: Deadlock
#6: Memory Leak
#4: Stack Overflow
#1: Race Condition
Keep an eye out for all of these Top 10 Firmware Bugs whenever you are reviewing source code. And be sure to follow the recommended best practices to prevent them from happening to you again.