These built-in troubleshooting resources for DMA controllers can pave the way for smoother embedded software integration.
Many blocks within chips process data read from or written to memory using direct memory access (DMA) controllers. In these operations, firmware initiates the data processing by setting up the blocks and DMA controllers. Firmware sets up the DMA controllers by writing the starting address and byte count into their respective registers. After the DMA transfer has been initiated, the address and byte count values continually change as the DMA controller moves the data.
The prevalence of DMA controllers in embedded devices means that DMA transfers can be a common source of problems during testing and integration. In "Firmware-Friendly FPGA (and ASIC) Design Tips," I suggested some useful resources to include in chip designs to facilitate firmware integration. This article continues in the same vein, focusing on facilities to assist troubleshooting problems with DMA controllers.
Status Registers
Tip: | Design DMA controllers to provide current address and byte count values throughout the transfer. |
When DMA problems occur, the first task is to figure out if the problem is in the block, in the firmware, or in the data. The current values of the address and byte count registers in the DMA controllers can provide clues to troubleshoot the problem. Temporary debug statements in firmware can retrieve those values for analysis.
In addition, DMA controllers should be designed to allow firmware to read both the starting values (that firmware originally wrote) and the current values (that the DMA controller is currently using) of the address and byte count registers. These starting and current values can assist troubleshooting activities in different scenarios.
Table 1 lists DMA register states that can indicate problems during DMA transfers either from or to memory. While not comprehensive, this table shows how current state information in DMA registers can provide information useful for troubleshooting data processing problems.
DMA Register Status |
Potential Problem | |
DMA from Memory | DMA to Memory | |
Both the address and byte count registers are unchanged from what firmware wrote. | The DMA transfer has not started, maybe due to an incorrect setup of the DMA registers. | The block has not yet given data to the DMA controller. The block may be set up incorrectly. |
The address and byte count registers are one DMA burst size off. | One burst size from the beginning indicates that the DMA transfer has started but the block has not consumed the data. The block might not be set up correctly. | One burst size from the end of the buffer might indicate that a last byte is stuck somewhere in the block. |
The address and byte count registers are somewhere in the middle of the transfer. | Maybe the data read in had a corrupted spot causing the block to generate an error and quit. The address indicates the general vicinity where the corrupted data is in memory. | This might indicate that the block terminated early, stopping it from sending more data to the DMA controller. The byte count indicates how much was transferred to memory. |
The address and byte count indicate a completed transfer but the block has not finished. | The block may be expecting more data than the DMA controller was programmed to transfer. | The block might have more data than the DMA controller was programmed to transfer. |
Table 1. Sample DMA register troubleshooting guide
Reading the current values in the DMA registers not only helps troubleshooting but also can be used to work around defects in the hardware. While useful for FPGAs, this capability is especially beneficial for non-FPGA devices with their higher cost to respin chips. I experienced this benefit first-hand during one firmware development effort for a block with a faulty state machine (alas, without the state machine registers discussed previously). In this instance, I could not abort the block unless I knew that the data flowing into the block was stalled. I did this by monitoring the DMA byte count register until it quit changing. As long as data was flowing, the byte count register would keep changing; but once the flow stalled, the byte count register would not show a change across several reads. Of course, I had to consider how often that register would change and make sure I sampled across enough time to catch a change if the data was still flowing. This technique became part of normal operation so we could ship without respinning the chip—it was indeed a good thing that the DMA controller had this capability for this contingency.
Chaining (Scatter/Gather)
Tip: | Design DMA controllers to provide the starting values of the address and byte count for the current buffer in the chain, and the pointer to the next buffer in the chain throughout each chaining (scatter-gather) operation. |
Some DMA controllers are equipped with chaining (scatter-gather) capabilities that use linked lists to indicate to the controller the location and size of multiple buffers in memory. In addition to the current value of the address and byte count register, the DMA controller should make available the starting address, byte count, and next pointer for the buffer in the chain that the controller is currently working on. This information can help firmware engineers diagnose linked list and chaining problems, as it did once for one of my project teams: It helped us discover that we had not correctly translated the virtual addresses of the linked list pointers into the proper physical addresses needed by the DMA hardware.
Cyclic Redundancy Check (CRC) Generator
Tip: | Include a CRC generator in the DMA controller module instantiated throughout the chip. |
Another useful debugging tool is a cyclic redundancy check (CRC) generator that calculates the CRC of the data moving through the DMA controller. After the DMA controller transfer is completed, firmware can read the CRC value. Since data corruption problems are typically not noted until the end of the pipeline, looking at the DMA controller CRCs at the various steps within the pipeline may give clues to corruption problems.
Memory corruption can occur for a variety of reasons, such as a rogue process writing to that location, cache flush and invalidation problems, or—as happened to us once—a bug in the memory controller generating a bad address occasionally. Whatever the reason, CRCs can help narrow down where to look for the sources of data corruption. If the CRC from the DMA controller writing data to memory does not match the CRC from the DMA controller reading data from memory, then that is a clue that corruption happened to the data in that memory location.
Note that the CRC may not be the same throughout the data pipeline. A block processing the data is likely to be modifying the data. So the DMA controller CRC when the block reads the data may be different from the CRC when it writes the data. Adding the CRC generator in the DMA controller module that is instantiated throughout the chip will assure that the same CRC algorithm is used in all locations.
Byte-Swapping
Tip: | Include a byte-swapping capability in the DMA controller module instantiated throughout the chip. |
Like data corruption, byte swapping issues are a common source of annoyance, confusion, and defects. Building a byte-swapping capability into the DMA controller provides another useful debugging tool.
In my own experience with LaserJet printer development, our team has dealt with endianness issues in processors, blocks, communication channels, and print jobs. On one of our previous projects, the printer under development was experiencing a severe performance penalty. Our team discovered that the data was being byte-swapped 11 times; many of the swaps were redundant, and many were done via firmware. Several conditions led to that situation:
- The firmware was leveraged from a previous product with different swapping needs.
- Engineers knew their little piece of the pipeline and did not know what byte swapping was done outside their little piece.
- Engineers invoked an additional byte swapping step when they should have disabled one before or after it.
- Engineers invoked a firmware-base routine, not knowing the hardware had the ability to do so.
After a thorough analysis of the whole pipe, we were able to reduce the number of byte swaps to three and to do most of them using the byte-swapping feature in our DMA controllers. Reducing the byte swaps and using the DMA controller byte-swapper improved the printer's performance dramatically.
Even if you don't think you will need byte swapping capability in your DMA controllers for a current platform, you just might when you develop a new platform.
Related Barr Group Courses:
For a full list of Barr Group courses, go to our Course Catalog.