Click for a larger image



Using Deep Memory To Find The Cause Of Elusive Problems

Introduction

The frequency of an anomaly in a hardware or embedded software system can vary from once every bus cycle to once every million bus cycles or more. The ability to capture anomalies in a digital stream of data is enhanced significantly when a logic analyzer with "deep memory" is used. This application note will show how deep memory can be used effectively to debug hardware and embedded software errors. Problems such as memory leaks, stack overruns, and hardware glitches will be used as examples to demonstrate the usefulness of deep memory. Additionally, this application note will discuss how to select a logic analyzer with deep memory based on features such as memory acceleration that affect the speed of operation, increase productivity, and prevent the "swallow and wallow" phenomenon.*

 

*"Swallow and wallow" is the term that some people use to describe acquiring vast amounts of data (swallow) and then "wallowing" around in it aimlessly trying to find what you need. As we describe here, sophisticated data handling techniques can eliminate the confusion and help you quickly find the exact data you need.

Why Do I Need Deep Memory?

Figure 1 shows a simplified view of how a logic analyzer captures data. The logic analyzer is connected to a data source and is set to continuously collect data into a circular buffer. When the buffer fills, new data overwrites the oldest data in the buffer. If you stop the data capture at any point, you have a "window" of data that you can scan to look back and get a picture of what happened up to that point. How far you can look back in time is determined by how much memory you have in the circular buffer - i.e., how deep is your memory. Earlier logic analyzers functioned well with 512 Ksamples of memory per channel. However, the speed and complexity of today's designs demands troubleshooting and analysis tools with ever increasing power. To answer that need, the Tektronix TLA7P2/4 and TL7N1/2/3/4 deep memory modules for the TLA700 series logic analyzers provide up to 16 Msamples of memory per channel.

 

Figure 1. How a logic analyzer captures data.

 

Typical Problems That Can Be Solved With Deep Memory

Real-time software problems. Real-time software problems are difficult to debug because they only occur when the system is running "at speed." In these instances, debug monitors fail to provide visibility because they do not have real-time trace capability. Emulators can often help, but sometimes lack the triggering capability or the acquisition memory depth to find the problem. Logic analyzers with deep memory allow users to perform "real-time trace" on large amounts of historical data to identify the problem. Real-time trace records the activity of the program without stopping execution. Debug monitors and emulators only display the current status of the program (at the time it is stopped), not how you got there. In real-time systems, you often cannot stop the program while data is coming in without losing a significant amount of information. Thus, real-time trace is critical for debugging these types of routines. The deeper the trace memory the better.

Crash problems. Embedded systems differ from computer applications in that they generally do not have protection from a stray program crashing the entire system. Computer operating systems have many schemes for isolating the system from a misbehaving application - embedded systems often do not. Thus, when your embedded software system crashes, it frequently takes the whole system down, losing any information that may help determine the cause. Logic analyzers can provide the history to quickly determine the cause of the crash. Here again, the deeper the memory the more data you have to analyze to find the problem.

Deeper memory also means that the source of the crash can be further away, or "decoupled," from the actual crash. As embedded application software complexity increases, the decoupling of cause (problem) and effect (crash) can greatly increase.

Memory leaks. A memory leak is an error in a program's dynamic memory allocation logic that causes it to fail to free up memory that is no longer used, leading to eventual collapse due to memory exhaustion. These leaks often caused immediate crashes in older designs with small fixed-size address spaces. With the increasing amount of memory available in systems today, it may take a longer amount of time for the crash to occur and this makes it more difficult to isolate the fault. However, deep memory provides a trace buffer large enough to accommodate the needs of modern designs.

Coupling deep analyzer trace memory with conditional storage with context capability can greatly extend the time window acquired by the analyzer. For example, acquiring just the memory allocation/deallocation routines along with the context just prior to and after the routine can limit the analyzer acquisition to just the areas of interest and greatly extend the capture window. Contextual storage provides a means of having the analyzer automatically capture windows of activity around an event(s) of interest.

Hardware glitch and timing. A glitch or timing error can occur when the inputs of a circuit change, causing the outputs to change to some random value for a brief time before they settle down to the correct value. If another circuit inspects the output at the wrong time and reads the random value, the results can be wrong and very hard to debug. If the logic analyzer has deep memory and glitch storage, the user has the ability to trigger on the symptom and still acquire the glitch that occurred much earlier, causing the eventual failure.

Stack overruns. Stack overruns occur when a program attempts to push more information onto the stack than it can hold. The maximum size of a stack is set first by the size of numbers the relevant register can hold; second by the initial value of the stack pointer. If a logic analyzer does not have deep memory, it is difficult to trace historical stack pointer cycles to capture overrun data.

 

-Part 2 of 2