Design Article

Debugging multiprocessor code

Jakob Engblom

7/17/2008 11:00 PM EDT

Debugging code running on multiprocessor computing systems--and, in particular, parallel code on multicore devices--is an old problem that has achieved new prominence because of the profound transformation of hardware from single-processor to multiprocessor and multicore solutions.

A survey performed by Freescale and Virtutech at this year's Embedded Systems Conference indicated that the top issues in multicore software development are a lack of determinism and repeatability of bugs, an inability to stop an entire system to debug software, getting existing software to run on multicore systems, and inadequate visibility of all states in an embedded system.

Operating systems offer a good case study for those interested in software ports to parallel computers. First, OSes are by necessity the first code that must be put in place if a new machine is to be useful. Second, OSes are fairly well studied, with data available from several generations of parallelization efforts: on mainframes in the 1960s, desktops in the 1990s and, now, in embedded.

In general, porting an operating system to an symmetric multiprocessing (SMP) system involves splitting the OS data into one portion that's local to each processor and one global portion, with locks to protect the shared data. For an initial port, most of the work is in splitting up the data; locking can be kept simple, sacrificing some performance to ensure correctness. Long term, most of the work on a parallel OS is spent successively refining the locking regime and mechanisms to increase parallelism and thus performance.

Each change to the locking regime requires extensive test and debug. The main issue is getting locking right, so that locks aren't missing where they should be and aren't taken in an order that leads to a deadlock. From conversations with software engineers involved with both general software and fundamental OSes, the pattern for porting existing code onto parallel platforms is clear: First, get locks in place, then tune them to lock as little and seldom as possible, making sure correctness is retained.

Multicore debug

In general, debugging boils down to three steps:

1. Provoke. Apply conditions or sequences of events that create conditions where the code might fail. This is basically testing to find bugs, or the actions taken by customers that trigger the failures in your code. Sometimes, the code provoking bugs is itself buggy.


Next:




Please sign in to post comment

Navigate to related information

Datasheets.com Parts Search

185 million searchable parts
(please enter a part number or hit search to begin)

Feedback Form