Design Article
Dynamic power management techniques for multimedia processors
Arthur Musah and Andy Dykstra, Texas Instruments
7/27/2008 10:25 AM EDT
Active power management
On-chip power management techniques fit into two main categories: managing active system power consumption, and managing standby power consumption.
Active power management falls into three areas: dynamic voltage and frequency scaling (DVFS); adaptive voltage scaling (AVS); and dynamic power switching (DPS). Static power consumption management, on the other hand, involves keeping an idle system in a power efficient state until more processing is required. This type of power management uses so-called static leakage management (SLM), which often relies on several low-power modes from standby to power off.
Let's look at the active modes. With DVFS, clock rates and voltages are lowered in software depending on the performance required by the application. Consider, for example, an applications processor that includes an advanced RISC machine (ARM) and digital signal processor (DSP). Even though the ARM component can run at rates as high as 600 MHz, not all of that computational power is always required. Typically, the software selects pre-defined processor operating performance points (OPPs), which include a voltage that ensures the processor runs at the minimum frequency to meet the system's processing requirements. For additional flexibility in optimizing power to suit different applications, a separate set of the device's cores OPPs are pre-defined for interconnects and peripherals in the processor.
Corresponding to the given OPP, software sends control signals to external regulators in order to set the minimum voltage. For instance, the DVFS is applicable to two voltage supplies VDD1 (which supplies the DSP and ARM processor) and VDD2 (which supplies the interconnects between subsystems and peripherals), and these rails deliver most of the chip's power, typically around 75 to 80 percent. MP3 decoding can be performed with plenty of margin for other tasks by transitioning the DSP processor into a low operating performance point, where the ARM can run at up to 125 MHz. To achieve that functionality with optimal power consumption, we can lower VDD1 to 0.95 volt, as opposed to the maximum of 1.35 volts, which secures 600 MHz operation.
The second active power management technique, adaptive voltage scaling (AVS), is based on variations that come up during chip manufacturing as well as during a device's operational lifetime. This technique is in contrast to DVFS, where all processors have the same preprogrammed OPPs. As expected in most established manufacturing processes, the chip's performance for a given frequency requirement follows a fairly well defined distribution. Some devices (known as "hot" devices) can achieve a given frequency with a lower voltage than can "cold" devices. Here, AVS comes into play—the processor senses its own performance level and adjusts voltage supplies accordingly. Dedicated on-chip AVS hardware implements a feedback loop, which does not require processor intervention, that dynamically optimizes voltage levels to account for variations in process results, temperature and silicon degradation (Fig. 1).
In operation, the software sets up the AVS hardware for each OPP, and the control algorithm sends commands to external voltage regulators, over an I2C bus, to lower the appropriate regulators' outputs in incremental steps until the processor just exceeds target-frequency requirements.
For example, a developer starts by designing in a voltage that fits all cases, and for 125 MHz that is 0.95 volt (just above V1 in Fig. 1). If, however, a "hot" device with AVS is inserted in the system, the on-chip feedback mechanism automatically lowers the voltage to the ARM to 0.85 volt or less (just above V2 in Fig. 1).
These first two methods of active power management find the minimum operating voltage necessary to run part of a device at the desired speed. In contrast, the third method—dynamic power switching (DPS)—determines when a device has completed its current computational tasks and, if it's not needed at the moment, then puts the device into a low power state (Fig. 2). For example, a processor enters a low power state while waiting for a DMA transfer to complete. On wakeup, the processor returns to its normal state in a matter of microseconds.

Passive power management
While the DPS puts only a section of a multimedia system-on-chip (SoC) in a low-power state, there are situations where it makes sense to put the entire device in low-power mode—either automatically when no application is running, or upon user request. To do this, we apply static leakage management (SLM), which is used to initiate a standby or device-off mode. One key difference is that in standby mode, the device retains internal memory and logic, whereas in the device-off mode all system states are saved in external memory. With SLM, the wakeup time is far faster than a cold boot because the program is already loaded in external memory and users do not have to wait for a full operating system (OS) restart. One example in using SLM would be in a media player that, following ten seconds of on time with no processing and no user input, shuts off the display and enters standby or device-off mode.
TI's OMAP35x single-chip processor devices with ARM Cortex-A8 core, for example, implements the device-off mode—the lowest-power mode in which devices can wake up autonomously. All power domains are off except in the wakeup domain, and so power is consumed only in the wakeup domain and from I/O leakage currents. The system clock is turned off, and in this case the wakeup domain is separately clocked at 32 kHz. The OMAP35x also sends signals to external regulators automatically, and the regulators can be turned off during this deep-sleep state. No memory or logic is retained inside the processor. The system state is saved in external memory before entry into the device off mode; after a post-wakeup reset, the microprocessor unit (MPU) jumps to a user-defined function and the SDRAM controller configuration is restored from scratchpad memory.
A technique for every use
By combining the aformentioned power management techniques, we can handle a variety of operating scenarios in an optimal manner. When system activity on a portable multimedia player is high, such as viewing high-resolution video, an overdrive OPP can be set on VDD1. For web browsing that requires medium power consumption, nominal OPPs can be set for VDD1 and VDD2. For listening to music, which has relatively low-power demands, the lowest OPPs can be set for VDD1 and VDD2. In all these examples, the AVS can be activated to flatten power-consumption differences between "hot" and "cold" devices. Finally, when the user leaves the media player on but does not use it for several hours or days, it uses SLM to automatically place the device into its off mode.
To better appreciate the power savings possible by taking advantage of these features, consider the following cases. The following examples do not utilize TI's AVS/SmartReflex technology unless otherwise noted. In these descriptions, IVA refers to the image, video, and audio accelerator subsystem.
•Case 1: Device-off mode—0.590 mW.
This is the lowest power mode from which TI's OMAP 3 can still wake up autonomously. In this mode the entire device, except for the wakeup domain, is off, and the wakeup domain runs lower than 32 kHz. Unused regulators are turned off (VDD1 = VDD2 = 0), SDRAM is self-refreshed, and a special boot sequence restores the SDRAM controller and system state upon wakeup.
•Case 2: Standby mode—7 mW
In this device state, the wakeup domain is active and all other non-wakeup power domains are in low-power retention (VDD1 = VDD2 = 0.9 volt). All logic and memory are maintained. AVS is off.
•Case 3: Audio decode—22 mW (excluding DPLL and IO power). The ARM, although running at 125 MHz, merely sets up DMA to read input data from a multimedia card after which it goes into a sleep state. The IVA decodes MP3 frames (44.1 kHz, 128k bps stereo), and sends decoded data to buffers located in SDRAM. An on-chip multichannel buffered serial port sends data to an audio codec for playback. As for system configuration, the DSP runs at 90 MHz and transitions into low-power states to save power when cycles are not required for processing. Here, VDD1 = 0.9 volt and VDD2 = 1 volt.
•Case 4: Audio/video encode—540 mW (excluding DPLL and IO power). In this case, audio is captured and encoded (AACe+ at 48 kHz and 32k bps stereo), video is captured and encoded (H.264 VGA resolution at 20 frames/sec, 2.4 Mbsp), and both are stored. At the same time video is displayed. In this configuration, the ARM runs at 500 MHz, the DSP runs at 360 MHz, VDD1 = 1.2 volts, and VDD2 = 1.15 volts. An on-chip camera subsystem also captures video input coming from an external sensor, a multichannel buffered serial port captures audio PCM input, the IVA performs video and audio encoding, the encoded data is stored in a multimedia card, and the display subsystem rotates the video and sends it out on LCD and TV output interfaces.
Implementing power management
To achieve this extensive power-management flexibility, the DSP processor relies on an on-chip power reset and clock manager (PRCM). The OMAP3530 processor divides its functional blocks into 18 power domains, each with its own switch. The PRCM can switch all the power domains, but many of them can also be controlled by the user. Further, each power domain can be put into one of four states depending on whether power is applied to logic and memory and whether clocks are active or not: active, inactive, retention or off.
These states require coordination with the auxiliary voltage regulators typically needed for ARM-and DSP-based devices. Many regulators on the market can do the job; each must, of course, meet the processor's voltage, current, and power slew rate specifications as well as power up/down sequencing requirements. In order to implement DVFS and AVS operations on ARM-and DSP-based processors, the associated regulators must also have I2C programmability. In the device-off mode, circuitry must be able to turn the VDD1 and VDD2 regulators on/off either with I2C commands issued automatically or by a dedicated GPIO signal. The latter option brings a slightly faster wakeup time because there are no I2C latencies. To lighten the burden on design engineers, all of the features of these separate functions are ideally placed into a single device, which greatly reduces parts count (Fig. 3).



