Design Article
Introduction to Hot Swap
Jonathan M. Bearfield
9/24/2001 12:00 AM EDT
![]() |
|
|
ABOUT THE AUTHOR
Jonathan M. Bearfield has
worked in the field of electronics for 15 years, the last five
supporting Texas Instruments' hot swap and power distribution
product lines.
|
||
Hot swap, hot plug, and hot dock are terms used interchangeably to refer to hot insertion and removal. Hot swap lets you insert and remove cards, PC boards, cables, and/or modules from a host system without removing power. Because of the need for High Availability (HA) systems, hot swap has quickly become part of every designer's vocabulary. To increase system availability, hot swap is used to reduce down time, simplify system repair, and allow for system upgrade. Because of these advantages, hot-swap solutions are finding their way into a wide variety of applications.
Applications using hot swap designs include Telecom Systems, Servers, RAID, Hot Plug PCI, CompactPCI, USB, and Device Bay. Employing hot swap techniques allows each of these applications to be upgraded, expanded, and repaired without affecting the rest of the system. If implemented properly, a Hot Swap IC solution creates a highly reliable, HA system.
In an HA system, hot-swap power management is used to maintain 100% up time. In some systems, power is not applied to the card socket until after the card is plugged in. Once power is applied to the card, the system polls the power requirements and then applies power. You need to address every application's power-management needs up-front to design the correct product. These applications have key considerations to address early in the design.
Hot-swap solutions control the power up of uncharged cards and manage system response. Cards mating into a live system connector will connect and disconnect power (bounce power on and off) as the card is rocked into the connector. It can take several milliseconds for the card to mate properly. As the card is inserted, the capacitors on the card start to charge and draw current from the live system. As the capacitors initially charge, the card appears as a short and instantaneously draws a large amount of current. This inrush current produces a large demand on the system and can cause the system capacitors to discharge and the system voltage to droop.
Simple Discrete Solutions
The basic discrete or mechanical solution combines staggered pins
with resistors and capacitors or implements Positive Temperature
Coefficient Resistors (PTC)/Negative Temperature Coefficient
Resistors (NTC) to manage the power event.
In the staggered-pin solution, the connector has a combination of long and short pins. The longer pins mate first and start charging the board capacitance through a series of resistors. When the board is fully seated, the shorter pins mate, bypassing the resistors connected to the longer pins and creating a low-impedance path for powering the inserted card. One flaw with this solution is that it requires a specialized connector, which can be expensive. Another flaw is that the card capacitance charge rate is impossible to control because the rate of card insertion is impossible to control. The card capacitance charge is variable and unknown prior to connecting the low impedance path to the live system. The simple discrete solution exhibits large inrush currents.
Two solutions that are popular in older systems include the use of PTCs or NTCs. A standard connector is used in the PTC or NTC simple discrete solution. The PTC or NTC limitations are the additional impedance in the power path, lack of fault management features, and that the power management is dependent on temperature. Although PTCs have been used in hot-swap designs, they are not suited for high availability applications. The basic function of a PTC is to change from a low-impedance to a high-impedance in high current situations, effectively turning off the card. However, PTC reacts to temperature, not current, and therefore takes a long time to respond to a change in load. PTC does not have the ability to respond rapidly to suppress the large inrush currents seen during a hot-swap insert event. Also PTC cannot sense over-current or respond to fault conditions.
The basic function of a NTC is to decrease from a high-impedance to a low-impedance during start-up when current is charging the card. However, NTC adds a voltage drop to the system that changes with changes in the load. The change in resistance from near open to a few tenths of an Ohm is driven by a change in temperature and takes a long time. NTC does not respond rapidly enough to suppress the large currents of a fault event.
Each of the simple discrete solutions requires some type of fuse or other rapid fault protection device. The fuse adds a voltage drop to the power path and service costs for board repair.
Discrete MOSFET
The feature set of a discrete MOSFET makes it popular in power
management solutions. MOSFETs provide a low drain-to-source
on-resistance (RDS(on)), and the MOSFET functionality is
similar to that of an ideal switch. As a voltage controlled device,
MOSFETs require a small additional current from the system to
operate. MOSFETs can be turned on and off almost instantaneously,
several orders of magnitude faster than fuses, PTCs, or NTCs.
A MOSFET is not sufficient to meet all of the sensing and control circuitry necessary for a HA hot swap solution. A discrete MOSFET circuit requires additional resistors and capacitors to control rise time and sense over-current or fault conditions. One negative feature of discrete MOSFETs is that most of them have a parasitic diode connected from the drain to the source. If there is more voltage on the output of the device than on the input, current will be conducted back across the device. Preventing the back-flow of current is of critical importance in most HA systems.
Hot-Swap Switch
The next step beyond the discrete MOSFET is a solution using a
hot-swap switch IC. These are ICs with n-channel or p-channel
MOSFETs that have drive, sense, and reporting circuitry designed
in. Hot-swap switches are readily available in the industry today.
They incorporate the drive circuitry necessary to control the rise
time and/or the current ramp rate of the MOSFET. In addition to
sensing faults and protecting against over-current conditions,
these ICs also have the ability to report these conditions to a
system controller. Hot-swap switch solutions with current limit,
shut down, and thermal protection features may remove the need for
fuses.
Hot-swap switch solutions are easy to design. A solution can be as simple as a timing capacitor and a hot-swap switch IC. Hot-swap switch ICs reduce design times and lower engineering costs that are required when trying to implement a discrete MOSFET solution.
Hot-Swap Power Manager and Controller
Some hot swap applications require more than the previous solutions
can deliver. In HA applications like CompactPCI and Hot-Plug PCI,
there are many control and interface requirements that disqualify
simpler solutions. Most HA applications have very specific di/dt,
sequencing, fault management, and reporting requirements that
prohibit discrete solutions.
Two types of HSPMs exist. In lower current applications the MOSFET can be integrated into the hot swap manager similar to a hot-swap switch. For higher current applications the HSPM works with a discrete external MOSFET. The external MOSFET solution has the advantage of tailoring the IC and MOSFET to the solution needs. Almost all HSPMs have rise-time turn-on control and circuit breaker fault protection features. Other HSPM features include di/dt turn-on ramp, sequencing, healthy signal, and I/O command.
Selecting A Solution
The decision of what hot swap solution to implement should be made
early in the design. Every hot swap application requires a unique
power management feature set. The power supply has to manage the
transients generated by the hot swap insertion and removal events.
There needs to be enough bulk and bypass capacitance to maintain
stable system power during card insertion. Inrush current needs to
be limited, including current ramping, and the connector selection
will depend on the need to pre-charge the inserted card.
You can design hot swap solutions on the card or the system backplane. In server applications, the hot swap solution is designed on the backplane. For CompactPCI applications, the hot swap solution is designed on the card. Standards exist to define and specify hot swap requirements. CompactPCI, Hot-Plug PCI, and USB hot swap solutions are specified and mandatory.
For low-availability systems where boards are rarely removed or inserted, a simple discrete implementation may suffice. For HA computer and telecom systems, hot-swap switch or HSPM ICs are the most appropriate solution.
Sequencing refers to the control of the relative levels and timing between two or more supply voltages during both power-up and power-down transitions. This requirement may be specified as a minimum time delay between a primary supply achieving its steady-state level before a secondary supply is turned on. Or, it may be stated as the fixed threshold of one supply at which the second is ramped up. Other systems or processors may have the requirement that two supplies are ramped together, such that the voltage differential between them never exceeds a specified maximum. This is often referred to as supply tracking.
As a second example, consider a server system with ±5V and ±12V routed to each plug-in connector. The proper hot swap of a mating card assembly may require specific di/dt control on each supply line, and a time delay for power-on reset (POR) of a slave controller before the higher voltage supplies are ramped up to drive the Rx/Tx circuitry. In addition, an over-current or under-voltage on any one of the supplies will need to be signaled to the host, for either external (host) or automatic (module) shut-down of the remaining voltages. The hardware configuration at the mating interface or under host software control set the sequencing order.
Dual-supply devices, such as DSPs and ASICs often impose supply sequencing requirements. Many of the devices now on the market use a +3.3V, 2.5V, or even 1.8V supply that powers all the core logic, including the CPU, clock-generation circuitry, on-chip memory, and hardware peripherals. A second, isolated supply drives the I/O circuitry to interface to other 5V or 3.3V devices. Two issues arise from this configuration:
- Current flow within internal isolation or ESD structures between supply rails
- Bus contention between the I/O pins of the processor or ASIC and other drivers on the bus.
Excessive current flow within the device can occur if the voltage differential between the two supplies exceeds a particular threshold. Nominal operating voltages for the two supplies are such that one supply is at a higher potential. But during power-up and power-down transients, differences in load characteristics can affect voltage ramp rates such that the relative level of the supplies overstresses the device.
The caveats regarding voltage differentials also apply during power-down events. Differences in the load current drawn from each supply during power-down, as well as the capacitance associated with each rail, affect the rate of voltage decay. If these factors can vary based on board configuration (number of ports or channels and processor speed), or the level of peripheral activity (active, standby, or sleep mode), then predicting the ramp-down rates of the different supplies becomes increasingly difficult. When multiple supply nodes are present at the hot swap interface, the system design should address the orderly turn-on and subsequent status of these supplies.
The following list provides insight into the functions of a well-designed hot swap controller. Obviously, different systems may require more or less functionality, and the designer should evaluate the needs of his design against the requirements of reliability and up-time specification, industry standards, and the anticipated frequency of live insertions and removals. The list is generally organized in order of importance, with the basic components shown first, progressing towards functions that become more peripheral to the hot swap event:
- Current limiting
- Controlled di/dt or dv/dt (soft-start)
v(t) = L * (di/dt)
- Circuit breaker function
- Over-current time-out
- Tight fault threshold tolerance
- Programmable / adjustable
- Sequencing control
- Power good reporting.
All of this means that a good hot-swap power management solution will have a high level of integration, otherwise it will be very cumbersome and difficult to use.
USB
Perhaps the most common hot swappable interface in the industry
today, USB is a simple four-wire interface that contains both data
and power. The USB interface is very similar to the 1394 interface
in that they are both hot pluggable interfaces. However, USB
transmits data at 1.5Mbps and 12Mbps (USB 1.0) or 480Mbps (USB
2.0), and distributes power at approximately 5V in increments of
500 mA and 100 mA, well below the 1394 capabilities and
requirements. The power management requirements of USB are also
much more explicitly defined than those of the 1394 interface,
although there are still three tiers in the USB peripheral
platform: Host/Self-power hub, Bus-powered hub, and USB
function.
USB data is a 3.3V level signal, but power is distributed at 5V to allow for voltage drops in cases where power is distributed through more than one hub. Each function must provide its own regulated 3.3V from the 5V input or its own internal power supply.
Designing the simplest power management solution for port power control may protect your power supply, but will probably not ensure the continuous operation of the system. Taking just a few items into consideration can turn this simple solution into an effective means of both managing power and ensuring reliable operation over a wide range of conditions. Selecting capacitors with an appropriate ESR, the position and type of ferrite bead, and using individual current limit devices or power switches maximize the performance of the USB power interface without adding much complexity to the design.
Infiniband
Truly an emerging application, Infiniband (IB) is a switched-fabric
architecture in which daughter cards are hot-inserted into a
server-type system. IB is a scalable, modular, channel-based,
switched-fabric architecture with a performance range from 500Mbps
to 6Gbps. Supporting a wide range of applications, IB is defined
for the connection of servers with other servers, as well as with
storage and networking devices. The application requires the
management of two separate supplies during insertion and removal.
The main supply is a 12V rail capable of delivering approximately
2.5A, and the auxiliary supply is only 5V at 250mA. Due to the high
availability requirements of the system, card insertion and removal
must take place without generating any adverse effects in the
system. On the card side of the hot-swap power management device,
power is regulated up or down based on what the card requires. The
IB specification stipulates the di/dt requirements of the hot swap
event.
There are two power connections available for each IB module. Bulk power is intended for the major functions of the module. This 12V, ±2V supply will have to be regulated down to a more useful voltage level, such as 3.3V. The maximum load current is 2.5A dc. Auxiliary power is intended for management and enumeration functions of the module, even when the bulk power is not available. Auxiliary power is supplied to the module as 5V and may be regulated within the module as required. The maximum load current is 0.26A dc.
A single IB port is capable of providing up to 50W, but provisions are made to allow the chassis backplane the option to supply 25W and indicate that only 25W is available. Concurrently, the IB module indicates that up to either 25W or 50W is required. This ensures that an appropriate match of power required versus power delivered is in place while allowing for both flexible systems and module designs.
Device Bay Power Requirements
Device Bay is intended to support a wide variety of applications,
including mass storage, communications, and security. Typical
devices include CD-ROM drives, DVD-ROM drives, hard disk drives,
and smart card readers. Device Bay is a new form factor for PCs,
and PC peripherals. Just as PC Cards allow the expansion of the
notebook PC platform, Device Bay adds this flexibility to
everything from the desktop PC to the monitor. One intention of the
Device Bay specification is to enable compatibility between any
Device Bay device and any bay. Device Bay provides a simple path
for upgrade and/or system expansion by allowing peripherals to be
changed without opening the chassis. These applications allow the
user to remove a hard drive and take their entire operating system
on the road with them, as well as provide a rather instantaneous
means for upgrading a system.
The Device Bay communication protocol uses the 1394 and USB interfaces. This provides a broader range of bandwidths with almost unlimited scaleable performance. Supporting both interfaces in the host also allows the device designer to select the interface that is best suited for the functional requirements of the device application.
In a desktop, the peak power requirements for Device Bay, on a per-bay basis, is 45W. This is derived from the combination of required voltages VID, 3.3V, 5V, and 12V at 0.45A, 3.75A, 3A, and 5.25A peak, respectively. The electrical continuous power requirement drops the total power requirement to 30W. For a notebook computer, the power requirement is much lower. The peak power requirement is 8W and is derived from the combination of required voltages VID, 3.3V, and 5V at 0.45A, 1.6A, and 2.4A peak, respectively. Note that the 12V supply is not required in the notebook computer.
Hot-Plug PCI Power Requirements
PCI hot-plug is a specification derived from the standard PCI local
bus specification. PCI provides a means of defining the interface
and power requirements of add-in cards for a PC system.
Graphics-oriented systems and applications have created throughput
problems between the main host processor and any of its
peripherals. The PCI bus provides a means of moving the more
demanding, higher bandwidth peripherals closer to the processor on
a communications level. This provides measurable gains to most
applications. The PCI hot-plug addendum provides a means of
hot-plugging these add-in cards, which increases system
functionality and allows system upgrade and repair without powering
down a system, thereby minimizing the effective down-time.
The add-in card has been the standard upgrade path for the PC since the first PC was created, providing a means of altering the operating environment and changing the system functionality. Utilizing the PCI interface and add-in boards as a method of upgrading a system has a cost in terms of power. All PCI connectors must support four power rails, regardless of the need for all four rails anywhere else in the system. The power requirements are 3.3V, 5V, 12V, and -12V at 7.6A, 5A, .5A, and .1A, respectively. In theory a PCI expansion card could draw more than 25W, and there are, at a minimum, four PCI slots per host system. This is a worst case additional power consumption requirement exceeding 100W-per-system.
Although using the PCI add-in cards for upgrades has existed for several years, it can still weigh heavily on a system in terms of power requirements. Most designs do not demand the maximum amount of power available. However, as with the other interfaces, the host power supply must be designed to handle the worst-case loads.
Telecom
Perhaps one of the highest voltage hot swap applications, Telecom
systems definitely fall into the HA requirement category. Typically
distributing 48V and sometimes 12V or 24V on the backplane, these
systems can not afford to fail. If a single card goes out, only a
handful of users may be affected. However, having to turn off the
system to repair or upgrade the system would take down an entire
city grid.
These systems distribute higher voltages so that they do not have to distribute high currents. The voltage drops in low-current, high-voltage systems are negligible, especially since regulators on the plug-in cards and modules will not be effected by even sizable changes in the input voltages. This advantage does not outweigh the criticality of needing to insert and removes cards without impacting system performance. Strict attention must be paid to inrush current, voltage, and current transients, as well as voltage tolerances. The later due to the fact that system voltages often fluctuate between 30V and 76V in many systems. Not only does the hot-swap solution need to manage the insertion and removal event, it must function properly across this entire input voltage range.
Many of the hot swap modules are driven by multiple power sources. The removable assembly may contain logic blocks operating from different regulation points, may need to interface to legacy logic-levels elsewhere in the system, or may contain devices that operate from split-rail supplies. These systems and modules require a power interface that provides a sophisticated level of hot swap control combined with proper supply sequencing, and sequencing order and timing may be defined at different levels in the system hierarchy.
For possible device issues, always look to the manufacturer's specifications to identify any unique considerations. The combination of hot swap requirements with supply sequencing calls for a power interface that performs the functions of both. Obviously, the degree of the interface's complexity can vary from system to system. However, the feature set previously presented in this article can serve as a checklist to identify the pertinent functions of each target application. This, together with the functional blocks presented, should help the system, or module designer, develop the appropriate solution. Today, the more complex solutions are often built around an integrated hot swap controller, providing the functionality needed at a lower cost, using less board space. In addition, this approach generally results in a full-featured solution, providing a robust design that contributes to the up-time performance and long-term reliability of the overall system.




