News & Analysis

Failure Analysis Is a Useful Feedback Tool for System Designers

Venkataraman Lakshminarayanan

12/19/2000 4:45 PM EST

Failure Analysis Is a Useful Feedback Tool for System Designers
The complexity of electronic systems has been increasing over the years with an accompanying reduction in package sizes. Although quality is the buzzword in any industry today, failures of electronic systems caused by component failures do occur and when one is faced with a sudden failure of a system, it is not always easy to trace the exact cause of failure in every case. The analysis of failures can give valuable insight into the causes of failure and provide inputs for product improvement. Failure analysis is a tool for reliability evaluation of a system. Commonly used techniques for failure analysis of electronic components are described in this article. Case studies of failure analysis of some components are also presented to illustrate the various techniques used.

Study of failures of components helps us to know the causes of failure, stresses, and mechanisms causing failure and degraded performance of components and arrive at corrective measures to avoid such failures. A failure mechanism leads to an identifiable change in the component. Semiconductor device failures can be explained using a physics-of-failure approach. The bath-tub curve is a commonly used model used to describe the failure of components, electronic or mechanical.

The infant mortality failures are caused by defects during manufacture, faulty design, or usage of the component and freak components which have not failed during screening. The normal operation region of the curve which is the useful-life phase of the component is quite long in the case of electronic components compared to mechanical parts, and most product designs get revised before this phase of the electronic components is passed.

Failures during this phase are caused by stresses such as high temperature (thermal overstress), high voltage / current (electrical overstress ; EOS), humidity (for example, in coastal areas), vibration, mechanical or thermal shocks.Wear-out failures occur after the useful life phase of the component has passed. Corrosion, electrical leakage, insulation breakdown, migration of metallic ions in the direction of current flow, cracking of the encapsulating material due to deterioration of the material, cracks in the bond wires due to repeated stresses are examples of wear-out failures. Wear-out of contacts, increase in contact resistance in connectors are examples of wear-out in mechanical components.

Failure analysis (FA) is an important tool to evaluate the reliability of a product under actual operation. FA provides feedback to the product designers for improving design or even correct minor design faults which might have been overlooked in the initial design. It could point out faults in the device design and provide useful hints for improving the design of components. The aim of failure analysis is to identify the cause of failure and initiate corrective action. The fallout of failure analysis may be an improvement in the design and / or construction of the component, or improvement in the design of the product where the component is used, by incorporating additional components or by modifying the application circuit. A failed component can provide a wealth of information which can be used to enhance the reliability of a product. Depending on the type of failure of the component, we can identify the failure mode, mechanism, and the factors such as stresses inducing the failure and initiate appropriate corrective measures. Therefore, FA is a tool to improve the reliability of a product by applying appropriate corrective methods.

Several techniques are used in FA work to find out the cause of failure. The method used in the FA investigation depends on the severity and type of problem. The techniques used in the case of electronic devices range from simple electrical measurements on the failed samples to examination of decapsulated samples under microscope. Maximum information should be gathered from the failed samples using non-destructive methods before the devices are opened. A suggested sequence of steps for FA investigation is shown in Table 1. Based on experience and past history of failures, some steps in the sequence may be skipped.

Steps in failure analysis

For a proper corrective action to be initiated,it is necessary that the failure mechanism of the failed component be correlated to the field observations about the operating conditions which caused the failure to occur. The first step when a failure is reported is to collect data on the number of failures observed and the sample size. This information collected over a large number of locations where the system is deployed will help the analyst decide about the statistical significance of the failure. Information should be collected on the problem observed and the conditions under which the failure was observed. If the failure occurs during a particular test condition or operation, the possible stresses likely to be encountered by the component can give a clue to the possible cause of failure.

Sometimes, equipment which functions well in one place will develop problems at some other geographical location due to heavy lightning or crossing of high voltage power lines. Such operational and locational details will help in the proper analysis along the right lines. There could be situations in which the problem may not be with the failed component at all , but a wrong application of the component or lack of proper circuit protection may have caused the failure of the component by exposing it to higher than normal stress levels the device is rated to withstand. Based on the data collected, the procedure for failure analysis investigation to be used is decided.

As a preliminary step, a rough flow-chart of the FA procedure to be used is drawn up based on the type of problem. Information is collected about the design — a circuit diagram of the card where the component is used, datasheet of the component which has failed, make and batch number details, number of failures and the conditions under which failures occur. Based on this preliminary study, a hypothesis is made about the cause and type of failure. In the next step, fresh samples of the devices from the same batch are sought for further testing. This helps in identifying component level faults, inherent device problems, application problems, batch related problems and related causes of failure. The failed components, unless these are totally destroyed, burnt, etc., are tested electrically in a component tester which checks the functionality of the device or in a curve tracer to study the V-I characteristics. A similar test is carried out in a good sample to verify differences in the electrical characteristics between the good and failed devices.

Once the component is identified to have failed, the mode of failure is identified by any of the failure analysis techniques described in the following section. The next step is to study the card where the component has been used, to identify the cause of failure. Components will not fail on their own. The cause of failure is very rarely identified to be an infant mortality case or a batch problem. Generally, failures are caused due to application of overstress ; electrical (EOS), thermal, or due to mishandling (e.g., ESD damage due to non-obervance of precautions), or problems created by components in the vicinity or associated circuits, etc. Sometimes, other components in the circuit can cause the failure of a device (e.g., transformer leakage inductance, caused by a defective transformer), high temperature heat sinks mounted close to electrolytic capacitors which can cause failure of such capacitors, etc.Failure can also be caused by defective PCB construction, operating environment and similar factors and not due to component-related factors alone.

After the failure analysis is completed, a report should be made indicating details of the analysis and the corrective action suggested.The report should give details about the problem reported, analysis carried out, test results, readings of parameters if taken, techniques used for the investigations, batch number and make of the component involved, exact cause of failure identified, and corrective action recommended to overcome the failure.

Failure analysis methods

Several techniques have been evolved over the years to carry out failure analysis of components of various types (see Table 1).

Table 1 — Failure analysis
Method Description
External visual examination Failed samples are examined thoroughly, if necessary under optical microscope and notes are made on the observations.
Electrical measurement and testing Component is tested to check functionality. Measurement of critical parameters is made to identify failure mode.
IR or X-ray examination This helps in non-destructively viewing the internal connections and structure of the component,before decapsulation is done. For IR microscopy, back-polishing of plastic encapsulated component is done.
DecapsulationThe internal structure of the component is exposed for examination. This is done by mechanical or chemical methods to remove the outer encapsulation.
Optical or Electron microscopy The die is examined under high magnification optical microscope or electron microscope. In most cases, the failure cause can be identified by this examination.
Destructive analysisSelective etching of layers, bond pull and die shear, internal circuit probing are some of the techniques used.

The decision in each case on the method to be used for failure analysis depends on the extent and type of failure observed.

Commonly used techniques for failure analysis investigations

When a system failure is reported, it is necessary to proceed in a systematic way to investigate the cause of the problem. The faulty card or module should be examined thoroughly to see if there are any visible and obvious manifestations of failure, such as charring of device, any obvious damage, etc.Try to collect as much data as possible about the extent of failure, frequency, conditions under which failure occurs ; whether it occurs during any particular load or test condition, number of failures and the corresponding sample size and whether any correlation can be drawn between related failures of the cards or components.It is advisable to go about the analysis in a step-by-step fashion using non-destructive techniques for analysis of failed components initially and gradually progressing to destructive methods of analysis.

Thus the main objective of this approach is not to destroy evidence and to ensure that chemical action of destructive methods of analysis do not lead to a faulty analysis. For example, usage of acids for etching a plastic package could lead to corrosion of metallic parts in the device such as metallizations in the presence of corrosive chemicals and moisture and it may be difficult to know whether the corrosion was pre-existing before the device was opened or occurred due to chemical action after the device was opened. Therefore, non-destructive examination techniques should be used initially to carry out failure analysis work.

Collect information about the conditions under which the failure occurred. Some stressful conditions such as high voltage transients, lightning occurrence, and ESD can cause failure and information about such occurrences will be useful to trace the exact cause of failure. Environmental conditions such as humidity, temperature, dust, salinity, or presence of chemical contaminants in the atmosphere in the area of operation of the system where failure occurred should be noted. Any drift in the parameters of operation of any component, or an associated fault in some other component which triggered the failure should be looked into.

Functional and parameteric electrical tests

In the case of components, the failed sample is electrically tested in an automatic tester or using laboratory instruments (oscilloscope, test pattern generator, curve tracer) to verify failure of the device and observe the critical parameters. Device malfunction and any deviation from standard device characteristics can be observed by this method. Generally, a curve-tracer is used to study the device's input / output characteristics. Faults such as open circuit / short circuit, degradation of device characteristics, etc. can be detected by this method. X-ray examination may be done at this stage to find out any internal defects in the component. The relevant parameters should be measured in the failed as well as a good device at test conditions which nearly correspond to the application circuit.

Microscopy techniques

Low-magnification microscopy:

The component is initially examined under a low power optical microscope having a magnification in the range of 10-100 times to observe any external damage, logo verification on the package (to verify later with the logo on the die and detect spurious devices), handling damage, hair-width fracture in the component leads/pins, etc.These observations should be recorded as part of the analysis.

Photograph 3 shows a typical fault which can be observed by this method. Observe the corrosion inside the crystal oscillator module, which occurred due to ingress of moisture during cleaning of assembled boards with water.The sealing in the oscillator module was defective and this provided a path for entry of water.

High-magnification microscopy

After a device is decapsulated, the inner sructure of the device can be examined by a higher magnification (up to 2000x) microscope to reveal any damage in the internal structure. Before examination, the sample is generally cleaned by ultrasonic cleaning to remove fine particles of dust which may be adhering to the device structure after the decapsulation process. Damage due to electrical overstress (EOS), corrosion of metallization patterns, damage to bond-wires, oxide layer damage, spiking faults, etc. can be identified by this method.

Photograph 1 shows an example of such an observation. EOS has induced damage in the internal structure of the IC, as seen here.

Infra-red microscopy:

This method relies on the transparency of silicon to IR wavelengths.Using this technique, certain types of failure in devices such as ball bond defects, corrosion, intermetallic diffusion, overstress effects, and spiking across layers can be identified.To prepare a sample for IR microscopy,the back side of the package is polished to remove the encapsulation, after bending the leads backwards. On reaching the die surface, the polishing is stopped.Views of the device's lower layers which cannot be seen by decapsulation of the upper package can be seen by this technique. Since the upper layer of the device is not damaged by chemical etching, electrical measurements can be made on the sample if required. ESD and corrosion damage in the inner layers of devices can be identified using IR technique.

Other techniques used to study the internal structure of devices are scanning acoustic microscope, scanning electron microscope, X-ray, and thermal imaging techniques.These techniques are to be used when optical techniques are not helpful in identifying the problem in the device. Micro-probing may be used to make any measurements and identify the failed nodes in the device.

Photographs 5 and 6 show cases of EOS damage observed in two ICs as observed using an IR microscope. Photograph 2 shows the X-ray view of the internal structure of a failed IC, where fusing of bond-wires has occurred due to EOS.



Decapsulation techniques

Depending on the package material used, different techniques are used for opening devices for internal examination. Some of the commonly used packaging materials are plastic, ceramic, and metal-can packages. Plastic encapsulation is etched out by chemical agents such as hot fuming nitric acid or sulphuric acid delivered through a jet delivery system. Many etching agents exist and the reader is referred to reference (4) which gives a broad list of chemical agents used. With rapid strides being made in the area of device technology, new types of packaging materials for device encapsulation are being developed and used in the manufacture of semiconductor devices.

Ceramic packages are opened by removing the encapsulation by mechanical methods. These tools depend on the technique of fracturing the brittle ceramic packages for opening by application of pressure.

Metal-can packages such as transistors are opened using rotary cutters fitted with sharp blades.

Metal-lidded packages such as used in some LSI devices are opened by mechanical means by lifting a corner of the seal after a little sawing. Thermally opening the solder seal is also possible provided care is taken to avoid thermal overstress damage to the die within. In all cases, while opening the device by mechanical means care should be taken to ensure that the tool does not damage the interconnections or the die.

Glass packages:

These are quite delicate and require careful handling.Glass packages are opened by mechanically lapping the package along the axis till the active device region is reached.



Failure modes and mechanisms of commonly used components

The following section gives a brief overview of commonly observed failure mechanisms in semiconductor devices and passive components used in electronic circuits.

Semiconductor devices:

  • Cracking of encapsulation due to thermal overstress

  • Penetration of moisture, flux contaminants during soldering, washing of boards, storage under humid conditions, etc.due to seal integrity problems.

  • Mechanical stress cracks due to differential thermal expansion of plastic encapsulant, metal leads, die.

  • Chip to substrate attachment failure leading to voids and thermal stress problems.

  • Bond wire snapping due to EOS.

  • Deformation of bond wires due to improper bonding.

  • Cracks at the bonding pad-bond wire junction

  • Metallization damage due to EOS, ESD, corrosion.

  • Electromigration of metal along the direction of current flow.

  • Hillock formation by metal ions.

  • Degradation of metallization at high temperature.

  • Oxide layer faults due to impurities, ESD damage, pin-hole due to etching processes.

  • Defects in the bulk semiconductor material such as crystal defects.

  • Design and fabrication faults, misalignment of layers, geometric defects.

  • Leakage at p-n junction.

  • Deviation from the normal characteristics of the device.

  • Changes in threshold voltage/current characteristics.

    Resistors:

  • Open circuit caused by thermal overstress due to EOS ( high current flow leading to increased IR loss).

  • Cracks at the lead-body interface leading to open-circuit.

  • Degradation in value due to application of high levels of stress, exposure to high humidity conditions, high temperature operating environment.

    Capacitors:

  • Rupture of oxide film in electrolytic capacitors due to application of high electric field.

  • Leakage of electrolyte in electroloytic capacitors due to high temperature, faulty seal.

  • Moisture ingress due to voids between the leads and body leading to a short circuit.

  • Leakage current

  • Degradation of dielectric material due to exposure to humidity, high temperature, aging.

  • A unique property of Aluminium electroloytic capacitors is that they get set to the voltage at which they are operated even though the rated voltage may be higher.Hence excessive derating of applied voltage should not be done in the case of such capacitors.

  • Shift in parameters.

  • Lowering of insulation resistance.

  • Open circuit failure.

  • Short circuit failure.

  • Corrosion of the electrodes due to chemical action caused by contaminants and moisture.

  • Polarity reversal in electrolytic capacitors can cause damage.

  • Disconnection of lead wires from the terminations.

  • Drying up of electrolyte due to operation at high temperature.

  • Dielectric breakdown due to application of high voltage beyond the rating.

    Coils:

    - Open circuit of coil wire due to thermal overstress caused by shorting of adjacent turns where insulation has been damaged during winding process or due to a manufacturing process fault.

    - Nicks and kinks in the wire can cause the above failure to occur.

    Transformers:

    - Open circuit fault in primary and secondary windings due to excessive thermal stress caused by EOS, shorting of windings as in the case of coils.

    - High levels of parasitics such as leakage inductance, inter-winding capacitance due to faulty design and manufacturing technique.

    - Short circuit between primary and secondary due to poor isolation, low dielectric withstanding voltage.

    - High levels of copper and eddy current losses which leads to high heat dissipation in the transformer and affects adjacent components. Basically, caused by poor design.

    - Corona discharge can sometimes occur between adjacent turns or windings. To prevent this, impregnation of the transformer should be proper.

    Relays:

    - Arcing induced damage of contacts.

    - Corrosion of contacts due to ingress of moisture, flux, cleaning agents due to improper sealing.

    - Melting of contacts due to electrical overstress (EOS).

    - Coil damage due to EOS.

    - Damage to plastic body due to exposure to high temperature, during soldering or internally generated heat due to EOS.

    Printed circuit board:

    - Discolouration due to exposure to high temperature during soldering, heat dissipation of components on the board.

    - Delamination due to exposure to high temperature.

    - Warping due to exposure to high temperature, faulty board design, insufficient thickness of laminate, faulty layout and mounting of components on the board.

    The commonly observed failure mechanisms, their causes, analysis techniques to detect this fault and test screens which can be used to precipitate the failure mechanisms are listed in Table 2 below.

    Table 2 — Failure mechanisms
    Failure mechanism observed Possible cause of failure analysis Technique which can be used to detect this fault Test screen which can be used to precipitate this failure mechanism
    Cracking of plastic packageThermal shockVisual examination with low magnification optical microscopeTemperature cycling
    Cracking of dieThermal shock, thermal processing, fabrication defectsVisual examination with high magnification optical microscopeTemperature cycling
    Voids in plastic packageManufacturing process / material defectsAcoustic microscopyTemperature cycling
    Delamination of plastic / dieDifferential thermal expansionAcoustic microscopyTemperature cycling, thermal shock
    Bond pad and wirebond corrosionEntry of moisture, contaminantsVisual examination, electrical testing, wire bond pull testStorage at high temperature
    Corrosion of metallizationPresence of contaminants during manufacture, entry of contaminants laterOptical, acoustic or electron microscopy, electrical testingElectrical testing, temperature cycling
    Wirebond intermetallicsContamination of bond area, high temperatureVisual, electron microscopy, x-ray examinationTemperature cycling, high temperature storage, mechanical shock, vibration testing, burn-in, electrical testing
    Shorts and opens in dieContamination of bond pads, excess intermetallic formation in wire-bond structureElectrical testinghigh temperature storage, temperature cycling Burn-in, electrical testing
    Snapping of bondwireElectrical overstress, contamination of bond pad, wirebond areaVisual examinationTemperature cycling, high temperature storage
    Contaminants in dieIngress through faulty sealVisual inspection, electrical testing, acoustic / electron microscopyTemperature cycling, high temperature storage
    Voids in die attachThermo-mechanical stressOptical microscopyTemperature cycling
    Corrosion of metal packageMoisture, saline or chemically corrosive atmosphereVisual examinationUse a good plating on the metal case, such as a noble metal
    Corrosion of leadsMoisture, chemical action due to contaminantsVisual examinationSolderability test for leads
    Poor seal in hermetic packageManufacturing process defects, gaps at sealing edges due to wetting problemsHermeticity test, visual examination under microscopeHermeticity test, thermal shock, high temperature storage, electrical testing, visual examination

    Case studies

    A few case studies of failure analysis are presented here as examples to illustrate the various techniques of investigations discussed.

    Final analysis

    Miniaturization of electronic components is progressing at a rapid pace and device geometries have been shrinking over the years with a corresponding increase in the circuit complexity. New materials are being developed for packaging devices at a lower cost and to protect against thermal stresses, moisture, etc. Higher complexity and finer device geometries will present problems for failure analysis work and require new types of instrumentation techniques to probe into the die-level world.A failure analyst should have a thorough knowledge of component and system design techniques to effectively tackle failure analysis tasks.The rapid strides in the field of integrated circuit technology and microelectronic packaging techniques will throw up a lot of challenges for failure analysts in the coming years.


    References

    V.Lakshminarayanan has over 17 years of experience in the area of design and development of electronic systems.He is a member of the IEEE and is Coordinating Engineer - Failure Analysis & Reliability at the Centre for Development of Telematics, Bangalore, India.





  • Please sign in to post comment

    Navigate to related information

    EE Buzz DesignCon

    Datasheets.com Parts Search

    185 million searchable parts
    (please enter a part number or hit search to begin)

    Feedback Form