Sie sind auf Seite 1von 54

Fault Diagnosis Overview

David Lavo
UC Santa Cruz
January 13, 2005
Outline
• Introduction: What is Fault Diagnosis?
• Components: What’s involved?
• Algorithm details: How does it work?
• Diagnosis in practice: How does it really
work?
• Research: Why does (or doesn’t) it work?
How should it work?

©2005 David Lavo Fault Diagnosis Overview 2


What is Fault Diagnosis?

• A guess as to what’s wrong with a


malfunctioning circuit
• Narrows the search for physical root cause
• Makes inferences based on observed
behavior
• Usually based on the logical operation of the
circuit

©2005 David Lavo Fault Diagnosis Overview 3


VLSI Fault Diagnosis
(in One Slide)
Defective Circuit

Observed
Tests
Behavior

Location
or
Fault

Physical Analysis Diagnosis Diagnosis Algorithm


Two Types of Diagnosis

• Circuit Partitioning (“Effect-Cause” Diagnosis)


– Identify fault-free or possibly-faulty portions
– Identify suspect components, logic blocks,
interconnects
• Model-Based Diagnosis (“Cause-Effect”
Diagnosis)
– Assume one or more specific fault models
– Compare behavior to fault simulations

©2005 David Lavo Fault Diagnosis Overview 5


Circuit Partitioning
• Separate known-good portions of circuit from
likely areas of failure
• Simplest method: identify failing flip-flops
– Tester can identify failing flops or outputs
– Input cone of logic is suspect
– Intersection of multiple cones is highly
suspect
– Single clock pulse with scan can be used
for sequential/functional fails

©2005 David Lavo Fault Diagnosis Overview 6


Back-Tracing Failures
aka Effect-Cause Diagnosis

• Reasoning based on observed behavior and


expected (good-circuit) functions
• Commonly used at system and board-levels
• Tries to separate good and suspect areas
• Advantage: Simple and general
• Disadvantage: Not very precise, often gives
no indication of defect mechanism

©2005 David Lavo Fault Diagnosis Overview 8


Cause-Effect Diagnosis
• Start from possible causes (fault models),
compare to observed effects
• A simulator is used to predict behavior of the
circuit in the presence of various faults
• Match prediction(s) against observed behavior
• Advantage: Implicates a mechanism as well as a
location
• Disadvantage: Can be fooled by unmodeled
defects

©2005 David Lavo Fault Diagnosis Overview 9


Cause-Effect Diagnosis
Behavior Signature
010001010100010101010 …

Defective Circuit Comparison &


Conclusion
Tests
Diagnosis
010100110000101010100 …
Algorithm
101000100001011101100 …

010100010100011101100 …

000111000101010011110 …
Fault Simulator
Candidate Signatures
Outline
• Introduction: What is Fault Diagnosis?
• Components: What’s involved?
• Algorithm details: How does it work?
• Diagnosis in practice: How does it really
work?
• Research: Why does (or doesn’t) it work?
How should it work?

©2005 David Lavo Fault Diagnosis Overview 11


Components of Fault Diagnosis

• Fault models
• Fault simulators
• Fault dictionaries
• Diagnosis algorithms

©2005 David Lavo Fault Diagnosis Overview 12


Fault Models
• A fault model is an abstraction of a type of
defect behavior
• A fault instance is the application of a model
to a circuit wire, node, gate, etc.
• Used to create and evaluate test sets
• For diagnosis, they can be used to simulate
and predict faulty behaviors

©2005 David Lavo Fault Diagnosis Overview 13


Stuck-at Fault Model
• The most-used fault model
(by far)
• Simple to simulate and Node A stuck-at 1:

enumerate 0/1 A
0/1
• Effective for testing, fault 1
grading, and diagnosis of B

some defects (Fault-free/faulty


logic values)
• Many defects are not well
represented by the stuck-at
model
Bridging Fault Model
• Shorts are a common
defect type in CMOS Nodes X and Y bridged:

• Different bridging fault 0


X 0
models have varying 1
accuracy and precision,
from simplistic to very 1
Y 1/0
sophisticated 1

• Difficult or impractical to Node X forces Y


enumerate to a value of 0
Some Diagnostic Fault Models

Gate Fault
Net Fault

Bridging Fault Path Fault


Fault Simulators
• A fault simulator can simulate instances of a
particular fault model
• Inputs:
– Circuit (netlist)
– Test set
– Faultlist (list of fault instances)
• Output: circuit response
• Usually, simulates the presence of a single
fault instance (“single-fault assumption”)

©2005 David Lavo Fault Diagnosis Overview 17


Fault Dictionaries
• A fault dictionary is a database of the
simulated responses for all faults in faultlist
• Used by some diagnosis algorithms for
convenience:
– Fast: no simulation at time of diagnosis
– Self-contained: netlist, simulator, and test
set not needed after dictionary creation
• Can be very large, however!

©2005 David Lavo Fault Diagnosis Overview 18


The Full-Response Dictionary

• For each fault ( f ), store the response to each


test vector ( v )
• One bit per vector, pass ( 0 ) or fail ( 1 )
• For each vector, store the expected output
response ( o )
• Total storage requirement: f  v  o bits

©2005 David Lavo Fault Diagnosis Overview 19


The Pass-Fail Dictionary

• For each fault, store only the test vector


responses
• One bit per vector, pass ( 0 ) or fail ( 1 )
• Total storage requirement: f  v bits
• Much smaller than full-response, and often
practical for even very large circuits

©2005 David Lavo Fault Diagnosis Overview 20


Dynamic Diagnosis
• Alternative to dictionary-based diagnosis
• Fault simulation is only done for certain faults,
based on test results
– Only simulate faults in input cones of failing
flip-flops/outputs
• Dictionary is eliminated, but requires complete
netlist and test pattern file
• Used by most commercial ATPG tools: Mentor
Fastscan, Synopsys, Cadence, etc.

©2005 David Lavo Fault Diagnosis Overview 21


Outline
• Introduction: What is Fault Diagnosis?
• Components: What’s involved?
• Algorithm details: How does it work?
• Diagnosis in practice: How does it really
work?
• Research: Why does (or doesn’t) it work?
How should it work?

©2005 David Lavo Fault Diagnosis Overview 22


Algorithm Details

• Role of a diagnosis algorithm


• Scoring methods
• Types of diagnosis algorithms

©2005 David Lavo Fault Diagnosis Overview 23


Diagnosis Algorithms

• Algorithms compare observed behavior to


predicted behaviors
• An algorithm attempts to “explain” the
observed failures with fault candidates
• The job of a diagnosis algorithm is to report
the best fault candidate(s)
• “Best” is determined by scoring method

©2005 David Lavo Fault Diagnosis Overview 24


Fault Candidate Scoring
• Two common scoring methods
– Match/mismatch points
– Fault candidate probability
• Other common scorings:
– Hamming distance
– Set intersection/overlap
– Nearest neighbor

©2005 David Lavo Fault Diagnosis Overview 25


Match/mismatch Point Scoring
• Award points for matching observed failures
• Optionally deduct points for not predicting fails
• Nonprediction: A behavior not predicted by
candidate
• Misprediction: A prediction not fulfilled by
behavior
• Commercial tools (e.g. Fastscan) are usually
biased to lowest nonprediction

©2005 David Lavo Fault Diagnosis Overview 26


Probabilistic Scoring
• Probability score based on matches and
mismatches and error assumptions
– Weights for non- and mis-prediction
– Different prediction probabilities for different
fault candidates (bridges vs. stuck-at)
• Usually normalized so that total of all
candidates equals 1.0
• UCSC method uses probabilities to compare
stuck-at candidates to bridges in same
diagnosis
©2005 David Lavo Fault Diagnosis Overview 27
Types of Diagnosis Algorithms
• Stuck-at
– Most common, best supported by tools
– Surprisingly effective (~60% exact matches)
– Very fast
• IDDQ
– Orthogonal set of failing data
– Requires interpretation of tester results
– Not well supported by tools

©2005 David Lavo Fault Diagnosis Overview 28


IDDQ Threshold Setting
180
160
140
120
100
80
60
40
20
0
0 50 100 150 200
Types of Diagnosis Algorithms
(Cont)
• Bridging-fault
– May better represent common CMOS faults
– More complicated fault model
– Biggest problem: candidate selection
• Other possible (future) directions:
– Functional fails
– Delay fails
– Parametric failures

©2005 David Lavo Fault Diagnosis Overview 30


Outline
• Introduction: What is Fault Diagnosis?
• Components: What’s involved?
• Algorithm details: How does it work?
• Diagnosis in practice: How does it really
work?
• Research: Why does (or doesn’t) it work?
How should it work?

©2005 David Lavo Fault Diagnosis Overview 31


Diagnosis in Practice

• Using a diagnosis
• Translating the results: circuit navigation
• Evaluating diagnosis quality
• Commercial diagnosis tools

©2005 David Lavo Fault Diagnosis Overview 32


Using a Diagnosis
• Fault diagnosis is used to aid physical
inspection and root-cause identification
• Diagnosis output is logical, not physical:
– Abstract faults (such as stuck-at)
– Gates, ports (nodes), and nets
– No information about location or size
• Translation to physical location requires
navigation of circuit

©2005 David Lavo Fault Diagnosis Overview 33


Types of Circuit Navigation
• Netlist
– Examine RTL (Verilog/VHDL etc) for gates
and data paths
• Schematic
– Symbolic view of gates and wires
• Layout/artwork
– Graphical view of metal lines, poly, vias,
cell boundaries, etc.

©2005 David Lavo Fault Diagnosis Overview 34


Circuit Netlist
module TOP (CLK, Reset, StartOut, SiReady, Rst_CntN, Up_DnN, Wr, SDin, Wr_RAM, Wr_Rreg,
RAM_Addr, ATG_TESTMODE, BIST_TESTMODE, SDout, TwoOnes, OneOne, NoOnes, TwoZeros,
OneZero, NoZeros);

input CLK;
inout Reset, StartOut, SiReady, Rst_CntN, Up_DnN, Wr, SDin, Wr_RAM;

inout [2:0] RAM_Addr;


inout ATG_TESTMODE;
inout BIST_TESTMODE;
inout SDout, OneZero, NoZeros;
inout TwoOnes, OneOne, NoOnes, TwoZeros, Wr_Rreg;

// Tie off cells


TLOW tielow1 (.Q(tielow));
THIGH tiehigh1 (.Q(tiehigh));

// Inverted CLK
wire CLK_N;
INVFF clkinv (.Q(CLK_N), .A(CLK));

//PADS

PADNMIOSCM0H08N05B50 PAD001_StartOut (.PUEN(tiehigh),


.PDE(tielow),
.IEN(tielow), .I(StartOut_I), .SIGNAME(StartOut),
.INMODE(in_mode_avail), .TESTI(jumper001),
.TESTIEN(tiehigh), .SCANIN(jumper001),
.OUTMODE(out_mode_avail), .TESTO(tiehigh), .TESTOEN(tiehigh),
.O(tielow), .OEN(tiehigh));
Netlist Navigation
• Either use text editor on netlist, or use
browser function in simulator
• Browsers allow you to trace forward and
backward and see logic values
• Can be used to view hierarchy and functional
blocks
• Can be tedious

©2005 David Lavo Fault Diagnosis Overview 36


Circuit Schematic
Schematic Navigation
• Either hand-drawn (from netlist navigation) or
tool-generated gate symbols and wires
• Schematic tools in simulators also allow
forward and backward traversal and display
of logic values
• Used to verify fault propagation
• Does not reflect physical distances

©2005 David Lavo Fault Diagnosis Overview 38


Circuit Artwork
Layout (Artwork) Navigation
• Use routing/floorplanning tools to view artwork
• Can usually input cell or wire name and tool will
highlight the object
• Useful for determining (x,y) values
• Also good for evaluating physical implications of
a set of fault candidates
– Faults clustered in a small area are good
– Faults/nets spread around large die areas are
bad

©2005 David Lavo Fault Diagnosis Overview 40


Fault Proximity

Net runs
across die:
physical
examination
is almost
impossible

Faults
contained in
small area:
physical
examination
is possible
Evaluating a Diagnosis
• A diagnosis without one or a few strong (high-
scoring) candidates is usually poor
• Can indicate:
– Multiple defects
– Unmodeled (complex) behavior
– Inappropriate algorithm
• If the diagnosis is poor, either try another
algorithm or look for more data (failures)

©2005 David Lavo Fault Diagnosis Overview 42


Evaluating a Diagnosis (cont)
• Many diagnoses (~60%) implicate a single
stuck-at fault
• Usually a good sign, but you must consider
equivalent faults
• Many defects can mimic a stuck-at fault,
without being a short to Vdd or Gnd
• Consider nearby nodes also, if practical

©2005 David Lavo Fault Diagnosis Overview 43


Dominance Bridging Fault
Strong inverter

FIB short

Weak inverter
Top candidate is stuck-at fault
on this node.
Candidate #2 is Best

Candidate #1 Candidate #2
Candidate #3

FIB short
Commercial Tool:
Mentor Graphics
• ATPG tool: Fastscan
• Stuck-at diagnosis only
• No IDDQ capability
• Orders candidates by number of matched
failures (biased to lowest non-prediction)
• Also has netlist & schematic browser
• Based on Waicukauski & Lindbloom (D&T‘89)

©2005 David Lavo Fault Diagnosis Overview 46


Commercial Tool: Synopsys

• ATPG tool: TetraMAX


• J. Waicukauski moved to Synopsys after
writing Fastscan
• Diagnosis capability unknown: assumed to be
similar to Fastscan

©2005 David Lavo Fault Diagnosis Overview 47


Commercial Tool: Cadence
• ATGP tool: Encounter Test
• Test and diagnosis tools purchased from IBM
• IBM has had good diagnosis research, but
Encounter’s capabilities are unknown
• Also of interest: Silicon Ensemble - routing tool
• Graphical artwork viewer
• Good for highlighting nets and cells based on
diagnosis results
• Good for determining (x,y) and producing screen
shots

©2005 David Lavo Fault Diagnosis Overview 48


Outline
• Introduction: What is Fault Diagnosis?
• Components: What’s involved?
• Algorithm details: How does it work?
• Diagnosis in practice: How does it really
work?
• Research: Why does (or doesn’t) it work?
How should it work?

©2005 David Lavo Fault Diagnosis Overview 49


Prior Art
• Waicukauski & Lindbloom, IEEE Design & Test, Aug. ‘89
– Most widely-used algorithm for commercial tools
– Finds candidates to match individual tests, attempts to “explain”
all failing tests
• Abramovici & Breuer, IEEE Trans. Computing, June ‘80
– Effect-cause diagnosis
– Permanent stuck-at fault assumption
• Aitken & Maxwell, HP Journal, Feb. ’95
– Analysis of relative importance of models vs. algorithms
• Lavo, Larrabee, et. Al., Proceedings of ITC ’98
– Probabilistic scoring
– Mixed-model diagnosis
• Bartenstein et. Al., Proceedings of ITC ’01
– SLAT: Single Location At-a-Time diagnosis
– Focus on matching per-vector results

©2005 David Lavo Fault Diagnosis Overview 50


Prior Art (cont)
• Jee & Ferguson, Proceedings of ISTFA ’93
– Carafe – Inductive Fault Analysis (IFA)
– Examine circuit to determine likely failure locations
• Aitken, Proceedings of ITC ’95
– Using FIBs to insert defects
– Calibrate/evaluate diagnosis methods
• Henderson & Soden, Proceedings of ITC ’97
– Probabilistic physical failure analysis
• Nigh, Vallett, et. Al., Proceedings of ITC ’98
– Large-scale, multi-company SEMATECH experiment
– Failure analysis of timing and IDDQ fails

©2005 David Lavo Fault Diagnosis Overview 51


Research Directions
• Complex defect behaviors
– Beyond stuck-at and 2-line bridges
– Intermittent faults
– Delay and timing-related defects
– Parametric & process-related defects
– Multiple simultaneous defects
– Is there a simple, inductive way to infer
complex defects?

©2005 David Lavo Fault Diagnosis Overview 52


Research Directions (cont)
• Diagnosibility
– What makes a particular circuit easy or
hard to diagnose?
– What can we do to make diagnosis easier?
• Evaluation of diagnoses
– What makes a good diagnosis?
– Can we quantify our confidence in a
diagnosis?

©2005 David Lavo Fault Diagnosis Overview 53


Research Directions (cont)
• Integration with physical FA & yield improvement
– Can we incorporate process information?
– Can we produce a “physical diagnosis”?
– On-line (or even on-chip) diagnosis
• Commercial toolflow integration
– Can diagnosis tools use industry-standard data
formats?
– Can commercial tools be scripted or
programmed to do better diagnosis?

©2005 David Lavo Fault Diagnosis Overview 54

Das könnte Ihnen auch gefallen