Beruflich Dokumente
Kultur Dokumente
Secondary
Method Validation
Secondary Method Validation
We run controls, sample, we collect data, we create graphs, charts, plots and
crunch numbers.
Then we stuff it all in a folder and hand it to an inspector. It's called method
validation.
§ Method validation studies verify the lab can achieve product performance
claims.
§ Provide useful information about what to expect with a new method & how
results may be different compared to the method being replaced.
Page 4
When is it Done
& revalidated
• whenever the conditions change for which the method has been
validated (e.g., an instrument with different characteristics)
Page 5
Process of establishing a Routine Test
Validate Method
Performance
Maintain Method
Implement Method
Prevent Problem
Precision
AMR
Method
Verification/
Comparison
Range
Page 7
Statistical tools
Common tools:
Mean, SD, CV, Confidence intervals, Chi Squared, Regression analysis
All calculations made with statistical tools are estimates, not absolute truth.
Estimates will vary between studies.
Page 8
Samples Used
Page 9
Why is matrix an issue ?
Matrix: everything in the sample except what we are measuring
Sample matrix can be altered by handling / processing
• Matrix proteins in particular
• Proteins may denature
Page 10
Reference Guidelines
Clinical & Laboratory Standards Institute (CLSI):
Accuracy / Range:
• EP15-A2 User Verification of Performance for Precision and Trueness (2005)
Precision:
• EP5-A2 Evaluation of Precision Performance of Quantitative Measurement
Methods (2004)
• EP15-A2 User Verification of Performance for Precision and Trueness (2005)
Method Comparison:
• EP9-A3 Measurement Procedure Comparison and Bias Estimation Using
Patient Samples (2013)
Reference Interval:
• C28-A3c Defining, Establishing, and Verifying Reference Intervals in the
Clinical Laboratory (2008)
Page 11
Agenda
Precision
AMR
Method
Verification
Comparison
/ Range
Page 12
AMR Verification / Assay Range
Study
Ideal: compare to true reference method using fresh patient samples
Next best: use special samples validated for the method
Study:
• Test accuracy samples in replicates, typically triplicate
• Compare mean of replicates to acceptable limits
Potential issues:
• Need to verify that assigned acceptable values are current to method and reagent lot number
• Comparing results to method in current use is NOT an accuracy study unless current method is a
true reference method
Page 13
TSH : Method Validation : Instrument name (S/N. 546722)
AMR/Linearity/Dilution Recovery using Linearity Panels Lot# 076859
Date: 12-Jan-16
Instrument: ADVIA CENTAUR
Reagent: Lot # 045 : Exp: 20 Mar 2016
Linearity Material: AUDIT
Calibrators Lot#: 076859 Exp.: 31 Jan 2014
Units of Measure: uIU/ml
Acceptability
Achieved Values LIMIT
Calibrators (Target) 1st Rep 2nd Rep 3rd Rep 4th Rep Mean bias % bias + / - 2 CV%
S0 0.40 0.35 0.32 0.40 0.38 0.36 -0.037 -9.1% 13.3% PASS
S1 2.50 2.30 2.40 2.35 2.33 2.345 -0.155 -6.2% 13.3% PASS
S2 5.50 5.60 5.67 5.45 5.89 5.652 0.152 2.8% 13.3% PASS
S3 10.00 11.00 10.40 10.54 10.50 10.61 0.61 6.1% 13.3% PASS
S4 25.50 24.50 24.70 25.50 23.50 24.55 -0.95 -3.7% 13.3% PASS
S5 45.50 43.78 44.01 43.68 43.89 43.84 -1.66 -3.6% 13.3% PASS
S6 75.00 72.08 71.09 73.00 73.23 72.35 -2.65 -3.5% 13.3% PASS
S7 105.00 99.00 99.32 100.00 98.50 99.21 -5.79 -5.5% 13.3% PASS
T SH Linearity
80
Achieved
60
40
20
0
-10 10 30 50 70 90 110
Target
0
25
Manufacturer's claimed linear range: 0.01 to 150 uIU/ml
Page 14
Conclusion: Linearity on Instrument name S/N. 546722 demonstrated from 0.0 to 150 uIU/ml.
Precision
AMR
Method
Verification
Comparison
/ Range
Page 15
Precision
Page 16
Precision
Page 17
Precision - Recommendation
Atleast, Two or Three different control material that represent low and high
Medical decision points are used.
• 20/10/5 replicates of each control material one run for one day over 5
days
• Calculate mean, SD and CV %
• Validate, determine if the long term precision is acceptable.
Page 18
THCG ADVIA CENTUAR - Method Validation
Instrument Serial Number:
Date Performed:18 JAN 2008
Control Name: IMMUNOASSAY PLUS Manufactured by:
C LOT NO: 40191 Exp Date: 31 JULY 2009
Intra-Assay (Within Run) Precision Inter-Assay (Between Run) Precision
Conventional Units ( mIU/mL ) Conventional Units ( mIU/mL )
Level 1 Level 2 Level 3 Date Replicate Level 1 Level 2 Level 3
1 1
2 2
08/01/2012 Run
3 1 13:34PM
3
4 4
5 5
6 6
7 7
09/01/2012 Run
8 8
2 14:30 PM
9 9
10 10
11 11
12 12
10/01/2012 Run
13 3 15:30PM
13
14 14
15 15
16 16
17 17
11/01/2012 Run
18 4 16:50PM
18
19 19
20 20
21
n 0 0 0 22
12/01/2012 Run
Mean 5 18:45PM.
23
sd 24
CV(%) 25
n 0 0 0
Mean
sd
Page 19 CV(%)
What does the SD represent?
s=
å i
( x - x) 2
SD ?
(n -1)
Page 20
Precision: Assumptions & Limitations
n = 20 n = 10 n=6
93 93
96 Number of replicates affects calculated SD
100 100 100
93
89
94
89
94
§ Fewer replicates tend to make SD
87 87 appear larger
89
86 86 86 § Fewer replicates means less
88
86 86 confidence in how accurately the
91 91 calculated SD represents assay
92 92
91
performance
102 102 102
90
90 90
85 85
91 91
92
Mean 91.3 91.6 93.0
SD 4.39 5.52 7.04
%CV 4.8% 6.0% 7.6%
Page 21
How good are our estimates ?
87 Data Points
• Overall mean: 224
• Overall SD: 9.91
• Overall CV: 4.42%
How well do smaller data sets
estimate mean and SD ?
Page 22
Estimating Mean and SD
For a Mean:
Mean SD
Overall 224
8 – 10 replicates can give a useful estimate
Overall 9.90
Data Mean Δ
Data SD Δ
Groups of 10 results Groups of 10 results For the SD:
1-10 221.7 -1.1% 1-10 9.88 -0.2% t An SD calculated from a single study of 20
11-20 230.1 2.6% 11-20 11.76 18.8% replicates is an estimate of the “true” SD
21-30 224.3 0.0% 21-30 10.41 5.2% t This estimate can be from 30% smaller to
31-40 223.7 -0.2% 31-40 9.86 -0.4%
25% larger than the “true” SD and still be
equivalent to the “true” SD
41-50 224.7 0.2% 41-50 8.17 -17.5%
t To reliably estimate the SD to within ±10%
51-60 226.7 1.1% 51-60 7.72 -22.0%
of the “true” value requires 100+ values
61-70 226.1 0.8% 61-70 9.15 -7.6%
t To conclude that a problem exists anytime
71-80 220.2 -1.8% 71-80 11.42 15.4% an SD calculated from a single study
Groups of 20 results exceeds the reference SD (IFU) is
statistically incorrect and unrealistic
1-20 11.42 15.4%
Page 23
Validating Precision
Page 24
Chi Squared
Page 25
Using Chi Squared
Replicate Sample1 Sample2 Sample3
1 93.00 93.00 100.00
2 96.00 100.00 94.00
3 100.00 89.00 86.00
Tests whether precision data collected on-
4 93.00 87.00 91.00
5 89.00 86.00 102.00
site is comparable to expected
6 94.00 86.00 85.00 performance
7 87.00 92.00
8 89.00 102.00 Provides objective test to support
9 86.00 90.00 comparability, even if study SD exceeds
10 88.00 91.00 reference SD
11 86.00
12 91.00
13 92.00
14 91.00
15 102.00
16 90.00
17 90.00
18 85.00
19 91.00
20 92.00
Mean 91.25 91.60 93.00
SD 4.39 5.52 7.04
CV 4.8% 6.0% 7.6%
N 20 10 6
IFU Mean 95 95 95
IFU claimed CV 5.2% 5.2% 5.2%
Chi-square 15.8 12.5 12.2
Crit. Chi-square 31.4 18.3 12.6
CV acceptable Yes Yes Yes
Page 26
Chi squared - Assumptions & Limitations
2 * (n-1)
χ
2 = s
σ2
Page 27
Precision
AMR
Method
Verification
Comparison
/ Range
Page 28
What
120
100
80
60
40
20
0
0 20 40 60 80 100 120
Page 29
Why
Page 30
Typical Scenario
Page 31
Now there is an issue
§ Regression slope is not the same as the method comparison published in the package
insert
§ Difference between methods not the same as between respective peer groups on EQA
reports or QC peer group reports
§ Most common conclusion:
There is something wrong with the new method or instrument
DO NOT use the original results obtained with fresh samples by one method to
compare with results obtained later by another method using the same samples
after being stored, especially if stored frozen
Page 33
Considerations when testing samples
Range of concentration
• Cover as much of the shared analytical range as possible
• Focus on the range of clinical interest
• Sample concentrations should ideally be uniformly distributed over the range
covered
• Single samples at extreme limits of the range are generally not useful – can
strongly bias statistical analysis 80000
Good
Poor distribution
• Be very cautious about using proficiency testing, QC or other manufactured
70000
60000
samples to extend the range of concentration due to potential for matrix
50000
issues 40000
20000
• If dilutions are important, keep the data for diluted samples separate
10000
0
0 10000 20000 30000 40000 50000 60000 70000 80000 90000
Page 34
Testing Protocol
§ Samples should be tested by both methods within a few hours or, at worst,
within the same day
§ If using stored samples, do not thaw or prepare samples until the day they will
be tested
§ Ideally, study should be done over several days, testing a different group of
fresh samples each day
§ Ideally, both methods should be recalibrated during course of study
§ If using pooled, diluted, spiked, or manufactured samples (EQA, QC, etc.), make
sure these samples are identified so they can be tracked throughout the study
Page 35
Reviewing the data
12
30
25
10
20
Difference (µg/L)
15
Method 2
8
10
5
6
0
0 100 200 300 400 500 600 700 800 900
-5
4
-10
2 -15
-20
0 Concentration (µg/L)
0 2 4 6 8 10 12 14
Method 1
Page 36
Scatter Plots
14
12
Always
the same X Y
10
scale Mean of Methodmethod
Comparative 2 7.72Test method
0
0 2 4 6 8 10 12 14
X 1
Method
Page 37
Difference Plots
Page 38
Scatter Plots: Constant SD vs Constant
CV
Constant SD Constant CV
800
16
700
15
600
14
Identity
500
13
400 Identity
12
300
Page 39
Bland Altman Plot
60
Page 40
Reviewing the Data
Page 41
Linear Regression
Page 42
Regression Analysis
Page 44
Misconceptions about correlation
coefficient
Method 2
Good correlation means all the points fall on or 15
0
0 5 10 15 20 25 30
r ≠ Bias
Method 1
Page 45
Dealing with low r value
How to address:
• Obtain additional samples that span broader concentration range
• Often not practical – especially for some methods
• Instead of regression analysis, use difference plot and estimate average difference
• Use an alternate regression model (Weighted, Deming, Passing-Bablock)
Page 46
Case Studies
Dealing with data
Theory is great, but my data never looks like that !
Page 47
Case #1: Few data points
Page 48
Difference plot
regression 2
1.5
How to address: 1
0.5
Test additional samples or …..
0
Estimate average difference -0.5
-1
Use difference plot
2 4 6 8 10 12 14 16
Mean of All
Page 49
Case #2: Troponin I
Number of Samples 30
14
Range of Observations 0.016 to 12.1
Correlation Coefficient (r) 0.999
12
Linear Slope 0.77
0
0 2 4 6 8 10 12 14
Comparative Method
Page 50
Troponin: Difference plot
samples 0.5
-2.5
-3
-3.5
0.01 0.1 1 10 100
Comparative Method
Page 51
Troponin I: Excluding extreme samples
Number of Samples 26
0.11
Range of Observations 0.016 to 0.1
Correlation Coefficient (r) 0.962 0.1
Linear Slope 1.00
0.09
Linear Intercept 0.00
0.08
New Method
range 0.06
0.03
0.02
0.01
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11
Comparative Method
Page 52
Case# 3
90,000
HCG
Number of Samples 18
80,000
Range of Observations 1.82 to 68,651
40,000
30,000
20,000
10,000
0
0 10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90,000
Comparison
Page 53
Dealing with diluted samples
0
0 100 200 300 400 500 600 700 800 900
Comparison
Page 54
Case #4
New Method
10
0
0 5 10 15 20 25
Comparison Method
Page 55
Case #4 - suggestions
New Method
10
0
0 5 10 15 20 25
Comparison Method
Page 56
Case #4: Difference plot
difference
• Is that difference significant ? -0.5
• Could it be an error ?
-1.5
Can collect more samples in higher
range, but if that is not practical this
-2.5
shows overall excellent agreement
-3.5
0 5 10 15 20 25
Comparison Method
Page 57
Dealing with Outliers, etc.
What’s an outlier ?
A result that does not represent overall relationship / performance due
to some sample specific characteristic
How do you know it’s an outlier ?
• EP9-A3:
• Samples with method to method differences more than 4x the
average difference can be considered outliers.
• Up to 2.5% of data points, identified as outliers, can be deleted
without re-assessing the study (1 point in 40)
• Statisticians recommend against removing outliers if no specific cause
can be found – suggest using statistics robust to outliers
Page 58
Outliers
Page 59
Case # 7
1800
LD
1600
1400
1200
New Method
1000
800
Nothing 600
0
0 200 400 600 800 1000 1200 1400 1600 1800
Comparison Method
Page 60
True Method to Method Difference
New Method
1000
600
Will need to update reference interval and
notify clinicians of expected change to results 400
200
0
0 200 400 600 800 1000 1200 1400 1600 1800
Comparison Method
Page 61
Truth Table and Concordance
Used with:
• qualitative assays (infectious disease)
• large method to method differences to
show clinical equivalence
Page 62
Concordance and Quantitative Methods
Comparison Method
Page 63
Precision
AMR
Method
Verification
Comparison
& Range
Page 64
Transferring Reference Intervals
CLSI C28-A3: Defining, Establishing, and Verifying Reference Intervals in the Clinical Laboratory (2008), p28
Page 65
Verifying a Reference Interval (CLSI C28-
A3)
1. Regression analysis
• If regression results are robust (enough samples, good distribution,
appropriate regression model, etc.) can use regression equation
• New Reference limit = slope x old reference limit + intercept
• If calculated new limits match proposed reference interval – verified
2. Using small study
• Select 20 individuals that match criteria used for proposed reference interval
and analyze with method
• If no more than 2 results exceed limits of proposed reference interval –
verified
3. Subjective judgment
• On careful review of all data and description of how proposed reference
interval was established, lab director may accept new reference interval
Page 66
Summary
• Method Validation studies are very useful, good lab practice, and
generally required by regulation
• Keep the assumptions and limitations of all statistical tools used in mind
when reviewing studies
• There is no simple single answer for all data sets; each requires thoughtful
selection of the correct tool for the situation
Page 67