Sie sind auf Seite 1von 294

Measurement Uncertainty in Chemical Analysis

Springer-Verlag Berlin Heidelberg GmbH


Paul De Bievre . Helmut Giinzler (Eds.)

Measurement Uncertainty
in Chemical Analysis

, Springer
Prof Dr. Paul De Bievre
Duineneind 9
2460 Kasterlee
Belgium

Prof Dr. Helmut Giinzler


Bismarckstr. 4
69469 Weinheim

ISBN 978-3-642-07884-2 ISBN 978-3-662-05173-3 (eBook)


DOI 10.1007/978-3-662-05173-3

Cataloging-in-Publication Data applied for


A catalog record tor this book is available from the Library of Congress.
Bibliographic intormation published by Die Deutsche Bibliothek
Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliogratiej
detailed bibliographic data is available in the Internet at http://dnb.ddb.de

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,
reproduction on microtilm or in other ways, and storage in data banks. Duplication of this publication or
parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission tor use must always be obtained from Springer-Verlag. Violations
are liable tor prosecution under German Copyright Law.

Springer-Verlag Berlin Heidelberg 2003


Originally published by Springer-Verlag Berlin Heidelberg New York in 2003
Softcover reprint of the hardcover 1st edition 2003

springeronline.com
The use of general descriptive names, registered names, trademarks, etc. in this publication does not
imply, even in the absence of a specitic statement, that such names are exempt from the relevant protective
laws and regulations and there!ore free tor general use.
Product liability: The publisher cannot guarantee the accuracy of any intormation about dosage and
application contained in this book. In every individual case the user must check such intormation by
consulting the relevant literature.

Coverdesign: Design & Production, Heidelberg


52/3111 Printed on acid-free paper - 5 4 3 2 - SPIN: 11008095
Preface

For over six years, the journal "Accreditation and Quality The latter interpretation has the consequence that the
Assurance" (ACQUAL) has been publishing contributions uncertainty of any quantity influencing the result. including
from the chemical measurement community on various the chemical sample preparation prior to the measurement,
aspects of reliability in chemical measurement, the key must be included in the final uncertainty evaluation, thus
mission of ACQUAL. yielding a combined uncertainty. That, however, entails
One of these aspects is uncertainty. Although even its almost invariably an increase in the size of the uncertainty
very concept is still controversial, ACQUAL authors are bar of the measurement result, previously called the "error
quite proficient in writing about it. Their papers show that bar". Most of the chemists do not yet agree on this. Thus.
uncertainty is interpreted - and used - in many divergent uncertainty is increased as the result of more work because
ways. the whole measurement process must be evaluated for
It seemed a good idea to the publisher, Springer-Ver- possible uncertainty contributions. All of this makes
lag, to present some of the most prominent contributions uncertainty evaluation more elaborate but more realistic,
on the topic that have appeared in ACQUAL in the course and therefore more responsible. This marks a truly dramatic
of the years. The result lies before you. change.
It should be clear to the reader that this is a collection It would be very helpful if the ongoing revision of the
of papers and not an "integrated" book. We are still far "International Vocabulary of Basic and General Terms in
from a homogeneous, internationally accepted common Metrology" (VIM), would define "measurement result"
perception of what uncertainty means in chemical unequivocally in order to promote one meaning of the term
measurement. The result is that we use it in different ways. in our work as well as in international discussions. A common
But dramatic changes in the perception and interpretation language is absolutely essential in this matter.
of "uncertainty" amongst chemists are becoming visible. It gives us great pleasure to deliver ",'hat we consider
They are already reflected in the papers selected. Clearly a useful compendium from ACQUAL authors and editors
more time is necessary for the implementation of end-of- to ACQUAL readers and to other colleagues in the art
20th century uncertainty concepts, and be accepted by and science of chemical measurement. For those among
beginning-of-21-st century minds. the readers who consider themselves as newcomer in the
The spectrum of what we read and hear in matters of field, I propose to read the articles by Dube and / or Kadis
uncertainty in chemical measurement is very broad: it goes as an introduction to the current state and to the problems
from interpreting uncertainty as a mere repeatability of involved.
measurement results obtained from replicate measurements
of the quantity su~/ect to measurement*, all the way to the Prof Dr. P. De Bievre
full uncertainty of the result of a measurement procedure Editor-in-Chief
applied to a quantity intended for measurement. Accreditation and Quality Assurance
Kasterlee
2002-09-20
• quantity (German: "Grosse", French: "grandeur", Dutch: "grootheid")
is not used here in the meaning 'amount', but as the generic term for
the quantities we measure: concentration, volume, mass, temperature,
time. etc.
Contents

Analytical procedure in terms of measurement Uncertainty calculations in the certification


(quality) assurance. . . . . . . . . . . . . . . . . . . . . . . . . . .. 1 of reference materials.
1. Principles of analysis of variance . . . . . . . . . . . .. 88
Metrology in chemistry - a public task. . . . . . . . . . .. 8
Uncertainty calculations in the
Chemical Metrology, Chemistry and the
certification of reference materials.
Uncertainty of Chemical Measurements. . . . . . . . .. 13
2. Homogeneity study ........................ 94
From total allowable error via metrological
Uncertainty calculations in the certification
traceability to uncertainty of measurement of
of reference materials. 3. Stability study. . . . . . . . .. 99
the unbiased result . . . . . . . . . . . . . . . . . . . . . . . . . .. 19
Some aspects of the evaluation of measurement
The determination of the uncertainty
uncertainty using reference materials . . . . . . . . . .. 106
of reference materials certified by laboratory
intercomparison . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 24 Uncertainty - The key topic of metrology
in chemistry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 113
Evaluation of uncertainty of reference materials . .. 29
Estimating measurement uncertainty:
Should non-significant bias be included
reconciliation using a cause and effect
in the uncertainty budget? . . . . . . . . . . . . . . . . . . . .. 34
approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 115
Evaluation of measurement uncertainty for analytical
Measurement uncertainty and its implications
procedures using a linear calibration function . . . .. 39
for collaborative study method validation and
Measurement uncertainty distributions and method performance parameters . . . . . . . . . . . . . .. 120
uncertainty propagation by the simulation
Uncertainty in chemical analysis and validation
approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 44
of the analytical method: acid value
Evaluation of uncertainty utilising the determination in oils ........................ 125
component by component approach ............. 52
A practical approach for assessment of
Uncertainty - Statistical approach, 1If noise sampling uncertainty . . . . . . . . . . . . . . . . . . . . . . .. 131
and chaos .................................. 59
Quality Assurance for the analytical data
Calibration uncertainty ....................... 64 of trace elements in food. . . . . . . . . . . . . . . . . . . .. 138
Measurement uncertainty in microbiology
Customer's needs in relation to uncertainty
cultivation methods . . . . . . . . . . . . . . . . . . . . . . . . .. 70
and uncertainty budgets. . . . . . . . . . . . . . . . . . . . .. 143
The use of uncertainty estimates of test results
Evaluating uncertainty in analytical
in comparison with acceptance limits. . . . . . . . . . .. 74
measurements: pursuit of correctness . . . . . . . . . .. 147
A model to set measurement quality objectives
A view of uncertainty at the bench
and to establish measurement uncertainty
analytical level . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 152
expectations in analytical chemistry laboratories
using ASTM proficiency test data. . . . . . . . . . . . . .. 80 Uncertainty of sampling in chemical analysis. . . .. 158
VIII Contents

Appropriate rather than representative sampling, Uncertainty evaluation in proficiency testing:


based on acceptable levels of uncertainty . . . . . . .. 163 state-of-the-art, challenges, and perspectives . . . .. 223
Experimental sensitivity analysis applied Uncertainty calculation and implementation
to sample preparation uncertainties: are ruggedness of the static volumetric method for the preparation
tests enough for measurement uncertainty of NO and S02 standard gas mixtures ........... 227
estimates? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 170
Assessment of uncertainty in calibration of
Relationship between the performance a gas flowmeter ............................ 237
characteristics from an interlaboratory study
Measurement uncertainty - a reliable
programme and combined measurement
concept in food analysis and for the use of
uncertainty: a case study. . . . . . . . . . . . . . . . . . . .. 174
recovery data? ............................. 242
The evaluation of measurement uncertainty
In- and off-laboratory sources of uncertainty in
from method validation studies Part 1: Description
the use of a serum standard reference material as
of a laboratory protocol. . . . . . . . . . . . . . . . . . . . .. 180
a means of accuracy control in cholesterol
The evaluation of measurement uncertainty from determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 248
method validation studies Part 2: The practical
Assessment of limits of detection and quantitation
application of a laboratory protocol ............ 187
using calculation of uncertainty in a new method
Is the estimation of measurement uncertainty for water determination ...................... 252
a viable alternative to validation? . . . . . . . . . . . . .. 197
Study of the uncertainty in gravimetric analysis
Validation of the uncertainty evaluation for of the Ba ion ............................... 257
the determination of metals in solid samples
Assessment of permissible ranges for results of
by atomic spectrometry . . . . . . . . . . . . . . . . . . . . .. 201
pH-metric acid number determinations using
Statistical evaluation of uncertainty for rapid uncertainty calculation. . . . . . . . . . . . . . . . . . . . . .. 263
tests with discrete readings - examination of
Uncertainty and other metrological parameters
wastes and soils . . . . . . . . . . . . . . . . . . . . . . . . . . .. 207
of peroxide value determination in vegetable oils . 267
Influence of two grinding methods on the
Uncertainty of nitrogen determination by the
uncertainty of determinations of heavy metals
Kjeldahl method. . . . . . . . . . . . . . . . . . . . . . . . . . .. 273
in AAS-ETA of plant samples ................. 211
Glossary of analytical terms: Uncertainty ........ 280
Measurement uncertainty and its meaning in legal
metrology of environment and public health ..... 216
Contributors

Anton Alink Mirella Buzoianu


Nederlands Meetinstituut National Institute of Metrology, Reference Materials
P.O. Box 654, 2600 AR Delft, The Netherlands Group, Sos. Vitan-Barzesti No. 11 , 75669 Bucharest,
Hans Andersson
Romania
SP Swedish National Testing and Research Institute, M Hlomena G F C. Camoes
P.O. Box 857,501 15 Bonis, Sweden CECUL, Faculdade de Ciencias da Universidade de
Thomas Anglov Lisboa, P-1700 Lisbon, Portugal
Department of Metrology, Novo Nordisk NS, Ivo Cancheri
Krogsh0jvej 51, 2880 Bagsvrerd, Denmark Agrarian Institute, 1-38010 San Michele all'Adige,
Sabrina Barbizzi Trento, Italy
Agenzia Nazionale per la Protezione dell'Ambiente - John R. Cowles
Unita Interdipartimentale di Metrologia Ambientale, LGC (Teddington) Ltd., Teddington, England TW11
Via Vitaliano Brancati 48, 00144 Rome, Italy OLY, UK
Vicky J. Barwick Larry F Crawford
Laboratory of the Government Chemist, Bethlehem Steel Corporation, Bethlehem, Pa., USA
Queens Road, Teddington, Middlesex, TW11 OLY, UK
Ricardo J. N. Bettencourt da Silva
Maria Belli
CECUL, Faculdade de Ciencias da Universidade de
Agenzia Nazionale per la Protezione dell' Ambiente -
Lisboa, P-1700 Lisbon, Portugal
Unita Interdipartimentale di Metrologia Ambientale,
Via Vitaliano Brancati 48, 00144 Rome, Italy Simon Daily
LGC (Teddington) Ltd., Teddington, England TWll
Ricard Boque
OLY, UK
Department of Analytical and Organic Chemistry,
Institute of Advanced Studies, Rovira i Virgili Paolo de Zorzi
University of Tarragona, PI. Imperial Tarraco, 1, Agenzia Nazionale per la Protezione dell' Ambiente
43005 Tarragona, Catalonia, Spain (ANPA) - Unita Interdipartimentale di Metrologia
Ambientale, Via Vitaliano Brancati 48,
David Bradley
00144 Rome, Italy
ASTM Headquarters, West Conshohocken, Pa., USA
Andrea Delusa
A.J.M. Broos
Ente Regionale per 10 Sviluppo Agricolo del Friuli
Nederlands Meetinstituut, P.O. Box 654, 2600 AR Delft,
Venezia-Giulia (ERSA), Via Sabbatini 5,
The Netherlands
33050 Pozzuolo del Friuli (UD), Italy
Lutz Brilggemann
Elias Diaz
UFZ Centre for Environmental Research Leipzig-Halle,
Beca de Ampliaci6n de Estudios del F.I.S., Instituto de
Department of Analytical Chemistry, Permoserstrasse
Salud Carlos III, Madrid, Spain
15,04318 Leipzig, Germany
X Contributers

Gunther Dube KajHeydorn


Physikalisch-Technische Bundesanstalt, Bundesallee Department of Chemistry, Technical University of
100,38116 Braunschweig, Germany Denmark, 2800 Lyngby, Denmark
Rene Dybkaer Zhengzhi Hu
Copenhagen Hospital Corporation, Department of Chinese National Center for Food Quality Supervision &
Standardization in Laboratory Medicine, H:S Testing, 32 Xiaoyun Road, Chaoyang District, Beijing
Kommunehospitalet, 0ster Farimagsgade 5, DK-1399 100027, China
Copenhagen K, Denmark
Rouvim Kadis
Joiio Seabra e Barros D. I. Mendeleyev Institute for Metrology, 19 Moskovsky
Instituto Nacional de Engenharia e Tecnologia pr., 198005 St. Petersburg, Russia
Industrial, Estrada do Pa90 do Lumiar, P-1699 Lisbon
Elena Kardash-Strochkova
Codex, Portugal
The National Physical Laboratory ofIsrael, Danciger A
Stephen L.R. Ellison Building, Givat Ram, Jerusalem 91904, Israel
Laboratory of the Government Chemist, Queens Road,
Jesper Kristiansen
Teddington, Middlesex, TWll OLY, UK
The National Institute of Occupational, Health, Lerse
Osvaldo Failla Parkalle 105, DK-2100 Copenhagen, Denmark
Faculty of Agriculture, University of Milan, Milan, Italy
Daniela Kruh
John Fleming Rafael Calibration Laboratories, P.O. Box 2250, 31021
LGC, Queens Road, Teddington, Middlesex TW 11 OLY, Haifa, Israel
UK
Stephan Kappers
Dean A. Flinchbaugh Schering AG, In-Process-Control, Miillerstrasse 170-
Flinchbaugh Consulting, Bethlehem, Pa., USA 178, 13342 Berlin, Germany
Blandine Fourest /lya Kuselman
Institut de Physique Nucleaire, 91406 Orsay Cedex, The National Physical Laboratory ofIsrael, Danciger A
France Building, Givat Ram, Jerusalem 91904, Israel
Michel Gerboles Andree Lamberty
European Reference Laboratory of Air Pollution European Commission, Joint Research Centre, Institute
(ERLAP), Commission of the European Communities, for Reference Materials and Measurements (lRMM), B-
Joint Research Centre, 1-21020 Ispra, Italy 2440 Geel, Belgium
Rattanjit S. Gill Jim Lauda
Laboratory of the Government Chemist, Queens Road, Department of Analytical Chemistry, CHTF STU,
Teddington, Middlesex, TWll OLY, UK Radlinskeho 9, 812 37 Bratislava, Slovak Republic
Manfred Golze Yunqiao Li
Federal Institute for Materials Research and Testing National Research Center for Certified Reference
(BAM), Unter den Eichen 87, 12205 Berlin, Germany Material, No.18 Bei San Huan Dong Lu, Chaoyang Qu,
Beijing, 100013, P. R. China
William A. Hardcastle
Laboratory of the Government Chemist, Queens Road, Thomas Linsinger
Teddington, Middlesex, TWll OLY, UK European Commission, Joint Research Centre, Institute
for Reference Materials and Measurements (IRMM), B-
Werner Hasselbarth
2440 Geel, Belgium
Federal Institute for Materials Research and Testing
(BAM), 12200 Berlin, Germany Li Liu
Chinese National Center for Food Quality Supervision &
Andre Henrion
Testing, 32 Xiaoyun Road, Chaoyang District, Beijing
Physikalisch-Technische Bundesanstalt, Bundesallee
100027, China
100, D-38116 Braunschweig, Germany
John L. Love
Institute of Environmental Sciences and Research, P.O.
Box 29 181, Christchurch, New Zealand
Contributers XI

Xiaohua Lu Michael H. Ramsey


National Research Center for Certified Reference Centre for Environmental Research, School of
Material, No.18 Bei San Huan Dong Lu, Chaoyang Qu, Chemistry, Physics and Environmental Science,
Beijing, 100013, P. R. China University of Sussex, Falmer, Brighton BNl 9QJ, UK
Hans MaUssa Wolfgang Riepe
University of Salzburg, Institute of Chemistry and University of Salzburg, Institute of Chemistry and
Biochemistry, Hellbrunnerstrasse 34, 5020 Salzburg, Biochemistry, Hellbrunnerstrasse 34, 5020 Salzburg,
Austria Austria
Alicia Maroto Angel Rios
Department of Analytical and Organic Chemistry, Department of Analytical Chemistry, Faculty of
Institute of Advanced Studies, Rovira i Virgili University Sciences, University of Cordoba, E-14004 Cordoba,
of Tarragona, PI. Imperial Tarraco, 1, 43005 Tarragona, Spain
Catalonia, Spain
Jordi Riu
Sandra Menegon Department of Analytical and Organic Chemistry,
Ente Regionale per 10 Sviluppo Agricolo del Friuli Institute of Advanced Studies, Rovira i Virgili University
Venezia-Giulia (ERSA), Via Sabbatini 5, of Tarragona, PI. Imperial Tarraco, 1, 43005 Tarragona,
33050 Pozzuolo del Friuli (UD), Italy Catalonia, Spain
Frank Moller F. Xavier Rius
Faculty of Agriculture, University of Milan, Milan, Italy Department of Analytical and Organic Chemistry,
Institute of Advanced Studies. Rovira i Virgili University
Bernd Neidhart
of Tarragona, PI. Imperial Tarraco, 1, 43005 Tarragona,
Philipps-UniversWit Marburg, Hans-Meerwein-Strasse,
Catalonia, Spain
35032 Marburg, Germany
Matthias Rosslein
Seppo I. Niemela
EMPA St. Gallen, Department ofChemistry/Metrology,
Finnish Environment Institute, P. O. Box 140,
Lerchenfeldstrasse, 5, 9014 St. Gallen, Switzerland
00251 Helsinki, Finland
Heinz Schimmel
Riitta Maarit Niemi
European Commission, Joint Research Centre. Institute
Finnish Environment Institute, P. O. Box 140,
for Reference Materials and Measurements (IRMM),
00251 Helsinki. Finland
B-2440 Geel, Belgium
Alberto Noriega-Guerra
Petras Serapinas
European Reference Laboratory of Air Pollution
Plasma Spectroscopy Laboratory, Institute of Theoretical
(ERLAP), Commission of the European Communities,
Physics and Astronomy, A. Gostauto 12, 2600 Vilnius,
Joint Research Centre, 1-21020 Ispra, Italy
Lithuania
Viliam Pdtoprstj;
Avinoam Shenhar
Slovak Institute of Metrology, Karloveskit 63,
The National Physical Laboratory ofIsrael, Danciger A
842 55 Bratislava, Slovak Republic
Building, Givat Ram, Jerusalem 91904, Israel
Jean Pauwels
~FelixSherman
European Commission, Joint Research Centre Institute
The National Physical Laboratory ofIsrael, Givat Ram,
for Reference Materials and Measurements (IRMM)
Jerusalem 91904, Israel
B-2440 Geel, Belgium
NaUie Shi
Inge M Petersen
National Research Center for Certified Reference
Department of Metrology. Novo Nordisk, NS,
Material, No.18 Bei San Huan Dong Lu, Chaoyang Qu,
Krogshojvej 51, DK-2880 Bagsvrerd, Denmark
Beijing, 100013, P. R. China
Mark J.Q. Rafferty
Gino Stringari
Laboratory of the Government Chemist, Queens Road,
Agrarian Institute, 1-38010 San Michele alI'Adige,
Teddington, Middlesex, TWll OLY. UK
Trento, Italy
XII Contributers

Pavol Tarapcik Wolfhard Wegscheider


Department of Analytical Chemistry, CHTF STU, Montanuniversitat Leoben, Franz-Josef-Strasse 18, A-
Radlinskeho 9, 812 37 Bratislava, Slovak Republic 8700 Leoben, Austria
Christoph Tausch Rainer Wennrich
Philipps-Universitat Marburg, Hans-Meerwein-Strasse, UFZ Centre for Environmental Research Leipzig-Halle,
35032 Marburg, Germany Department of Analytical Chemistry, Permoserstrasse IS,
04318 Leipzig, Germany
Michael Thompson
Department of Chemistry, Birkbeck College, Gordon Paul Willetts
House, 29 Gordon Square, London WCIH OPP, UK Food Labelling and Standards Division, Ministry of
Agriculture, Fisheries and Food, CSL Food Science
Guanghui Tian
Laboratory, Norwich Research Park, Colney, Norwich
National Research Center for Certified Reference
NR47UQ,UK
Material, No.18 Bei San Huan Dong Lu, Chaoyang Qu,
Beijing, 100013, P. R. China Alex Williams
19 Hamesmoor Way,Mytchett, Camberley, Surrey,
Yakov l. Tur yan
GUl6 6JG, UK
The National Physical Laboratory ofIsrael, Danciger A
Building, Givat Ram, Jerusalem 91904, Israel Carole Williams
LGC (Teddington) Ltd., Teddington, England TWII
Miguel Valcarcel
OLY, UK
Department of Analytical Chemistry, Faculty of
Sciences, University of Cordoba, E-14004 Cordoba, Roger Wood
Spain Food Labelling and Standards Division, Ministry of
Agriculture, Fisheries and Food, CSL Food Science
Adriaan van der Veen
Laboratory, Norwich Research Park, Colney, Norwich
Nederlands Meetinstituut, P.O. Box 654, 2600 AR Delft,
NR47UQ,UK
The Netherlands
Accred Qual Assur (2002) 7:294-298
DOl 10_1007/800769-002-0484-9
© Springer-Verlag 2002

Rouvim Kadis Analytical procedure in terms of


measurement (quality) assurance

Abstract In the ordinary sense measurement uncertainty as providing a


the term "analytical procedure" single-number index of accuracy
means a description of what has to inherent in the procedure. The
be done while performing an appropriateness of the uncertainty-based
analysis without reference to approach to analytical measurement is
quality of the measurement. A stressed in view of specific inaccuracy
more sound definition of sources such as sampling and matrix
"analytical procedure" can be effects. And methods for their
given in terms of measurement evaluation are outlined. The question of
(quality) assurance, in which a a clear criterion for analytical procedure
specified procedure to be followed validation is also addressed from the
is explicitly associated with an standpoint of the quality requirement
R. Kadis (*) established accuracy of the results which measurement results need to meet
D. I. Mendeleyev Institute for produced. The logic and conse- as an end-product.
Metrology, 19 Moskovsky pr., quences of such an approach are
198005 St. Petersburg, Russia discussed, with background Keywords Accuracy' Analytical
e-mail: rkadis@mail.rcom.ru definitions and terminology as a procedure . Measurement assurance .
Tel.: +7-812-3239644 starting point. Close attention is Measurement uncertainty· Quality
Fax: +7-812-3279776 paid to the concept of assurance . Validation

Introduction "The analytical procedure refers to the ,yay of perfor-


ming the analysis. It should describe in detail the steps
There are different ways in ,vhich quality assurance concepts necessary to perform each analytical test. This may
playa role in analytical chemistry. Most of them such as include but is not limited to: the sample, the reference
stipulating requirements for the competence and acceptance standard and the reagents preparations, use of the
ofanalytical laboratories, writing quality manuals, performing apparatus, generation of the calibration curve, use of
systems audits, etc. can be viewed as something foreign to the formulae for the calculation, etc." [2].
common analy1ical thinking forced upon analysts by external In brief, this simply means a description of all that should
authorities. Perhaps another possible way is to try to integrate be done in order to perform the analysis.
quality assurance aspects into common analytical concepts Leaving aside some prolixity in the definition above,
and (re)define them in such a way as to explicitly include the main thing that is lacking is the goal requirement
the quality matters required. This may facilitate an effective needed in considering quality matters. As it is shown in
quality assurance strategy in analytical chemistry. In ordinary this paper, a sound definition of an analytical procedure
usage the term "analytical procedure" hardly needs a can be given in terms of measurement (quality) assurance.
referential definition and may be for this reason there are The case in point is not simply "the way of performing
few official definitions of the term. The only definition quoted the analysis" but that which ensures obtaining the results
in the references [1] is rather diffuse: of a specified quality. What this eventually means is a
2 R. Kadis

prescribed procedure to follow in producing results with in the hierarchy of analytical methodology [7] expressed
a known uncertainty. as a sequence from the general to the specific:
If we have indeed recognized chemical analysis to be
technique ~ method ~ procedure ~ protocol
measurement, though possessing its own peculiarities, we
can apply the principles and techniques of quality assurance Indeed, the procedure level provides the specific directions
developed in measurement to analytical work. These principles necessary to utilize a method, which is in line with the
and techniques constitute the field of measurement assurance definition of measurement procedure given in the Interna-
[3], a system affording a confidence that all the measurements tional Vocabulary of Basic and General Terms in Metrology
produced in a measurement process maintained in statistical (VIM): "set of operations, described specifically, used in
control are good enough for their intended use. "Good enough" the performance of particular measurements according to
implies here nothing more than having an allowable a given method" [8].
uncertainty. Although measurement assurance was originally This nomenclature is however not always adhered to.
developed for instrument calibration, i.e. with emphasis on In many cases, i.e. scientific publications, codes of practice,
measurement traceability, it is reasonable to treat it more or official directives, an analytical procedure is virtually
generally. One can say that a fixed measurement procedure implied when an analytical method is spoken about.
is a means of assigning an uncertainty to a single measurement, Commonly used expressions such as "validation of analytical
and this is the essence of measurement assurance. This also methods" or "performance characteristics of analytical
reveals the role a prescribed (analytical) procedure plays in methods" are typical examples of incorrect usage. Such
routine analytical measurement. We will focus on different confusion appears even in the definition suggested by Wilson
aspects involved in the concept of an analytical procedure in 1970 for the term "analytical method" [9]. As he then
defined in terms of measurement assurance such as put it, "an analytical method is to be regarded as the set of
terminology, content, evaluation, and validation. written instructions completely defining the procedure to
be adopted by the analyst in order to obtain the required
analytical result". It is actually difficult to make a distinction
Starting point between the two notions when one of them is defined in
terms of the other. On the other hand, there is normally no
Chemical analysis generally consists of several operational
reason to differentiate the two most specific levels in the
stages beginning with taking a sample representative of the
hierarchy above, carrying the term "procedure" over to
whole mass of the material to be analysed and ending with
the designated "protocol". The latter was defined [7] as
calculation and reporting of results. In this sequence the
"a set of definitive directions that must be followed, without
measurement proper usually makes a relatively small
exceptions, if the analytical results are to be accepted for
contribution to the overall variability involved in the entire
a given purpose". So, written directions have to be faithfully
chemical measurement process (CMP) [4], the largest portion
followed in both cases. In many instances the term
ofwhich being concerned with "non-measurement" operations
"procedure" actually signifies a document in which the
such as isolation, separation, and so on. Because of this,
procedure is recorded - this is specifically noted in VIM
everything in the chain that affects the chemical measurement
in respect to "measurement procedure". Besides, the term
result must be predetermined as far as practically possible:
"standard operating procedure" (SOP), especially applied
the experimental operations, the apparatus and equipment,
to a procedure intended for repetitive use, is popular in
the materials and reagents, the calibration and data handling.
quality assurance terminology.
Thus, a "complete analytical procedure, which is specified
A clear distinction needs to be drawn between analytical
in every detail, by fixed working directions (order ofanalysis)
procedure as a generalized concept and its particular
and which is used for a particular analytical task" [5] - a
realization, i.e. an individual version of the procedure arising
concept presented by Kaiser and Specker as far back as in
in specific circumstances. In practice, an analytical procedure
the 1950s [6] - becomes a point of critical importance in
exists as a variety of realizations, differing in terms of
obtaining meaningful and reproducible results. We use the
specimens, equipment, reagents, environmental conditions,
term "analytical procedure" or merely "procedure" for short,
and even the analyst's own routine. Not distinguishing
in the sense outlined above.
between these concepts can lead to a misinterpretation
embodied, for instance, in the viewpoint that with a detailed
specification the procedure will change "each time the
"Method'; "procedure'; or "protocol" analyst, the laboratory, the reagents or the apparatus
The importance of the correct usage of relevant terms, in changed" [9]. What will actually change is realizations of
particular, the term "procedure" rather than "method" is the procedure, only provided that all specified variables
noteworthy. The terms actually correspond to different levels remain within the specification. Also one cannot but note
Analytical procedure in terms of measurement (quality) assurance 3

that the hierarchy of methodology above concerns, in fact, the conditions and operations that ensure established
a level of specificity rather than the extent to w'hich the characteristics of trueness and precision" [12]. This wording
entire CMP may be covered, Although sampling is the which goes beyond the scope of analytical chemistry
first (and even the most critical) step of the process, it is specifically differs from the VIM definition of measurement
often treated as a separate issue when addressing analytical procedure quoted above by including an accuracy
methodology, A "complete analytical procedure" mayor requirement as a goal function. It clearly points out that
may not include sampling, depending on the particular adhering to such a fixed procedure ensures (at least
analytical problem to be solved and the scope of the conceptually) that the results obtained are of a guaranteed
procedure. degree of accuracy.
Two basic statements underlie the definition above. First,
an analytical procedure when performed as prescribed, \vith
the chemical measurement process operating in a state of
An analytical procedure yields the control, has an inherent accuracy to be evaluated. Second,
results of established accuracy a measure of the accuracy can be transferred to the results
In line with Doerffel's statement w'hich refers to analytical produced, providing a degree of their confidence. In essence,
science as "a discipline between chemistry and metrology" the measure of accuracy typical of a given procedure is
[10 L one may define analytical service - as a sort of being assigned to future results generated by the application
analytical industry, that is practical activities directed to of the procedure under specified conditions. The justification
meeting customer needs - as based upon concepts of for both the propositions was given by Youden in his work
chemistry, metrology, and industrial quality control. The on analytical performance [13, 14] where methods for
intention of any analytical methodology in service is to determining accuracy in laboratories were discussed in detail.
produce data of appropriate quality, i.e. those that are fit As a prerequisite for practical implementation of the
for their intended purpose. The question to ans\-,ver is what analytical procedure concept it is assumed that the chemical
kind of criteria should be addressed in characterizing fitness- measurement process remains in a state of statistical control,
for-purpose. being operated within the specifications. To furnish evidence
From the vie\vpoint of objective of measurement, which for this and to avoid the reporting of inyalid data the
is to estimate the true value of the quantity measured, and analytical system needs to be tested for continuing
its applicability for decision-making, closeness of the result performance. A number of control tests may be used with
to the true value, no matter how it is expressed, should be this aim, for instance, testing the difference between par-
such a criterion. If a measurement serves any practical allel determinations ,vhen they are prescribed to be carried
need, it is to meet an adequate level of accuracy. It is out, duplicating complete analysis of a current test material,
compliance with an accuracy requirement that fundamentally and analysis of a reference material. An important point
defines the suitability of measurement results for a specific is that the control tests are to be performed in such a manner
use, and hence corresponding demands are to be made on and in such a proportion as a given measurement requires,
a measurement process that produces the results. Next it and are an integral part of the w'hole analytical procedure.
is assumed that the process is generated by the application The control tests may be more specific in this case and
of a measurement procedure and thus, the accuracy relate to critical points of the measurement process. As
requirements should be finally referred to in the procedure examples, calibration stability control ,vith even one
itself. (The "requirements sequence" first implies reference sample or interference control by spiking provide
substantiation of the demands on accuracy in a particular useful means of expeditious control in an analytical
application area, the problem that needs special consideration procedure.
in chemical analysis [11].) A principal point in this scheme is that accuracy
Following this basic pattern, it is reasonable to re-define characteristics should be estimated before an analytical
Kaiser's "complete analytical procedure", so that the procedure is regularly used and should be the characteristics
fitnessfor- purpose criterion is explicitly allmved for. There of any future result obtained by application of the procedure
must be an accuracy constraint built in the definition so as under specified conditions. Measurements of this type are
to give a determining aspect of the notion. It is probably most commonly performed (by technicians, not
unknown to most analytical chemists worldwide that such measurement scientists) in control engineering and are
a definition has long since been adopted in analytical sometimes called "technical measurements". It is such
terminology in Russia. This was formulated in 1975 by measurements that are usually referred to as routine in
the Scientific Council on Analytical Chemistry of the chemical analysis. In fact, the problems of evaluation of
Russian Academy of Sciences. As defined by the latter an routine analyses faced by chemists are treated more generally
analytical procedure is the: ,. a detailed description of all in the "technical measurements" theory [15].
4 R. Kadis

Uncertainty as an index of accuracy line with GUM. Also this is true for the "top-down" approach
of an analytical procedure [20] that provides a valuable alternative when poorly
understood steps are involved in the CMP and a full
It is generally accepted that accuracy as a qualitative mathematical model is lacking. An important point is that
descriptor can be quantified only if described in terms of the top-down methodology implies a reconciliation of
precision and trueness corresponding to random and information available with the required one that is based
systematic errors, respectively. Accordingly, the two on a detailed analysis of the factors which affect the result.
measures of accuracy, the estimated standard deviation and For both approaches to work advantageously a clear
the (bounds for) bias, taken separately, have to be generally specification of the analytical procedure is evidently a
evaluated and reported [16]. As the traditional theory of necessary condition.
measurement errors holds, the two figures cannot be The break with the traditional subdivision of measurement
rigorously combined in any way to give an overall index errors has a crucial impact on the way accuracy may be
of (in)accuracy. Notice that accuracy, as such, ("closeness quantified and expressed. In 1961, Youden wrote [14]:
of the agreement between the result ofa measurement and "There is no solution to the problem of devising a single
a true value of the measurand" [8]) by no means involves number to represent the accuracy of a procedure". He was
any measurement error categorization. indeed right in the sense that a strict probability statement
On the other hand, it has long been recognized that the cannot be made about a combination of random and
confidence to be placed in a measurement result is systematic errors. Today, thanks to the present uncertainty
conveniently expressed by its uncertainty that was thought, concept, we maintain the other opinion that such a solution
from the outset, to mean an estimate of the likely limits to does exist. It is measurement uncertainty that can be regarded
the error of measurement. So, uncertainty has traditionally as a single-number index of accuracy inherent in the
been treated as "the range of values about the final result procedure. In doing so we must not be confused by the
within ,vhich the true value of the measured quantity is fact that the operational definition of measurement
believed to lie" [17]. However, there was no agreement uncertainty that GUM presents does not use the unknown
on the best method for assessing uncertainty. Consistent "true value" of the measured quantity following pragmatic
with the traditional subdivision, the "random uncertainty" philosophy. The old definitions and, in particular, that cited
and the "systematic uncertainty" each arising from above are equally valid and are now considered ideal.
corresponding sources should be kept separate in the Consequently, ,ve can define an analytical procedure as
evaluation of a measurement, and the question of how to leading to results with a known uncertainty, as in Fig. 1 in
combine them was an issue of debate for decades. which typical "constituents" to be specified in an analytical
Now a unified and widely applicable approach to the procedure are shown.
uncertainty statement set out in ISO Guide (GUM) [18] is
being accepted in many fields of measurement , particularly
in analytical measurements due to the helpful adaptation Specific inaccuracy sources in an analytical
in the EURACHEM Guide [19]. Some peculiarities of the procedure
new approach can be intimated, specifically, the
abandonment of the previous distinction betw·een random What has been said in the previous section generally refers
and systematic uncertainties, treating all of them as standard- to specified measurement procedures used in many fields
deviation-like quantities (after the corrections for known of measurement. There are, however, some special reasons,
systematic effects have been made), and their possible specific to chemical analysis, that make the uncertainty
estimation by other than statistical means. Fundamental, methodology particularly appealing in analytical
however, is that any numerical measurement is not thought measurements. This is because of specific inaccuracy sources
ofin isolation, but in relation to the process which generates in an analytical procedure which are difficult to be allowed
the measurements. All the factors operative in the process for otherwise. Two such sources, sampling and matrix
being defined, they virtually determine the relevant effects, will be mentioned here, with an outline of the
uncertainty sources, so making practicable their methods for their evaluation.
quantification to finally derive the value of total uncertainty.
One can say that the measurement uncertainty methodology Sampling
fits neatly the starting idea of a procedure specified in every
detail, since the procedure itself defines the context which Where sampling forms part of the analytical procedure,
the uncertainty statement refers to. all operations in producing the laboratory sample such as
This is true of the component-by-component Cbottom- sampling proper, sample pre-treatment, carriage, and sub-
up") method for evaluating uncertainty that is directly in sampling require examination in order to be taken into
Analytical procedure in terms of measurement (quality) assurance 5

account as possible sources contributing to the total as physical properties that influence the result and may be
uncertainty. at different levels for analytical samples and a calibration
It is generally accepted that a reliable estimate of this standard. It has long since been suggested in examination
uncertainty can be obtained empirically rather than of matrix effects [26, 27] that the influence of matrix factors
theoretically. Accordingly, an appropriate methodology has be varied (at least) at two levels corresponding to their
being developed [e. g. 21, 22] aimed at separating the upper and lower limits in accordance with an appropriate
sampling contribution from the total variability of the experimental design. The results from such an experiment
measurement results in a specially designed experiment. enable the main effects of the factors and also interaction
This is not, however, the only way of quantifYing uncertainty effects to be estimated as coefficients in a polynomial
in sampling. Explicit use of scientific judgement is now regression model, with the variance of matrix-induced error
equally approved w'hen experimental data are unavailable. found by statistical analysis. This variance is simply the
An illustrative example from the EURACHEM Guide (Ref. (squared) standard uncertainty we seek for the matrix effects.
19, Example A4) clearly demonstrates the potential of In many ways, this approach is similar to ruggedness
mathematical modelling inhomogeneity as an alternative testing aimed at the identification of operational (not matrix-
to the sampling assessment experiment. related) conditions that are critical to the analytical
It is significant that ,vith the uncertainty methodology performance.
both the major analytical properties, "accuracy" and
"representativeness" [23], which quality of analytical data
relies on, can be quantified and properly taken into account "Method validation" in terms
to give a single index of accuracy. This index expresses of measurement assurance
consistency between the measurement results and the true
value that refers to a bulk sample of the material rather The presented concept of analytical procedure offers a clear
than the test portion analysed. perspective on the problem of "method validation" which
is an issue of great concern in quality matters. Validation
Matrix effects is generally taken to mean a process of demonstration that
a methodology is suitable for its intended application. The
The problem of matrix mismatch is ahvays attendant when question is how should suitability be assessed, based on
one analyses an unknown sample "with the same matrix" customer needs? It is commonly recommended [e.g. 2,28-
using a fixed, previously determined, calibration function. 30] that a number of characteristics such as selectivity/
Not uncommonly, an analytical procedure is developed to specificity, limits of detection and quantitation, precision
cover a range of sample matrices in such a way that an and bias, linearity and working ranges be considered as
"overall" calibration function can be used. An error due criteria for analytical performance and evaluated in the
to matrix mismatch is therefore inevitable ifnot necessary course of an validation study. In principle. they need to be
significant. Commonly regarded as systematic for a sample compared to some standard; based on this. judgement is
with a particular matrix, the error becomes random when made as to whether the procedure under issue is capable
a population of samples to which the procedure applies is of meeting the specified analytical requirements. that is to
considered; this in fact constitutes an inherent part of the say, whether a "method is fit-for-purpose" [28]. However.
total variability associated \yith the analytical procedure. from the perspective of endusers of analytical results. it is
Meanwhile, these effects are in no way included in the important that the data be only of the required quality and
usual measures of accuracy as they result from a "method- thus appropriate for their intended purpose. In other words,
performance study" in accordance with the accepted the matter of primary concern is quality of analytical results
protocols [24, 25]. The accuracy experiment defined by as an end-product. In this respect. a procedure will be deemed
ISO 5725 (Ref. 24, Part 1, Section 4) does not presuppose suitable when the data produced are fit-for-purpose.
any variable matrix-dependent contribution, being confined It follows that the common criteria of validation should
to identical test items. The underlying statistical model be made more specific in terms of measurement assurance.
assumes that solely laboratory components of bias and their It is (the index of) accuracy that requires overriding
distribution must be considered. consideration among the characteristics of analytical
It is notable that such kinds of error sources are fairly performance if quality of the results is primarily kept in
treated using the concept of measurement uncertainty which mind. Other performance characteristics are desirable to
makes no difference between "random" and "systematic". ensure that a methodology is well-established and fully
When simulated samples with known analyte content can understood. but validation of an analytical procedure on
be prepared, the effect of the matrix is a matter of direct those criteria seems impractical also in vie,Y of the lack of
investigation in respect of its chemical composition as well corresponding requirements as is commonly the case.
6 R. Kadis

Fig. 1 Typical "constituents" to be specified within analytical procedure, which ensures obtaining the
results with a known uncertainty

(Strictly speaking, there is no validation unless a particular of analytical measurements may be in order. Nevertheless,
requirement has been set.) the conceptual (measurement assurance) basis of this
We have every reason to consider the estimation of approach to validation deserves attention beyond doubt.
measurement uncertainty in an analytical procedure followed
by the judgement of compliance with a target uncertainty
value as a kind of validation. This is in full agreement Conclusions
with ISO 17025 that points to several ",-ays of validation, This debate allows the following propositions to be
among them "systematic assessment of the factors made:
influencing the result" and "assessment of the uncertainty 1. The term "analytical procedure" commonly used
of the results ... " [31]. In line with this is also a statistical without reference to the quality of data is best
modelling approach to the validation process that has recently defined in terms of measurement (quality) assurance
been developed and exemplified as applied to in-house to explicitly include quality matters. This means a
[32] and interlaboratory [33] validation studies. specified procedure which ensures results with an
A concrete example of such validation is worthy of notice. established accuracy.
Certification (attestation) of analytical procedures used 2. The measurement uncertainty methodology neatly
in regulated fields such as environmental control and safety fits the idea of a specified measurement procedure
is operative in the Russian state measurement assurance and furthermore provides a tool for covering
system as a process of establishing metrological properties specific inaccuracy sources peculiar to analytical
and confirming their compliance ",-ith relevant requirements. measurement. Uncertainty can be regarded as a
(By metrological properties we mean herein the assigned single-number index of accuracy of an analytical
measurement error characteristics, i. e. measurement procedure.
uncertainty.) This is introduced by the Russian Federation 3. When an analytical procedure is so defined,
state standard GOST R 8.563 [34] ",-hich also covers uncertainty becomes the performance parameter that
procedures for quantitative chemical analysis. This needs overriding consideration over and above all
certification is, in fact, a legal metrology measure similar, the others assessed during validation studies. This
to some extent, to pattern evaluation and approval of kind of validation gives a direct answer to the
measuring instruments. Some scepticism concerning the question whether the data produced are of required
efficiency oflegal metrology practice in ensuring the quality quality and thus appropriate for their intended use.

References
1. Holcombe D (1999) Accred Qual 3. Cameron 1M (1976) J Qual Technol 6. Kaiser H, Specker H (1956)
Assur 4: 525-530 8 53-55 Fresenius Z Anal Chern 149: 46-66
2. International Conference on 4. Currie LA (1978) Sources of error 7. Taylor JK (1983) Anal Chern 55:
Harmonization of Technical and the approach to accuracy in 600A-604A, 608A
Requirements for Registration of analytical chemistry. In: KolthotIIM, 8. BlPM, IEC, IFCC, ISO, IUPAC,
Pharmaceuticals for Human Use Elving PE (eds) Treatise on IUPAP, OIML (1993) International
(1994) Text on validation of analytical chemistry. Part I. Theory vocabulary of basic and general terms
analytical procedures. ICH Quality and practice, vo!.I, 2nd edn. Wiley, in metrology, 2nd edn. International
topic Q2A: Definitions and New York, pp 95-242 Organization for Standardization
terminology (http://www.ifpma.org/ 5. Kaiser H (1978) Spectrochim Acta (ISO), Geneva
ich5q_html) 33B: 551-576 9. Wilson AL (1970) Talanta 17: 21-29
Analytical procedure in terms of measurement (quality) assurance 7

10. DoerfIel K (1998) Fresenius J Anal 19. EURACHEM/CITAC Guide (2000) 28. EURACHEM (1998) The fitness for
Chern 361: 393-394 Quantifying uncertainty in analytical purpose of analytical methods. A
11. Shaevich AB (1989) Fresenius Z measurement, 2nd edn (http://www. laboratory guide to method validation
Anal Chern 335: 9-14 eurachem.bam.de/guides/quam2.pdf) and related topics. LGC, Teddington
12. Terms, definitions, and symbols for 20. Ellison SLR, Barwick VJ (1998) 29. Wegsheider W (1996) Validation of
metrological characteristics in Analyst 123: 1387-1392 analytical methods. In: Giinzler H
analysis of substance (1975) Zh 21. Ramsey MH (1998) J Anal Atom (ed.) Accreditation and quality
Anal Chim 30: 2058-2063 (in Spectr 13: 97-104 assurance in analytical chemistry.
Russian) 22. van der Veen AMH, Alink A (1998) Springer, Berlin, etc., pp 135-158
13. Youden WJ (1960) Anal Chern 32 Accred Qual Assur 3: 20-26 30. Bruce P, Minkkinen P, Riekkola M-L
(13) 23A-37A 23. Valcarcel M, Rios A (1993) Anal (1998) Mikrochim Acta 128: 93-106
14. Youden WJ (1961) Mat Res Stand 1: Chern 65: 781A-787A 31. ISO/IEC 17025 (1999) General
268-271 24. ISO 5725 (1994) Accuracy (trueness requirements for the competence of
15. Zemelman MA (1991) Metrological and precision) of measurement testing and calibration laboratories.
foundations of technical methods and results. Parts 1-6. International Organization for
measurements. Izdatelstvo International Organization for Standardization, Geneva
standartov, Moscow (in Russian) Standardization, Geneva 32. Jiilicher B, Gowik P, Uhlig S (1999)
16. Eisenhart C (1963) J Res Nat Bur 25. IUPAC (1995) Protocol for the Analyst 124: 537-545
Stand 67C: 161-187 design, conduct and interpretation of 33. van der Voet H, van Rhijn JA, van de
17. Campion PJ, Burns JE, Williams A method performance studies. Pure Wiel HJ (1999) Anal Chim Acta 391:
(1973) A code of practice for the Appl Chern 67 331--343 159-171
detailed statement of accuracy. 26. Makulov NA (1976) Zavod Lab 42: 34. GOST R 8.563-96 State system for
National Physical Laboratory, Her 1457 -1464 (in Russian) ensuring the uniformity of
Majesty's Stationery OtTice, London 27. Parczewski A, Rokosz A (1978) measurements. Procedures of
18. BIPM, IEC, IFCC, ISO, IUPAC, Chern Analityczna 23: 225-230 measurements. Gosstandart of
IUPAP, OIML (1993) Guide to the Russia, Moscow (in Russian)
expression of uncertainty in
measurement ISO, Geneva
Accred Qual Assur (2001) 6:3-7
© Springer-Verlag 2001

Gunther Dube Metrology in chemistry - a public task

Abstract The importance of analy- uring devices. In fields in which the


tical chemistry is increasing in comparability of measurement re-
many public fields, and the de- sults is of particular importance,
Presented at Analytica Conference 2000, mand for reliable measurement re- they establish traceability struc-
11-14 April 200D, Munich, Germany sults is growing accordingly. A tures. Responding to the globaliza-
measurement result will be reliable tion of trade and industry the In-
only if its uncertainty has been ternational Committee for Weights
quantified. This can be achieved and Measures (CIPM) agreed on
only by tracing the result back to a an arrangement on the mutual re-
standard realizing the unit in which cognition of calibration certificates
the measurement result is ex- (CIPM MRA) issued by the NMIs.
G. Dube pressed. The National Metrology
Physikalisch-Technische Bundesanstalt, Institutes (NMIs) can contribute to
Bundesallee 100,38116 Braunschweig, the reliability of the measurement Keywords Reliability .
Germany results by developing measuring Uncertainty . Traceability
e-mail: gunther.dube@ptb.de
Tel.: +49-531-5923210 methods, and by providing refer- National Metrology Institutes
Fax: +49-531-5923015 ence materials and standard me as- CIPM MRA

be best achieved by cooperation between the two par-


Introduction ties.

In many fields in which quantitative analyses are called


for, analytical chemistry is confronted with new chal- The need for reliability of measurement results
lenges. Particularly in such spheres, on which countries
spend a considerable part of their revenue like health The need for reliability of measurement results is de-
care, environmental protection and nutrition, reliable monstrated by the following examples. In Germany the
measurement results are of great importance. "To expenditure for health care in 1994 amounted to
judge analytical methods and results critically, this be- DM 344.6 billion [2]. This was 10% of the gross nation-
longs at all times to the analyst'S tasks" [1]. Therefore, al product. One-third of this went into medical services:
analytical chemists also use metrological ways of think- 10% of these was spent on laboratory services, i.e. for
ing and terms like traceability and uncertainty of meas- the most part on measurements, and it is well known
urement results. On the other hand, metrologists un- that 30%went into repeating measurements [3]. Repeat
derstand their responsibility in determining amount-of- measurements are carried out only if the results do not
substance measurements, and try to reach uniformity of seem reliable. In this particular case, the financial loss
measurement to achieve comparability of the results. In due to repetitions amounted to DM 3 billion.
doing so, they make use of the worldwide metrological The second example comes from the area of natural
infrastructure. The demand of the public for reliability gas. In Germany the import of natural gas has in-
of the measurement results in analytical chemistry can creased over the last 30 years and in 1998 amounted to
Metrology in chemistry . . a public task 9

Import of Natural Gas, Germany year, this error of 0.1 % leads to a price difference of
OM 20 million.
Ref.: BMWi, http://www.bmwi.de

3e+6 .-----------------------------------,
Uncertainty and traceability of measurement results
2e+6 It follows from these examples that the reliability of the
measurement results is of great public interest. Meas-
Q) urement results are reliable only if their uncertainty is
:; 2e+6 known and quantified. Uncertainty is a metrological
0

-
'ro'
.... term which is defined as follows: Uncertainty: parame-
Q)
ter, associated with the result of a measurement, that
c:: 1e+6 characterizes the dispersion of the values that could
>- reasonably be attributed to the measurand [5].
....
C)
Q)
c:: 1e+6 The uncertainty can be stated only if the traceability
W of the measurement result to a system of units is gua-
ranteed. Traceability is defined as follows [5]: Tracea-
5e+5
bility: property of a result of a measurement or the val-
ue of a standard whereby it can be related to stated ref-
erences, usually national or international standards,
Oe+O +---.----"'1' through an unbroken chain of comparisons, all having
1965 1970 1975 1980 1985 1990 1995
stated uncertainties.
Such a traceability system is demonstrated in Fig. 2.
Year
The International System of Units (SI) is at the top of
Fig.l Import of natural gas, Germany (Ref.: Bundesministerium the system. Its units are realized by standards. A meas-
flir Wirtschaft und Technologie, BMWI) urement is a process, in the course of which the measu-
rand is compared to a standard. For practical measure-
ments, usually a working standard not a primary stand-
ard is used. To state the uncertainty of the measure-
nearly 3 million terajoule (TJ) [4]. The data shown in ment result, the uncertainty of the value assigned to the
Fig. 1 are given in TJ, and that is why the calculation of working standard must be known. It results from the
the natural gas price is based on the energy consumed. uncertainty of the comparison measurement of the
This is the product of calorific value and volume. Fo! working standard with the reference standard. The un-
the determination of the energy, the calorific value H certainty of the value assigned to the reference stand-
of the natural gas must be known. The well-known ard results from the uncertainty of the comparison
method for the calorimetric determination of calorific measurement of the reference standard with the prima-
values is increasingly replaced by a new method. The ry standard. This chain of comparison measurements is
main feature of this method is the determination of the exactly what the definition of the term "traceability"
mole fractions of the gas components using gas chroma- means. If the traceability of a measurement result is
tography. The mole fractions Xj are multiplied by the
molar calorific values HO (t l ) of the gas components.
These products are summarized and multiplied by P2/
Responsibility
RT2 according to Eq. (1).
N
HO[tl V(t2P2)] = j~l XjX HO(t l) :;2 (1)
SI
...
Primary Method
CGPM

CIPM, NMI

where tlis temperature of combustion, V is volume, P2,


t2 is the temperature and pressure at measurement, Tis
the absolute temperature, HO is the molar calorific val- Secondary Method Accredited
ue, x is the mole fraction, R is the gas constant. Calibration
The effects that a wrong gas chromatographic result Laboratories

has on the natural gas price should be considered. If it


Routine Method Routine
is assumed that one component accounting for 10% of I Laboratories
the natural gas was determined incorrectly by 1%. The Working Standard
calculated energy is affected by an error of 0.1 %. If the
price of the natural gas amounts to OM 20 billion per Fig.2 Traceability scheme
\0 G. Dube

guaranteed, its uncertainty can be stated. From this Uncertainty Responsibility

considerations it follows that metrology can provide the


SI CGPM
tools, necessary to get reliable measurement results.
In analytical chemistry, traceability of measurement Primary Method CIPM;NMI
results to SI units is not always possible and the tracea-
bility hierarchy ends below the level of the SI units. For 5x10E·3

example, in the case of standard measuring devices, or Second ary Method Accredited Reference Labs
reference materials, the values are fixed by mutual Producers Calibration Labs

agreement or in the case of methods are generally 1x 10 E-2 Control ;terial


agreed upon. In these cases, the comparability of meas- Routine Method Clinical LaboratOfies
urement results is limited. I
5x to E-2 Patient Sample

Tasks of the National Metrology Institutes (NMlsl Fig.3 Traceability in clinical chemistry

The tasks of the NMls are :


- Realisation, maintenance and dissemination of the
units material. The uncertainty of the pH value on this level
- Development and application of primary measure- amounts to 0.002. The primary buffer solutions are
ment methods used by accredited calibration laboratories as reference
- Establishment of traceability structures solutions for measuring the pH values of secondary
- Guaranteeing the equivalence of measurement buffer solutions in an electrochemical comparison cell.
standards. The uncertainty of the pH of these solutions is higher
In Germany, the Physikalisch-Technische Bunde- than the uncertainty of the primary buffer solution and
sanstalt (PTB) is responsible for the national standards. amounts to 0.003. The secondary buffer solutions are
In chemistry, reference materials and standard measur- used in the routine laboratories for calibrating commer-
ing devices are the national standards. To cover the cial pH meters. The pH values measured by commer-
huge demand for reference materials, the PTB cooper- cial pH meters show an uncertainty of 0.01. This tracea-
ates with other competent national institutions on the bility chain guarantees that the uncertainty of the pH
basis of agreements, first of all, with the Federal Insti- values measured in the routine laboratories are correct-
tute of Material Research and Testing (BAM) and also ly stated. They thus are reliable. Other NMls also keep
with companies which produce and distribute reference such pH measuring devices, and it was possible to de-
materials. PTB also cooperates with other NMls within monstrate by comparison measurements that the stand-
the scope of joint projects aiming at the development of ard measuring devices of different countries can pro-
new reference materials not yet available on the mar- vide measurement results which agree very well [9].
ket. The establishment and the support of traceability
To trace back the values assigned to reference mate- structures is one of the most essential tasks of the
rials to the SI, the NMls develop and apply "primary NMls. Traceability is of particular importance also in
methods". It is the main feature of these methods clinical chemistry. It must be ensured that the results
which are sometimes called "absolute methods" that obtained in the measurement of patient samples, are
they do not make reference to standards of the same reliable. Therefore, in 1988, the Federal Medical Asso-
unit in which the result is expressed. Examples are cou- ciation (Bundesarztekammer) in Germany issued
lometry, gravimetry and isotope dilution mass spec- "Guidelines for Quality Assurance in Medical Labora-
trometry [6, 7]. tories" [10]. Although at present under revision, they
In cases, where traceability to the SI is not possible prescribe the use of quality control samples to check
or can be attained only with a relatively high uncertain- the measurement results obtained in medical laborato-
ty, standard measuring devices form the highest refer- ries. For control materials used in internal quality con-
ence points of the traceability chain. At PTB for exam- trol the producer and for control materials used in ex-
ple, a standard measuring device was set up to provide ternal quality control a reference laboratory must make
traceability of pH measurements [8]. It consists of a sys- sure that an uncertainty is stated for the assigned value.
tem of reference electrodes, which form a electrochem- This is possible only if the control materials are linked
ical cell. The electrolyte of the measuring cell is the pH to primary reference materials and to SI units. The link
buffer solution to be measured. From the cell voltage to the SI can be provided by reference laboratories as
the pH value of the buffer solution is determined by a well as by the laboratories of control material produc-
well-defined measuring procedure. By the measure- ers, provided they have been accredited as calibration
ment this solution gains the rank of a primary reference laboratories. In Germany this accreditation can be pro-
Metrology in chemistry - a public task II

vided by PTB and is carried out by the German Cali- Table 1 CCOM international comparisons, clinical diagnostic
bration Service (DKD). markers. NIST: National Institute of Standards and Technology,
USA; LGC: Laboratory of the Government Chemist, UK;
IRMM: Institute for Reference Materials and Measurements,
Belgium; SP: Sveriges Provnings- och Forskningsinstitut, Sweden
The International Committee for Weights and Measures
Mutual Recognition Arrangement (CIPM MRA) Reference Pilot lab Date
No.
The NMls are obliged by national law to realize, main- Cholesterol in serum CCOM-P6 NIST 199H
tain and disseminate the national standards. However, CCOM-K6 NIST 1999
they also take care of the uniformity of measurement Glucose in serum CCOM-PH NIST 1999
worldwide. The first activity for this was the signing of Creatinine in serum CCOM-P9 NIST 1999
Creatinine in serum CCOM-KI2 N 1ST 200!)
the Meter Convention in 1875. On the basis of this trea- Ca in serum CCOM-PI4 IRMM/SP 20m
ty, the General Conference of Weights and Measures Anabolic steroids in urine In preparation -
(CGPM) and CIPM work today. However, the NMls Hormones in serum In preparation -
are confronted today with new challenges [11]. The cal-
ibration certificates issued by them are generally valid
only in the country of issue and are not accepted world-
wide. This turned out to be a barrier to the internation- - Health
al trade. So, different activities have been launched by - Food
various institutions to overcome these obstacles. The Environment
contribution of the NMIs to these efforts is the "Mutual - Advanced materials
Recognition Arrangement" (CIPM MRA), which was - Commodities
signed by the presidents of 38 NMIs in October 1999 - Forensic matters
during the twenty-first session of the CGPM [12]. Its - Pharmaceuticals
objectives are: - Biotechnology
- To establish the degree of equivalence of national In the field of amount-of-substance measurements,
measurement standards maintained by NMls 70 comparisons have been planned. Some of them have
- To provide for the mutual recognition of calibration already been started. Up to now, the measurement pro-
and measurement certificates issued by the NMIs gramme for clinical chemistry has carried out the com-
- Thereby to provide governments and other parties parisons given in (Table 1) [13].
with a safe technical foundation for wider agree- The results of the key comparisons - including the
ments related to international trade, commerce and uncertainty statement - will be stored in an Internet-
regulatory affairs. accessible database. This will enable companies, ac-
The technical basis of the CIPM MRA is a system of crediting bodies, and institutions to evaluate the equi-
key intercomparisons. Furthermore, the NMls have to valence of the measurement results performed by the
prove that they work in accordance with a quality sys- NMIs. The database will make it easier for businesses
tem. The key comparisons are international compari- and organizations relying on these services to prove
son measurements. The Consultative Committee for compliance with the measurement-related require-
Amount of Substance (CCQM) of CIPM is responsible ments of regulations and standards. The database will
for the comparisons in the field of chemistry. It selects be an integral part of the infrastructure necessary to ex-
the substance systems, organizes the realization of the pand free trade and to eliminate technical barriers to
measurements and the evaluation of the measurement export.
results. The substance systems are chosen from areas of
public interest in which traceability is necessary. Prior- Acknowledgements Stimulating discussions with Mrs. P. Spitzer
ity areas are: and Dr. P. Ulbig are thankfully acknowledged.

References

I. Doerffel K (19H7) Preface to: Statistik 2. Bundesministerium flir Bildung und 3. Semerjian HG (199H) Metrology: Im-
in der analytischen Chemie, 4th edn. Forschung, Bundesministerium flir pact on national economy and inter-
YCH, Weinheim, Germany Gesundheit, Statistisches Bundesamt national trade. In: Seiler E (ed) The
(20UO) Die Gesundheitsberichterstat- role of metrology in economic and
tung des Bundes; http://www.gbe- social development. PTB-Texte, Band
bund.de 9, Braunschweig, pp 99-133
12 G. Dube

4. Bundesministerium fiir Wirtschaft 6. Quinn T (1997) Metrologia 34:61-05 12. Comite international des poids et me-
und Technologie (1999) Entwicklung 7. Richter W (1997) Accred Qual Assur sures (CIPM) (1999) Mutual recogni-
der Einfuhr Naturgas in die Bundes- 2:354-359 tion of national measurement stand-
republik; HYPERLINK H. Spitzer P, Eberhardt E, Schmidt I, ards and of calibration and measure-
http://www.bmwi.de Sudmeier U (1996) Fresenius J Anal ment certificates issued by national
5. Deutsches Institut fiir Normung Chern 356: 17H-IHI metrology institutes. Bureau interna-
(1994) Internationales W6rterbuch 9. Spitzer P (1997) Metrologia tional des poids et mesures (BIPM),
der Metrologie, 2nd edn. Beuth, Ber- 34:375-370 Sevres, France
lin Wien Ziirich 10. Bundesarztekammer (19HH) Dt A.rzte- 13. BIPM (1999) Comite consultatif pour
blatt H5:099-706;(1994) 91 :211-212 la quantite de matiere (CCQM).
11. Richter W (1999) Fresenius J Anal Report of the 5th Meeting (February
Chern 305: 509-573 1999). Bureau international des poids
et mesures (BIPM), Sevres, France
DOl 10.1007/500769-001-0438-7

John L. Love Chemical metrology, chemistry


and the uncertainty of chemical measurements

Abstract Chemical results normally ty of the result to the original sam-


involve traceability to two reference ple. These sources of uncertainty
points, the specific chemical entity may however have much more im-
and the quantity of this entity. pact on the reliability of the result
Results must also be traceable than will any uncertainty associated
back to the original sample. As a with the repeatability of the mea-
consequence, any useful estimation surement. Uncertainty associated
of uncertainty in results must with sampling may amount to
include components arising from 50-1000% of the reported result.
any lack of specificity of the meth- Chemical metrology must be ex-
lL. Love od, the variation between repeats of panded to include estimations of
Institute of Environmental Sciences the measurement and the relation- uncertainty associated with lack of
and Research, ship of the result to the original specificity and sampling.
P.O. Box 29181 sample. Chemical metrology does
Christchurch, New Zealand
e-mail: john.Iove@esr.cri.nz
not yet incorporate uncertainty aris- Keywords Metrology· Sampling·
Tel.: +64-3-3510017 ing from any lack of specificity from Chemical· Specificity· Uncertainty
Fax: +64-3-351 00 I0 the method selected or the traceabili-

Introduction inefficient [I] but reflects the current inability of chemi-


cal measurement to produce consistent results over dis-
Dependable measurement is critical to both science and tance and time.
trade. Without a common understanding of the meaning The situation with respect to physical measurements
of results of measurements, science would not function is in complete contrast to results from different sources
and systems of trade would become inefficient. Trade generally accepted as being comparable. Metrology, the
requires reliable measurements for quantity, quality and science of measurement, has been developed from physi-
safety of goods and without these, delivery slows and cal measurement and emphasizes results traceable to de-
disagreements as to their compliance with specifications fined reference points, normally the International System
proliferate. Reliable measurements in science and trade of Units (SI), and fully analysed uncertainty budgets
depend on having defined standards for analytes, de- based on the processes set out in the Guide to the Ex-
monstrable traceability of results to the defined stan- pression of the Uncertainty of Measurement (GUM) [2].
dards and an understanding of the uncertainties of these This process involves identifying each component of the
processes. measurement that contributes to uncertainty, estimating
International trade agreements under the World Trade the contribution of each component of uncertainty, then
Organization are now emphasizing the current, less than combining these estimations to calculate the total uncer-
satisfactory, state of chemical measurements. In trading tainty. Much of the improvement in consistency of phys-
relationships, both the buyer and seller usually repeat ical measurements has been achieved by use of the un-
tests and often, regulatory agencies require their own in- certainty budget to better define and control the test en-
dependent check. This replication of effort is obviously vironment.
14 J.L. Love

In the last ten years much effort has been applied to Confinnation
introduce these same concepts of physical measurement
into chemical measurement. For example:
The Bureau International des Poids et Mesures
(BIPM) has put in place a consultative committee, the if" .. COffeel chemistry
Consultative Committee on the Quality of Material
(CCQM) [3], to strengthen the relationship of chemi-
cal measurements to its SI unit, the mole. Speciation
EURACHEM and CITAC [4] have developed a guide
for quantifying uncertainty in chemical analysis based Fig. 1 Cause and effect diagram showing sources of uncertainty
on metrological principles and GUM [2] to quantify- associated with chemical measurements
ing uncertainty of measurement.
ISO/IEC 17025: 1999 [5] is replacing ISO Guide 25
[6] as the standard against which laboratories are ac- the measurement. The cause and effect diagram in Fig 1
credited and supports these moves by having an in- represents this situation. Each of these components is
creased emphasis on this metrological approach. important. Get one wrong and the result is unlikely to be
'fit for purpose'.
Incorporating traceability to the mole and uncertainty For many years, analytical chemists have used refer-
budgets into chemical analysis is more complex than is ence methods as a means of limiting the numbers of un-
their application to physical measurement. Normally a knowns by removing those associated with traceability
chemical measurement depends on a combination of of the measurement to the defined chemical entity. Al-
physical measurements, chemical separation of the com- though reference methods remove uncertainty associated
pounds of interest and the selection of the test portion with traceability of the result to the named chemical enti-
from the bulk material. An understanding of the chemis- ty and thereby eliminate most chemical unknowns as an
try involved in these separation processes is vital before issue, they always have the disadvantage that they rede-
reliable results can be achieved and chemical analysts fine the analyte in terms of a method rather than as a
have tended to concentrate on this area of analysis. It is chemical species. Amongst the best reference methods
however a part of the measurement that is tending to be are those published by the Association of Official Ana-
ignored in moves to align chemical measurement with lytical Chemists International (AOAC) [7]. These meth-
the traditional physical metrological process. The sam- ods will have been validated within a number of labora-
pling process, both in the laboratory and outside in the tories from a collaborative study and will have associat-
field also contributes to the uncertainty of the measure- ed estimations of uncertainty based on repeatability and
ment but has tended to be ignored by analysts. Under- reproducibility results from the study.
standing of the uncertainty of chemical measurements In reference methods, uncertainty in the result will di-
will not be achieved without an understanding of the rectly relate to the measured repeatability. Defining the
whole process. analyte as the method result eliminates any uncertainty
related to the underlying chemistry. It may also define
the procedure for taking the test portion in the laboratory
Discussion and thereby include some of the uncertainty associated
with sampling.
Chemical measurement has a fundamental difference from Modern methods of analytical chemistry are less con-
physical measurement in that it does not take place under ducive than traditional methods to the reference method
controlled and defined conditions. Almost always, the pri- approach. Instrument and equipment combinations are
mary objective of a chemical measurement is to determine much more variable between laboratories and change
the amount of components of interest, not the total compo- over time as manufacturers add technical improvements.
sition of the sample. Total composition will almost always Reference methods also cause problems between coun-
remain unknown and therefore the total environment un- tries unless they have international acceptance and they
der which the measurement is taking place cannot be de- limit the adoption of new analytical methodology and
fined or controlled. Unknowns will always increase the equipment.
uncertainty associated with any measurement. As a consequence of the problems associated with
Three components can be considered as contributing reference methods, there is now more emphasis on an
to uncertainty in chemical measurement. These are absolute measure of analytes of interest where these are
sources of uncertainty associated with the sampling pro- distinct chemical entities. For instance, the Codex Com-
cess, the underlying chemistry of the chosen method, in- mittee on Sampling and Analysis is presently debating
cluding its selectivity, and the more readily quantifiable whether analytical requirements for discrete chemical
aspects of uncertainty associated with the repeatability of components in foods can be defined by method perfor-
Chemical metrology, chemistry and the uncertainty of chemical measurements 15

mance criteria or whether a prescribed method is also re- Measurement process Sampling
quired in dispute situations [8]. Reference methods must
still remain for those analytes not readily definable as a .
distinct chemical entity [8].
Moves away from reference methods towards perfor-
Mca"'Hr~nh::r:ib:
imJ t!2.:IUW.!;<;
If<

. .. .
mance criteria will make problems with the underlying -*-t. Unccrtainty in the result
chemistry of the selected method or in taking the test
portion more significant. A problem in either of these ar-
eas may have the consequence of making the result
meaningless. However, much of the recent work on the
analyses of uncertainty in chemical measurement has ne-
glected these issues and instead has tended to concen- Chemistry Hlvoiv(xi
trate on alternatives, based on GUM [2], to repeatability
Fig. 2 Cause and effect diagram showing the issues around under-
estimated from collaborative trials, examples being the
standing the chemistry
work of EURACHEM [4] and the survey of King [9].
Uncertainty arising from the repeatability of chemical
measurement is a characteristic of the method. Its calcu-
nation of characteristics, of the molecules involved. Ex-
lation has similarities to the calculation of uncertainty in
amples are systems involving chromatography, with or
physical measurement and many components are identi-
without, mass spectrometric detection, atomic absorption
cal to those involved in physical measurement, compo-
spectrometry and emission spectrometry. These methods
nents such as uncertainty in mass and volume. However,
do not work unless the measured characteristic, or com-
others such as purity of reference materials and recovery
bination of characteristics, is unique to the compound of
are rather more unique to chemistry, but once deter-
interest or the impact of known interferences can be re-
mined, can still be incorporated into the uncertainty bud-
moved by calculation. The analyst should have used
get using the standard techniques developed for physical
enough properties of the compound to make it unlikely
metrology and described in the publication, GUM [2].
responses from interfering substances could be incorpo-
Once determined, this estimation of uncertainty can be
rated into the reported result.
applied to future tests using the same method in the same
Interference has been a concern of analysts using
laboratory with the same equipment, reagents and staff.
chromatography for a long time. All positive results
Uncertainty due to the underlying chemistry and sam-
from analyses based on chromatographic methods must
pling is much more difficult to estimate. Both may vary
have an appropriate level of confirmation if they are to
with changes of sample. Realistic estimations of uncer-
be specific to the analyte of interest and results are to be
tainty as to the specificity of the method when testing
reliable. Methods of confirmation have included:
samples from one source may not be applicable to sam-
ples from other sources. However, laboratories must be Independent analysis of the sample by a different
able to incorporate uncertainty arising from any lack of method.
specificity into their estimation of total uncertainty if Re-analysis of the sample on a column of different
they are to be able to judge if results will be "fit for pur- polarity known to separate compounds in a signifi-
pose" and give reliable information on uncertainty as re- cantly different order.
quired by ISO 17025 [5]. Re-analysis of the sample using a different wave-
This same issue that uncertainty in analytical chemis- length on the detector. The second wavelength must
try involves more than the uncertainty in the reported re- be chosen so that it will give a good indication of the
sult has been highlighted previously in a small number shape of the absorption curve.
of papers including those of Wells and Smith [10] and Re-analysis of the sample using a detector that oper-
Alexandrov [11]. ates using a different principle. An example could be
an FID and nitrogen specific detector but the different
sensitivities of detectors will limit this option.
Uncertainty in the underlying chemistry A statement from the client of the expected level of
the analyte in the sample. Normally, it could be con-
The underlying chemistry involved in the test method is sidered that the test has an appropriate degree of spec-
obviously important. A simple cause and effect diagram ificity if the result is similar to this expected value.
for the underlying chemistry is shown in Fig. 2. Use of a detector such as a mass spectrometer that
Almost without exception, modern methods of trace gives additional informational on the nature of the
analysis require separation of the analyte of interest from compound detected.
the sample matrix, then estimation of the amount of anal- Recording of the spectrum of the detected compound
yte present using some unique characteristic, or combi- using for instance a diode array detector.
16 J.L.Love

Use of a spike of the compound of interest at a level who continue to improve the ease of operation of existing
that gives a similar response to the measured com- equipment. Improvements can be very good for the expert
pound. A spike gives good evidence that a compound analyst who understands the limitations of all measure-
differs from the compound of interest when retention ments but can also allow laboratories to downgrade the
times are close but it does not provide good confirma- skill level of equipment operators. Automated equipment
tion if retention times coincide. Spikes also reveal can also lead to a decreased level of appraisal of individu-
modifications to responses that can arise with new al results with an increase in uncertainty arising from
matrixes. overlooking problems with the underlying chemistry.
Experience from a number of known similar samples. Automated equipment often allows replacement of
This is unlikely for ad hoc samples from a number of expert technicians with less skilled staff, which will de-
sources but may be relevant for a project or a process crease reliability of judgements made during the analyti-
control laboratory. cal process. A decrease in the skill level of laboratory
staff seems to be a problem throughout the world [14]. It
Many of these approaches to confirmation involve an ad- is partly addressed by accreditation authorities requiring
ditional check procedure following that used to generate a minimum level of expertise within laboratories accred-
the analytical result. A change in the method of confir- ited to ISO/IEC 17025: 1999 [5] but it also requires sup-
mation may have a major impact on the specificity of the port of laboratory owners. Too often this is not forthcom-
result and its potential "fitness for purpose" without hav- ing. Accreditation authorities can probably control major
ing any impact on the metrological estimation of uncer- reductions in skill levels within a laboratory over a short-
tainty. In fact, a laboratory could ignore confirmation to time frame but may have less ability to resist longer-term
get a commercial advantage and still be able to demon- incremental shifts that have the same overall conse-
strate the same uncertainty using the current metrologi- quence. Skill levels in the laboratory certainly has a ma-
cal approach. Confirmation will normally have no effect jor impact on quality of analytical judgements and ana-
on the components of uncertainty included in the metro- lytical reliability without necessarily any effect on the
logical approach but lack of confirmation is likely to calculated uncertainty of the recorded numerical result.
make the specificity of the result so uncertain that it will It is often stated, by for instance King, that a significant
be "unfit for any purpose". number of reported chemical measurements are wrong
Suspect chemistry may not only apply to chromatog- [9]. This is almost certainly true and likely arises from
raphy, it can make any measurement meaningless. The problems in the chemistry underlying the method resulting
December 1999 report from the Canadian Food Inspec- in lack of specificity and/or loss of the analyte. Uncertain-
tion Agency on their Histamine Quality Assurance Pro- ty arising from problems with the chemistry will be rele-
gramme [12] shows major differences between results vant to most chemical measurements aimed at measuring
obtained by high performance liquid chromatography discrete chemical entities. In addition, lower limits of de-
(HPLC) methods, mainly with fluorescence detection, tection are often associated with increased uncertainty un-
and some of those obtained by immunoassay based sys- less significantly improved equipment is used.
tems. Using immunoassay methods for the sample of
fish sauce (code HI6), 2 of the 3 results at 48.00 and
105.83 mg/l 00 g were considerably higher than all ex- The sampling process
cept 1 of the 21 HPLC based results, which had a mean
of 11.83 mg/l 00 g. It seems unlikely these laboratories Sampling is the process used to obtain a portion of the
or the manufacturers of the immunoassay kits did not un- bulk material for testing. It may take place in the field or
derstand or had failed to validate the method. Obviously in the laboratory. Most chemical analyses will involve at
something is missing from these validations and the least two stages of sampling, primary sampling in the
present understandings of these chemical measurements, field and secondary sampling in the laboratory to give
and is unlikely to have been included in any uncertainty the final test portion. All samples are heterogeneous if
associated with the method. Alternatively, but less likely, looked at on a small enough scale [15]. The uncertainty
an interference with the same retention time as histamine arising from sampling will depend on the degree of het-
has quenched the observed fluorescence and resulted in erogeneity in that sample.
low results for HPLC based methods. Chemical laboratories are usually not responsible for
Uncertainty arising from any lack of specificity of the primary sampling in the field but this must be appropri-
underlying chemistry is not going to go away. It is also not ate or the result will be meaningless. They will however
a new concern. Interference has always dominated think- have to prepare the sample delivered to the laboratory
ing and precautions in analytical chemistry and is well and take a representative secondary sample for testing.
discussed in traditional reference textbooks such as Vogel The uncertainty associated with sampling is dependent
[13]. Uncertainty may not be decreased by the new meth- on the sample and may vary between nominally identical
ods analysts continue to introduce or by manufacturers samples although procedures used are identical. This is
Chemical metrology, chemistry and the uncertainty of chemical measurements 17

Table 1 Protein and moisture content of rice flour

Sample I Sample 2

Moisture Protein Moisture Protein

2 2 2 2

13.38 13.36 8.69 8.71 12.36 12.41 7.44 7.47


13.58 13.66 8.24 8.24 12.50 12.50 7.45 7.44
13.60 13.57 8.26 8.25 12.54 12.38 7.46 7.45
13.36 13.36 8.74 8.69 12.51 12.53 7.44 7.45
13.46 13.43 8.62 8.68 12.50 12.50 7.44 7.46
13.48 13.45 8.24 8.26 12.48 12.56 7.64 7.69
13.28 13.29 8.64 8.65 12.56 12.56 7.63 7.69
Mean 13.45 8.49 12.49 7.52
SD Overall 0.12 0.21 0.07 0.10
SD Replicate 0.026 0.022 0.050 0.023

amply demonstrated by some test data on rice flour that


are shown in Table 1.
The test data in Table 1 are taken from homogeneity
data from two sets of samples each produced from two
bags of rice flour and intended for use in a proficiency
system. Seven replicates for each set of samples is too
limited to produce a reliable estimate of the standard de-
viation. However, it is clear even from these limited data
that for sample I, the uncertainty associated with the lo- Test Sample Size (9)
cation of the sample in the bag is about twice that of the Fig. 3 Iodine levels in a dry mix infant formula
second sample for both analyses. Variation between rep-
licates of the same sample from within the batch is simi-
lar within both sets of samples for the protein determina- from statistical relationships such as have been devel-
tion and possibly slightly higher in batch 2 for the mois- oped by Gy [151 and incorporated into the metrological
ture determination. approach to estimating the uncertainty in the defined
An even more extreme example of the effect of lack quality of the measurand.
of information on sample heterogeneity relates to the Without knowledge of the uncertainty associated with
analysis of infant formula for iodine content [16]. A sampling, any estimation of uncertainty in the result is
number of standard methods of analysis for the iodine meaningless. Gy [15] notes that the bias at the primary
content in dairy based products take 1 g test portions for stage of sampling can be as high as 1000% of the result
the analysis and this size test portion is completely satis- and at the secondary stage, as high as 50%. Any analyti-
factory for most formulations. One product on the New cal uncertainty of a few percent is then essentially irrele-
Zealand market when this testing was carried out was vant to the uncertainty budget while the potential biases
however formulated by dry mixing and I g test portions in sampling are this large.
were completely inappropriate. Replicates from 1 g test Estimating uncertainty associated with the sampling
portions of this sample were repeatable with no indica- process requires that test portions be representative of
tion of an analytical problem although the sampling pro- the parent material, that is, they have to be selected by a
cedure meant results were well below the expected level method that is both accurate and reproducible [15]. The
and unfit for purpose. Test portions of 25 g gave more term for "representative sampling" (rG) has been defined
variable results but at least these were centered on the by Gy [15] as a composite quantity dependent on both
expected level. Test portions of 100 g gave results within the maximum variance allowed (sG) and the square of the
a tight range still centered at the expected level. These maximum bias allowed (mG), that is:
results are summarized in Fig. 3 [16].
Uncertainty associated with sampling depends on (rG)=(mG)+(s6)
three main attributes, the heterogeneity of the sample, Thus, representative sampling is characterized by the ab-
the size of the sample and the method of taking this sam- sence of bias together with an acceptable variance [15].
ple from the material in the field or in taking the test por- Sub-samples that are not representative of the whole will
tion from the sample in the laboratory. If these attributes make the final measurements irrelevant to any decision
are all known then the uncertainty can be calculated making.
18 J.L. Love

Analysts must understand and include the uncertainty present the chemical entity of interest and/or has not
associated with sampling if they are to realistically esti- measured all of this entity present.
mate uncertainty of chemical measurements for clients. Uncertainty arising from variation between repeats of
They must understand what is meant by a representative the measurement.
sample and have appropriate data to ensure samples test- Uncertainty arising from the sampling procedure and
ed are representative. However, as Gy [15] has pointed the likelihood that the test portion does not containing
out this is not a traditional part of the education or train- a representative amount of the analyte of interest.
ing of analysts although it is critical to the use of mea-
Present approaches to uncertainty based on metrological
surements and their "fitness for purpose". This must
principles developed for physical measurement concen-
change and sampling processes must be included as part
trate on estimating the uncertainty inherent in the trace-
of the metrological approach to chemical measurement.
ability of the quantity of measurand to its reference point.
In the past laboratories may have excluded primary
Incorporation of metrological practices of physical mea-
sampling from their concern but ISO/IEe 17025:1999
surement into chemical measurement is certainly an im-
[5] now imposes a requirement that the test and calibra-
provement but more effort is needed to incorporate uncer-
tion methods selected are capable of meeting the client's
tainty associated with the sampling process. The metro-
requirements. Without knowledge of the traceability of
logical practices of physical measurement do not address
the measurement back to the client's bulk sample, this is
uncertainty arising from the choice of chemistry and the
impossible.
traceability of the result to the defined chemical entity,
that is, the specificity of the method or uncertainty associ-
ated with sampling. At present chemists use judgement as
Conclusion to when uncertainty associated with sampling and speci-
ficity can be ignored. However, judgements are subjective
Two parameters define the result of a chemical measure-
and incompatible with formal methods to estimate uncer-
ment. These are the named chemical entity and the
tainty. Effort must be spent in developing a metrological
amount of this entity estimated by the defined procedure.
approach for chemical measurements of discrete chemical
Any estimation of uncertainty in the result must consider
entities that will allow realistic estimations of the total un-
traceability of the measurement to both these reference
certainty associated with the reported result, not just the
points. To be useful, the result must also be traceable
uncertainty associated with the numerical value.
back to the original sample. The uncertainty in chemical
measurements must include: Acknowledgements The author thanks Dr. Don Ferry from Inter-
national Accreditation New Zealand (IANZ) for the supply of the
Uncertainty arising from assumptions made in the data on rice tlour replicates. This paper arose from some discus-
chemistry on which the method is based and the pos- sions between the author and the late Dr. John Nicholas at New
sibility that the measured result does not solely re- Zealand Measurements Standards Laboratory (MSL).

References
I. CITAC (2000) Traceability in chemical 7. Horwitz W (2000) Official methods of 13. Vogel AI (1961) A textbook of quanti-
measurement. CITAC web page at analysis of AOAC International. AOAC tative inorganic analysis including ele-
http://www.vtt.filketlcitac/traceabili- International, Gaithersburg, Md., USA mentary instrumental analysis, 3rd edn.
ty.pdf 8. Codex Committee on Methods of Longmans, London, UK
2. BIPM, lEC, IFCC, ISO, IUPAC, Analysis and Sampling 23rd Session, 14. Clapp S (2000) Professional qualifica-
IUPAP, OIML (1995) Guide to the Ex- Budapest, Hungary 26 February-2 tions: How close should we look? In-
pression of the uncertainty in measure- March, 2001. Proposed draft guidelines side Laboratory Management June
ment. ISO, Geneva for the application of the criteria ap- 2000: 18-20. AOAC International,
3. Consultative Committee for Amount of proach by the committee on methods Gaithersburg, Md., USA
Substance (Bureau International des of analysis and sampling. Agenda item 15. Gy P (1998) Sampling for analytical
Poids et Mesures) http://www.bipm.orgl 4a (CX/MAS 01/4) purposes. Wiley, Chichester, UK
enusl2_Commi ttees/CCQ M. shtml 9. King B (2000) Accred Qual Assur 5: 16. Love JL (2000) Sampling - What
4. EURACHEM/CITAC Guide (2000) 173-179 should analytical chemists learn from
Qualifying uncertainty in analytical 10. Wells RJ, Smith RJ (1996) Chern Aust microbiologists? Inside Laboratory
measurement, 2nd edn., Final Draft April 1996: 167-168 Management February 2000: 17-18.
April 2000. EURACHEM II. Alexandrov YI (1997) Fresenius J Anal AOAC International, Gaithersburg,
5. ISO/lEC 17025 (1999) General re- Chern 357: 563-571 Md., USA
quirements for the competence of test- 12. Burns-Flett E (2000) Report by the His-
ing and calibration laboratories. ISO, tamine Quality Assurance Co-ordinator,
Geneva Canadian Food Inspection Agency dat-
6. ISO 25 (1990) General requirements ed 3 I March 2000. Canadian Food In-
for the competence of calibration and spection Agency, 501 University Cres-
testing laboratories. ISO, Geneva cent, Winnipeg, Manitoba R3T 2N6
Accred Qual Assur (1999) 4:4()1-405
© Springer-Verlag 1999

Rene Dybkaer From total allowable error via


metrological traceability to uncertainty of
measurement of the unbiased result

Abstract The concept of "total al- biases of procedure and laboratory.


lowable error", investigated by The sources of bias are discussed
Westgard and co-workers over a and the importance of commutabil-
Presented at: 4th Conference on Quality quarter of a century for use in la- ity of calibrators and analytical
[RJevolution in Clinical Laboratories, boratory medicine, comprises bias specificity of the measurement pro-
Antwerp, Belgium 29-30 October 199R
as well as random elements. Yet, cedure is stressed. The practicabili-
to minimize diagnostic misclassifi- ty of traceability to various levels
cations, it is necessary to have spa- and the advantages of the GUM
R. Dybkaer
Copenhagen Hospital Corporation, tio-temporal comparability of re- approach for estimating uncertain-
Department of Standardization in sults. This requires trueness ob- ty are shown.
Laboratory Medicine, tained through metrological tracea-
H:S Kommunehospitalet, 0ster bility based on a calibration hierar- Key words Metrological
Farimagsgade 5, DK-1399 Copenhagen K, traceability . Total allowable error
Denmark
chy. Hereby, the result is asso-
Tel.: + 45-33-3R-37 1\5/1\6 ciated with a final uncertainty of . Trueness' Unbiased result
Fax: + 45-33-31\-37-1\9 measurement purged of known Uncertainty of measurement

allowable total error


Introduction
= constant inaccuracy of procedure
+ varying inaccuracy due to sample matrix
The important contributions of Prof. James O. West- + unstable inaccuracy detectable by QC
gard to quality assurance in laboratory medicine have + z (unstable imprecision detectable by QC)
spanned a quarter of a century. His initial interest in where z = 1.65 yields a maximum allowable number
statistical comparison of measurement procedures [1] fraction of defects of 5 % .
soon led to criteria for jUdging precision and accuracy The present discussion is about the constant bias of
in a procedure [2]. Based on the concept "total analytic the measurement procedure (the first term, called con-
error", comprising constant systematic error, propor- stant inaccuracy, in the equation above). This compo-
tional systematic error, and random error, the concept nent of overall bias is, in principle, a known detriment
"allowable total error" (originally called total allowable to trueness of measurement (defined as average close-
error) was defined with respect to clinical require- ness to a reference value).
ments, usually as a 95 % limit. This measure has been
maintained during all later developments by Westgard
and his co-workers and has been recently applied to the Trueness and consequences of procedure-dependent
"analytical model" used in a paper on the Validator® bias
2.0 which is a computer programme for automatic
selection of statistical quality control procedures [3]. In It is relevant to ask whether trueness is important or
worded form, the following equation is said to apply whether the sometimes heard pronouncement "preci-
(where QC=internal quality control rules): sion is better than accuracy [meaning trueness]" rele-
20 R. Dybkaer

gates trueness to a lower priority. The reliance on pre- official definition of traceability in metrology is: "prop-
cision is repeatedly seen in the results from external erty of the result of a measurement or the value of a
quality assessment (or proficiency testing) schemes all measurement standard whereby it can be related to
over the world, where method-dependent groupings of stated references, usually national or international
results for a given measurand are abundant. measurement standards, through an unbroken chain of
Bias always impairs the comparability over space comparisons all having stated uncertainty" [11]. As
and time of the results for a given type of quantity and stressed in the first resolution of the 20th General Con-
distorts the relationships between different types of ference on Weights and Measures (CGPM) in 1995
quantity. Biological reference intervals are changed in [12], the top of the calibration hierarchy, when possible,
comparison with a true distribution [e.g. 4, 5]. Harris should be the definition of an SI unit.
even suggested a new term for such intervals, "medical
indifference ranges" [6]. Whereas serial monitoring for
change can sometimes live with a constant bias, this is The physical calibration hierarchy
not the case with screening, initial diagnosis, and move-
ment towards a fixed discriminatory true limit, where In physics, the use of calibration hierarchies is well es-
diagnostic misclassifications are the outcome [e.g. tablished and is used in any laboratory, e.g. for bal-
6-10]. A positive or negative bias of, say, 1 mmolll in ances, volumetric equipment, spectrometer wave-
the amount-of-substance concentration of cholesterol lengths, cuvette light path lengths, thermometers, ba-
or glucose in blood plasma has enormous effects on rometers and clocks.
population health and economy.

The chemical calibration hierarchy


Reduction of bias
For chemical quantities, involving the SI base unit for
Several approaches to the elimination of known bias amount of substance, the "mole", its definition de-
should be considered when selecting, describing and mands specification of the elementary entities of the
operating a measurement procedure for a given type of component under consideration. According to the phy-
quantity: sical calibration hierarchy, a primary standard would be
1. The type of quantity that is to be measured must be needed for each of the huge number of different com-
defined sufficiently well. This is particularly de- pounds that are defined in the measurements. To cir-
manding when analyte isomorphs or speciation are cumvent this obstacle, the Consultative Committee for
involved. Amount of Substance of the International Committee
2. The principle and method of measurement must be on Weights and Measures (CIPM-CCQM) defines a
carefully selected for analytical specificity. primary reference method, which is claimed directly to
3. A practicable measurement procedure including give amount of substance in moles without prior cali-
sampling must be exhaustively described. bration by a primary standard [13, 14]. Current exam-
4. A calibration hierarchy must be defined to allow ples of primary reference methods are isotope dilution-
metrological traceability, preferably to a unit of the mass spectrometry and gravimetry. It should be real-
International System of Units (SI). Traceability in- ized, however, that establishing the more complicated
volves plugging into a reference measurement sys- measurement procedures based on such primary meth-
tem of reference procedures and commutable cali- ods is by no means simple [15] and may require the ex-
bration materials. pertise of the International Bureau of Weights and
5. An internal quality control system must be devised Measures (BIPM) or a national metrology institute
to reveal increases in bias. (NMI). A primary reference measurement procedure
6. Any correction procedures must be defined and val- (prim. RMP) assigns a value with uncertainty of meas-
idated. urement to a primary reference material [13], usually
7. Where possible, there should be participation in ex- purified and stable, used as a primary calibrator (prim.
ternal quality assessment ("proficiency testing") us- C). The steps of the calibration series may be as follows
ing material with reference measurement values. with the responsible bodies in parentheses (accr.
CL = accredited calibration laboratory; mf. = manufac-
turer).
Metrological traceability
SI unit (definition) (CGPM)
The necessary anchor for the trueness of a measure- prim. RMP (BIPM, NMI)
ment procedure is obtained by strict metrological tra- prim. C (BIPM, NMI)
ceability of result, based on a calibration hierarchy. The sec. RMP (NMI, accr. CL)
From total allowable error via metrological traceability to uncertainty of measurement of the unbiased result 21

sec. C (NMI -->accr. CL-->mf.'s lab.) as the catalytic activity concentration of aspartate ami-
mf.'s selected MP (mf.'s lab.) notransferase in plasma and number concentration of
mf. 's working C (mf.'s lab.) erythrocytes in blood, no high-level calibrators exist.
mf. 's standing MP (mf.'s lab.) International calibrators, e.g. from WHO, but no high-
mf. 's product C (mf. -->user) level in vitro procedures characterize a couple of
routine MP (mf., user) hundred types of quantity involving, for example, cho-
routine sample (user) riogonadotropin. An overwhelming number of types of
result (user) quantity have no high-level ending of the traceability
chain, but rely on the internal best-measurement proce-
The length of the hierarchy can be reduced by eliminat- dure and calibrator of the reagent set manufacturer or
ing pairs of consecutive steps, thereby reducing uncer- individual laboratory. The end-user, as a rule, cannot
tainty. be expected to establish the entire traceability chain if
that goes above an in-house procedure. The laborato-
rian usually has to rely on the manufacturer which, in
Commutability and analytical specificity turn, may claim traceability of its product calibrators to
the highest available level, preferably provided by a na-
There are two major reasons why a traceability chain tional metrology institute, an accredited calibration la-
may be broken and trueness lost due to the introduc- boratory, or a reference measurement laboratory. In
tion of bias: insufficient commutability of a calibration fact, this responsibility of the manufacturer is now en-
material and non-specificity of a measurement proce- shrined in the EU Directive on in vitro diagnostic med-
dure. The effect of these separate properties are often ical devices [19], which will be supported by four EN/
indiscriminately lumped together as "matrix effect". ISO standards under development. The laboratorian
Commutability refers to the ability of a material, here a should, however, bolster his or her belief in trueness
calibrator, to show the same relationships between re- and comparability - especially if the traceability chain
sults from a set of procedures as given by routine sam- does not reach high - by recovery experiments [20],
ples [16, 17]. Analytical specificity refers to the ability comparison with a selected procedure [21], and interla-
of a measurement procedure to measure solely that boratory parallel measurements [22], including external
quantity which it purports to examine [16, 18]. Discre- quality assessment [23], preferably on material with ref-
pancies between results of a reference procedure and a erence measurement procedure assigned values [24].
routine procedure applied to routine samples are often The internal quality control system finally checks, with
caused by non-specificity of the routine procedure. The a given probability, whether the current measurements
use of a set of human samples as a manufacturer's cali- are in statistical control with no sign of change in the
brator to eliminate so-called matrix effects should only assumed zero bias.
be accepted if the relationship between the results from
reference and routine procedures is sufficiently con-
stant to allow explicit correction with consequent in- Uncertainty of measurement
creased uncertainty of assigned values.
The definition of metrological traceability (see above)
stipulates that each link in the chain has a known un-
Traceability in practice certainty. Nowadays, this concept and its application
have been reformulated by the BIPM and recently de-
It is relevant to ask how often the routine measurement tailed in the "Guide to the expression of uncertainty in
procedures currently used in laboratory medicine pro- measurement" (GUM) [26]: "parameter, associated
vide results that are traceable to high-level calibrators with the result of a measurement, that characterizes the
and reference measurement procedures (Lequin: per- dispersion of the values that could reasonably be attri-
sonal communication). It turns out that primary refer- buted to the measurand". Useful explanations are pro-
ence measurement procedures and primary calibrators vided in several other guides [26-30] as well as com-
are only available for about 30 types of quantity such as mentaries [e.g. 31-33]. The philosophy is to apply a bot-
blood plasma concentration of bilirubins, cholesterols tom-up approach by formulating a function of all input
and sodium ion. International reference measurement quantities giving the measurand as output. An uncer-
procedures from the International Federation of Clini- tainty budget of all sources of uncertainty is estab-
cal Chemistry and Laboratory Medicine (IFCC) and lished. Important items to consider are:
corresponding certified reference material from BCR - definition of the measurand
are available for the catalytic activity concentration of a - realization of the measurand
few enzymes such as alkaline phosphatase and creatine - sampling
kinase in plasma. For another 25 types of quantity, such - speciation and matrix
22 R. Dybkaer

- instability than classical precision intervals, are bad for business.


- environment and contamination Also, the perceived psychological effect on the custom-
- measuring system er of the term "uncertainty" seems to have led the food
- published reference data industry - naturally concerned about palatability - to
- calibrator values propose the substitute term "reliability". Although it
- commutability would be possible to define a concept with a "comfort-
- algorithms and software ing" term inversely related to the measures of uncer-
- corrections and correction factor. tainty - analogously to accuracy, trueness, and preci-
Each contribution is assessed as a standard uncer- sion - the term reliability is already used for a more
tainty, either by statistical procedure on experimental comprehensive concept covering several analytical per-
data in the form of an a posteriori distribution, the so- formance criteria. There should be no doubt, however,
called Type A evaluation, or by scientific judgement that, as the GUM says, "The evaluation of uncertainty
based on an a priori chosen distribution, Type B evalu- is neither a routine task nor a purely mathematical one;
ation. The few standard uncertainties of important it depends on detailed knowledge of the nature of the
magnitude are combined quadratically, including any measurand and of the measurement" [25]. To alleviate
covariances, and the combined uncertainty, u c , is ob- the calculations involved, commercial EDP pro-
tained as the positive square root. grammes are being offered.
The advantages of this approach are important:
- The transparent budget invites improvement where
major contributions are identified in the total se- Conclusions
quence from definition onwards.
- There is no known significant bias allowing one, The upshot of these considerations is that one should
usually symmetric, measure of uncertainty. cease to define a so-called allowable total error of re-
- The combined uncertainty is comparable with that of sult, with assessable biases of procedure and laboratory
other results. included. Instead, it is necessary to provide corrected
- The combined uncertainty can be quadratically ad- results with a defined allowable maximum uncertainty
ded to those of other results as demanded for tracea- at an agreed level of confidence. Likewise, a manufac-
bility. turer may be asked to specify an expected uncertainty
- The combined uncertainty can be compared with the for a measuring system performing according to a
classical top-down approach of calculating an uncer- measurement procedure under statistical control. Final-
tainty directly from replicate final results to reveal ly, the laboratorian can provide the customer with a
any discrepancy requiring further investigation. corrected result and an accompanying uncertainty in-
The role of certified reference materials (with assigned terval comprising a stated proportion of values that
value and uncertainty) in obtaining traceability and could reasonably be attributed to the measurand. This
avoiding bias is obvious. view is not in conflict with the 25-year-old statement by
The GUM approach to uncertainty is rapidly gaining Westgard and co-workers - using classical terminology
acceptance in metrological institutes and industry, and - that "In principle, only random error need be toler-
must be applied in ISO and CEN standards. It should ated. Systematic errors can be eliminated by appro-
be used in accredited laboratory work but chemists oft- priate improvements in methodology" [1].
en find the implementation difficult and therefore hesi-
tate [34]. Additionally, sometimes, there is a fear that Acknowledgements Ms Inger Danielsen is gratefully thanked for
honest GUM uncertainty intervals, which may be wider her excellent secretarial assistance.

References

1. Westgard JO, Hunt MR (1973) Clin 5. Hyltoft Petersen P. Gowans EMS, 9. Hyltoft Petersen P, H¢rder M (1992)
Chern 19:49-57 Blaabjerg 0, H¢rder M (1989) Scand Scand J Clin Lab Invest 52
2. Westgard JO, Carey RN, Wold S J Clin Lab Invest 49:727-737 (Suppl 2(8):65-87
(1974) Clin Chern 20:825-833 6. Harris EK (1988) Arch Pathol Lab 10. Hyltoft Petersen P, de Verdier C-H,
3. Westgard JO, Stein B, Westgard SA, Med 112:416-420 Groth T, Fraser CG, Blaabjerg 0,
Kennedy R (1997) Comput Method 7. Ehrmeyer SS, Laessig RH (1988) Am H¢rder M (1997) Clin Chim Acta
Programs Biomed 53:175-186 J Clin Path 89:14-18 260: 189-206
4. Gowans EMS, Hyltoft Petersen P,
BJaabjerg 0, H¢rder M (1988) Scand
J Clin Lab Invest 48: 757-764
°
8. Hyltoft Petersen P, Lytken Larsen M,
Harder M, Blaabjerg (1990) Scand
J Clin Lab Invest 50
11. BIPM, IEC, IFCC, ISO, IUPAC, IU-
PAP, OIML (1993) International vo-
cabulary of basic and general terms in
(SuppI198):66-72 metrology. ISO, Geneva
From total allowable error via metrological traceability to uncertainty of measurement of the unbiased result 23

12. Comite International des Poids et 21. Hyltoft Petersen P, St(ickl D, Blaa- 29. EAL-R2 (1997) Expression of the un-
Mesures (1 99H) National and interna- bjerg 0, Pedersen B, Birkemose E, certainty of measurement in calibra-
tional needs relating to metrology. Thienpont L, Flensted Lassen J, tion.
Bureau International des Poids et Kjeldsen J (1997) Clin Chem 30. ISO TR 14253-2 (199H) Geometrical
Mesures, Sevres 43:2039-2046 product specifications (GPS) - In-
13. Kaarls R, Quinn TJ (1997) Metrolog- 22. Groth T, de Verdier C-H (1993) Up- spection by measurement of work-
ia 34:1-5 sala J Med Sci 9H:259-274 pieces and measuring equipment -
14. Quinn TJ (1997) Metrologia 34:61-65 23. Hirst AD (199H) Ann Clin Biochem Part 2: Guide to the estimation of un-
15. Adams F (199H) Accred Qual Assur 35:12-1H certainty in GPS measurement, in cal-
3: 30H-31 6 24. Stamm D (19H2) J Clin Chem C1in ibration of measuring equipment and
16. Dybkaer R (1997) Eur J Clin Chem Biochem 20: H17-H24 in product verification. ISO, Geneva
Clin Biochem 35:141-173 25. BIPM, IEC, !FCC, ISO, IUPAC, IU- 31. Kadis R (199H) Accred Qual Assur
17. Fasce CF, Rej R, Copeland WH, PAP, OIML (1993) Guide to the ex- 3:237-241
Vanderlinde RE (1973) Clin Chem pression of uncertainty in measure- 32. Bremser W (199H) Accred Qual As-
19:5-9 ment. ISO, Geneva sur 3: 39H-402
lH. Kaiser H (1972) Z Anal Chem 26. Taylor BN, Kuyatt CE (1994) NIST 33. Hasselbarth W (199H) Accred Qual
260:252-260 Technical Note 1297. National Insti- Assur 3: 41 H-422
19. EU Directive 9H179/EC (199H) Off J tute of Standards and Technology, 34. Golze M (199H) Accred Qual Assur
Eur Comm L 331: 1-37 Washington 3:227-230
20. Willets P, Wood R (I99H) Accred 27. Eurachem (1995) Quantifying uncer-
Qual Assur 3:231-236 tainty in analytical measurement.
2H. EAL-G23 (1996) The expression of
uncertainty in quantitative testing.
Accred Qual Assur (199X) 3: 1XO-1 X4
© Springer-Verlag 199X

Jean Pauwels The determination of the uncertainty


Andree Lamberty
Heinz Schimmel of reference materials certified
by laboratory intercomparison

Abstract A pragmatic method is of the impact of various laboratory


proposed for the implementation standard uncertainties and of be-
of the Guide to the expression of tween-units variability on the cer-
Jean Pauwels (181) . Andree Lamberty
Heinz Schimmel uncertainty in measurement in the tified reference material (CRM)
European Commission, certification of reference materials uncertainty.
Joint Research Centre by laboratory intercomparison. It is
Institute for Reference Materials and based on the establishment of a Key words Reference material .
Measurements (IRMM)
B-2440 Geel, Belgium
full uncertainty budget for each la- Laboratory intercomparison .
Tel.: + 32-14--571722 boratory result and the estimation Certified value' Uncertainty
Fax: +32-14-590406
e-mail: pauwels@irmm.jrc.be

pooling is not allowed because individual data do not


Introduction belong to the same normally distributed population,
Many reference materials, produced worldwide, are the mean value of the laboratory means is taken as the
certified by laboratory intercomparison, involving a certified value and the half-width of the 95% confi-
large number of independent and, if possible, equally dence interval of the mean value of the laboratory
competent laboratories [1]. Normally, methods used means as its uncertainty.
are based on a variety of chemical and/or physical prin- The limitation of such procedures is that the distri-
ciples. It is then assumed that the differences between bution of the considered values should be normal and
individual results, both within and between laborato- that no other sources of uncertainty than "random ex-
ries, are all of a statistical nature regardless of their perimental uncertainties" should exist [1].
causes. Each laboratory mean is considered as an un- The above procedure finds its justification in the fact
biased estimate of the property of the material to be that one presumes that, if a large variety of indepen-
certified, and usually an unweighted mean of the labo- dent laboratories and methods is used, possible syste-
ratory means is assumed to be the best estimate of that matic effects in the individual laboratory results will be
property. In general, a reference material certification "randomized" and that, eventually, both the residual
involves different laboratories, each of which measures systematic error and its uncertainty are reduced to
the requisite property on different samples, with each zero.
sample measurement consisting of a number of inde-
pendent repeated observations. The certified value and
Determination of an uncertainty according to the Guide
its uncertainty are then estimated on the basis of an
to the expression of uncertainty in measurement
analysis of variance, after verification that all data be-
long to the same normally distributed popUlation. According to the Guide to the expression of uncertain-
If this is the case, the mean value of all individual ty in measurement (GUM)[2], the result of a measure-
data is taken as the certified value, and the half-width ment corresponds to the estimate of the value of a
of the 95% confidence interval of the mean value of all measurand and should, therefore, always be accompa-
individual data as its uncertainty. If, on the contrary, nied by an uncertainty statement. It is, generally, deter-
The detennination of the uncertainty of reference materials certified by laboratory intercomparison 25

mined on the basis of a series of observations obtained (2)


under repeatability conditions; its standard uncertainty
The generally chosen value of the coverage factor k is 2
is expressed as a standard deviation. It is assumed that
or 3. If the probability distribution characterized by y
measurement results are corrected for recognized sig-
and uc(Y) is approximately normal and the effective de-
nificant systematic effects and that every effort has
grees of freedom of uc(Y) of significant size, k = 2 or 3
been made to identify and quantify such effects. More-
corresponds to a level of confidence of approximately
over, any other sources of uncertainty should be esti-
95 or 99%.
mated and taken into account.
The result of a measurement is conveniently ex-
Uncertainty components are of two different types
pressed as:
based on the method used for their evaluation: type A
uncertainties are evaluated statistically on the basis of a Y=y±U (3)
series of observations, and type B uncertainties on the
which means that the best estimate of the value attri-
basis of all means other than statistical ones (e.g. pre-
butable to the measurand Y is y, and that
vious experimental data, knowledge or experience,
manufacturer's specifications, data from certificates, y-U< Y<y+U (4)
published reference data, etc). Both type A and type B
is the interval that may be expected to encompass a
uncertainties can be of a "random" as well as of a "sys-
large fraction (P) of the distribution of values that
tematic" nature.
could reasonably be attributed to Y. The fraction p of
A measurand Y is, however, generally not measured
the probability distribution is named coverage probabil-
directly, but determined from N other quantities X"
ity or level of confidence.
X 2 , ••• , X N through a functional relationship f
The Eurachem document "Quantifying Uncertainty
(1) in Analytical Measurement"[4] shows how the GUM
concept should be applied in chemical measurement
The set of input quantities X" X 2 , ••• , X N may be ca-
and illustrates this by four worked examples. These ex-
tegorized as
amples are however limited to simple analytical deter-
- quantities whose values and uncertainties are directly
minations, and the document discusses neither the
determined in the current measurement; they may then
problem of laboratory intercomparisons nor their use
be obtained from a single observation, repeated obser-
for the certification of reference materials.
vations or judgement based on experience; they may in-
volve the determination of corrections to instrument
readings and corrections for influence quantities Application of the GUM to the determination of the
- quantities whose values and uncertainties are uncertainty of CRMs by laboratory intercomparison
brought into the measurement from external sources,
A typical example of a certification exercise by labora-
such as quantities associated with calibrated measure-
tory intercomparison (e.g. for BCR CRMs) is shown in
ment standards, certified reference materials, reference
Fig. 1:
data obtained from handbooks, etc.
The estimated standard deviation associated with
BAR-GRAPHS FOR LABORATORY MEANS AND 95% CI
the output estimate y of Y, termed combined standard
140.0 150.0 160.0 170.0
uncertainty and denoted uc(y) is determined from the + ••.•.•••. + ••••••••• + ••••••••. + •• ,•••• :::~::~::::::: ••••••••• + ••••••••• +

estimated standard deviation associated with each input LAB 01

estimate Xi of X h termed standard uncertainty and de- LAB 02 <----*---->

noted u(x;). In its second recommendation, the Comite LAB 03 <-------"'- ___ ROO>
International des Poids et Mesures (CIPM) requested LAB 04 I <-------*------->

that this combined standard uncertainty be used "by all LAB 05 <------ ---- ------------>
< ______ w_______ >
participants in giving results of all international com- LAB 06

<___ "' ___ >


parisons or other work done under the auspices of the LAB 07

.00_-_--*--------->
CIPM and Comites Consultatifs" [3]. LAB 08 < __

Although uc(Y) can be universally used to express LAB 09

<-----"'----->
the uncertainty of the result of a measurement, it may LAB 10

be required to give a measure of uncertainty that de- LAB 11


< ________ w_____ OR>
fines an interval about the measurement result that LAB 12

<------"'- __ OR>
may be expected to encompass a large fraction of the LAB 14

distribution of values that could reasonably be attri- Certified interval

buted to the measurand. This additional measure is


termed the expanded uncertainty and is denoted U. It is Fig. 1 Example of certification by laboratory intercomparison as
obtained by mUltiplying uc(Y) by a coverage factor k: performed to-day
26 1. Pauwels' A. Lamberty· H. Schimmel

- Between 6 and 15 laboratories carry out each six BAR-GRAPHS FOR LABORATORY MEANS AND EXPANDED
UNCERTAINTIES
measurements spread on two different units.
- Samples of each of both units are measured on two 140.0 150.0 160.0
+ ••••••••• + ••.•••••• + ••••••••• + ••••••••• +••••••.•• + ••••••.•• + ••••••••• +
170.0

different days. LAB 01


I
<---------- .• -----.------------------>
- The measurement is (e.g. for BCR CRMs) carried LAB 02 <_.-0------------.------------_.-->
I
out under reproducibility conditions, i.e. such that each LAB 03
I
<---------------•. --------------)
replicate has its own calibration, dissolution, extraction, LAB 04
I
<- - - - -_. - - - - -- - - - - - - - .*. ---- _.. ------------->

blank determination, etc. LAB 05 <-------------------*------------------>


I
The comparison of the results is, however, limited to LAB 06
I
<-----------------*------------------>
the bare values of the six replicates carried out by each I
LAB 07 <-----------------*----------_.----->
laboratory, with the immediate consequence that labo- LAB 08
I
<- - - - - - - - - -- - - - - - - - - - - - - - - .*. _." ----- ------ --------- ~>
I
ratories very often do not overlap between each other. LAB 09 <~~~-~~~~~---------*------------------->

Frequently, it is observed that the results of several la- LAB 10 < ___________________ W ___________________ > I
I
boratories participating in the certification do not even LAB 11 < ___________ • ______ W __________________ >

I
overlap with the value which is certified. The reason for LAB 12 <- - - - - - - -- - - -. - - - -- -*- ----- - --- --- ---- -->
I
this is not, as is generally believed on the basis of rou- LAB 14 <- - - - - - -- - - - - - -- - - - - - --- - _w_ - - - - - - - - - __ 0- - - --

tine statistical tests, that there are significant differ-


ences between the results of the different laboratories, EXPANDED UNCERTAINTY <-----------·-M------------->
I
but because only the standard uncertainty on the six re-
plicates is considered and because calculation of a com- Fig.2 Example of certification by laboratory intercomparison
bined standard uncertainty for each participating labo- with consideration of combined standard uncertainties
ratory result is omitted. As already indicated, each ana-
lyst carrying out a measurement should always make up
a complete uncertainty budget considering all recog- Therefore, all results should in principle overlap and
nized components of standard uncertainty affecting his any discrepancies as shown in Fig. 1 should no longer
measurement result. This should a fortiori apply to any exist (see Fig. 2). At this point, it should however be
laboratory which is invited to contribute to the certifi- noted that components of standard uncertainty which
cation of a reference material. The standard deviation are not laboratory specific but which are common to all
sU) of the six replicates carried out by laboratory j, fur- or to part of the participating laboratories (e.g. those
ther denoted as Ul U) already includes part of the uncer- using identical methods) should be considered sepa-
tainties of a purely statistical nature due to day-to-day rately. For this reason it is essential that each laborato-
variation, calibration (at least if each replicate has its ry supplies the project leader with a fully detailed un-
own calibration), recovery yield (same remark), etc, as certainty budget and that these uncertainty budgets are
the measurements are in principle executed under re- extensively discussed with the experts of all participat-
producibility conditions. However, the standard uncer- ing laboratories.
tainties UiU) (for i ranging from 2 to n) due to sampling, The certified value can then be calculated as either
dry mass determination, calibration, recovery yield, the unweighted or as the weighted mean of the labora-
blank correction, matrix effect, possible interferences, tory means. In principle the former should be pre-
etc, generally also contain components of a more syste- ferred, but in practice it may be unfair towards some
matic nature which are not included in sU) and which laboratories, as especially type B components of uncer-
are in general of a much larger magnitude. These tainty may have been evaluated differently from one la-
should then as well be taken into account in the calcu- boratory to another. The certified uncertainty can be
lation of the combined uncertainty ucU) and the ex- calculated after deconvolution (and later recombina-
panded uncertainty UU) of each laboratory result: tion) of all laboratory standard uncertainties in distinct
categories of (combined) standard uncertainties, which
n
may be evaluated as type A andlor as type B:
UU)=k·ucU)=k· ~ [uiU)f (5)
i= 1 1. uncertainties which are exclusively laboratory-de-
pendent [u c ( I)]
i = identification number of all uncertainties considered
These affect the certified uncertainty interval in such a
in each individual laboratory j, varying from 1 to n,
way that the more laboratories are involved in the in-
with n not necessarily identical for each laboratory
tercomparison the smaller their contribution becomes:
From this moment on, it can be assumed that all la-
boratory results are corrected for recognized significant I
systematic effects, that every effort has been made to ~ [ucU)]Z
identify and quantify them, and that all sources of un- j= 1
Uc I = - - - ' - - - -
( ) (6)
certainty have been estimated and taken into account.
The determination of the uncertainty of reference materials certified by laboratory intercomparison 27

j = laboratory identification number, varying from 1 to I Ix(q)-xU) I s,k·uc(I,j) (11)


I = total number of laboratories
whereby uc(l,j) corresponds to the combined category I
2. Uncertainties which are common to all laboratories uncertainty of laboratory j, whereas all laboratory
participating in the certification [uc(l/)] mean values xU) should differ from the overall mean
These affect the certified uncertainty interval in such a x
value by less than:
way that their contribution is independent of the num-
(12)
ber of participating laboratories:
11
whereby u c ( III, q) corresponds to the combined catego-
uc(lI) = ~ [ui(l/)jZ (7) ry III uncertainty of the group to which laboratory j be-
i=l longs.
i = category 1/ uncertainty identification number, vary-
Laboratories whose results do not overlap within
these limits are either affected by unrecognized syste-
ing from 1 to n
Typical examples of this category are the use of a com- matic errors and/or by uncertainties that have been un-
mon calibrant by all laboratories or material-related ef- derestimated or omitted. Their results should therefore
not be considered for certification.
fects such as between-units variation (see "Effect of
The final uncertainty of the laboratory intercompari-
possible inhomogeneity and instability on the certified
uncertainty"). son can then be calculated as:

3. Uncertainties in between the two above categories


[uc(ll/) ]
These are common to groups of limited numbers of la-
K Effect of possible inhomogeneity and instability on the
boratories ~ hq = I, such as those using an identical certified uncertainty
q=l
analysis procedure: Most frequently the between-units variability resulting
K from a homogeneity study is not insignificant compared
~ hq·[uJqW to the uncertainty of the mean value. In addition it is
q=l
(8) generally preferred to assign a single certified value to
g·1 all units of the entire CRM batch. Therefore, the uncer-
with: tainty associated with the (possible) between-units in-
homogeneity of the material should be included in the
11
total uncertainty of the CRM. As indicated in [5], this
uc(q) = ~ [Ui(q )]2 (9)
i=l
can be done either by basing the CRM uncertainty on
the statistical tolerance interval of the homogeneity
q = group identification number, varying from 1 to g study or by including the between-units standard uncer-
g = total number of groups tainty in the "category II" combined uncertainty
I = total number of laboratories [uc(l/)] calculated according to Eq. 7.
hq = number of laboratories in group q The within-unit inhomogeneity, on the contrary,
i = category III uncertainty identification number in should in general not be included in the CRM uncer-
group q, varying from 1 to n tainty, except if such small sample intakes are used (e.g.
4. Moreover, as all laboratory means are not complete- in microanalysis techniques) that the sample inhomo-
ly identical, a residual component [u(R)] corresponding geneity becomes significant compared to the certified
to the standard uncertainty of the average of the labo- uncertainty of the CRM. The main difference with be-
ratory means should be considered as well: tween-units homogeneity testings is that if the observed
within-unit inhomogeneity is significantly larger than
u(R) = Shctw (10) the CRM uncertainty, it is sufficient to recommend the
VI use of a larger sample intake on the basis of the fact
that the uncertainty due to material inhomogeneity is
Shctw = standard deviation of the laboratory means
inversely proportional to the square root of the mass of
I = total number of laboratories
the analysed sample [6]. It is on the basis of this prop-
As already indicated, if the expanded uncertainty of erty that microanalysis was effectively proposed to de-
each laboratory is correctly estimated, all laboratory re- termine experimentally the minimum sample mass
sults should overlap. More specifically, one can state down to which CRM certificates remain valid [7].
that in fact laboratories within the same group should Linear regression and correlation can be used for
have mean values xU) differing from the mean group the prediction of the possible instability of CRMs [8].
x
value (q) by less than: Quantitative characteristics expected to decrease (or
28 J. Pauwels' A. Lamberty· H. Schimmel

increase) with time are determined by calculating the However, as is stated in its paragraph 3.4.8., the follow-
time at which the 95% lower (or higher) confidence ing should be noted:
limit intersects the acceptable lower (or higher) specifi- - It cannot substitute for critical thinking, intellectual
cation limit, i.e. the lower or higher limit of the certified honesty, and professional skill.
interval. The time so determined may then be consid- - The evaluation of uncertainty is neither a routine task
ered as the expiration date, as one may be 95% confi- nor a purely mathematical one and depends on detailed
dent that the average value of the batch characteristic knowledge of the nature of the measurand and of the
will remain within specification until that date. As was measurement.
the case for the within-unit variation, this possible in- - The quality and utility of the uncertainty quoted for
stability should, in general, not be included in the CRM the result of a measurement therefore ultimately depend
uncertainty, except if the degradation is significant on the understanding, critical analysis, and integrity of
compared to the certified uncertainty of the CRM. In those who contribute to the assignment of its value.
such cases it might be preferred, rather than to reject This is particularly the case for the certification of
the material as CRM, to certify an arbitrarily chosen reference materials. The above procedures can be used
interval within which the material can be expected to to obtain an estimation of both the certified value of a
remain stable during a significant period of time, i.e. reference material and its uncertainty. However, there
until the expiry date of the certificate. must be room for critical evaluation of the results by
the people and organizations taking up responsibility
for the values assigned to a CRM. Therefore it may be
Conclusion common practice in some organizations to increase the
calculated uncertainty as it is felt to be optimistic. One
The Guide to the expression of uncertainty in measure- should however be careful not to give lower uncertain-
ment provides a framework for assessing uncertainty ties just on the basis of the fact that large uncertainty
which can and should be used for the certification of intervals may be interpreted as being the consequence
reference materials by laboratory intercomparison. of e.g. an analytical artefact.

References

1. Guidelines for the production and cer- 3. Giacomo P (1987) Metrologia 7. Pauwels J, Vandecasteele C (1993)
tification of BCR reference materials 24:49-50 Fres J Anal Chern 345: 121-123
(1997) - document BCRIOI 197, Euro- 4. Quantifying uncertainty in analytical 8. Pauwels J, Lamberty A, Schimmel H,
pean Commission, Dg XII-5-C (SMT measurement, 1st edn (1995) Eura- Quantification of the expected shelf-
Programme ). chern, ISBN 0-948920-08-2 life of certified reference materials,
2. Guide to the expression of uncertainty 5. Pauwels J, Lamberty A, Schimmel H, Fres J Anal Chern (accepted)
in measurement (1995) ISO, Geneva, Homogeneity testing of reference ma-
ISBN 92-07-10188-9 terials, Accred Qual Assur 2:51-55
0. Ingamells CO, Switzer P (1973) Talan-
ta 20:547-508
Accred Qual Assur (2()()() 5: 95-99
© Springer-Verlag 2()()O

Jean Pauwels Evaluation of uncertainty of reference


Adriaan van der Veen
Andree Lamberty materials
Heinz Schimmel

Abstract Certification of reference and instability during transport to


materials is far more than just the customer.
characterisation of a selected ho-
Presented at: EURACHEM Workshop mogeneous batch of material. Key words Certified reference
on Efficient Methodology for the From the perspective of the ISO materials' Uncertainty·
Evaluation of Uncertainty in Analytical
Chemistry, Helsinki, Finland 14-15 June Guide on the Expression of Uncer- Characterisation . Uncertainty
1999 tainty in Measurement (GUM) all analysis
uncertainty sources relevant to the
J. Pauwels (181) . A. Lamberty
user of an individual certified ref-
H. Schimmel erence material (CRM) sample at a
I nstitute for Reference Materials and moment in time should be part of
Measurements, EC-JRC-IRMM, the CRM uncertainty. This not
244() Geel, Belgium only includes the full uncertainty
e-mail: jean.pauwels@irmm.jrc.be
Tel.: + 32-14-571722 of the batch characterisation (rath-
Fax: +32-14-590406 er than the statistical variation),
A. van der Veen
but also all uncertainties related to
Nederlands Meetinstituut, P.O. Box 654, possible between-bottle variation,
26()() AR Delft, The Netherlands instability upon long-term storage

units, to written or agreed standards or to an artefact,


Introduction such as, e.g. the primary WHO materials to which sev-
eral clinical RMs are traceable. The certification of a
The accurate and traceable determination of a mean RM involves, in the first instance, the preparation of a
value of a quantity (content, amount) in a sample or a larger number of homogeneous, stable and adequately
batch of material can be obtained in various ways, such packaged samples which are all representative of the
as carrying out a number of independent repetitions us- complete batch, as well as the proper assessment of
ing a primary method of analysis [1], comparing the re- their homogeneity and stability. Ignoring this is not
sults of a limited number of reference methods, or com- only one of the main reasons why problems occur with
paring the results of various independent methods ap- certified reference materials (CRMs), but also why they
plied in a series of laboratories. These three different are the subject of needless discussions about primary,
methods are used by various producers to certify the secondary, consensus, working, etc. RMs. This distic-
values assigned to their reference materials (RMs), tion in classes of RMs mainly exists in the mind of some
whereby this assignment is done using quite similar metrologists, but is fully absent in the existing ISO-
statements, but these statements may sometimes have REMCO Guides. The latter only differentiate between
very different meanings. Moreover, it must also be real- (just) RMs and CRMs, whereby a RM is defined as "a
ised that the certification of a RM is much more than material or substance one or more of whose property
just carrying out a series of precise and accurate meas- values are sufficiently homogeneous and well estab-
urements traceable to the SI or to any other system of lished to be used for the calibration of an apparatus,
30 1. Pauwels et al.

the assessment of a measurement method, or for assign-


Uncertainty analysis in the preparation of a CRM
ing values to materials", and a CRM is just "a RM with
a certificate in which the certified values are accompa-
From the reasoning given above, it becomes apparent
nied by an uncertainty at a stated level of confidence"
that the certification of a RM includes far more than
[2].
just the characterisation of the material. This step, of~­
en carried out as a collaborative study between multI-
ple laboratories, is crucial for the quality of the material
What is a CRM user interested in?
as a CRM, but it is generally insufficient.
From the perspective of EURACHEM Guide [7] as
CRMs are sometimes forced into a hierarchical system
well as from GUM [4], a producer should include all
depending on the fact that there certified val~es were
uncertainty sources that are relevant to the package
determined using a primary method of analysIs or are
sold to the customer. Internal consistency of the uncer-
based on "less traceable" measurements obtained in a
tainty analysis requires the inclusion of the (residual)
laboratory intercomparison. In reality, such a differen-
uncertainty from the experiments carried our for ho-
tiation is meaningless, considering that very often the
mogeneity and stability testing. So, ev~n if th~ prod~~er
uncertainty component which originates from the c~ar­
cannot demonstrate any inhomogeneIty or InstabIlIty,
acterisation of the RM is dominated by uncertaInty
there is still a (small) uncertainty budget to be included.
components originating from several oth~r sources su.ch
Usually, this budget will be small, but in cases where
as insufficient guarantee of absence of InhomogeneIty
only poorly repeatable methods of m~as~~ement are
and/or instability. Therefore, it is not correct when pro-
available, this contribution may be of sIgmfIcance.
ducers certify their RMs just considering the results of
A further consequence of this is that it really "pays
their accurate and traceable determinations of the
off" in terms of uncertainty if a sufficient number of
mean value of the content of the CRM batch, knowing
replicate measurements is carried out in ~omogene~ty
that their customers (users) are only interested in the
and stability testing. The use of methods WIth good lIn-
mean value of the single bottle they ordered on condi-
earity, selectivity and repeatability will also greatly co~­
tion that it is received on the day of dispatch.
tribute to reducing the uncertainty from these expen-
The ongoing revision of the ISO-Guide 35 [3] -
ments. These factors are all in the hands of the produc-
which constitutes a complete rewriting - is therefore a
er. Implementing them correctly and consistently will
unique opportunity to reconsid~r the pro~uction of
reduce the costs of "after sales" of a CRM producer,
CRMs. It will consider productIOn as an Integrated
not to speak of the subsequent damage due to wrongly
process of correct preparation, positive demonstration
certified RMs.
of homogeneity and stability, and accurate and tracea-
This way of thinking may seem new, but those who
ble characterisation, and thus of full implementation of
have already gained experience with inhomogeneous
the principles laid down in the Guide to the Expression
and/or instable RMs have already developed ways to
of Uncertainty in Measurement (GUM) [4]. ThIS means
deal with these aspects. A CRM producer should in-
that all components of uncertainty of "the sample on clude in an uncertainty statement everything that "rea-
the desk of the user" should be properly evaluated and
sonably attributes" (GUM) to the uncertai~t~ of th.e
accounted for. Thereby, it must be strongly emphasis~d
measurand, i.e. the property value to be certIfIed. T.hIS
that the inability to demonstrate between bottle vana-
ends where accidents and incidents start: if somethIng
tion or instability during storage or transportation, as
happens to a CRM during transport that goes bey?nd
well as confining the uncertainty of the batch ch.ar~cte~­ what can be foreseen, it is not part of an uncertaInty
isation to the statistical between-laboratory vanatIOn IS
statement, as the information on the certificate will sti-
no longer acceptable. Ignoring this is one of the major
pulate under what conditions the certificate (and the
causes of the so-called "Jorhem paradox" discussed at
CRM) are valid.
BERM-7 [5] where it was (rightly) found unacceptable
that "results found to be unacceptable for user labora-
tories are good enough to be used in the certification of
the CRM", even if it is statistically just logic [6]! The What is important in the preparation of a CRM?
consequence is, however, that one will h~ve.to acce~t­
just as was the case for testing laboratones IntrodUCIng Good measurements carried out on bad quality candi-
GUM - that uncertainties of CRMs will increase "from date RMs are a nonsense and a complete waste of time
fiction to reality": an idea which is apparently difficult and money! Therefore, extreme care should be taken
for many analysts to become accustom to, and which, not only to prepare a stable and homogeneous base
moreover, may confuse those who tend to c~mpare the material, but also to sample it in a tight and inert con-
quality of the CRMs of various producers Just on the tainment [8]. Matrix CRMs require in gen.eral to b~
basis of the quoted uncertainty. clean and dry, to be transformed into an optImal phYSI-
Evaluation of uncertainty of reference materials 31

cal and chemical form, and to be stored at the correct and the variation between the different samples within
temperature from a very early stage in the production a bottle can only be obtained from measurements car-
process. In general, microbiological degradation can be ried out using a highly repeatable method so that the
minimised by reducing the water content of the materi- method repeatability is negligible compared to the var-
al to a level between 1 and 3%. Packaging is best car- iation between the samples in a bottle, i.e. S~Clh« S?nh'
ried out in an atmosphere of argon - not under vacuum In this case, sample intakes must however be minimal,
as this may become a source of leaks - whereby all pre- as the contribution of S~Clh to U?nh becomes negligible
cautions must be taken to guarantee absolute tightness. when extrapolating S?nh from smaller (m) to larger (M)
This can be achieved using bottles with inserts, penicil- sample sizes according to:
lin vials or ampoules, whereby it must be stressed that
all three solutions have failed in the past: bottles and [S?nh]M = [S?nh]m' m/M == [U?nh]m' m/ M (3)
vials due to insufficiently tight or retracting inserts (e.g. It must be emphasised that Sinh is irrelevant for the
due to ageing or freeze-temperature effects) or am- CRM uncertainty, provided the minimum representa-
poules due to cracks appearing during storage as a con- tive sample intake is properly determined. The value of
sequence of stresses present in the glass. Sinh is, however, of prime importance to estimate this
minimum representative sample intake correctly [10].
In both cases it should be noted that:
What is important in homogeneity testing? - Not correcting Uhh or Uinh for Smcas or Smclh is not real-
ly a problem, but leads to (too) conservative CRM
Homogeneity testing addresses a double problem: uncertainty estimates.
What is the variation in mean value which exists be- - Corrected Shh or Sinh values may never be taken
tween the various units of a batch of candidate RM? smaller than their respective combined uncertainties,
And, how inhomogeneous is the material contained in i.e. U(Shh) and U(Sinh) [9].
a bottIe?
The first problem is of utmost importance to the
user as he/she will, in general, buy just one bottle, and What to do with stability data?
will not care about the other ones! Therefore, between-
units variation is an important component of uncertain- Stability testing at higher temperatures simulating pos-
ty which must be included in the certified value of the sible transport conditions and conditions of long-term
CRM. The determination of the between-units varia- storage are often part of procedures describing the pro-
tion is carried out by measuring the value of a signifi- duction of CRMs [11]. In most cases they do, however,
cant number of units. As the result of such measure- not give quantitative information on presumed instabil-
ments is a combination of two effects, the between-bot- ity, mainly as a consequence of insufficient measure-
tle variability [Shh] and the measurement repeatability ment reproducibility and of an insufficient number of
[smcas] replicates. With the upcoming requirements of fixing
expiry dates [12], it will be mandatory that not only
(1) quantitative data be available, but that their quality is
the variation between the mean value of the bottles can such that high precision extrapolations can be made.
only be obtained from measurements carried out with This requires however that data are produced with
the highest repeatability: i.e. that each bottle must be measurement reproducibilities (or repeatabilities when
analysed, using a highly repeatable method, on sample isochronous measurements are carried out [13]) which
intakes of optimal size and carrying out a number of are negligible compared to the certified uncertainty. An
repetitions which is sufficient to obtain a measurement extrapolation method was recently proposed by Pau-
uncertainty which is negligible compared to the varia- wels et al. [14] to determine the time for which the cer-
tion between the bottles, i.e. s~cas < < S~h' Usually, this tified value of a CRM remains valid, based on the de-
is however not the case. Then, U~h should, as far as pos- termination of the intersection of the lower 95% confi-
sible, be corrected for s~cas to obtain the best estimate dence bound with the lower limit of the certified confi-
of S~h [9]. dence interval (see Fig. 1). Such calculations show how-
To evaluate the inhomogeneity of the material con- ever that, with the levels of uncertainty presently cer-
tained in a bottle, within-bottle measurements have to tifed, either unrealistically high precisions are required,
be carried out. Also here, the result of such measure- or that shelf-lifes must be reduced to unrealistically
ments is a combination of two effects, the within-bottle short periods of time, even if one considers that further
inhomogeneity [Sinh] and the method repeatability stability monitoring during the lifetime of the CRM
makes regular re-evaluation and updating of the shelf-
[Smclh]
life possible. Therefore, in many cases, it may become
(2) necessary to re-evaluate the certified uncertainties of
32 1. Pauwels et at.

mal guarantees of accuracy and traceability, and must be


1.1 , - - . _ - - _ . ----I documented by a full uncertainty budget. For each set of
1.08 • • • , determinations an expanded standard uncertainty ac-
1.06 • cording to GUM should then be calculated. The final
1.04 • :
estimation of the uncertainty of the characterisation of
... i

,; 1.02

-
i the batch (Uchar) should then take into account all these
~ 1
~ 0.98 - •
• • ---------
:;- I
:
standard uncertainties, considering that those uncer-
tainties which have been repeatedly determined in an
~ 0.96

• ••
:>
independent way, decrease proportionally with the
------------ ---
I
0.94
0.92 • ""1
: square root of the number of degrees of freedom. A
0.9 proposal to handle this problem was published by Pau-
o 10 20 30 40 50 60 wels et al. [15]. It is based on a separate consideration
time (months) of three types of standard uncertainties:
- Those which are exclusively laboratory dependent.
Fig.l Example of determination of the long-term stability of cer- - Those which are common to all laboratories, such as
tified reference material (CRM): Cr in CRM 27RR (mussel the effect of between-bottle variation or the use of a
tissue)
common calibrant.
- Those which are common to groups of laboratories,
e.g. those using the same measurement procedure.
RMs taking into account a realistic stability uncertainty. In this context it should be noted that matrix CRMs
Possibly, other approaches may be found to solve this are generally certified for mass fractions related to dry
extremely important problem, such as the one pro- matter, i.e. that not only the amount of substance but
posed by a group of experts working in the framework also the dry sample mass has to be assessed and its un-
of a "Standards, Measurements and Testing Accompa- certainty evaluated: a problem that is ignored and/or
nying Measure" under the co-ordination of LGC (s. underestimated by many analytical chemists and a po-
Burke, personal communication), consisting in extrapo- tential source of significant errors and unaccounted un-
lating the certified value to mid-way of an arbitrarily certainties in CRMs.
chosen life-time and calculating the associated supple-
mental uncertainty.
A similar reasoning may be appropriate for possible The CRM uncertainty according to GUM
degradation of the CRM during transportation to the
customer. The final uncertainty of a CRM according to GUM
should consider all sources of uncertainty described
above:
The characterisation of a homogenous batch of
material
UCRM = [ Uchar
2
+ U oo
2
+ Ul2ts + Usts
2 ] 1/2
, (4)
whereby Its and sts refer to long-term stability (upon
The estimation of the mean value of a quantity of a storage) and short-term stability (during transport), re-
CRM batch using: (1) a primary method of analysis, or spectively.
(2) by comparing the results of a limited number of ref- It is good practice to quantitatively determine all
erence methods, or (3) the results of various indepen- sources of uncertainty, be they significant or not. In the
dent methods applied in a series of laboratories should, latter case they will anyhow disappear in the rounding-
in fact, only be variants of one and the same philoso- off of the calculation, but it will:
phy. The third characterisation method, however, re- - Avoid the risk of overlooking sources of uncertainty
quires that a number of analyses are carried out by one due to ignorance.
or more techniques in one or more laboratories, where- - Demonstrate to users that they have been considered
by each series of measurements is carried out with maxi- and what is their magnitude.

References

1. Quinn TJ (1997) Metrologia 34:61-65 3. ISO Guide 35 (19R9) Certification of 4. ISO (1995) Guide to the expression
2. ISO Guide 30 (19R1) Terms and defi- reference materials - General and of uncertainty in measurement. ISO,
nitions used in connection with refer- statistical principles. ISO, Geneva, Geneva, ISBN 92-67-101 RR-9
ence materials. ISO, Geneva, Switzer- Switzerland 5. Jorhem L (199R) Fresenius J Anal
land Chern 306: 370 373
Evaluation of uncertainty of reference materials 33

6. Pauwels J (1 YYY) In: Fajgeli A, Parka- Y. Pauwels J, Lamberty A, Schimmel H 13. Lamberty A, Schimmel H, Pauwels J
ny M (eds) The use of matrix refer- (lYYS) Accred Qual Assur 3:51-55 (1 YYS) Fresenius J Anal Chern
ence materials in environmental ana- 10. Pauwels J, Vandecasteele C (lYY3) 360: 35Y-361
lytical processes. The Royal Chemical Fresenius J Anal Chern 345:121-123 14. Pauwels J, Lamberty A, Schimmel H
Society, London, pp 31-45 11. European Commission: DG XII-C-5 (1 YYS) Fresenius J Anal Chern
7. EURACHEM (lYY5) Quantifying un- - document BCRIOlIY7 (1 YY7) Guide- 361:3Y5-3YY
certainty in analytical measurement. lines for the production and certifica- 15. Pauwels J, Lamberty A, Schimmel H
EURACHEM, London, ISBN tion of BCR reference materials. Eu- (IYYS) Accred Oual Assur 3:1S0-1S4
O-Y4SY26-0S-2 ropean Commission, Brussels
S. Kramer GN, Pauwels J (lYY6) Mikro- 12. ISO Guide 31 (1 YYS) Reference ma-
chim Acta 123:S7 -Y3 terials - Contents of certificates and
labels (draft). ISO, Geneva, Switzer-
land
Accred Qual Assur (2002) 7:90-94
DOl 1O.1007/s00769-001-0434-y

© Springer-Verlag 2002

Alicia Maroto Should non-significant bias be included


Ricard Boque
Jordi Riu in the uncertainty budget?
F. Xavier Rius

Abstract The bias of an analytical and which is the best approach to


procedure is calculated in the assess- include this bias in the uncertainty
ment of trueness. If this experimen- budget. To answer these questions,
tal bias is not significant, we assume we have used the Monte-Carlo
that the procedure is unbiased and, method to simulate the assessment
consequently, the results obtained of trueness of biased procedures and
with this procedure are not corrected the future results these procedures
A. Maroto (~) . R.Boque . J. Riu for this bias. However, when assess- provide. The results show that non-
F. X. Rius ing trueness there is always a proba- significant experimental bias should
Department of Analytical
and Organic Chemistry, bility of incorrectly concluding that be included as a component of un-
Institute of Advanced Studies, the experimental bias is not signifi- certainty when the uncertainty of this
Rovira i Virgili University of Tarragona, cant. Therefore, non-significant ex- bias represents at least a 30% of the
PI. Imperial Ta.rraco, 1,43005 Tarragona, perimental bias should be included overall uncertainty.
Catalonia, Spain.
e-mail: maroto@quimica.urv.es as a component of uncertainty. In
Tel.: +34-977-558187 this paper, we have studied if it is al- Keywords Bias· Uncertainty·
Fax: +34-977-559563 ways necessary to include this term Assessment of trueness

Introduction measurements. However, different approaches have been


proposed to include bias as a component of uncertainty
One of the most important steps in the validation of an when physical measurements are not corrected for sys-
analytical procedure is the assessment of trueness. In this tematic errors [1]. In this paper we study whether these
process, the experimental bias of the analytical proce- approaches can be applied to include non-significant ex-
dure is estimated. If this bias is statistically not signifi- perimental bias in the uncertainty budget of chemical
cant, we assume that the procedure is unbiased and, con- measurements and whether it is always necessary to in-
sequently, results are not corrected for the experimental clude this term. To answer these questions, we have sim-
bias. However, can we be sure that the procedure does ulated the process of assessment of trueness of biased
not have any bias? In fact, when assessing trueness, there analytical procedures and, subsequently, the future re-
is always a probability of incorrectly concluding that the sults these procedures provide. We simulated these re-
experimental bias is statistically not significant. As a re- sults covering most of the possible situations that may
sult, this probability should be included (expressed as a happen in practice.
quantity related to the experimental bias) as a component
of the uncertainty of the results obtained with the analyt-
ical procedure. However, several questions arise, i.e. is it Assessment of trueness
always necessary to include this component of the uncer-
tainty? Moreover, if it is necessary, how should this non- Checking the trueness of an analytical procedure in-
significant experimental bias be included? volves estimating its experimental bias. If the routine
Non-significant experimental bias has not been in- samples have similar levels of concentration, we can as-
cluded so far as a component of uncertainty in chemical sume that we have the same bias in the whole concentra-
Should non-significant bias be included in the uncertainty budget? 35

tion range and, consequently, the experimental bias can timating precision. In this paper, this latter term will be
be estimated using one reference sample with a concen- considered to be negligible. The overall expanded un-
tration similar to the routine samples. If this is the case, certainty, U, is then calculated by multiplying the stan-
the experimental bias is calculated as the difference be- dard uncertainty, u, by the two-sided t tabulated value,
tween the reference value, cref' and the mean value, t aJ2 . eff' for the effective degrees of freedom, V eff [2], i.e.
bias=cret-cfound' The experimental bias is not significant U = t aJ2 • eff' U. A coverage factor of k=2 is recommended
if: for most purposes when the effective degrees of free-
dom, veff' are large enough. This value represents a level
bias::; tal2.~fj . u(bias) (I) of confidence of approximately 95%. Strictly, the uncer-
tainty calculated in Eq. 3 corresponds to results of future
where t a12 • etf is the two-sided t tabulated value for the samples obtained after correcting the concentration
effective degrees of freedom, veff' [2] associated with found for the experimental bias. However, analytical re-
u(bias), and can be replaced by the coverage factor k if sults are never corrected for non-significant experimental
the effective degrees of freedom are large enough [3,4]. bias. As a result, this bias should be included as a com-
The uncertainty of the experimental bias, u(bias), de- ponent of uncertainty because the procedure may have a
pends on the reference used to assess trueness. If a certi- true bias.
fied reference material (CRM) is used, this uncertainty is
calculated as:
Approaches for including non-significant
u(bias) = -S1p + u(cret)~
.
~ (2) experimental bias in the uncertainty budget

where s,
is the standard deviation of the p results ob- Different approaches have been proposed in the field of
tained when analysing the CRM and u(c ref ) is the stan- physical measurements to include bias as a component of
dard uncertainty of the CRM (i.e. U(cref)lk, where k is uncertainty when results are not corrected for systematic
normally equal to 2 and U(cref) is the uncertainty of the errors [1]. In this paper, we will study whether these ap-
CRM provided by the manufacturer). proaches can be applied to include non-significant exper-
If the experimental bias is significant, the procedure imental bias in chemical measurements. The first ap-
should subsequently be revised in order to identify and proach consists of including this bias as another compo-
eliminate the systematic errors which produced the bias. nent of uncertainty and simply to add it in the usual root-
Otherwise, we assume that the procedure is unbiased sum-of-squares (RSS) manner, i.e.
and, consequently, we do not correct results for the ex-
perimental bias. However, several questions arise in this U(RSSu) = tal2.eif . -,Ju 2 + bias 2 •
latter case because, from a chemical point of view, some
bias is always to be expected in an analytical procedure. The second approach sums this bias in a RSS manner
with the expanded uncertainty, U, i.e.

Calculation of uncertainty U(RSSu) = -,JU2 + bias 2 .


Uncertainty can be obtained either by calculating all the The third procedure consists of adding this bias to the
sources of uncertainty individually [3,4] or by grouping expanded uncertainty. This approach is denoted as
different sources of uncertainty whenever possible [5-9]. SUMU [1] and is equivalent to correcting the results:
In this paper, the latter strategy is followed to calculate
uncertainty using information obtained in the process of U+ = U + bias; U_ = U -bias. (4)
assessment of trueness [5, 6]:
Finally, the last procedure to be studied consists of
u = ~ u(proc)2 + u( trueness)2 + u(pret)2 + u( other therms)2
adding the absolute value of the experimental bias to the
(3) expanded uncertainty, i.e.
where u is the standard uncertainty [3], u(proc) is the un-
certainty of the procedure and corresponds to the inter- U(bias) = U +Ibiasl.
mediate standard deviation of the procedure, i.e.
u(proc)=s,. u(trueness) is the uncertainty of the experi-
mental bias and corresponds to u(bias). u(pret) is the un- Numerical example: bias in the assessment
certainty associated to subsampling and to sample pre- of trueness
treatments not considered in the assessment of trueness.
Finally, u(other terms) considers other terms of uncer- To investigate the effects of including the non-significant
tainty due to factors not representatively varied when es- experimental bias as a component of uncertainty, the
36 A. Maroto et at.

Table 1 Three different situations simulated in the assessment of


trueness

p u(crer) 0", u u(bias) u(bias) .100%


O"pret u

Case 1 5 0.2 1.5 0 1.64 0.70 43


Case 2 10 0.2 1.5 0 1.59 0.51 32
Case 3 10 0.2 1.5 3 3.39 0.51 15

with Eq. I whether this bias was statistically significant


or not. If it was not significant, a future concentration
obtained for a routine sample, CfUI' was simulated with
the Monte-Carlo method. The true value and the true
variance used to simulate this result were, respectively,
Yes: Bias non significant Cful. lrue and <12ful' The variance, <12fUl' corresponded to
.................... :L ................., No: Bias significant <12[+<12 pret. After this, we calculated the uncertainty of
!Analysis of a routine sample! this future result following the section on calculating un-
,.....~:::::::::::::::::::r:::::::::::::::::::_._ ... _... _. certainty. The experimental bias was included as a com-
i,.........................•.........
cfut,tnIe +Oprocedure;<J,;<Jpret True values!
..... _...... _.'
_....... _ ponent of uncertainty using the four approaches ex-
Monte-Carlo Method
plained in the section following that one. Then we
checked whether these approaches included the true con-
r················J-········-·······.
centration of the routine sample, Ctilt. true' within the inter-
i.~~I!.~~!.~~.~~~.~~~~:.~~!j val cfut±Uncertainty.
This process was simulated 300,000 times for 25
different values of 0procedure- After this, we calculated
the percentage of times that the experimental bias was
found to be non-significant. This corresponded to the
Yes: Future result traceable probability of ~ error (or probability of false negative)
because in the assessment of trueness we state that the
method is unbiased when in fact is biased. We also cal-
Fig. 1 Scheme of the process of assessment of trueness simulated
with the Monte-Carlo method. Future results of routine sample
culated the percentage of times that, once the experimen-
were simulated if the experimental bias was identified as non sig- tal bias was identified as non-significant, the different
nificant approaches studied included the true concentration of the
routine sample, cful. true' within the interval cfut±Uncer-
tainty. We simulated this process for three different cases
process of assessment of trueness was simulated with the which cover different situations that may happen in prac-
Monte-Carlo method (see Fig. 1). A true reference value, tice (i.e. presence/absence of pretreatment steps and dif-
crer, true' together with its standard uncertainty, u(cref)' was ferent number of replicated analysis of the CRM). Table
assigned to a "hypothetical CRM". A true bias, 0l;lrocedure' I shows, for the three cases, the values of 0'[, O'prel' u(cref)
together with a true intermediate standard deviatIOn, 0'1' and p used for simulating the results_
was associated to the analytical procedure. Moreover, the
possibility of having other steps in the analytical proce-
dure (i.e. pretreatments and/or subsampling) not carried Results and discussion
out when analysing the CRM was also studied. In this
case, an additional true standard deviation, O'prel' was Table 2 shows the probability of ~ error committed for
also added to the future results obtained with the analyti- different values of the true bias of the analytical proce-
cal procedure. dure, ()procedure, for the three cases studied. We can see that
In the assessment of trueness, we simulated the certi- the higher is 0procedure, the lower is the probability of ~
fied reference value, cref' and the p results obtained when error. This is because the higher is the true bias, the more
analysing the CRM. The true value and the standard de- likely is to detect that the procedure is biased. The proba-
viation used to simulate the p results were, respectively, bility of ~ error depends also on 0'1' u(cref) and p. We can
Cref. true+Oprocedure and 0'1' Results were simulated assum- see that for the same values of 0'1' u(cref) and 0procedure, the
ing that they followed a normal distribution. After this, lower is p, the higher is the probability of ~ error.
we calculated the mean of these results, cfound' and the We also studied the percentage of times that, once the
experimental bias. The uncertainty of this bias, u(bias), experimental bias was identified as being non-signifi-
was calculated with Eq. 2. Afterwards, it was checked cant, the different ways of calculating uncertainty includ-
Should non-significant bias be included in the uncertainty budget? 37

Fig. 2 Percentage of traceable 100,-------------------------------------------------------,

~;!ii;:~i ;;;:::::: :;::::: :::.:::'., "'.',',


future results for case 1 versus
the probability of ~ error. Un-
certainty is calculated without 95 +-~~~~~~~~~~~~~~--~~~,~--------------~
including non-significant ex-
perimental bias: ---- U and its
""':~:::~::::a':'~::"'6,,"6,,"6:::::,:':.::::':::,:::,::"
inclusion using the four ap- ~ 'II. '6" --
"§i 90 -I-________________________~~'_D_'_,'-1,,--------'A_,--'-;;.-_ _ _'·_'-"~:.._,--_i
proaches described for includ- P -D" A, .,
ing non-significant experimen- e
- 'D, '6., •
'II, a
tal bias in the uncertainty B
'a, ll,
budget: --- -l:,. U(RSSu); ~ 85 +-----------------------------------~~~~------~--__4
&..
--- BU(RSSU); ---+---SUMU '"
~ " -.. 4:
and --.--U(bias)
e
'"
,
~ 80 +--------------------------------------------~~~------~
tI
t..

" 'II,
75 +-----------------------------------------------~~&-__1
'b

70+-----.----,-----,----.-----,----.-----.-----,-----.~--1

100 90 80 70 60 50 40 30 20 10 o
% 13 error

Fig. 3 Percentage of traceable 10'v~------------------------------------------------------~

future results for cases 2 and 3


versus the probability of ~
error, Uncertainty is calculated
without including the experi-
mental bias, U, (case 2: --.--
and case 3 --- -l:,.) and with the
SUMU approach (case 2 --.--
and case 3 --- B)

75 +---------------------------------------------------------~

100 90 80 70 60 50 40 30 20 to o
% 13 error

Table 2 True bias of the analytical procedure, 8rroccdurc, and per- ed the true concentration of the routine sample, CfUL true'
centage of times that the experimental bias is identified as non- within the interval cfut±Uncertainty. Uncertainty was cal-
significant, i,e, % ~ error, for the three cases described in Table I
culated using the z-value for a level of significance
Case 1 Cases 2 and 3 a=5%. Therefore, if the uncertainty is correctly calculat-
ed, this percentage, i.e. % traceable future results, should
8proccdurc % ~ error 8proccdurc % ~ error be 95% (i.e. I OO-a%). If uncertainty is underestimated,
this percentage is lower than 95% and, if it is overesti-
°
OA
0,8
9S
91
79
°
OA
0,8
9S
88
66
mated, it is higher than 95%. This percentage was calcu-
lated and plotted as a function of the ~ error committed
1.2 60 L2 36 in the assessment of trueness.
1.6 37 lA 22 Figure 2 shows these results for case 1. In this case
2 18 1.6 13
the contribution of u(bias) to the overall uncertainty,
2A 7 1.8 6
u(bias)/u, is 43%. We see that uncertainty can be greatly
38 A. Maroto et al.

underestimated when the experimental bias is not includ- peri mental bias should be included in case 2 but this is
ed as a component of uncertainty. This underestimation not necessary in case 3. This is because in case 3 the un-
depends on the ratio u(bias)/u, i.e.: the higher this ratio certainty of the experimental bias is negligible when
is, the higher is the underestimation of uncertainty. The compared to the overall uncertainty.
best approach to include non-significant experimental bi-
as is the SUMU approach because it gives the percentage
of traceable future results closest to 95%. The uncertain- Conclusions
ty, U(bias), is also a good approach for including this
bias. However, this approach gives higher uncertainty Non-significant experimental bias should be included in
values than the SUMU approach. The U(RSSU) and the the uncertainty budget when the uncertainty of this bias
U(RSSu) uncertainties are clearly inferior for including represents about 30% of the overall uncertainty. The
this bias because they overestimate uncertainty for higher this contribution is, the more important it is to in-
higher probabilities of ~ errors and underestimate uncer- clude the non-significant experimental bias. In contrast,
tainty for lower probabilities of ~ errors. Moreover, these it is not necessary to include this bias when its uncertain-
approaches give higher uncertainty values than the ty has a low contribution to the overall uncertainty, i.e.
SUMU approach. 15% or lower. The best approach for including this bias
Figure 3 shows the percentage of traceable results is the SUMU approach. The uncertainty, U(bias), also
versus the percentage of ~ error for case 2 (i.e. 32% of gives good results. Otherwise, we can use the uncertainty
contribution of u(bias) to the overall uncertainty) and for U(bias) because, opposite to the SUMU approach, it has
case 3 (i.e. 15% of contribution). In this Figure, uncer- the advantage that it gives a symmetric confidence inter-
tainty is calculated without including the experimental val around the estimated result. However, it gives higher
bias and with the SUMU approach. We see that the ex- uncertainty values than the SUMU approach.

References
1. Phillips SO, Eberhardt KR, Parry B 4. EURACHEM (1995) Quantifying un- 7. Ellison SLR, Williams A (1998) Accred
(1997) J Res Natl [nst Stand Technol certainty in analytical measurements, Qual Assur 3:6-10
102:577-585 EURACHEM Secretariat, PO. Box 46, 8. Barwick VJ, Ellison SLR (2000) Accred
2. Satterthwaite FE (1941) Psychometrika Teddington, Middlesex, TWll OLY, UK Qual Assur 5:47-53
6: 309-316 5. Maroto A, Riu J, Boque R, Rius FX 9. EURACHEM/CfTAC Guide (2000)
3. BfPM, fEC, [FCC, [SO, IUPAC, IUPAP, (1999) Anal Chim Acta 391: 173-185 Quantifying uncertainty in analytical
OfML (1993) Guide to the expression of 6. Maroto A, Boque R, Riu J, Rius FX measurement, EURACHEM, 2nd Edition.
uncertainty in measurement, ISO, Geneva (1999) Trends Anal Chern 18/9-10:577- Helsinki
584
Accred Qual Assur (2002) 7:269-273
DOl 10.1007/s00769-002-0485-8

© Springer-Verlag 2002

Lutz Bruggemann Evaluation of measurement uncertainty


Rainer Wennrich
for analytical procedures
using a linear calibration function

Abstract In the EURACHEM/CIT- realistic values if the condition of


AC draft "Quantifying uncertainty in variance homogeneity is not correct-
analytical measurement" estimations ly fulfilled in the calibration range.
of measurement uncertainty in ana- The complete calculation of mea-
lytical results for linear calibration surement uncertainty including as-
are given. In this work these estima- sessment of trueness is represented
tions are compared, i.e. the uncer- by an example concerning the deter-
L. Brilggemann (~) . R. Wennrich
UFZ Centre for Environmental Research tainty deduced from repeated obser- mination of zinc in sediment samples
Leipzig-Halle, vations of the sample vs. the uncer- using ICP-atomic emission spec-
Department of Analytical Chemistry, tainty deduced from the standard re- trometry.
Permoserstrasse 15, 04318 Leipzig, sidual deviation of the regression. As
Germany
e-mail: bruegge@ana.ufz.de
a result of this study it is shown that Keywords Measurement
Tel.: +49-341-2352512 an uncertainty estimation based on uncertainty· Linear calibration·
Fax: +49-341-2352625 repeated observations can give more Method validation

Introduction _ Yobs -bo (2)


xpml - bl
For some years international guidelines and recommen-
dations [1,2] have existed for the evaluation of measure- The analytical procedure (sampling preparation, mea-
ment uncertainty of analytical procedures. In the draft of suring process) is regarded as a whole. On condition
the EURACHEM/CITAC Guide [2] two formulas for es- that the calibration is statistically justified [3] (in par-
timating uncertainty are indicated relating to analytical ticular, the variance homogeneity must be within the
procedures with linear calibration. Application of these calibration range and the regression model must be
different formulas is only briefly described in the Guide. suitable) and that the trueness (referring to the ISO
The aim of this study was to discuss the use of these esti- 5725 standard [4]) of the process can be checked from
mations for the purpose of method validation. Moreover, recovery of the analyte content of a reference material,
the calculation of measurement uncertainty in a closed the measurement uncertainty belonging to xpred can be
form using an analytical task is demonstrated. determined.
The linear regression model The two estimations of the variance of xpre<b indicated
in [2], are calculated from different inputs. For given
(1) variances/covariances of the inputs Yob." b o and b /, the
variance of xpred is estimated by
with the response y of the measuring system to the anal-
yte content x of the measured component is applied. The var(xpred) = bl2 (var(Yob.,)+x~redvar(bl)
regression coefficients bo and b J are estimated from the I

calibration data set {xi' yd and the (predicted) content is + 2· xpred cov(bo,bl ) + var(bl )) (3)
calculated from the observed response with the help of
the inverse function (Eq. 3 marked in [2] as E3.3). A second estimation
40 L. Briiggemann . R. Wennrich

( ) - s?x
var Xpred - b2
I
[l
p
1... + (Xpred - X:::)
+ n ~(. _
4- XI
I
)2
Xcal
j (4)
According to [1] now the variance estimations of the
measurement uncertainty are marked by u2 and their
standard measurement uncertainty by u={i;2. Thus for
the estimation of the standard measurement uncertainty
is based on the calibration data (Eq. 4, E3.5 in [2]). It of a content xpred one obtains the expressions
concerns the well-known formula for the variance esti-
mation of an average value predicted when p repetitions sL (II)
(i number of calibration levels, cal symbolizes here that u(xpred)=T
the average values belong to the calibration data set, and
s~x defines the residual variance of the regression model).

(12)
Comparison of the estimations of measurement
uncertainty
sL
r
U(Xpred) =
With the helP[Ofthe relations Vajr(bl )= r(x, _'xWI )2 and .------------~~----------

var(bo) = si.x ~ + x::: 2


2 Eqs. (i3) and (4) are
~I ~U2(YOhS) + ( YObS; Y: u 2(q) + U2 (Yeal) (13)
r(xi -Xeal)
I
following from Eqs. (4), (5) and (10) (Eq. 11 corre-
identical if s~jp from Eq. (4) is used as an estimate of sponds to E3.5; Eq. 12 and 13 correspond to E3.3 [2]).
var(Yob) in Eg. (3) or vice versa. . Equations (12) and (13) only formally differ concern-
Sometimes the available analytical quality assurance ing the used regression parameters and supply identical
software gives different estimates of the input values of results. Equation (13) has an advantage in comparison to
Eqs. (3) or (4). If no estimate for the covaria?ce term in Eq. (12). The individual uncertainty contributions for th.e
Eq. (3) is available, this term is repl.aced us~ng. the ex- sample measurement, the calibration model and the calI-
pression cov(bo, b l ) = xcal var(b l ) valId for thiS lInear re- bration measurement, given by UohJXpred) = lI/bllu(Yabs);
gression (derived from the rules of covariance algebra,
Ub,(Xpred) = I-(Yobs - Yeal )Ibllu(b l ) and ucal(xpred) = l-lIb l l
see for instance [5]), thus
u(Ycal ), respectively, referring to the formal represen-
var (x pred) = ~2 (var (Yobs) tation u(xpred) = U~hS<Xpred) + USl (xpred) + UZal(Xpred) of
(13), are easy to interpret.
+ (X~red - 2XcaIXpred)var(bl) + var(bo))· (5) In contrast to Eqs. (12) and (13), Eq. (11) contains the
The relation residual standard deviation Su of the regression instead
of the standard uncertainty U(Yobs) derived from repeated
bo = Ycal - b l . Xcal (6) observations. For xpred = Xcal Eq. (11) gives a minimum.
If var(Yoh\)=s~xlp, Eqs. (12) and (13) give the same re-
applies to the regression parameters. On the assumption sults as Eq. (f I). Normally, Eq. (II) is applied for the es-
that the variance of calibration contents can be neglect- timation of measurement uncertainty in the case of linear
ed, according to Eq. (6) one obtains least squares calibration. In the case of special applica-
tions, it can be used as an actual estimate u(Yobs)' or an
var(bo) = var(Ycal) + xca/ var(bl ) (7)
estimate U(Xpre<l) calculated according to Eq. (12) or (13)
as well as the following equations for Eqs. (I), (2) and which gives a more realistic result.
(3), respectively:

(8) Assessment of trueness

Xpred =-Xeal + Yobs b-I Ycal (9) In order to check the trueness of an analytical procedure
concerning the analysis of a sample with the content Xs
and of one component, an additional analytical quality con-
trol (AQC) measurement of a reference material with the
var(xpred) =
certified content Xr is to be executed. The determined

+( Y: Yvar(bl ) +var(YCal»)
content x (observed from the reference material) is com-
~2 ( var(Yob.\) YObS; (10) pared with x r• Two further uncertainties have to be con-
sidered: the uncertainty of the AQC measurement u(xq),
Evaluation of measurement uncertainty for analytical procedures using a linear calibration function 41

based on the standard uncertainty u(y,) of the appropri- iment samples. The aqua regia extracts were prepared
ate response values, and the uncertainty u(x r ) concerning according to DIN ISO 11466 [8]. The concentration of
the content specification of the reference material [6]. zinc was determined in the diluted (deionized water) ex-
With the help of a correction factor f" defined by tracts by ICP-atomic emission spectrometry with pneu-
matic nebulization (Spectroflame MfP, Spectro A.I.). The
fr=xqlx r (14) aim of this work was to estimate the applied calibration
and the model equation for the corrected content procedure based on a set of diluted ICP multielemental
standards (Merck IV) in 0.1 mol I-I nitric acid for "true"
xcorr = x.J,h. (15) results in the aqua regia extracts. The trueness of this an-
alytical procedure should be proved on the basis of SRM
u(Xq) and u(xr) can be included in the calculation of com-
2709, Montana soil (NIST), which was handled as a
bined measurement uncertainty.
sample within the procedure. The certified value for zinc
The standard uncertainties of fr and xcorr are estimat-
(aqua regia soluble) is reported to be 100 mg kg- I (range
ed by
87-120 mg kg-I). If one considers the conversion factor
U(!,) = fr' ~(u(Xq) f x,y + (u(x r ) f xr)2 (16) from the sample preparation, the mean concentration of
zinc in the solution amounts to 0.285 mg I-I.
and Spread-sheet programs and special software solutions
u(xmrr ) = x corr ' .,j(u(x,) f x,)2 + (u(!,) f !,)2 , (17) can be used for the necessary calculations. For example
the program tool "SQS98" [9] supplies the standard uncer-
respectively. By multiplication of the combined mea- tainties u(bo), u(b,), and u(Ycal) needed in Eqs. (12) and
surement uncertainty u(xcorr ) with a coverage factor (13). However, a small side-calculation is necessary (viz.
(k=2) the expanded measurement uncertainty Table 2).
(18)
Table 2 Standard uncertainty in the parameters of linear regression
is obtained and reported in the result of the analysis.
Quantity Standard uncertainty
Using the test statistic
T- 11- frl (19)
5,.,=48.33
- u(fr) Residual standard deviation
bll=-42.5 CI(b o)=71.16
based on the t-distribution, it can be proven whether fr
Intercept u(bo)=CI(bo)/t45 ,;,
significantly differs from 1. If T is larger than 2 (according
to the size of k), an existent method bias is suggested [7]. u(b ll )=71.16/2.776=25.63
This method bias should be eliminated in the context of b J=2.224.9 C/(b /)=31.68
further investigations. If this is not possible, then it must Slope u(b /)=C/(b /)lt4 :YA
be considered with the calculation of the sample content. (Sensiti vity) u(b /)=31.68/2.776=11.41
XCiI/ = 1.433 u(XCi//) =0
Abscissa centroid
Example u(YCi//) = ~u2(b2)- ~2u2(bl)
YCi//=3146.6
The evaluation of measurement uncertainty is presented Ordinate centroid u(Y,a/) = .)25.63 2 - 1.433 2 . 11.412
on the basis of a calibration data set (Table 1) for the de-
u(Yw/) = 19.74
termination of zinc in aqua regia extracts of polluted sed-

Table 1 Calibration data set for n=6 calibration levels (p=5 repetitions)

Analyte Content a Response from repeated measurements Mean SD

x y1 y2 y3 y4 y5

0 2.3 6.3 0 4.4 11.8 4.96 4.49


0.1 210.6 216.4 233.7 224.3 216 220.2 8.99
0.5 1053 1042 1070 1024 1033 1044.4 17.90
1.0 2144 2142 2126 2106 2125 2128.6 15.39
2.0 4288 4380 4387 4431 4376 4372.4 52.06
5.0 11155 11050 11061 11150 11127 11108.6 49.76

a content in mg 1-1
42 L. Brtiggemann . R. Wennrich

Table 3 Measurement uncertainty concerning analyte contents in calibration range (without AQC measurement)

Analyte Response Stand. unc. Analyte Unc.-contr. Unc.-contr. Unc.-contr. Measurement uncertaintyb
Contenta Yobs U(Yobs) Content a Uobs ubi Ucal
Xi Xpred Eq. (13) Eq. (11)
U(Xpred) U(Xl'red)

0 4.96 4.49 0.021 0.0020 0.0072 0.0089 0.011629 0.015015


0.1 220.2 8.99 0.118 0.0040 0.0067 0.0089 0.011856 0.014782
0.5 1044.4 17.90 0.488 0.0080 0.0048 0.0089 0.012920 0.014017
I 2128.6 15.39 0.975 0.0069 0.0023 0.0089 0.011492 0.013361
2 4372.4 52.06 1.984 0.0234 0.0028 0.0089 0.025183 0.013454
5 11108.6 49.76 5.012 0.0224 0.0184 0.0089 0.030264 0.022585

"content in mg I-I
bThe Eqs. (13) and (II) correspond to EURACHEM Guide [2], E3.3 and E3.5, respectively.

Table 4 Evaluation of measurement uncertainty for the special application (analyte content x=2) including the assessment of trueness
(contents given in mg I-I)

Quantity Standard uncertainty

Response sample measurement:


y,=4372.4 u(y,)=52.06

Response AQC measurement:


)''1=679.3 u(y,)=30.95

Sample content:
. =1433+ 4372.4-3146.5
x, .. - 2224.9 U(XJ= 22i4.9 52.06 2 +( 4372i i24:J 46.5 fl1.4J2 +19.742
x,=1.984 u(x)=0.0252

RM Content from AQC measurement:

X'1
=1433+ 679.3-3146.5
.. 2224.9 U(X'1) = 22i4.9 30.952+(679i22~.~46.5fll.4F+19.742
xq=0.324 u(x()=0.0175

RM Content, certified:
xr=0.285
Correction factor:
fr=x/xr

f r=0.324/0.285=1.137
Corrected content:
U(X corr ) = X corr ' ( U(XX.,.,) )2 + ( U(f,f,rr) )2
Xcorr=1.98411.137=1. 745 ( ) I 745 ( 0.0252)2 + ( 0.0877)2 = 0.136
U Xcorr =. . 1.984 1.137

In Table 3, according to Eqs. (13) and (11) and with- (see the third column in Table 3 and Fig. 1), are nearly
out consideration of the AQC measurement, the calculat- equal. For u(YohJ = -vs?'x/p = -V48.33 2/5 = 21.6 the curves
ed standard uncertainties are arranged for all calibration in Fig. I have an intersection point.
levels. One can see that the measurement uncertainties in In Table 4 the determination of the combined mea-
the part of the calibration range, within which the condi- surement uncertainty, including the assessment of true-
tion of the variance homogeneity is correctly fulfilled ness, for a special analytical application (analyte con-
Evaluation of measurement uncertainty for analytical procedures using a linear calibration function 43

of the method bias the corrected sample content


. z:. 0,04
x corr= 1. 75 is not used. Only its measurement uncertain-

...
1: ty u(x corr )=O.136 is used, thus for the analysis result
o c: 0,03
~ -~ .,.,.- x,=1.98 the associated expanded measurement uncer-
::s
" 0
0,02
",",,'"
,/ ..'-- tainty is U=2·0.136=O.27 .
III
o c:
U 0,01
2 ::I 0,00
o 0,1 0,5 2 5
Conclusion
Analyte content [mgllJ
Fig. 1 Comparison of the measurement uncertainty estimations The application of unweighted ordinary least squares re-
(Eqs. (13) and (11) correspond to EURACHEM Guide [2], E3.3 gression for linear calibration can lead to a slightly un-
and E3.5, respectively) derestimated uncertainty value, if the condition of vari-
ance homogeneity in the calibration range is not correct-
ly fulfilled and the uncertainty calculation is based on
tent x=2 mg 1-1), is represented. Here i r=1.137 and the residual standard deviation of the regression (Eq.
u(fr)=O.088 for the test statistic T= 1.56:::;2, so that a E3.5 in [2]). In this case the other indicated possibility of
significant method bias cannot be proven, although uncertainty calculation (uncertainty deduced from re-
the relatively large difference between xr and Xq sug- peated measurements, Eq. E3.3 in [2]) can result in more
gests a biased error. Because of the non-significance realistic estimations.

References
1. ISO (1995) Guide to the expression of 3. IUPAC Recommendations (1998) 8. DIN ISO 11466: 06.97 Soil quality - ex-
uncertainty in measurement. ISO, Guidelines for calibration in analytical traction of trace elements soluble in
Geneva chemistry: http://www.iupac.org aqua regia. Beuth, Berlin
2. EURACHEM/CITAC Guide (2000) 4. ISO 5725-2 (2000) Accuracy (trueness 9. Kleiner J, Lernhardt U (1998) Program
Quantifying uncertainty in analytical and precision) of measurements meth- SQS98. Perkin Elmer GmbH, Uberlin-
measurement, 2nd edn., Final Draft ods and results, Draft May 2000. ISO, gen, Germany
April 2000. EURACHEM: Geneva
http://www.measurementuncertainty.org 5. Rawlings 10 (1988) Applied regression
analysis. Wadsworth and Brooks, Cali-
fornia, USA
6. Kurfiirst U (1998) Accred Qual Assur
3:406-411
7. Barwick VJ, Ellison SLR (2000) Accred
Qual Assur 5:47-53
Accred Qual Assur (2001) 6:352-359
© Springer-Verlag 2001

Pavol Tarapcik Measurement uncertainty distributions


Jan Labuda
B1andine Fourest and uncertainty propagation
Viliam Patoprsty
by the simulation approach

Abstract A complete and accurate features influencing the measure-


evaluation of measurement uncer- ment of uncertainty intervals. In
tainty requires the knowledge of the paper we described examples
Paper based on a talk given at the the uncertainty distributions. The of such evaluations related to the
3rd EURACHEM Workshop "Status of
Traceability in Chemical Measurement",
latter are rarely determined or preparation of certified reference
6-8 September 1999, Bratislava, verified experimentally, and hence materials, where there is excellent
Slovak Republic up to now only crude estimates or agreement between the traditional
assumptions based on intuition and simulation approaches. And
have been used. The simulation of evaluation of more complex meas-
P. Tarapcfk. (~) . 1. Labuda
experimental results is readily ac- urements of diffusion coefficients
Department of Analytical Chemistry, cessible and provides a more relia- by the open capillary method,
CHTF STU, Radlinskeho 9, ble solution to this problem. When where uncertainty of the simulated
812 37 Bratislava, Slovak Republic using an appropriate model of result is more realistic than the re-
e-mail: tarapcik@chtf.stuba.sk measurement and after determina- sult from the traditional error
Tel.: +4217-59325, ext. 311 or 302
Fax: +4217-52926043 or 52493198 tion of input value parameters by method due to non-linearity and
present state-of-the-art techniques, probably Cauchy distribution in
B. Fourest
Institut de Physique Nucieaire,
simulation data supply reliable in- some steps.
91406 Orsay Cedex, France formation about the distribution of
the output results of a complex Keywords Measurement·
V. Patoprsty measurement. The method permits Uncertainties· Chemical analysis·
Slovak Institute of Metrology,
Karloveska 63, 842 55 Bratislava, simple variation of preposition and Distribution law· Monte Carlo
Slovak Republic therefore ready analysis of various simulation

Introduction evaluation it is essential to make the most exhaustive


assessment of all involved effects. All partial sources of
uncertainty are scouted, then their single contributions
Chemical measurement uncertainty expressed accord- are assessed and subsequently combined to provide the
ing to the latest metrological requirements [1-3] is a estimate of the uncertainty of the result. The combina-
clear improvement compared to the traditional "error tion of single constituents of uncertainty is made by
approach." Up to now, the results of measurement of common procedures using standard uncertainties; the
the same parameter would often differ widely from one standard uncertainty result is used to derive the uncer-
paper to another, although both papers claimed very tainty interval by applying a probability distribution
narrow confidence intervals (see for example stability law.
constant of complexes). This is because some signifi- Traditional, applied statistics in chemistry text books
cant sources of uncertainty had been ignored or preces- is based on Fisher statistics [4, 5]. However, application
sions were overestimated. In the process of uncertainty of the latter is now considerably less strenuous due to
Measurement uncertainty distributions and uncertainty propagation by the simulation approach 45

the ability of computing devices to treat large data sets of a domination of Gaussian distribution was not con-
automatically, and hence their use is more widely firmed in the study, where a number of large data sets of
spread. Unfortunately users are often too unaware of various types [10] were assessed. Only about 25% of
the limitations of the corresponding models and tools. the studied measurements could be considered as a
Mathematical statistics is developing continuously Gaussian distribution.
and a number of new procedures (e.g. robust proce- In recommendations on uncertainty evaluation
dures) [6] are built in new specialized statistics soft- [1-3]. Gaussian distribution is said to be fundamental,
ware. Improved software to calculate precision has but some other distributions are possible to use. The
been published [7, 8]. These new tools provide more application of assumed distribution is also made less
correct results compared to traditional ones. It is there- strict by using a less firm relation between probability
fore reasonable to expect that their use becomes stand- and standard deviation as is used in traditional statis-
ard. tics.
There is no consideration about the relation be-
tween distribution of input and output values - while
the result of the measurement is obtained as a function
of more values with their own uncertainties, the stand-
Complete characterisation of analytical results ard deviation of result is determined using error propa-
and uncertainty distribution law gation and the interval estimate is based on the as-
sumption that the distribution is Gaussian. But the
The use of standard uncertainty has one important lim- shape of the distribution depends on the input values
itation (but often forgotten) - results can be compared distribution and their function relation too.
only in the case of the same uncertainty distribution The applied function relation can be relatively sim-
law. The other interval measures of precision and de- ple or more complex. The measurement of individual
rived values, such as confidence intervals, detection input values is often a complex process and one input
limit or determination limit, cannot be reliably deter- number is the result of many single operations. In the
mined without knowledge of the distribution, and like- evaluation of chemical analytical results these relations
wise further statistical deductions are not valid without are of various complexity, but they can be mostly ex-
this information. pressed as an explicit function and their evaluation us-
Gaussian distribution is often supposed. The analyti- ing the propagation rule is simple. But some important
cal determinations are, however, generally complex types of measurement are more difficult to treat. Two
problems - the measurements are indirect, many single examples are described below.
operation are carried out, each of them with its own
uncertainty (considered as direct measurement or more
or less complex, indirect measurement). The uncertainty Example I. A simple case: preparation of a standard
of a result is the composition of these elemental un- solution of zinc
certainties. There is widely accepted opinion that due
to the large number of uncertainty sources, result distri- When a certified reference material (CRM) is prepared
bution is often Gaussian. This is valid only in the case from a weighted amount of Zn-metal by dissolving in
of addition or subtraction of the individual consti- acid and adjusting to the desired volume, the obtained
tuents, in other cases solution of the distribution law concentration from this process is influenced by several
problem is rather difficult - simple application of Cen- factors and a relatively complete description can be ob-
tral limiting theorem without careful consideration of tained as an extensive equation comprising 16 paramet-
its limitation leads to overestimation of Normal distri- ers, the values of which carry some uncertainty [II].
bution applicability. For comparison: the simple but not complete relation-
The identification of the distribution law is reliable ship in this case is:
only for very large data sets [9], which are practically
impossible in real chemical measurements. Verification c =mIV.M, (I)
using common statistical tests requires, for good relia-
bility, large data sets too. In a real situation often only where m is the amount weighed USIng analytical bal-
an assumption can be made. As this assumption about ances, V is the volume measured using a volumetric
the distribution law is not well-founded, there is a risk flask and M is the molar weight. The more complete
of false information about uncertainty. The assumption form is as follows:

c = ((1110 + ml read + mica! + mloper + m2read + ~cer! + ~(}per ) I I - rpair . (I Irpmb) - 11 rpWeid!h)) X (1 + KI tl - Kn
(2)
('-':Iecl + Vread + '-':lecl.E. (t2 - 20))· M
46 P. Tarapcfk et al.

where all used parameters (except mo, K, and K 2 ) are tion can be designed in the form:
considered as uncertain values and their symbol mean-
y= 8/n2. exp(-n2Dt/4[2)· {1+1/9· exp(-8n2Dt14[2)}
ings are described in [II]. The actual values and origin
(6)
of non-experimental values (molar mass, densities, etc.)
are given in [11]. This relationship is even more com- The solution for D is treated for example in [14]:
plex if the contribution of impurities of the materials
D = 4[2/n2t . In(8/pr2) + 4[2n'4r/9t88 (7)
used is included.
From the relationship above, standard uncertainty and for uncertainty after very awkward work
can be obtained, but one has to do relatively extensive
ob = 401+ o?+ 0~{8·(nI6·rL9.87)/
work - the error propagation law must be applied, that
(9.8 8 . In(8/n2 . J1+nI6 . r8)} (8)
means one needs 16 sensitivity coefficients (square of
derivation) and one has to estimate the standard uncer- The value r is conveniently measured as the radioactiv-
tainties of all 16 constituents. The derivation can be ob- ity ratio (after background corrections), hence it is also
tained without deep knowledge of mathematical analy- a typical indirect measurement, which is a combination
sis using the appropriate software, or using a similar of four individual measurements of radioactivity decay
numerical approach as described in [12] using a com- and four measurements of time.
mon spreadsheet. This procedure, however, after the If one needs a more complete equation (n = 2 or
large amount of work, gives no information about the more), due to slow diffusion of larger particles, when r
result distribution law, even if the input distributions values are higher over a reasonable time period, the so-
are known, and so the result interval estimates are lution is more complex - D can be calculated by an iter-
usually based on a non-verified assumption. ative procedure, but determination of uncertainly ac-
cording to the usual procedure [2] is very awkward. The
general relationship for relative standard uncertainty is
Example 2. A more complex case: measurement of as follows:
diffusion coefficient
ob =401+ o?+ 0 2 .
{[2. Jf[2tD(~exp[ -n2(2n+l)2Dt/4f2])] } (9)
The situation is more difficult if an analytical expres-
sion is not possible in the explicit form y = f (xJ This and nothing is known about the result distribution.
differs only in the more complicated method to obtain The examples above show excessive work is required
the sensitivity coefficient. The method for this is de- using the traditional approach without achieving com-
scribed, for example, in [2]. Even more complex was pletely reliable output information from well-measured
our particular measurement of diffusion coefficients us- input data.
ing the open capillary method, where this process is de-
scribed by infinite order [13]:
Looking at the distribution law for composition
y = c,r/co = I,8/n2(2n + I )2· exp [-n2(2n + I )2Dt/4f2] (3)
of more values
the sum is for n integers (i.e. from 0 to infinity), where
c'lr and Co are the concentrations in the capillary before More complete output information requires knowledge
and after diffusion, [ is the length of the capillary, D is about the distribution law. The composition of distribu-
the measured diffusion coefficient, important for exam- tion can be considered as two values (e.g. a ratio of two
ple for characterisation of particle size, t is the diffusion Gaussians gives a Cauchy distribution, multiplication
time, 1t has its common meaning and n is the consecu- gives Laplace distribution in this case). More complex
tive number in infinity order. relations are very awkward and for most analysts also
This order converges rapidly - for a simple mathe- technically impossible. By solving the problem of un-
matical solution, the time of diffusion is usually ad- certainty distribution in the above way one has two val-
justed so as to obtain a second term of order (n=1) ues and can divide the problem and work in single
1000 times lower than the first term (n=O). steps, dealing with one problem at a time.
The solution for the first term only (n=O) is:
D(Il=O) = (4· L2/n2. t) . [n(8/y· n2) (4)
The distribution of results is obtained as a
and the relative standard uncertainty from the error combination of two uncertain values
propagation rule:
The procedures to obtain the probability of density
ob= 4·01+ o?+ 0~(1/[n(8/n2.)"'1)) (5)
functions in this case can be found in advanced mathe-
If one needs to use two terms - due to somewhat higher matical statistics [5], here we provide some examples.
values of r (lower rate of diffusion), the equa- Table I presents convolutions of some distributions
Measurement uncertainty distributions and uncertainty propagation by the simulation approach 47

Table 1 Combined uncertainty distributions of results obtained by arithmetic combination of two measurements with known and
common uncertainty distributions

f(A,B) Distributions of constituents Resulting distribution type, probability density Graph type

A B

A+B R(-a,a) R(-b,b) trapezoid (a>b)


p(x) =0 Ixl~a+b
=(a+b+x)/4b(a+b) -a-b~x~b-a
=1/2(a+b) b-a~x~a-b
=a+b-x)/4b(a+b) a-b~x~a+b

AB R(-a,a) R(-b,b) p(x) =In(abllxl)12ab Ixl~ab 2


=0 Ixl>ab
AlB R(-a,a) R(-b,b) p(x)=al4b Ixl~b/a 3
=b/4ax 2 Ixl>b/a

A+B N(O,aA) N(O,all ) p(x)=exp ( (Gauss) 4

A·B N(O,aA) N(O,OjI) p(x)=exp (--.fl-)


vAa ll
1~4nlxlaAall 5

AlB N(O.aA) N(O.all ) p(x) =0'11 I( nO' A( I + x 2afi 10'1)) (Cauchy) 6


AB L(O,aA) L(O,all ) p(x)=Vn 2a Aall 121xl· exp{ -8~lxll a Aa ll ) I a Aall 5*
AlB LlO,aA) L(O,all ) p(x)=a ll /(2a A( I +x2afi la~)) 6

generally known and often used. The Gaussian and La- Some remarks can be made on this basis:
place distributions are signed N(Il, a) and L (11, a), re- - The distribution of the value obtained by dividing
spectively, (where 11 is the true value and a is the stand- two uncertainty values is often calculated without
ard deviation), R( -a, a) means rectangular distribu- defining statistical moments other than the zero mo-
tion on interval <-a,a>. The graphic presentations of ment (e.g. Cauchy), and there is no sense in deter-
convolutions are shown in Fig. I: graph type 5* is simi- mining such characteristics as standard deviation be-
lar to graph type 5 with the exception that the values of cause there is skewness based on the higher mom-
high deviation have higher probability. ents.
The traditional suggestion (ISO) in such cases rec- - When multiplication of uncertainty values is used,
ommends as a first approximation a Gaussian distribu- the resulting distribution has a large value of kurtosis
tion (graph type 4 in Table I and Fig. I), where stand- and the usual precision characteristics have small sta-
ard uncertainty of the result is obtained by a combina- tistical efficiency. More suitable are estimates based
tion of standard uncertainties of the constituents ac- on median or robust methods.
cording to the rules of differential calculus. The other
case is, of coarse, relatively simple and can be found in
the basic text books or derived by a person with moder- Simulation approach for uncertainties
ate skills.
The illustrative example can be given as a compari- The possible solution is an exploitation of the ability of
son of distributions obtained by the addition of two rec- standard software to generate data with defined statisti-
tangular distributed values - graph type I in Table I cal properties - with defined distribution and its param-
and Fig. I. According to ISO one can obtain the proba- eters, e.g. mean value, standard deviation and the like.
bility density function by: In this way, sets of data can be obtained each repre-
senting a single "experiment". The number of these
P(x)=ex p( (X-J1A-J1B)2)1 '2;r(a2+a2) (10) simulated data sets can be very large and a set of results
2(a~ +a~) '1/ A B
can be treated traditionally - that means by determin-
and a comparison of the same input values in Fig. 2 ing statistical moments and derived values, construing
emphasises a sharp difference in this case. The example histograms, polygons and the like. Or interval estimates
presented is, however, quite academic. Real measure- can be obtained simply from percentiles of the set of
ments are not so simple and the distribution laws and results (this is really more correct than the moment
their parameters are only known from small data sets, method where the assumption about distribution often
i.e. in most cases only an estimation is used. plays a negative role). This procedure requires no new
48 P. Tarapcfk et al.

1 2

Fig. 2 Comparison of true distribution in the case of type 1 in


3 4 Table 1 and the shape obtained according to traditional approxi-
mation by Gaussian distribution for the same input values
R(-0.2;0.2) and R(-O.I ;0.1)

- There is no need to calculate sensitivity coefficients


(calculation of sensitivity coefficient is sometimes
very awkward).
- By simulating on a PC, large set of individual values
influencing the final uncertainty are made. Each set
of values has its mean value, standard deviation and
originates from a distribution which is known or as-
5 6
sumed - in this case can be readily changed. (To gen-
erate these, a built-in function in the standard soft-
ware can be used or some simple procedure can be
written by a skilled user).
- Using a known measurement model (relation be-
tween input data and result), a set of results is calcu-
lated from simulated sets of individual input data.
- The number of simulated results can be so large that
traditional statistical treatment will give correct inter-
val estimates. The assumptions about input data can be
readily controlled and so one can simply estimate the
Fig. 1 Combined uncertainty distributions corresponding to the influence of changes in assumptions, by recalcula-
cases given in Table 1 tions. Such calculation can also provide similar infor-
mation useful for the optimization of measurement
conditions, such as the sensitivity coefficient analysis
common in traditional methods. These calculations
mathematical knowledge or abilities, and so does not
do not require more skill than simple result calculation
change the analyst into a theoretician. Only moderate
- no new mathematical equation, no derivation.
skills in applying standard software are required.

Solution using a spreadsheet


Simulating procedure
The working sheet includes, for each input value, the
The steps of identifying the partial uncertainty sources
mean and standard deviation (or other data concerning
and the design of the measurement model represent the
uncertainty and the distribution law, from which the
basis of the uncertaintly conception and cannot be
standard deviation can be calculated). The result is cal-
changed. The main goal of the simulation procedure is
culated using a known measurement model:
to "calculate the combined uncertainty". The previous
steps are conducted as usual and provide the input val- - In the I st line, the input measured values are written
ues for the next simulation procedure. The assumed and from these the result of the measurement is cal-
distribution is an important parameter for each individ- culated.
ual source of uncertainty. The treatment differs from The next line(lines) includes further parameter(s) of
the traditional one as follows: the distribution - for two parameters distribution, as
Measurement uncertainty distributions and uncertainty propagation by the simulation approach 49

In a Gaussian, it is for example standard uncertain- 10)15&02,------------------,


--simul
ty.
- Further, in the next line there is information about
the type of distribution law. 1001OE-02

In a new line a simulated set of input data is calcu-


lated using the parameters above and known genera- --O.Q25s1fOO1

tors of random values. lX::XXEE-02


- The last line is then realised (simple copy) n times - - - - D025rnorm

(e.g. 1000x), each new line includes new simulated


data automatically. .,
~
- Sets of results are then in one column, and are >
1,OCOOE·02

treated by the usual method (common integral part of ----0975norm


spreadsheet). For example: the mean value is in
cell A 1 and its standard deviation is in cell A2:, cell
A3 contains information about the distribution law,
then if in cells AII-AIOIO we want obtain 1000 val-
ues from a Normal distribution with the parameters 990C0f,03

above, we input into each cell term:


= NORMINV(RANDO; $A$l ; $A$2) (II)
g~~&m~- __--_+--__ --~-~
or if the distribution law has to be rectangular 0% 20% 40% 00% 80\', 100%

probability
= $A$1-$A$2*SQRT(3) +
RANDO*2*$A$2*SQRT(3) (12) Fig.3 Integral distribution function (polygon) of simulated cer-
tified reference material preparations
(NORMINV, SQRT and RAND are the names of
appropriate built-in functions of spreadsheet).
Table 2 Statistical evaluation of simulated certified reference
How to generate numbers of many other distribu- material "preparations"
tions is a well known and solved problem [6]:
Mean 0.0 I00004 molll
- If cells of column B have other values than in column Median 0.0 I 00002 molll
C the result of the calculation for the functional rela- Standard uncertainty 0.000004 molll
Asymmetry 0.0902
tion of A and B (for example the ratio) one obtains a Kurtosis -0.2527
1000 lines with simulated sets of input data and a set Relative standard deviation 0.03996%
of 1000 calculated results.
- Statistical analysis of the last set gives standard statis-
tical outputs, e.g. interval estimates (standard devia-
tion, percentiles) or better still a more complete The results are given in Table 2 and in graphical
picture using a graph (polygon or histogram) of the form of integral distribution function (polygon) in
simulated distribution, which can be then readily Fig. 3. In this graph there is also a curve for the Gaus-
compared with an a priori assumed distribution - sian distribution using the moment method and the
simply visually or by application of standard statisti- 95% interval obtained from the percentiles, and the
cal tests. The result distribution can be described by same estimate from parameters of the Gaussian distri-
an empirical equation and used in further result util- bution. It is evident, that in this case the assumptions
isation. were adequate. The simulating procedure gave results
in agreement with the results of the standard procedure
but without the element of vagueness due to the un-
Examples of the simulation approach known composition of the individual distributions.

Example 1. Preparation of standard solution of zinc


Example 2. Measurement of diffusion coefficient
We used parameters and assumption of individual in-
fluencing values and model of measurement as stated in This more complex example requires only a few more
the procedure according to the usual recommended skills. Each simulated data set was solved iteratively -
method [2]. The above mentioned procedure was used, but automatically by the macro "find solution" function
only more complex. of the spreadsheet.
50 P. Tarapcik et al.

Table 3 Statistical evaluation of simulated diffusion experiment

Measurement at "favourable" conditions (yabout 0.4) Type I Type 2

Traditional Stand. reI. uncertainty I.IS% I.IS%


95% interval (2 a) S.24IxlO-L S.640xlO-6 cm 2/s S.250xlO-L S.649xlO-6 cm 2/s
Simulation Stand. reI. uncertainty 1.14% 1.19%
95% interval (2 a) S.249xlO-LS.633xI0-6 cm 2/s S.249xlO-L S.65IxI0-6 cm 2/s
95% interval (from percentiles) S.243xl 0-LS.639xl 0-6 cm 2/s S.263xl 0-LS.626x I 0-6 cm 2/s
Shape Gaussian Trapezoid

Measurement at "unfavourable" conditions (yabout 0.75) Type I Type 2

Traditional Stand. reI. uncertainty 6.71% 6.62%


95% interval (2 a) 2. I SOx I O-L 2.497x I 0-6 cm 2/s 2.1 52xlO-L 2.457xl 0-6 cm 2/s
Simulation Stand. reI. uncertainty 11.0% 12.75%
95% interval (2 a) I.S24xlO-L 2.S50x I0-6 cm 2/s 1.7 I 7x I 0-L2.S92xl 0-6 cm 2/s
95% interval (from percentiles) I.S57x IO-L 2.S51 X 10-6 cm 2/s 1.715xlO-L2.935xI0-6 cm 2/s
Shape Triangular "skewed" Gaussian

For a rapid diffusion of Na+, at a mean measured Under favourable conditions there is no difference
value of Y= 0.424, using the uncertainty propagation in a number of characteristics between the traditional
rule we obtained a value of 0 -0,71 %. In this case si- and simulation procedure, and similarly there is no sig-
mulation gave a value of 0.70% and the relative uncer- nificant difference in relation to the assumption about
tainty of the calculated D was 1.15% by uncertainty the distribution. However, the simulation gave the
propagation and 1.14% by the simulation. At the same shape of the distribution and for often used assump-
time, the stimulation procedure gave details of the un- tions in type 2 a slightly narrower interval at 95% prob-
certainty distribution of the result (expressed, e.g., as a ability from percentiles technique is seen, which is most
histogram, moments of distribution, etc.), and it was evident.
possible to show the interval limits directly from the si- In the less favourable conditions, the traditional pro-
mulated set of results, regardless of the distribution cedure overestimated precision. The influence of a giv-
type. In activities measurements, assumptions were en choice of assumptions about distributions even dis-
used usually for radiometric measurements - normal played a skewed distribution and an evident enlarge-
distribution for counts, measurement of time and ment of interval, that shows the importance of well-
length were also of a Normal distribution (type 1). The founded assumptions.
histogram and parameters of simulated set of results in- For such measurements, more capillaries are used in
dicated a normal-like distribution, and this character the same experiment to obtain repeated values, al-
was maintained also in the case of the rectangular though a very limited number (8 for example) and with
distribution used for time and length measurements risk of correlation's, but the details are known by a
(type 2). skilled person. The uncertainties thus obtained show
For slow diffusion and a more critical measurement that the result of the simulation is more realistic com-
of Eu 3+, and not a long enough diffusion time, y was pared to the result of the traditional method.
about 0.75, and the situation is different. The applica-
tion of the uncertainty propagation rule leads to an
overestimation of precision by about 40-50% of the Conclusions
standard deviation, as compared to the values from the
simulation. Moreover, the simulation gave information Almost every analytical result is obtained as an indirect
about the shape of the distribution. It is also simple to measurement. The distribution of the results is given by
judge the influence of an assumption; the simulation the distribution of direct measured values and their re-
gave an objective result. Illustrative examples are lations. There is no reason to suppose a Gaussian distri-
shown in Table 3. bution of elementary measured values and results si-
Favourable conditions means a measurement in multaneously. In some cases, the simple error propaga-
which y is about 0.4, so higher terms of diffusion equa- tion law is sufficient but in some cases could be in error
tions are not important. Under unfavourable conditions and a simulation approach would be worth adopting. It is
the values of y was about 0.75: about four members of almost impossible to decide a priori which method to use.
the order are of importance. In both cases in an itera- If an assumption about the distribution is used in
tive calculation, the first eight members of a given or- one step of the analysis, further conclusions in the oth-
der were applied. er steps must be in harmony with this assumption, e.g.
Measurement uncertainty distributions and uncertainty propagation by the simulation approach 51

if the parameters of calibration of a straight line were ers, derived from small data sets are used. This meth-
tested as Gaussian distributed, simultaneously the as- od is also only an approximation, but in many cases
sumption of Cauchy distribution of the concentrations more correct and less strenuous than by the error
obtained by using this parameters is automatically propagation method.
made and further statistical deductions must agree with The result is easily variable as relate to the assump-
this fact. tions, this variability includes the distribution law
If assumptions about the direct measured values are too.
used, the result of the determination of the distribution The method is able to solve difficulties originating
law is also an assumption and it is not worth making a from complex calculations.
difficult exact analysis of distribution. It is easy to realise, and the procedure does not re-
The determination of the distribution of an analyti- quire any special mathematical knowledge or abili-
cal result is rather a difficult problem. Even if distribu- ty.
tions of direct measured values are known, it is reason- Only a moderate knowledge of standard software is
able to turn to statisticians for a solution, but a simula- required.
tion procedure is available for the moderately skilled Result distribution is given in table and graphical
person. form.
The simulation approach for interval estimates of
measured values does not solve this problem entirely, it The simulated distribution, in many cases without known
cannot overcome the problem of small numbers of de- (assumed) input distributions, remains a model of fre-
grees of freedom, but in most cases it is a good tool: quency distribution to be expected. This model is partic-
ularly useful when it is compared to experimen-
It can be used in cases where the distribution and dis- tally determined distributions, to check whether all sig-
tribution parameters of the constituent is known and nificant sources of uncertainties have been identified
in cases where only approximate values of paramet- and properly assessed.

References
I. EURACHEM (1995) Quantifying un- 5. Koroliuk VS, Portenko NI, Skorok- 11. Tarapcik P, Buzinkaiova T, Polonsky
certainty in analytical measurement. hod AV, Turbin AF (1985) Spra- J, Dlouha M, Chromek F (1997) Me-
EURACHEM, London, UK; ISBN votchnik po teorii veroyatnostei i ma- tro16gia a skusobnfctvo, 2:6-10;
0-948926-08-2 tematitcheskoi statistike (Handbook 3:10-14
2. TPM 0051-93 (1993) Stanovenie neis- of probability theory and mathemati- 12. Kragten J (1994) Analyst 119:2161-
tot pri meraniach (Determination of cal statistics). Nauka, Moscow, Russia 2166
measurement uncertainties). Slovak 6. Antoch J, Vorlickova D (1992) Vy- 13. Gosman A, Jech C (1989) Jaderne
Institute of Metrology, Bratislava, brane metody statisticke analyzy dat metody v chemickem vyskumu (Nu-
Slovak Republic (Selected statistical methods). Acad- clear method in chemical resarch).
3. ISO (1993) Guide to the expression emia, Prague, Czech Republic Academia, Prague, Czech Republic
of uncertainty in measurement. ISO, 7. Eckschlager K (1991) Collect Czech 14. Fourest B (1983) Coefficients de dif-
Geneva, Switzerland; ISBN Chern Commun 56:505-559 fusion limites et structure de quelques
92-67- I 0 188-9 8. TarapCik P (1992) Chern Lett ion aquo d'elements 5f et 4f. PhD
4. Taylor JK (1990) Statistical tech- 86:648-652 Thesis, Institute de Physique Nu-
niques for data analysis. Lewis Pub- 9. Thompson M, Howarth RJ (1980) cleaire, Orsay, France
lishers, Chelsea. Mich., USA Analyst 105:1188-1195
10. Novitskii PV. Zograf IA (1985) Ot-
senka pogreshnostei rezultatov izme-
renii (Errors evaluation of results of
measurements). Energoatomizdat,
Leningrad, Russia
Accred Qual Assur (200D) 5: 118-94
© Springer-Verlag 2000

Matthias Rosslein Evaluation of uncertainty utilising


the component by component approach

Abstract This paper reviews the Key words Uncertainty·


so-called "component by compo- Component by component
nent approach" of evaluating approach . Evaluation of
Presented at: EURACHEM Workshop of measurement uncertainty. An over- measurement uncetrtainty
Effecient Methodology for theEvaluation view of the evaluation process is Measurement uncertainty
of Uncertainty in Analytical Chemistry,
Helsinki, Finland, 14-15 June 1999 given followed by an in-depth dis-
cussion of some of the differences
between this approach and the ap-
M. Rosslein proach of utilising validation data.
EMPA St. Gallen, Department of Some of the advantages and disad-
Chemistry/Metrology, Lerchenfeldstrasse
5. 9014 St. Gallen, Switzerland
vantages of using the component
e-mail: matthias.roesslein@empa.ch by component approach are out-
www.empa.ch/metrology lined at the end.

2. Identify uncertainty sources


Introduction to the evaluation
of measurement uncertainty A feasible approach to identify the uncertainty sources
is as follows [2]:
The evaluation process of measurement uncertainty A. Write down the complete calculation involved in
consist of four steps (see Fig. 1):
obtaining the result, including all intermediate
measurements. List the parameters involved.
B. Study the method, step by step, and identify any
1. Specification
other factors acting on the result. Add these to the
list. For example, ambient conditions such as tem-
A clear statement of what is being measured is written
perature and pressure affect many results.
down. This includes the relationship between the meas-
C. Consider factors which will affect the parameters
urand and the parameters upon which it depends, and
identified in the previous two paragraphs and add
the scope of the measurement.
them to the list. Continue the process until the ef-
GUM makes an important remark about this step,
fects become to remote to be worth consideration.
which is quite often overlooked in the day-to-day busi-
D. Resolve any duplicate entries in the list. Listing un-
ness of calculating measurement uncertainty:
certainty contributions separately for every input
The measurand cannot be specified by a value but only by a
parameter might result in duplications in the list.
description of a quantity. However, in principle. a measurand Three cases arise and the following rules should be
cannot be completely described without an infinite amount of applied to resolve duplication:
information. Thus, to the extent that it leaves room for inter- - Cancelling effects: remove both instances from
pretation, incomplete definition of the measurand introduces the list.
into the uncertainty of the result of a measurement a compo-
nent of uncertainty that mayor may not be significant relative
to the accuracy required of the measurement [1].
Evaluation of uncertainty utilising 53

Fig.l Procedure to evaluate


the measurement uncertainty
Step 1

Step 2

Step 3
liIe"tIfy wlikh
eourc"."I:re

- Similar effects, same time: check carefully, if sim- 3. Quantifying uncertainty components
ilar effects are accounted for twice. In that case
resolve the duplication. Measure or estimate the size of the uncertainty asso-
- Similar effects, different instances: re-Iabel. ciated with each potential source of uncertainty identi-
fied. The goal of the component by component ap-
54 M. Rosslein

proach is to find ways and tools to quantify each uncer-


tainty component individually.
""t,fi9i
•~
,

ift): .
,', ).f~q[!
''';'

• - \
.
.'"
.
•• ~ r
,
' ..

4. Calculate the combined standard uncertainty

The information obtained will consist of a number


quantified contributions to overall uncertainty, whether
associated with individual parameters or with the com-
bined effects of several factors. The contributions
should be expressed as standard deviations, and com-
bined according to the appropriate rules to give a com-
bined standard uncertainty.

An example for evaluating the measurement Fig.2 Flow chart of the analytical procedure
uncertainty utilising the component
by component approach

The evaluation process for the measurement uncertain- 2. Identifying and analysing uncertainty sources
ty utilising the component by component approach is
illustrated in the following textbook example: The aim of this step is to identify all major uncertainty
sources and to understand their effect on the measu-
rand and its uncertainty. This is best done by drawing a
1. Specification cause and effect diagram using the procedure described
by Ellison et al. [2] (Fig. 3).
The concentration of a hydrogen chloride solution
(HCl) is determined by titration against freshly stand-
ardised sodium hydroxide solution (NaOH). It is as- 3. Quantifying uncertainty component by component
sumed that the NaOH concentration is known to be of
the order of 0.1 moili. The end-point of the titration is Within step 3 each uncertainty source identified in step
determined by an automatic titration system using a 2 has to be quantified using relevant data and then con-
combined pH-electrode to measure the shape of the verted to a standard uncertainty.
pH-curve.

Procedure
The concentration of the NaOH solution is
The measurement sequence to determine the HCl con-
0.10215 molll with a standard uncertainty of
centration has the following stages (Fig. 2):
0.00009 molli.
1. Transfer an aliquot of 15 ml of HCl into the titration
vessel using a bulb pipette. V HCI
2. Approximately 50 ml of ion-free water is added to
the vessel and then titrated using the NaOH and the 1. Repeatability. The uncertainty due to variability in
pH-curve is recorded. The end-point of the titration filling and delivery is determined as a standard devia-
is determined from the shape of the recorded tion of 0.0037 ml.
curve.
2. Calibration. The uncertainty on the stated internal
Calculation volume is given by the manufacturer as ± 0.02 ml. This
value is transformed into a standard uncertainty assum-
CNaOH' Tit V (mol/l) ing a triangular distribution 0.02 y'6 =0.0082 ml.
CHC I=----
V HC1
3. Temperature. The effect of temperature difference
CHd concentration of the HCI solution (mol/l) from the pipette calibration temperature to the labora-
CNaOH: concentration of the NaOH solution (molll) tory environment can be calculated from an estimated
V Tit : titration volume of NaOH solution (ml) temperature range and the coefficient of volume expan-
V HCI: aliquot of HCI titrated with NaOH solution (ml) sion.
Evaluation of uncertainty utilising 55

Fig. 3 Cause and effect dia- Repeatability Bias


gram displaying the effect of
c(NaOH)
the different components End-point--->-----'--....

Temperature----Jl~

Calibration-~~

Repeatability----Jl~

------------'----------,,-----1----)~c(HCI)

Repeatability-~~

Calibrationl----Jl~

Temperaturel----'~

V(HCI)

15 ml·2.1·10 - 4oC -1·4 °c cause a strong base (NaOH) is used to titrate a strong
0.0073 ml.
V3 acid (HCI). This leads to a large change of the pH-val-
ues around the end-point resulting in a very accurate
Combining the three contributions to the uncertainty determination of the value of the end-point.
u(VHO) of the volume V HO gives a value of V Tit is determined to be 14.89 ml and combining the
u(VHCI) =VO.00372 +0.00822+0.0072 2 =0.012 ml. four remaining contributions to the uncertainty u( V Tit)
gives a value of:
1. Repeatability of the volume delivery. The variability
of the delivered volume of the piston burette is deter-
mined as a standard deviation of 0.004 ml.
The above section shows that even for a relatively
2. Calibration. The limits of accuracy of the delivered simple textbook example the process to calculate the
volume is indicated by the manufacturer as ± 0.03 ml standard uncertainties for each of the components is
for a 20-ml piston burette. The standard uncertainty is time-consuming.
calculated assuming a triangular distribution 0.03/V6 =
0.012 ml.
4. Calculate the combined standard uncertainty
3. Temperature. The uncertainty due to the lack of
temperature control is calculated in the same way as for The intermediate values, their standard uncertainty and
V Hd 0.0073 ml. their relative standard uncertainty are shown in Ta-
ble 1.
4. Repeatability of the end-point detection. The repea- Using the values obtained above:
tability of the end-point detection is thoroughly investi-
gated during the method evaluation under the given 0.10215 '14.89
conditions and a standard uncertainty of 0.004 ml is
CHO=------ 0.10140 molll
15
found appropriate.
and
5. Bias of the end-point detection. During the method
evaluation no indication for any bias was found, be- U(CHO) =0.00016 mol/I.

Table 1 The standard uncertainties and relative standard uncertainties of the components used to calculate the combined standard
uncertainty

Description Value Standard Relative standard


uncertainty uncertainty

CNaO/l Concentration of NaOH 0.10215 mol/l O.OOO()lJ molll (). (JO()lJ


VIICI HCI aliquot for NaOH titration 15 ml 0.012 ml O.(J(JOX
V/'it Volume of NaOH for HCI titration 14.XlJ ml (U115ml 0.001
56 M. Rosslein

Within step 4 one also wants to find out which of the


c(NaOH) 33.1
relevant contributions should be re-evaluated. To ac-
complish this the relevant contributions have first to be
V(HCI) 26.1
found. One possible way to do this is to compare the , .
relative variance of the different components in a histo- . :.
V(Tlt) ~. 8 .
gram. It is better to compare the relative variances in- ..
stead of the relative standard uncertainties, because c(HCI) 100
they reflect more accurately their influence on the final
result, i.e. combined standard uncertainty (Figs. 4-6). 0 20 40 60 80 100
The contribution of V Til is the largest one according relative Variance [%)
to the histograms in Fig. 4. The volume of NaOH for
titration of HCI (V Til) itself is affected by four influenc- Fig.4 Contribution of the main parameters to the overall uncer-
ing quantities, which are the repeatability of the vol- tainty
ume delivery, the calibration of the piston burette, the
difference in temperature between the bench and the
calibration of the burette, and the repeatability of the V(rep)
end-point detection. Checking the size of each contri-
bution the calibration is by far the largest one. There- V(cal) . . . . .
fore this contribution has to be investigated more thor-
oughly, because its size is based on a rough estimate of V(temp) • • ••
the shape of the distribution function.
The standard uncertainty of the calibration of V Til
was calculated assuming a triangular distribution. The o 10 20 30 40 50
influence of the choice of shape of the distribution is
Variance (%)
shown in Table 2.
It is not surprising that the combined standard un- Fig.5 Contribution of the components (Y He,) to the overall un-
certainty Uc(CHC/) shows little effect from the choice of certainty of a single parameter
the distribution function of the largest influencing
quantity. The contribution of the different components
found during the evaluation can also be combined in V(rep;d)
different ways. One possibility is shown in the following
V(cal)
Fig. 7 which includes that the calibration of the experi-
mental equipment (38.7%) is the largest contribution to V(temp)
the overall uncertainty. The uncertainty of the titer
V(rep;e)
concentration of NaOH (33.1%) is around the same
size whereas the effect of the repeatability (8.3%) and
the temperature influence (19.9%) are considerably V(Tit) '''~~~!~.~.'."!""I-'',.~~~rB
smaller. o 10 20 30 40 50
Variance[%)

Fig.6 Contribution of the components (Yril ) to the overall un-


Similarities and differences between the two certainty of a single parameter
approaches

Similar components, such as repeatability or tempera- Table 2 The influence of the choice of shape of the distribution
ture dependence, are combined in Fig. 7. These compo- on the combined standard uncertainty
nents influence different parameters of the equation of Distribution Factor u(V(T; cal» u(V(T»
the measurand, but by combining them in this new way
allows to visualise their overall effect on the measure- Rectangular fl (U1l7 ml 0.019 ml O.OOO1R molll
ment and measurement uncertainty. Employing this ap- Triangular ~ 0.012 ml 0.015 ml 0.00016 molll
proach, the user of the data might gain a better under- Normal 1"9 0.010 ml ()'o14 ml (l.O0015 molll
standing of the measurement process. This is another
benefit of the evaluation of measurement uncertainty
besides its major objective to obtain comparable re- influencing quantities in step 3 for the utilising of vali-
sults. dation data approach (Figs. 8, 9). In this approach the
A closer look at the process of combining compo- grouping of influencing quantities has to be done to ac-
nents shows that it is very similar to the grouping of commodate the sometimes limited amount of informa-
Evaluation of uncertainty utilising 57

Fig. 7 Analysis of the effect


of the different components

V(rep;e)

Fig. 8 Process of grouping the V(Tit)


f
Repe~tability Bi~S c(NaOH)
repeatability within step 3 of
the evaluation process utilising End- oint- - -~- - - - - - - - - - -~- - - - - ->'\
"
validation data
. Temperature- - - - - . "
,
Calibration- - - - - . "

Repeatability~"

V
-~--+--.------L---~-------1-.c(HCI)

I Repeatability~'

/ Calibrallon- - - - - . , /
I '

I ,"
I Temperalure- - - - - .,'
!
,,
Repeatability V(HCI)

Fig.9 Grouped repeatability Bias


components c(NaOH)
End-point---'----.-I

Temperature,--~

Calibrationl--~

--------.-----1----------.---1-----.c(HCI)
V(HCI)I---'l~
Calibrationl--~

V(Tit;delivery)--. .
Temperature,--~

V(Tit;End-point)--. .

Repeatability V(HCI)
58 M. Rosslein

tion obtained during a validation study. GUM com- tability components, the variation of the two volume
ments about combining individual components in its in- deliveries, have been determined independently in a se-
troduction [1]: "The actual quantity used to express un- ries of ten delivery and weight experiments.
certainty should be: The determination of the different components in
- internally consistent: it should be directly derivable independent experiments increases the risk of neglect-
from the components that contribute to it, as well as ing correlation between these components in a given
independent of how these components are grouped analytical procedure. The aim of experiments to deter-
and of the decomposition of the components into mine the size of a single component is to reduce any
subcomponents". other influences, especially one tries to avoid correla-
tions. In a given routine analysis correlations between
some components might occur. Therefore the relevance
of these correlations have to be investigated if one
Some drawbacks of the component wants to use the component by component approach
by component approach correctly. In contrast overall method performance pa-
rameters are determined during validation studies.
The component by component approach has a few These parameters already take into account most of the
drawbacks, which mostly reduce the efficiency of the correlations between different components.
evaluation process. For example there are always com- The drawbacks outlined above show that the utilia-
ponents whose effect cannot be directly measured. In tion of validation data can be more efficient than the
the above textbook example, the repeatability of the component by component approach to evaluate meas-
end-point determination is such a component. The var- urement uncertainty. However the former approach
iation of the end-point detection is always part of the does not provide information about the relative size of
overall repeatability of the experiment and there is no the components, this information is importnat when it
other means to directly determine its size in an inde- is necessary to develop the method further to reduce its
pendent measurement. In contrast the other two repea- uncertainty.

References

1. ISO (1993) Guide to the expression of 2. Ellison SLR, Barwick VJ (199H)


uncertainty in measurement. ISO, Accred Qual Assur 3: Hl1-105
Geneva. Switzerland
Accred Qual Assur (199H) 3:145-149
© Springer-Verlag (199H)

Petras Serapinas Uncertainty - statistical approach,


l/f noise and chaos

Abstract Some general reasons for noise and nonlinear phenomena to


poor applicability of the statistical uncertainty balance, experimental
approach based on approximation verification of the assigned uncer-
Presented at: 2nd EURACHEM of normal data distribution to in- tainty value, ruggedness tests and
Workshop on Measurement Uncertainty terlaboratory test results and ana- statistical data distribution are
in Chemical Analysis, Berlin,
29-30 September 1997 lytical measurements at high data briefly discussed.
dispersion are considered. They in-
P. Serapinas (~)
clude a symmetry of the concentra-
Plasma Spectroscopy Laboratory, tion scale, low-frequency noise, Key words Uncertainty .
Institute of Theoretical Physics and and nonlinear phenomena in atom- Interlaboratory test· Noise
Astronomy, ization processes and chemical Data distribution' Nonlinear
A. Gostauto 12, 2fiOO Vilnius, Lithuania
Tel.: +37()-2-fi1fi01H
reactions. The relationship of lIf phenomena
Fax: +370-2-2253fi1
e-mail: pserapin@itpa.1t

Gaps also exist in the preparation of methods and


Introduction
the study of correspondence between the assessed con-
Attention to uncertainty in chemical analysis is mainly tributions of the identified sources of uncertainty and
for two reasons: (a) uncertainty is an indispensable part their direct manifestation in real measurement sets.
of any test results; and (b) the dispersion between Moreover, as at least some comparatively large uncer-
measurement results of different analytical laboratories tainty sources seem to be the typical situation (if only
is usually much greater than that between the results of one main source is identified it usually can be reduced
experiments reapeated by one laboratory. There is a to approximate to the group of comparatively intensive
tendency to explain this difference as inherent in the contributions). So, essential tasks seem to include the
uncertainties of the individual results, and the need to development of means to obtain information on uncer-
consider the full list of uncertainty sources and their tainty in the measurement process itself as well as the
contributions is usually stressed [1-3]. Nevertheless, the application of this information for uncertainty reduc-
interlaboratory character of the final experimental un- tion.
certainty test remains a problem, and gaps exist not
only between repeatability and reproducibility of RSD
On asymmetry of the data distribution
figures but also in links between the two concepts. Re-
peatability and reproducibility definitions [4] concen- Applicability of statistical control is a general charac-
trate on the difference between these, but more de- teristic of analytical methods. For this, analytical data
tailed understanding of the connections between them distributions ought to be normal. Large data sets are
is also needed. The relationship of uncertainty to lIf needed for detailed examination of the data distribu-
noise, oscillations and chaos in chemical reactions tions. Often, an obvious asymmetric character of the
seems to be of importance in the context and is consid- distribution manifests itself, indicating deviations from
ered in the paper. the normal distribution. This is almost the usual state of
60 P. Serapinas

affairs for large data dispersions, e.g. in interlaboratory considered. In analytical measurements, the measure-
tests. Nevertheless, the very definition of uncertainty ment time usually exceeds the time at which the lIf
implies at least the nearly symmetric distribution of val- noise, which increases with time, becomes comparable
ues about the measured result (see e.g. [5]). to other noises. Usually this takes place in the order of
The applicability of other data distributions, such as seconds.
the logarithmically normal, for analytical data is also Long-lasting measurement sets seem to be charac-
discussed, but the procedures currently used in analyti- teristic of chemical analysis. Quality control charting
cal data analysis (e.g., homogeneity, outlier tests etc. [6, [10], interlaboratory studies [6, 7, 11], and applications
7)) have mainly been developed for normal data distri- of the certified reference materials are typical exam-
butions. Asymmetry can be due to contamination and ples. In spite of this, the concept of lIf noise is hardly
other errors, but one general reason is asymmetry of ever applied to analytical measurements with large time
the concentration scale - zero acts as the lower limit intervals. The following question in relation to lIf noise
and there is no upper limit. According to our analysis, is considered in this paper: is the lIf noise concept app-
the distribution peak position in relation to the mean licable to the long-term variations of analytical data,
value and symmetry of the data scatter of the interlabo- where the information on the characteristic variation
ratory test data can be improved essentially if the sym- time or variation time dependences can be of impor-
metric scale of logarithms of concentration ( - infinity, tance?
+ infinity) instead of the concentration itself is used. The result of an attempt to measure the low-fre-
The difference is important at large data scatters only, quency noise spectrum from AAS soil analysis quality
and all the data scales are almost symmetric if the data control data lasting for almost two years is presented in
dispersion is small enough. So the problem can be for- Fig. 1. Fourier analysis of the traditional quality control
mulated as follows: can scales be found in which the data time series was carried out. It follows that lIf noise
applicability of normal distribution would be wider seems to be a useful approximation for such a low-fre-
than in the linear scale? quency spectrum. In addition to the common lIf char-
In some cases, the applicability of the log-normal acter of the noise some characteristic frequencies at
distribution has been strongly contested (e.g. at the de- about 11(25 days) and 11(2 months) are observed.
tection limit [8]). Even in these cases, some arguments The example presented seems to be of interest from
seem to be reformulated. Of course, no distribution is different points of view. First of all, characteristic noise
universal (the normal one as well), and if more suitable frequencies or time constants of the main slowly vary-
types of distribution (or data scales) exist, these ought ing uncertainty sources are displayed. This is an addi-
to be applicable only to some definite types of meas- tional, parallel source of information compared to the
urement data. Concerning data distributions in interla- ruggedness tests and uncertainty evaluation, and can be
boratory tests, more extensive analysis of the interlabo- used to confirm, support or supplement the obtained
ratory data distributions from different test schemes conclusions or to control the effectiveness of uncertain-
ought to be carried out. Better applicability of the nor- ty reduction procedures. Not only the characteristic fre-
mal distribution could enable wider fitness of the statis- quencies but the contributions of the corresponding un-
tical approach and wider possibilities for validation of certainty sources to the combined uncertainty can be
the analytical methods. Besides this, possibly some of obtained as integrals of the noise spectral density in the
the results at the large concentration "tail" could be re- characteristic frequency ranges.
garded as normal. Estimation of uncertainty in the case
of other distributions, including the asymmetric ones,
also needs attention. :i 60 1
.,j
i;.
50 1\ \
'u; 40 1 \
The l/f noise approach to understanding bias, error
functions and uncertainty reduction possibilities -~30"\v
o:s
\
'

~ 20 \ \
Repeatability standard deviation of the measurements ~ 10 \
with the same apparatus at identical conditions is usual- 0] ':--.- ,----~-\;: ,
ly limited by low-frequency noise. This noise is lifO' fre- o 5 10 15 20 25 30 35
quency dependent. Often ex is close to 1, and the noise I I I
is known as lIf noise (in parallel, the term flicker noise l/year lImonth 1/( I 0 days)
Frequency, 1 div = 3.6'10-8 Hz
is used). The phenomenon, while not fully understood
(see e.g. [9]), takes part in many natural, physical, Fig.l Noise spectrum of the AAS quality control analytical re-
chemical, social, economical etc. phenomena and is of sults: solid line from measured results for the period 1995.01-
special interest when accuracy of the measurements is 1996.09, dashed line lIf dependence
Uncertainty - statistical approach, I/fnoise and chaos 61

It is easy to understand that the number of such long tion of the uncertainty sources is not possible, multi-
measurement data sets available from individual labo- channel measurements can be used to monitor the var-
ratory is restricted. More extensive and profound stud- iations of the important conditions of the experiment
ies can hardly be carried out without interlaboratory (see e.g. [13]). If such information is available, many
collaboration. Comparison of similar data from differ- methods can be used to exclude the effect of the vary-
ent laboratories would allow one to distinguish experi- ing experimental conditions from the measurement re-
mentally between the uncertainty sources characteristic sults through determination of the relations between
of the method and those of individual laboratories. those variations and variations of the analytical signal.
The integral of the noise spectrum up to the fre- Cross-correlation measurements between the final sig-
quencies comparable to the inverse characteristic time nal and the variations of the experimental conditions is
of the individual measurement would contain the full a very sensitive tool to ascertain how complete the ex-
uncertainty due to all sources, representing the transi- clusion was. How far can we go in such a process? If the
tion from one measurement to the long time scale. Of lIf noise is, as we understand till now, really a general
course, the integration frequency range is a problem. phenomenon due to a large number of effects, this
The high-frequency limit of the order of the inverse component can hardly be reduced significantly while
time of the individual measurement seems to be natu- the characteristic (excess) noise components (see Fig.
ral. The low-frequency limit, longest time scale, de- 1) ought to be reducible. For the case represented in
pends on the practical problems being considered. In the figure (the accuracy of the spectrum itself is still un-
any case, we ought to take into account that if the der study), contributions of the two noises to the uncer-
measurement time being regarded is comparable with tainty seem to be comparable, and uncertainty reduc-
the characteristic time of some uncertainty source, such tion by a factor about 2 can be expected, and only fun-
a source will reveal itself only as a tendency, trend, or damental changes to the system (including preparation
bias. If the measurement lasts for an essentially longer of extremely low-noise methods as a special case) could
time, such a source will manifest itself as an error func- result in a different lIf noise level.
tion. These circumstances represent a more rigid, math-
ematical, basis to the bias-error relativity [5] problem
and could be important when the collaborative trial re- Nonlinear phenomena - oscillations and chaos
peatability and reproducibility data are subjected to un-
certainty and bias estimates [12]. Hitherto, stochastic (or random) fluctuations have been
If the noise spectrum is regarded as information for considered. Sometimes special, quasi-oscillatory varia-
assessment of the integral uncertainty in spite of the tions of the analytical signal are observed. The ampli-
spectral density increment at low frequencies, an accu- tude of the oscillations is comparable with the analyti-
rate value of the lowest frequency is not important. As cal signal itself, approaching two or more possible val-
an example, a time interval of between 1 and 100 years ues of the signal. Variations of the integral analytical
covers a frequency range of only about 3 '10 -K Hz, signal depend on both the signal intensity dynamics and
while time intervals from 1 year to 1 min cover about the number of the spikes observed. Because of the lim-
2 ·10 -2 Hz. So, for example, contributions to uncertain- ited number of such spikes, some special values of the
ty for pure lIf noise from 1 s to 1 hand 1-h to half-year integral analytical signal are preferable, and the distri-
time intervals would be comparable. bution of this quantity would not be normal in this case.
Probably the low-frequency noise studies need too The well-known example of such variations are spikes
much data for routine uncertainty estimates. It is al- in graphite furnace atomic absorption measurements
most accepted that the total uncertainty can be assessed (e.g. [14]). In the still-continuing discussion on the de-
as the combined effect of the identified uncertainty tailed mechanism of this phenomenon, processes being
sources. But the situation is possible when it is mainly considered usually include autocatalytic, essentially
due to a large number of comparatively weak sources, nonlinear, reactions.
and experimental uncertainty measurement from data More or less similar signal-time dependences seem
dispersion or a noise spectrum would be highly desir- to follow analytical sample transformations more often
able in such cases. than was traditionally expected. Arc discharge is well
Another problem is how far the uncertainty can be known as a radiation source susceptible to instabilities.
reduced. Many methods can be used for such reduction. An example of the quasi-oscillatory radiation intensity
The most straightforward procedure seems to be the time dependence from original measurements in a car-
"bottom-up" assessment of the contributions from all bon arc is presented in Fig. 2. Two repetitive intensity
the uncertainty sources and modification of the analyti- peaks were very often observed and were traditionally
cal method to eliminate the largest of them. As was dis- attributed to different volatilities of the sample compo-
cussed earlier, the noise spectrum studies can aid such nents or reaction products or different volatilization
identification and assessment as well. If further elimina- mechanisms. The studies reported in [15] were possibly
62 P. Serapinas

50 methods of sample volatilization, this principle seems


hardly to be maintained. Also, observation of only the
initial stages of the process cannot improve the situa-
tion: the observed analytical signals depend on the

.,
,
character and reproducibility of the process.
Because of the high sensitivity of the causes of non-
,
,, linear phenomena to the initial and dynamic character-
istics of the system, such phenomena, especially in in-
terlaboratory tests, can be appreciable sources of un-
o 100 200 300 certainty that are difficult to assess. Of course, even in
Time, s chaotic phenomena, limits of variation of the signal ex-
ist, but the phenomenon as it affects analytical applica-
Fig.2 Two individual Cu 261.X-nm spectral line intensity time
dependences. Spiked multi elemental oxide sample atomization in
tions is hardly studied and can hardly be properly ac-
carbon arc: current 20 A, sample 15 mg, analyte concentrations 1 counted for in certain cases. Understanding of the
mg/g. Data points are means from 14 measured time signal val- problem as such is in progress: "To date, chemometrics
ues has dealt with systems as deterministic or random, yet
many chemical systems behave chaotically" [17]. Stud-
ies of nonlinear phenomena should help in understand-
the first to take notice of the oscillatory character of the ing atomization mechanisms, revealing links between
process. It seems characteristic that often the quantity the reproducibility, repeatability and ruggedness tests,
of the sample material involved is too small (or the ob- and should present additional information for the as-
servation time is too short) for the dynamical structure sessment of balance and reduction of uncertainty.
of the sample transformations to be explicitly dis-
played, and only one or two peaks are observed. In the
atomization of solids and slurries such a situation can Conclusions
be expected more often now than hitherto [16].
The spike phenomenon, while interesting in itself, The problems of uncertainty understanding and assess-
reduces repeatability and needs special attention in or- ment in the range of small (a few per cent) and larger
ganization of measurements, but can be considered in uncertainties seem to be different. Experimental on-
the context of the common uncertainty assessment. The line uncertainty analysis should be of interest in both
essential feature seems to be the limited reproducibility cases. Comprehensive information inherent in interla-
or irreproducibility of the time scenario or even the boratory tests and quality control measurements could
characteristic time constants observed in measurements (and possibly should) be more effectively gained, sum-
similar to those of Fig. 2. From the mathematical point marized and used in method development. Preliminary
of view, if at least three reactions or processes, one of results show both the lIf and characteristic noises in the
them being nonlinear, are involved, it can result in quality control data. Attention to the nonlinear phe-
chaotic dynamics of the phenomenon in the sense that nomena including chaos seems to be essential in meth-
very small variations of the initial conditions can od preparation and performance studies. From this
change the character of the phenomenon as a whole. point of view, problems in uncertainty understanding
Such numbers of reactions may well be the rule than and reduction remain, and more intensive and in-depth
the exception in the atomization of multielement sam- interlaboratory collaboration would be highly desirable
ples. Thus, competition between different reaction for achieving faster progress.
mechanisms, autocatalytic and other nonlinear atomi-
zation processes, seem to be possible causes of oscilla- Acknowledgements The author thanks Dr. J. Lubyte, Agro-
tions and chaos in the analytical chain. chemical Research Center, Kaunas, for presenting quality control
data and the Regional Programme on Quality Assurance PRAG-
One of the early basic principles of analytical chem- III for financial support for the participation at the 2nd Work-
istry was to allow the reactions providing the analysis shop "Measurement Uncertainty in Chemical Analysis - Current
results to proceed to completion. In the fast modern Practice and Future Directions".

References

1. Eurachem (1995) Quantifying Uncer- 2. Horwitz W (1997) V AM Bulletin no. 4. ISO (1993) International Vocabulary
tainty in Analytical Measurement (1st 16:5-6 of Basic and General Terms in Me-
edn). Laboratory of the Government 3. Williams A (1997) V AM Bulletin no. trology. International Organization
Chemist. London 16:6-7 for Standardization, Geneva
Uncertainty - statistical approach, IIf noise and chaos 63

5. Analytical Methods Committee 9. Hooge FN (1997) In: Claeys C, Si- 13. Oberauskas J, Serapinas P, Salkaus-
(1995) Analyst 120: 2303-2301' moen E (eds) Noise in physical sys- kas J, Svedas V (191'1) Spectrochim
6. Horwitz W (191'1') Appl Chern 60: tems and lIf fluctuations, Proc 14th Acta B 36: 799-1'07
1'55-1'64 Int Conf, World Scientific. Singapore 14. L'vov BV (1996) Spectrochim Acta B
7. Sutarno R (1993) Procedure for sta- New Jersey London Hong Kong, pp 51:533-541
tistical evaluation of analytical data 3-10 15. Katasus Portuondo MR, Petrov AA,
resulting from international tests. ISO 10. Howarth RJ (1995) Analyst Sheinina GA (191'0) Zh Prikl Spektr
TC 102 N 451' 120: 11'51-1 1'73 33: 19-24
X. Thompson M, Howarth RJ (191'0) 11. Thompson M, Wood R (1993) Pure 16. Jackson KW, Chen G (1996) Anal
Analyst 105:111'1'-1195 Appl Chern 65:2123-2144 Chern 6X:243R-244R
12. Ellison S (1997) In: Measurement un- 17. Brown SD, Sum ST, Despagne F,
certainty in chemical analysis - cur- Lavine K (1996) Anal Chern 6X:23R
rent practice and future directions,
Proc 2nd Workshop. BAM, Berlin
Accred Qual Assur (2002) 7:153-158
DOl 10.1007/s00769-002-0440-8

© Springer-Verlag 2002

Kaj Heydorn Calibration uncertainty


Thomas Anglov

Abstract Methods recommended by of calibration data, including cases


the International Standardization Or- where linearity can be assumed only
ganisation and Eurachem are not sat- over a limited range.
Presented at the 10th International isfactory for the correct estimation of
Metrology Congress, 22-25 October 2001,
Saint Louis. France
calibration uncertainty. A novel ap- Keywords Metrology in chemistry .
proach is introduced and tested on Calibration· Uncertainty·
actual calibration data for the deter- Traceability· Verification
K. Heydorn mination of Pb by ICP-AES. The im-
Department of Chemistry, proved calibration uncertainty was
Technical University of Denmark, verified from independent measure-
2800 Lyngby, Denmark ments of the same sample by demon-
e-mail;heydorn@kemi.dtu.dk strating statistical control of analyti-
Tel.: +45-4525-2342
Fax: +45-4588-3136 cal results and the absence of bias.
The proposed method takes into ac-
T. Anglov
Department of Metrology, count uncertainties of the measure-
Novo Nordisk NS, Krogsh\'ljvej 51, ment, as well as of the amount of
2880 Bagsvxrd, Denmark calibrant. It is applicable to all types

Introduction determination of Pb in aqueous solution by ICP-AES in


the concentration range of 0-10 mglL. The resulting un-
In chemical analysis, the first and most essential link in certainty budget was used to test the linearity of calibra-
the traceability chain is the calibration of the measure- tion using the T-statistic originally developed for the
ment system with known calibrants. The uncertainty from Analysis of Precision [3]; when the deviations of the cal-
this calibration is therefore a most important uncertainty ibration points from the straight line relationship are
component, which is sometimes the largest contribution completely attributable to the uncertainty of the mea-
to the combined uncertainty of the analytical result. surement, the value of T is closely approximated by a
Both classical [I] and contemporary [2] guidelines chi-square distribution with n-2 degrees of freedom.
present methods intended for the determination of cali- Verification of the resulting uncertainty budget can be
bration uncertainty, and many of their recommendations carried out by the analysis of a particular sample by a
complement each other. The application of these meth- number of independent operators, each using their own
ods to actual calibration data should lead to a correct es- calibration data. With these data in statistical control, the
timation of the contribution of calibration to the uncer- new method for estimating calibration uncertainty is well
tainty budget. However, neither the classical nor the con- suited for propagation of uncertainty in accordance with
temporary approach are capable of extracting the maxi- the BIPM philosophy [4].
mum information from the data, corresponding to the
minimum uncertainty of the calibration; a new approach Materials and methods
is therefore warranted.
We have applied a novel combination of classical and All measurements were carried out by atomic emission
contemporary methods to actual calibration data for the spectrometry with an inductively coupled plasma unit for
Calibration uncertainty 65

excitation of the sample. The ICP-AES instrument was a Results and discussion
Perkin-Elmer model Plasma 1/ emission spectrometer.
The first step in predicting the uncertainty of calibration
is to establish the relationship between the standard devi-
Calibrants ation and the level of the signal, and this is dealt with in
paragraph 7.5 in [I]. Three types of functional relation-
All samples were prepared by dilution of a Merck certified ship are considered for the reproducibility standard devi-
lead standard with a nominal concentration of 1000 mgIL. ation, sr. and the mean level, m
Aliquots of the standard were taken with a calibrated pi-
pette and diluted with a mixture of hydrochloric and ni- I sr=bm, where b is the slope
tric acids in calibrated volumetric flasks. II s,.=a+bm, where a is the intercept
Pure acid mixture was used as a blank, and calibration III sr=Cm d d::;l, where d is the logarithmic slope
samples were prepared with nominal concentrations of Simple linear regression analysis may be used for the de-
0.25, 1.00, 2.00, 5.00, and 10.00 mgIL. A nominal 1.5 mglL termination of the parameters a and b, and from the loga-
sample was chosen as an unknown. Samples were la- rithm of expression III
belled A, B ... G in random order.
log sr = log C+d . log m
Uncertainty budgets prepared in accordance with Ex-
ample Al in Ref. [2] provided an estimated relative stan- we can determine in the same way log C as the intercept
dard uncertainty of 0.22% for the dilution process. a, and d as the slope b.
These expressions are mathematically simple, but make
no physical or chemical sense. Much more sense is found
Measurements in Ref. [2] Appendix E4, which reflects the normal situa-
tion that the standard deviation is independent of the result,
Ten people working together two and two carried out all x, at low signal levels and proportional at high levels
measurements in one afternoon. Each participant mea-
sured all samples in alphabetic order, and the instrument £4 u(x) = ~S6 + (x· SI)2 (1)
was zero-adjusted at the beginning of each series. As proposed in paragraph E.4.5.2 in Ref. [2] linear re-
Readings in arbitrary units - counts - are presented in gression of u(x)2 on x 2 can be used to determine So as the
Table I together with the mean and standard deviation square root of the intercept, and s I as the square root of
for each level. the slope.
The calibration data - Levels A to F - were tested in
accordance with the procedure recommended by ISO [11
as Example 3. No outliers were detected. In addition we Relationships found by simple linear regression
used ANOVA to detect any influence from the order of
measurement or from a possible difference between the The quality of these representations can be judged by the
teams; no such effects were found. statistic T [3], which compares the deviations of the ob-
The variability of measurements expressed as stan- served values from the calculated values with the uncer-
dard deviations - SD - are therefore assumed to depend tainty of the observations
only on the measurement level.
T = (observed - expected)2
t (standard.uncertainty)2
II
(2)

Table 1 Readings obtained by 10 participants for 6 reference solutions of Pb by ICP-AES

Team Level A Level B Level C Level D Level E Level F Level G

I 1074.0 159.6 2855.6 573.5 13.1 5780.5 830.3


I 1177.9 157.7 2946.1 564.7 9.3 5668.6 888.7
2 1109.3 146.6 2824.1 567.4 9.0 5736.9 875.4
2 1180.3 139.7 2829.7 568.1 7.7 5824.4 883.0
3 1091.9 149.6 2828.4 556.6 13.1 5741.9 860.8
3 1158.1 129.0 2861.0 573.5 -3.3 5667.9 864.4
4 1084.8 145.4 2837.4 568.7 14.9 5673.4 864.9
4 1137.0 151.3 2875.5 555.3 26.2 5794.3 861.2
5 1115.4 147.9 2864.5 560.9 -1.9 5709.5 861.7
5 1145.4 155.5 2920.7 547.4 6.9 5628.7 848.0
Mean 1127.4 148.2 2864.3 563.6 9.5 5722.6 863.8
SD 38.2 9.0 40.7 8.5 8.4 63.9 16.7
66 K. Heydorn . T. Anglov

Table 2 Test for adequacy of simple functional relationships for representation of the uncertainty of a reading

Level SD counts u(SD) counts Linear Regression Log/log Regression sqrlsqr Regression

Type I or II Sq residual Type /II Sq residual E4.5.2 Sq residual

A 38.2 9.0 21.7 3.35 24.8 2.20 23.0 2.85


B 9.0 2.1 12.1 2.01 12.8 3.18 19.6 24.29
C 40.7 9.6 38.8 0.04 33.6 0.55 36.5 0.19
D 8.5 2.0 16.2 14.82 19.8 32.33 2004 35.81
E 804 2.0 10.7 1.32 5.3 2.53 19.5 31.10
F 63.9 15.1 66.9 0.04 42.1 2.10 64.7 0.00
G 16.7 3.9 19.1 0.37 22.8 2.35 21.6 1.53
T=21.94 T=45.24 T=95.77

and which closely follows a chi-squared distribution In u(x) versus In x


with n-2 degrees of freedom, when two parameters are
estimated from the observations. ,....--------------,. 6
In the set of data in Table 1 we have n = 7 levels, and ·5
the uncertainty of each SD from m = 10 replicates is cal-
culated from ·4 M
:i'
I·····~---~-·~····~<- - - - - - t o 3 .E
u(SD) - SD (3)
- .,j2(m-1)
in good agreement with the value of 24% presented in
~~---~_r_r----~~--~. 1
Table E.I in Ref. [4]. o 1 2 3 4 5 67 8 9 10 11
In Table 2, results for SD from Table 1 are presented to-
Inx
gether with their uncertainties at each level, and calculated
values are shown for expressions I to III and E4 together Fig. 1 Standard deviations ± their standard uncertainties accord-
with the normalized, squared residuals, contributing to the ing to Eq. 3 as a function of the level. The curve is drawn accord-
ing to Eq. 4 with the parameters from Table 4 obtained by non-lin-
value of T. The intercept a is not significantly different ear regression.
from 0, which means that expression II is not significantly
better than I, although it makes more physical sense.
However, at S degrees of freedom all values of Tare
significant at the 0.1 % level, which means that none of to u(SD)-2 for relationships I and II. Three iterations
the expressions are capable of predicting a correct stan- were needed to obtain a stable solution. For relationship
dard deviation from the mean level. III, Wi equals u(log SD)-2, which is independent of SD,
and therefore degenerates to simple linear regression.
In the linear regression of U(X)2 on x2 the initial
Relationships found by alternative regression methods weight of U(x)2 is taken as the squared reciprocal of

Conditions for estimating parameters by simple linear re- U(U(X)2)=U(X)2.~ n:'1


gression are not strictly fulfilled. First of all the indepen-
dent variable is not without uncertainty; in fact its uncer- In this case we needed 4 iterations to obtain the stable
tainty is approximately .)2 times larger than that of the de- solution shown in Table 3.
pendent variable. However, b is usually below 0.1, which 2. With log SD = log u(x) as the dependent variable all re-
means that variations in x have very little influence on u(x). sults have the same weight in the transformed Eg. (1)
More important is that the uncertainty of the depen-
dent variable SD is notjndependent of the value of the log u(x) = O.Slog(sO+ (Sl . X)2) (4)
variable: in fact it is proportional to this variable, as indi- but the parameters have to be found by non-linear re-
cated in Eq. 3. gression.[S].
There are two methods for alleviating this problem: This may be carried out by means of the Solver
1. to resort to weighted linear regression to determine program that can be accessed from the usual Microsoft
slope and intercept Excel spreadsheet; the starting values were taken from
2. to use log u(x) as the dependent variable. the results obtained by weighted quadratic regression.
1. Weighted linear regression is carried out as described in The quality of the resulting functional approximations is
paragraph 7.S in Ref. [I] using initial weights, Wi' equal shown in Table 3, which presents the value of T calculat-
Calibration uncertainty 67

Table 3 Test for goodness of fit of functional relationships for representation of the uncertainty of a reading

Level SO counts u(SO) counts Linear weighted sqr/sqr weighted Non-linear weighted

Type II Sq residual E4.5.2 Sq residual E4.2.1 Sq residual

10 8.4 2.0 7.9 0.08 8.4 0.00 8.4 0.00


148 9.0 2.1 9.6 0.06 8.9 0.0\ 8.7 0.02
564 8.5 2.0 14.7 9.88 13.5 6.38 12.1 3.28
864 16.7 3.9 18.5 0.20 18.2 0.15 15.7 0.06
1127 38.2 9.0 21.7 3.33 22.7 2.94 19.3 4.41
2864 40.7 9.6 43.3 0.07 54.3 2.02 44.9 0.19
5723 63.9 15.1 78.8 0.97 107.6 8.41 88.4 2.65
T=14.60 T=19.90 T=10.61

Table 4 Absolute and relative


contributions to the standard Type of regression linear/linear squared/squared non-linear
deviation determined by differ-
ent types of weighted regres- Absolute value So 7.75 8.42 8.43
sion Relative value S1 0.0124 0.0187 0.0154

ed according to Eq, (2). With 5 degrees of freedom the original data, and results are presented in Table 5. Simple
critical values for Tare 15.1 for p < I % and ILl for p linear regression gave in all cases a less satisfactory fit.
<5%, which means that neither of the weighted regres- It is concluded that a perfectly satisfactory linear cali-
sions give satisfactory agreement with the observations. bration is maintained up to approx. 2 mg/L, which means
Only the non-linear - unweighted - fit is acceptable at that the observed deviations from the linear relationship
the 5% level of significance. are fully accounted for by the assumed uncertainty of the
Moreover, the actual deviations seen in the last col- readings. Thus no additional uncertainty contribution
umn of Table 3 are much more evenly distributed over needs to be included in the uncertainty budget.
the range than in the other two cases, which confirms
that the non-linear case is the best overall representation
of the data from Table I. This is confirmed by the loga- The uncertainty budget
rithmic plot of the standard deviation as a function of the
level shown in Fig. I. The uncertainty budget for the determination of Pb by
With the ready availability of non-linear regression to ICP-AES is reduced to
users of Microsoft Excel the computational effort is
probably less than that associated with the iterative, u(y)=~sa +(Y'Sj)2
weighted regression. Consequently we are going to use with so=8.43 and s [=0.0154, where y is the reading. The
the non-linear results presented in Table 4 for estimating conversion of a reading to a concentration is linear up to
the uncertainty of our calibration. at least 2 mg/L, and the additional uncertainty of 0.22%
is negligible in comparison with u(y), which is always
larger than 1.5%.
Linearity of calibration function Experimental verification of the budget is based on the
separate determination by each of the participants of the
The calibration function expresses the relationship be- concentration of Pb in the unknown sample G from Table 1.
tween the known concentrations of Pb in the reference With the uncertainty budget above each participant will cal-
solutions A to F and the readings of the instrument; to culate a result and its uncertainty from his own data alone.
the extent that the superposition principle is observed The participant may choose to assume a linear calibra-
this function is a straight line. However, deviations are tion and use all his observations to arrive at a result, or the
expected to occur at higher levels of the indicator, so that participant may calculate the result by interpolation only
the linearity range becomes restricted. between the two calibration points bracketing the unknown.
Instrument calibration is found by weighted linear re-
gression of the mean readings using the reciprocal square
of their respective uncertainties as the initial weight and Results based on linear calibration
Eq. (I) to calculate weights for subsequent iterations. The
linearity is tested by the value of T from Eq. (2) with the Each participant determines his own calibration data
standard uncertainties calculated from Eq. (I) using the from his readings in Table I of the reference solutions A
68 K. Heydorn . T. Anglov

Table 5 Tests for linearity of calibration based on weighted linear regression over a decreasing range

Reference Mean counts Uncertainly Linear weighted Linear weighted Linear weighted
mglL u(counts)
calibration Sq. residual calibration Sq. residual calibration Sq. residual

0 9.5 2.7 5.0 2.89 6.3 1.47 8.6 0.11


0.248 148.2 2.9 147.1 0.18 147.6 0.05 148.3 0.00
0.992 563.6 2.7 573.4 6.54 571.5 4.31 567.2 0.87
1.979 1127.4 12.1 1138.9 3.57 1134.0 1.16 1122.9 0.54
4.956 2864.3 12.9 2844.7 1.90 2830.3 5.73
9.88 5722.6 20.2 5666.1 4.08
4 dJ. *** 3 dJ. ** 2 d.f. n.s.
T=19.15 T=12.72 T=1.52

Table 6 Results obtained by using a linear calibration function over a range from zero to at least 2 mg/L

Participant Calibration Linear calibration T statistic SampleG Result Uncertainty


Initials Points Reading mg/L u(x)
a b

TAng 4 20.42 542.45 3.69 830.3 1.493 0.039


HkW 6 8.56 581.20 6.51 888.7 1.514 0.032
EIS 6 5.00 569.12 3.11 875.4 1.529 0.030
RiX 6 1.03 581.93 5.90 883.0 1.516 0.032
LonM 6 6.39 565.10 7.73 860.8 1.512 0.033
FinC 6 -6.58 580.02 2.37 864.4 1.502 0.029
IMP 6 7.90 563.65 6.91 864.9 1.520 0.033
DB 5 14.90 565.60 6.61 861.2 1.496 0.036
JNoS 6 -1.04 573.21 2.39 861.7 1.505 0.029
Dorman 6 4.88 573.28 8.79 848.0 1.471 0.033
Mean 1.507 0.010
T=2.29

to F with the known concentrations given in Table 5. The the calculated T-statistic for 9 degrees of freedom is far
zero-offset and the slope are determined by weighted re- below the 95% level of significance of 16.9.
gression assigning to each reading y a weight of u(y)-2 , The calibration uncertainty as expressed by the sec-
as calculated from the uncertainty budget above. The lin- ond term in the expression (5) is based on 4-6 observa-
ear range that may be used is determined in exactly the tions and therefore a minor contribution compared to the
same way as in Table 5. measurement uncertainty of the reading for sample G
With a 95% confidence level of T =9.49 at 4 degrees based on only one observation. Thus the variability
of freedom 8 out of 10 participants could use all 6 cali- among final results is probably less than the variability
bration points to determine their calibration data. One of the readings on which these results are based, which
participant had to eliminate the highest level and one the indicates that the readings obtained by a particular par-
two highest levels; only the last case was therefore re- ticipant are not completely independent of each other.
stricted to the range found in Table 5.
Calibration data and results for the concentration of
Results based on interpolation
Pb in sample G are shown for individual participants in
Table 6 together with the estimated uncertainty of their Calculation of results may also be carried out without as-
predicted result, xpred. Its uncertainty is calculated in ac- suming a linear calibration function over a particular
cordance with the approach presented in Appendix E.3 range, but only between two calibration points bracke-
of Ref. [2], while using the T-values calculated for the ting the unknown. This is simply done as linear interpo-
weighted regression from paragraph 7.5 in Ref. [ 1] lation and has the further advantage that uncertainty in
both x and Y values are easily taken into account.
2 __1 [ 2 T ( (Xpred~ - Tz)2 )]
U(Xpred) - b2 U(Yoh.,) + ~(n-2) 1+ ~~_Tz2 Results for the unknown sample G are obtained by
each participant using simple interpolation between refer-
(5) ence levels A and D; the interpolation formula is entered
The agreement between individual results is actually bet- into a spreadsheet, and the procedure described by
ter than expected from their estimated uncertainties, and Kragten in Appendix E.2 in Ref. [2] calculates the com-
Calibration uncertainty 69

Table 7 Results obtained by interpolation between two references

Participant Initials Reference samples Sample G Reading Result mg/L Uncertainty u(x)

1.979 mg/L 0.992 mg/L

TAng 1074.0 573.5 830.3 1.498 0.037


HkW 1177.9 564.7 888.7 1.514 0.032
EIS 1109.3 567.4 875.4 1.553 0.036
RiX 1180.3 568.1 883.0 1.500 0.032
LonM 1091.9 556.6 860.8 1.553 0.036
FinC 1158.1 573.5 864.4 1.483 0.033
IMP 1084.8 568.7 864.9 1.558 0.038
DB 1137.0 555.3 861.2 1.511 0.033
JNoS 1115.4 560.9 861.7 1.527 0.035
Dorman 1145.4 547.4 848.0 1.488 0.032
Mean 1.516 0.011
T=5.71

bined uncertainty from the uncertainties of the readings, uncertainty of 0.014 mglL which slightly exceeds the
as well as from the assigned value of the reference sam- critical value of z=I.96.
ples.
Table 7 lists results and their uncertainties, which
show that also in this case the variability may be less Conclusion
than expected from the uncertainties with a value of the
statistic T that is well below the 95% level. Neither simple nor weighted linear regression yielded
satisfactory results for the functional representation of
standard deviation as a function of level with any of the
Bias alternatives proposed in Ref. [1]. Only the use of non-
linear regression of the logarithmic standard deviation on
The unknown sample G was prepared by dilution of the the expression recommended in Ref. [2], gave acceptable
same lead standard as the calibrants, and its concentra- agreement with experimental data.
tion was 1.485 mglL with a standard uncertainty of An uncertainty budget based on this representation
0.008 mg/L. for the determination of Pb in an aqueous solution yield-
Weighted mean values of the analytical results ob- ed experimental results in statistical control and without
tained by the participants are presented in Tables 6 and 7 bias.
together with their standard uncertainty. In either case Acknowledgements The authors are indebted to the Danish Insti-
the bias is positive, but barely significant at the 5% level tute of Occupational Health for making their ICP-AES available to
of confidence. the first course in "Metrology in Chemistry" held at Novo Nordisk
For the results based on linear calibration the bias is in the year 2000. The original readings made by the participants in
the course are used in this study exactly as they were made on
0.022 mglL with a standard uncertainty of 0.013 mg/L, 2000-10-09. In particular we acknowledge Miss Dorrit Meincke
which is not significantly different from zero. For the in- for supervising the ICP-AES instrument and for preparing the cali-
terpolated results the bias is 0.031 mg/L with a standard bration standards under careful statistical control.

References
I. ISO 5725-2, Accuracy of Measurement 2. EURACHEM/CITAC Guide. Quantify- 4. ISO Guide to the Expression of Uncer-
Methods and Results, I st Edition, ing Uncertainty in Analytical Measure- tainty in Measurement, Geneva, 1993
1994 ment, 2nd edn. 2000 5. Heydorn K, Griepink B (1990)
3. Heydorn K (1991) Mikrochim Acta Fresenius Z Anal Chern 338: 287-292
(Wien) III: 1-10
Accred Qual Assur (2001) 6:372-375
© Springer-Verlag 2001

Riitta Maarit Niemi Measurement uncertainty in microbiological


Seppo I. Niemela
cultivation methods

Abstract Microbiological analyses result of the analysis and its uncer-


are carried out on clinical, food, feed tainty. Because of the importance of
and environmental samples. The the particle statistical variation to the
aims of the analyses are diagnostic uncertainty, the approaches devel-
or estimation of the safety or the oped for chemical analyses are not
quality of the sample. Important de- directly applicable to microbiology.
cisions are made on the basis of mi- This paper discusses microbial ana-
crobiological analyses. Little atten- lyses and describes a novel guidance
tion, however, is paid to the uncer- document for the estimation of mea-
tainty of measurement of microbio- surement uncertainty in culturing
R.M. Niemi (~) . S.1. Niemela logical analyses. In microbiological methods [1].
Finnish Environment Institute, cultivation techniques the result is
P. O. Box 140. 00251 Helsinki, Finland
e-mail: maarit.niemi@vyh.fi
obtained by counting individual ob- Keywords Microbiology·
Tel.: +358-9-40300853, jects. The normally low number of Cultivation methods· Measurement
Fax: +358-9-40300890 counted objects strongly affects the uncertainty

Introduction In microbiological analysis, culture techniques are


important because it is often relevant to detect viable mi-
It has been asked whether microbiology is more like the croorganisms. Often the target microorganisms in hu-
art of cooking than science. Cooking, and doing it cor- man, animal, food or environmental samples constitute
rectly (e.g. utilising gravimetric, volumetric, tempera- only a minor fraction of the microorganisms present.
ture- and time-related metrological information) is an in- Different selection principles and indicator systems are
tegral part of microbiological cultivation methods. Sci- applied in order to facilitate the growth of the target mi-
ence is the ever present basis for analytical work in mi- croorganism yielding characteristic reactions, while sup-
crobiology. pressing the growth of other microorganisms. The prima-
The aim in microbiological analyses is usually to de- ry cultivation result is usually not sufficiently reliable,
tect and enumerate a known species or group of microor- but necessitates the use of further tests to confirm the
ganisms in a measured amount of sample. If measure- identity of the target microorganism. In practice, confir-
ment means counting and identification, quantitative mi- mation tests in routine work do not provide taxonomical-
crobiological analyses belong to the sphere of metrology. ly valid identification, but rely on a limited set of tests
Traceability to primary measurement standards can hard- confirming the identity of the target with high probabili-
ly be achieved, but the measurement units are the num- ty. The valid identification of microorganisms can be re-
bers per gravimetric or volumetric units. garded as a rather complex measurement and the uncer-
Internationally available reference strains and materi- tainties inherent to microbial taxonomy complicate eval-
als, and international performance tests for microbiology uation of the uncertainty involved. In this paper, the un-
are becoming increasingly available to aid traceability certainty of taxonomically valid identification is not dis-
and comparability evaluation of analyses carried out in cussed. In routine work, identification is usually limited
different laboratories. to the agreed confirmation tests. Uncertainty of confir-
Measurement uncertainty in microbiological cultivation methods 71

mati on is addressed only to the extent of estimating the ble in microbiology and it is not worthwhile investing
binomial sampling variance of fractions confirmed. much effort in calculating values for these parameters
The Finnish guidance document for measuring uncer- (see Type A below). Instead, the uncertainty of the meth-
tainty is a novel approach for the estimation of uncer- od can be expressed as a formula into which observed
tainties in microbiological measurements based on culti- values of each measurement can be inserted.
vation and including confirmation [1]. This document Because the Type A uncertainty estimation (based on
has been elaborated as an activity of the Advisory Com- replicate measurements) is usually not economically fea-
mittee for Metrology in Finland and the Centre for Me- sible in microbiology, the emphasis in the guidance doc-
trology and Accreditation in Finland will produce a ument [1] is on the Type B approach.
translation into English. The complex measurements Type A: The standard uncertainty is calculated from n
such as taxonomically valid identification are not includ- independent replicate measurements Xl' X2' ... , xll as
ed in this document. It is hardly possible to know the ex- the experimental standard deviation [2]:
act number of atoms or molecules in chemical analysis.
Similarly, it is not possible to know the actual number of I;l_l (Xi - X)2
viable target cells or spores in a sample. Sr = -1
11
Therefore, it is only possible to measure the relative
The number of replicates must be rather high because
uncertainty of measurement. In this paper the uncertainty
even 30 replicates from sample sources following
measurements of microbiological cultivation methods
normal distribution yield estimates of the standard de-
are described according to Niemela [1].
viation with only 13% relative uncertainty.
Type B: According to [2], Type B uncertainty is obtained
using other approaches than replicate samples. The
Principles uncertainty variance u 2 or the standard uncertainty u
are based on the whole body of scientific information
The instructions for the calculation of uncertainty of available (with the exception of replicate measure-
measurement elaborated for chemistry [2, 3] are not di- ments) on the possible variation of the measurand. In-
rectly applicable to microbiology. In the guidance docu- formation of statistical theory, earlier measurements,
ment [1] the principles of these chemistry documents are experience or general beliefs on instruments and ma-
interpreted and adapted to microbiology. terials, specifications of the manufacturer, published
In microbiological measurements sample pretreat- reference values in calibration and certification re-
ment is normally limited to homogenisation and dilution ports, and uncertainty estimates in handbooks can be
(or concentration by filtration). The measured portion of utilised.
the sample or its dilution is transferred to the "detector"
The common sources of uncertainty in cultivation meth-
and the result is obtained by counting individual objects.
ods are sample stability, dilution, counting (including
The sole principle of the measurement is reflected in the
particle statistical variation and personal interpretation of
formula for calculation of the result:
the target), yield on the medium, crowding effect (coin-
y=F· VC cidence error) and uncertainty of confirmation. The com-
bined uncertainty can be calculated as the quadratic sum
where of different uncertainty components.
y = number of microorganisms per volume
F = dilution factor
V = volume of the portion of the final dilution Compilation of uncertainty in cultivation methods
C = number of microbial particles in V.
It is evident from the above formula that the counted In microbiological cultivation methods only four types of
number of microbial particles strongly affects the result. detection systems are used: the one-plate instrument, the
The particle statistical variation can be estimated by ap- set of plates instrument, the one-tube detector (Pres-
plying the Poisson theory: ence/Absence) and the set of tubes instrument (Most
Probable Number). All these classes necessitate different
RSDc =uc =jf formulas for uncertainty estimates. Different media and
incubation conditions, and confirmation and identifica-
Because the microbial detectors function optimally at tion tests offer the versatility needed for the enumeration
particle numbers between 25 and 100 per test portion, of different organisms. However, this versatility is not re-
the particle statistical variation often dominates the un- flected in the principles of the estimation of uncertainty.
certainty of the measurement. Therefore uncertainty esti- In microbiological cultivation methods the sample is
mates such as reproducibility and repeatability deter- homogenised, after which a measured amount is diluted
mined in collaborative efforts are not generally applica- or concentrated (e.g. by membrane filtration) to the
72 R. M. Niemi· S. I. Niemela

proper measuring range of the analytical method. Suit- Instructions for the estimation of individual uncertain-
able aliquots are transferred to the plates, tubes or wells ty components are described by NiemeHi [1].
and, after incubation, colonies or numbers of positive
and negative reactions in tubes or wells are counted.
It is often necessary to confirm that typical colonies A shortcut to uncertainty estimates
or tubes yielding a positive reaction actually show the of the instrument with several plates
presence of the target microorganism. When it is possi-
ble to confirm all the presumptive results, there is no un- Consider a detection instrument consisting of several
certainty due to the random error caused by sampling for plates with colony counts c i derived from the volumes Vi
binomial attributes. On the other hand, when only a frac- (i=1 ,2, ... ,17) of the final suspension. The average particle
tion of the presumptive positive reactions are tested for concentration x of the final suspension is estimated from
confirmation, a significant increase will occur. The bino- I.c
mial sampling variation in confirmation tests should be x---'
- I.vi
addressed in addition to the error due to Poisson distribu-
tion of presumptive target colonies counted. In the pri- The microbial concentration of the original sample is ob-
mary cultivation step, counting results of typical colo- tained from the calculation
nies or reactions are susceptible to subjective judgement
y=Fx
that can cause significant uncertainty.
The components causing uncertainty can be seen from where F is the dilution factor.
the formula used for calculating the measurement result. The uncertainty of the average particle density of the
Usually it includes all the dilution steps, colony counts final suspension consists of particle statistical variation
from different plates (or most probable numbers) and num- (uJ, counting uncertainty (u) and the uncertainty of in-
bers of isolates tested further and those confirmed. The un- oculum volume measurements (uv) including possible
certainty of counting of colonies caused by differences be- dilution effects within the detection instrument. These
tween technicians should be included. It is possible to cor- components merge to form the uncertainty of x. It can be
rect systematic errors, e.g. confirmation rate (often ex- estimated by using the log likelihood ratio statistic G2
pressed as % of typical colonies or reactions confirmed). calculated with the following formula:
In the calculation of the result multiplication is need-
ed and, therefore, the combined uncertainty is composed G,7_1 = 2[t Ci In( ~: )-(I.Ci)ln( ~~:)]
of the sum of the relative uncertainties. Fortunately, the
components of uncertainty tend to be independent in mi- where
crobiological measurements, so that covariances need c i = colony count of the plate i
seldom be considered. Vi = inoculum size (in ml of final suspension) on the plate i
Corrections for systematic errors, e.g. confirmation n =number of plates
rate and dilution error, are mostly multipliers. The cor- The relative uncertainty of the microbial concentra-
rected final result is therefore expressed as: tion x is calculated with the formula

y = kl . k2 ... kll . F· VC u - _ -G,7-_1 . -I-


x 17-1 I.ci
where
The uncertainty of confirmation and any other uncer-
k =coefficients of systematic errors.
tainties and correction factors that are known to affect
The uncertainty of each correction coefficient should be
the result can be taken into account in the same way as
included in the calculation of uncertainty.
before. The formula for the combined uncertainty in the
As an example, the relative total uncertainty for a
shortcut systems is therefore:
one-plate instrument can be calculated with the follow-
ing formula: u.-' = Vluk,2 + uk22 + ... + ukll2 + uP2 + uF2 + ux2
uy = ~Uf, +Uf2 +"'+ut" +u~ +up +ul +u~ +u} where
where
uk =correction coefficient uncertainties
ur =correction uncertainties (e.g. dilution correction, uR= binomial sampling uncertainty of confirmation
uF = dilution factor uncertainty
yield correction)
u~ =uncertainty of the confirmation coefficient u~ =measurement uncertainty of the multi-plate
u~= dilution factor uncertainty
"instrument" .
=
u~ particle statistical uncertainty
u~ = inoculum volume uncertainty
u~ = individual counting uncertainty.
Measurement uncertainty in microbiological cultivation methods 73

Discussion reliability of the measurement. The novelty, however,


means that it is challenging to the microbiologist to em-
Traditionally, microbiological uncertainty estimates have bark on the calculation of uncertainty estimates.
been based on replicate measurements or on the Poisson The temporal and spatial variations in numbers of mi-
theory only. Uncertainty estimates that include different croorganisms in the sample sources may be vast. Frequent
factors contributing to the uncertainty of microbiological sampling and replicate samples yield some information
measurements [I] offer the possibility to identify main about this variation and should be considered together
sources of error and, therefore, aid in identification of with uncertainty estimates from laboratory analyses.
the weak points in the analytical procedure. Attention Acknowledgements We thank Mr. Michael Bailey for correction
should be directed to these estimates in order to improve of the language.

References
I. Niemela SI (200 I) Uncertainty of Centre for Metrology and Accredita- 3. EURACHEM (2000) Quantifying un-
quantitative microbiological culture tion in Finland (MIKES) J1I2001, 69p certainty in analytical measurement,
methods. (In Finnish, to be translated ISBN 952-5209-3 2nd edn. Laboratory of the Govern-
into English and later available in elec- 2. ISO (1995) Guide to the expression of ment Chemist (LGC). Teddington, UK
tronic form at: www.mikes.fi). The uncertainty in measurement, 1st edn.
International Organization for Stan-
dardization (ISO), Geneva
Accred Qual Assur (2002) 7:228-233
DOl 10.1007/s00769-002-479-6

© Springer- Verlag 2002

Hans Andersson The use of uncertainty estimates


of test results in comparisons
with acceptance limits

Abstract When a test is performed as, e.g. how the acceptance limit is
in order to qualify a material or a related to the risk. In the second
product for a certain use, the result case, the variation in the property of
is generally compared with an ac- the material or product dominates
ceptance limit. The test result has an and the uncertainty of the testing
uncertainty which should be esti- procedure is negligible. When the
mated and stated (e.g. in accordance results are non-quantitative (go - no
with GUM). Very often this is not go), statistical methods can be used
the case. Further, discussions often to estimate the risk taken with a cer-
arise on the issue of how the uncer- tain sampling and acceptance strate-
tainty shall be considered in rela- gy that a certain proportion of the
tionship to the acceptance limit. The batch to be delivered does not quali-
intention of this note is to describe, fy. This should be considered more
in simple terms, the statistical back- often in standardisation of product
ground and to give some recommen- test methods. When the results are
dations. In short, there are two quantitative, a statistical analysis
clean-cut, extreme situations. The should be performed and the uncer-
first case is when the uncertainty of tainty should be compared with the
the testing procedure is the dominat- acceptance limit as before, from the
ing factor. Here it is found that the actual circumstances. When effects
estimates of single laboratories can- of testing uncertainty and product
not, generally, be used for compari- variation are comparable a sound
sons with acceptance limits. One treatment requires extensive experi-
should have standardised, well-veri- mental work. No short cuts can be
H. Andersson (~) fied estimates based on comprehen- made without loss of confidence!
SP Swedish National Testing sive investigations of the method. It
and Research Institute, can also be concluded that compari- Keywords Uncertainty·
P.O. Box 857, 501 15 Boras, Sweden
e-mail: hans.andersson@sp.se sons between test results and accep- Conformity assessment· Acceptance
Tel.: +46-33-165000 tance limits have to be made with limit
Fax: +46-33-165010 regard to the actual circumstances,

Assessment of risk (sampling, weighing, etc.). This uncertainty may be de-


scribed by a probability function PE(x), i.e. the probabili-
Consider a measurable property of a product, e.g. the ty that the true content lies between x and x+Llx is
content of a poisonous substance. There is an uncertainty PE(x)Llx.
in the measurement, x, which may depend on variations Further, assume that the risk for damage, e.g. illness,
in the product and on imperfections in the measurement can be described by a function Ps(x) such that the risk if
The use of uncertainty estimates of test results in comparisons with acceptance limits 75

Ps(x) The tests give, through statistical analysis, estimates


of the mean and standard deviation of PE(x). A parame-
ter related to the estimated standard deviation is the re-
producibility. Further, statistical analysis can also yield
the part of the uncertainty which is due to variations
within the laboratories, which is related to the parameter
repeatability. The statistical analysis, analysis of varianc-
es, and definitions of parameters are given in, e.g. ISO
Fig. 1 Probability of property, PE(x), and risk, PS<x) 5725[1].
However, in most cases, and contrary to metrology,
there are no means or time available for experiments at
the content lies between x and x+!lx is Ps(x)!lx, Then the
all, so estimates of the uncertainties are made through
probability for damage is described by the general rela-
professional judgement, so called Type B estimates in [2].
tionship
Generally, one has to combine a number of uncertain-
PDama}!e I PE(x{l ~(t)dtJdX
= (I)
ty estimates for different measurements in order to ob-
tain the uncertainty estimate for a complete test result or
See Fig, I, Note that a similar relationship may be set up chemical analysis. It is then presumed that there exists
for the case when the value of the property shall exceed some functional relationship, not always possible to de-
the acceptance value, e,g. the strength of a structure. fine in mathematical terms, between the measurement,
If as well PE(x) as Ps(x) have Gaussian distributions, t/i = I,n), results and the test result (x)
not at all necessary but useful as an example, with means (3)
mE and ms. and standard deviations (jE and (js. respective-
ly, the risk, probability for damage is If tj are independent stochastic variables with standard
deviations (jli the following formula is generally valid
</J( ±(m~ -m;)) (2)
= f(af )2 (4)
~(JE +(J;- (J2
x i=l ati
(J2
Ii'

where <p is the normalised Gaussian probability distribu- If there is a correlation between some of the tj:s, Eq. (4)
tion. The plus and minus signs relate to the two cases becomes more complex, see [2]. In many cases I' has the
above, respectively. It is readily seen that if the distribution form
of property and risk are well separated, the risk will be
II k.
small. In practise, there are considerable difficulties to find x=Jrf.'
i=l I
PE(x) and Ps(x), and therefore assumptions and approxima-
tions are used. The implications of this are treated next. then

( (Jx)2 = f k~(~)2 (5)


X i=l I ti
Assessment of PE(xJ
Obviously, if some tj has a large standard deviation, or if
PE(x) comprises a description of how the property varies some kj is large, this component dominates.
within the product, the "popUlation". This is described According to GUM [2] one shall assess the compo-
under "Qualitative testing of products". Here, we consid- nents (jlj in the following way and call them "standard
er the variation in the measured value of the property on- uncertainties". If a (jli is based on experiments it is called
ly due to the uncertainty in the measurement. This gives a Type A standard uncertainty. It is simply the standard
an analogous probability function PE(x). In cases where deviation estimate of the mean value. If it is not, it shall
both the uncertainty in the measurement and the varia- be estimated through professional judgement following
tion in product property are of the same magnitude, a certain, rather open, recommendations in GUM and be
special treatment is required, where the probability func- called a Type B standard uncertainty. These entities are
tions are combined. then inserted in Eqs. (4) or (5), yielding a standard un-
The experimental determination of PE(x) is very cost- certainty, (jx ' for the test result x. The degree of confi-
ly. In theory, it would mean the performance of a "large" dence of (jx depends on the experimental evidence and
number of tests in each of a "large" number of laborato- on the Type B estimates. A detailed description of this is
ries. In practise one is confined to a limited number of given in GUM [2], Appendix G. See also [3]. The value
tests in a small number (5-10) of laboratories. Then the of (jx thus defined is then used for decision making, most
degree of confidence becomes low, which has to be com- often disregarding its degree of confidence.
pensated for in the resulting estimation of the standard It is worth noticing that Type A and Type B uncer-
deviation. tainty estimates can both represent variations within as
76 H. Andersson

PE(X) estimates in the real world


Type A eslnnate Type B eslnnate

Variation between laboratories X X The procedures described in "Assessment of PE(x)" are


Variations within laboratories X X
best suited to metrology and certain areas of chemical
analysis where, e.g. statistical analysis can be used ex-
Fig. 2 Types of uncertainty components tensively thanks to large numbers of repeated measure-
ments. They are less easy to apply in product testing
where only few, expensive experiments can be per-
formed and where the varying properties of the products
are of importance. See also "Qualitative testing of prod-
ucts" .
Even in favourable situations there are problems, how-
ever. Extensive background data in chemical analysis has
been collected through IMEP, large programmes for pro-
ficiency tests in "pure" chemistry. One example is shown
in Fig. 4, which is typical for many investigations.
1s 1.96 s
Here, ordered results from 168 laboratories are shown
Fig. 3 Assumption of coverages of expanded standard uncertain- from analysis of lead in water together with their esti-
ties mated, extended uncertainties (K = 2, meaning a cover-
age of approximately 95%). As can be seen, the esti-
mates vary strongly and are generally much too small.
well as between laboratories. Components of uncertainty Probably uncertainty components have been both over-
may hence exist in all four of the positions of the matrix looked and underestimated. This indicates that there is
in Fig. 2. room for training and education and that guides as [4]
It is often, erroneously, presumed that Type A uncer- should be widely spread and used. It can also be seen
tainties represent variations within a laboratory and that that there is no distinction between accredited and non-
Type B uncertainties represent variations between labo- accredited laboratories in this investigation performed
ratories. As an example, measurements with a volt meter some years ago. This goes both for mean values and un-
of class 0.1 % may be taken. Here, there is an uncertainty certainty estimates.
of each specimen, i.e. between "laboratories", within the With Fig. 4 as a basis one can discuss what would
nominal limits -0.1 % to +0.1 %, i.e. normally a Type B happen in some real cases. Assume, for instance, that
uncertainty. But there is also variations, uncertainties there is a legal regulation that the lead content may not
within each laboratory, due to operations, etc., which be more than 0.12 11 mol/kg. Then, around 20 laborato-
may be estimated as a Type A or Type B uncertainty. ries, to the left in the diagram, accredited and non-ac-
As mentioned, the ax found, the standard uncertainty credited, will approve the test sample although the con-
for the test results is now presumed to be useful as a tent is well above the limit, even if one uses the whole
measure of the standard deviation of PE(x). It is also pre- stated uncertainty as a margin.
sumed that PE(x) is approximately Gaussian so that "ex- Hence, one should be careful to use the uncertainty
panded" standard uncertainties may give coverages of estimate of a single laboratory for comparison with stat-
PE(x) according to Fig. 3. This is based on the general ed limiting values in testing, particularly when many
relationship that the sum of a number of stochastic vari- Type B estimates have to be relied upon. There is a sig-
ables becomes approximately Gaussian, independently nificant risk that the estimates made by a single person
of the distributions of the various variables. or a group in a single environment make omissions or
Very often, an expansion factor of K=2 is recom- misjudgments. If possible one should use an uncertainty
mended (instead of 1.96) to represent a coverage of 95% estimate which includes the experience of severallabora-
(97.5% one-sided). Two reservations should be made. It tories, e g from interlaboratory comparisons in a learning
is not quite true that Eq. (4) gives an approximately process.
Gaussian distribution, e.g. if there is one or two dominat- If all the laboratories use the same method in the
ing factors which are not Gaussian, but perhaps "rectan- same way and make correct uncertainty estimates (K=2)
gular". This gives errors, in particular for large K-values. one would end up with a diagram like Fig. 5, analogous
Further, the degree of confidence becomes low if the ex- to Fig. 4. An laboratories state approximately equal un-
perimental evidence is poor. This should be compensated certainties and they cover the true value except in a few
for, through the Student (t) distribution, but this is very cases. (The S-shaped curve is an idealised, ordered sam-
rarely done. ple of PE(x).)
So, if one could, anyhow, obtain a trustworthy esti-
mate of Pdx) and express it with the uncertainty, how
The use of uncertainty estimates of test results in comparisons with acceptance limits 77

IMEP- 6 : Trace elements In water: natural w.ter (sample" 2")


Certified range (= ± 2 u c): 0.129 5 ·0.136 9 pmol'kg,1
!oO
• ACFredlted laboratory

0.''''
• PaItIaIIy ac~redlled ud ...kIng 'lilther accndllallon
• SeekIng accredllallon
40 .;.
o Mol acCNdlled Of no .c.c.m.nt .5
30 •cm
I!
20 'C
CII
co:

-
~
10 CII
0
0
0 CII
i5
'C
c). · 10 'E

-...
E
0
·20
c
0
0.0166 :p
·30 to
'>
CII
C
· 40

.!oO

168 results from all laboratories arranged by ascending values,

Fig. 4 Example of uncertainty estimates in a large interlaboratory Ps(Xo). limit


comparison

...... ... "


---- ---- ---
, ...... ... ----
Xo

-_ .... ... "


--- ----- --- ---
Fig. 6 Replacing real risk with a limiting value

tion according to Fig. 6. The limiting distribution is for-


mally a Dirac delta function, and one gets
Fig. 5 Idealised diagram of results and uncertainties, cf. Fig. 4
RLimit" = I PE(x{I0(t-X,)dt]dx =f PE(x)l(x -xo)dx

where l(x-x o) = I if x>xo and 0 else, i.e.


can it be used for comparisons with an acceptance limit?
This issue is treated next. RLimit" = f PE(x)dx
Xo

Assuming a Gaussian distribution yields


Replacing the risk distribution Psix)
with a limiting value R - rIl( mE -E Xo )
'Limit' - 'I' (J

For various reasons authorities, and written standards, This is in accordance with Eg, (3), if Xo = ms and as =0.
routinely replace the real distribution of risk with a limit- The result may seem trivial, but the reasoning gives
ing value which is intended to "safely" encompass all some insight in the process of replacing a real risk distri-
risks. In analogy with Fig. I, this corresponds to a situa- bution with a limiting value.
78 H. Andersson

It is evident that the relationships between the param- the measurement can be allowed to be omitted in the
eters, m s' as and Xo play an important part when it shall comparison with the limiting value.
be decided how a test result with the value mE and the Another illustrative example is the case where the al-
uncertainty estimate 2aE shall be related to a limiting cohol content in the blood of a car driver is measured
value xo' from the air exhaled from the lungs by a direct spectro-
Assume, as an example, that the limiting value Xo is metric method. The uncertainty of the method gives a
set in some relationship to the risk distribution P s (x), PE(x) distribution with a carefully determined uncertain-
and that we require that the measure mE shall have cer- ty approximating aE. The limiting value in Sweden, Xo
tain "margins" to the limiting value. We use the follow- =0.2 %0, is decided for political and pedagogical reasons
ing four cases: and it is comparable to factors as fatigue, irritation, etc.
The expanded uncertainty can therefore be safely credit-
1) Xo is determined as m s' and it is accepted that the val- ed to the car driver. There is still a margin to real risks.
ue mE may be used for comparison with xo' Of course, the fact that a sentence has severe conse-
2) Xo is determined as ms ' and it is required that the val- quences also indicates that the technical uncertainty
ue mE + 2ae shall be compared with xo' should be credited to the car driver.
3) Xo is determined as ms - 2ae and it is accepted that the In other cases other reasoning and conclusions have
value mE is compared with xo' to be made. The general conclusion is that it is not possi-
4) Xo is determined as ms - 2as and it is required that the ble to have a single "rule of thumb" for all cases when
value mE + 2aE shall be compared with xo' determining how the uncertainty of a measurement shall
be related to a given acceptance limit. Different situa-
This will give the corresponding risks for damage, with tions have to be treated separately.
assumption of normal distributions. There are, of course, also situations where the uncertain-
ty should be included in the comparison. One such is when
1) </>(0) =50% the material strength, e.g. fatigue properties are used for
safe dimensioning of critical structures, such as pipe lines
2) </>( ~ -;(J E 2 ), which is around 8% if a E =as or nuclear reactors. (Even here, though, the limits are set
with safety factors far exceeding the testing uncertainty.)
(JE +(Js
Again, the important principle to follow is that the re-

3) </>( ~ ~2(Js
(JE +(Js
2 ), which is around 8% if a E = as
lationship between limiting values and measurements
and their uncertainties should be technically well found-
ed. This requires a case by case decision.

Qualitative testing of products

Hence, an impression is obtained about the consequences It is now assumed that the uncertainty in the test method
of different strategies. If, for example, it is required that is small in relationship to the variation in product proper-
Xo = ms - 3as and aE«a s ' which is a usual situation, the ty and that the test is of the go - no go type.
risk of accepting mE for comparison with Xo is only The problem at hand is then to draw conclusions
about the proportion of approved products in a batch
from the proportion of approved specimen in a sample
</>( -4(Js )""</>(-3)<=01% tested. Examples may be safety helmets (impact resis-
~(J.~ +(J~ ,
tance), iron ore (contents of iron), pre-packed food
It is noted that in most cases the risk distribution is not (weight). This type of test is described, e.g. in ISO 2859
Gaussian, still the reasoning remains the same. [4] and may be described by so-called OC-diagrams (op-
It is usual to set limits for contents of hazardous sub- erating characteristics).
stances in food or in the occupational environment. Here, Such diagrams are based on the binomial distribution
the determination of the risk distribution, Ps(x), is very and the functional relationship is
unsure for small probabilities of injury or health effects. p
For this reason, there is a tendency for responsible au- PAccept(x) = L(~)XP(l- x)/-p
thorities to set the acceptance limit far below the lowest o
levels observed to cause effects, corresponding to large where PAcce.pt(x) is the probability to accept a batch with
k-values (k = 4,5 or even 6). This may cause difficulties the proportions x of non-accepted units if a sample of
in performing the measurement at all since the accep- size n is tested and if the batch is approved if at most p
tance level and detection levels become comparable. It is non-accepted units are found. The diagram has the gen-
also clear that in many such situations the uncertainty of eral shape according to Fig. 7.
The use of uncertainty estimates of test results in comparisons with acceptance limits 79

Paccept

Paccept
1.0

0.5

pin x 0.2 0.4 0.6 0.8 x


Fig. 7 General features of an OC-diagram Fig. 8 OC-diagram for p=O, n =3

Of course it is desirable that the diagram shall be de- Hence, if the batch contains 20% (x = 0.2) erroneous
cisive, i.e. that it is close to the rectangular shape of the units there is a 50% probability to accept it. This means
limiting curve obtained for n~oo. If the non-accepted that two laboratories may well get contradictory results
proportion is smaller than pin the batch should not be ac- or that the supplier may get different results from the
cepted and vice versa. same laboratory on two consecutive occasions!
As an example may be taken testing of packaging for It should be noted that when quantitative measure-
dangerous goods, where a drop test, from a certain height ments are made one can determine in the usual fashion
is approved if no leakage occurs. Three drops are made an approximate PE(x) function.
and the batch is approved if no leakage occurs in any of This type of product testing is very usual in written
them. standards, and much confusion and debate occurs be-
If it is assumed that the batch has a proportion x tween customers and laboratories, due to lack of under-
which would not pass the test the probability to accept standing of the properties of the testing procedure. Yet it
the batch will be is necessary to have these methods, for cost reasons, in
product testing. What is needed is that the standards are
PAccep/X) = (b)X o(1- x)3 = (1- x)3 produced with more care, and with clear explanations of
The corresponding curve is shown in Fig. 8. the risks and properties of the test methods.

References
I. ISO 5725 (1994) Precision of test meth- 2. ISO (1993) Guide to the expression of 4. EURACHEM Guide (2000) Quantifying
ods - Determination of repeatability and uncertainty in measurements. ISO, Gen- uncertainty in analytical measurements.
reproducibility by inter-laboratory tests. eva 2nd edn. EURACHEM, LGC,
ISO, Geneva 3. Andersson H (1994 )Assessment and Teddington. UK
practical use of uncertainty in test re- 5. ISO 2859 (1974) Sampling procedures
sults. 2nd EUROLAB Symposium, Flor- and tables for inspection by attributes.
ence, 1994. EUROLAB, ISO. Geneva
Accred Qual Assur (200 I) 6:493-500
© Springer-Verlag 2001

Dean A. Flinchbaugh A model to set measurement quality objectives


Larry F. Crawford
David Bradley and to establish measurement uncertainty
expectations in analytical chemistry
laboratories using ASTM proficiency test data

Abstract A model is presented that Keywords Proficiency test·


correlates historical proficiency test Uncertainty· Measurement quality
data as the log of interlaboratory objectives· Accreditation·
Electronic supplementary material to this standard deviations versus the log of ISO 17025
paper can be obtained by using the
Springer LINK server located at
analyte concentrations, independent
hup://dx.doi.orgll 0.1 007/s0076901 00398-y. of analyte (measurand) or matrix.
Analytical chemistry laboratories
can use this model to set their inter-
nal measurement quality objectives
D.A. Flinchbaugh and to apply the uncertainty budget
Flinchbaugh Consulting process to assign the maximum al-
Bethlehem, Pa., USA
e-mail: DAFlinch@beliatiantic.net. lowable variation in each major step
Tel.: +1-610-6946473, in their bias-free measurement sys-
L. F. Crawford (~)
tems. Laboratories that are compliant
Bethlehem Steel Corporation with this model are able to pass fu-
Bethlehem, Pa., USA ture proficiency tests and demon-
e-mail: Larry.Crawford@Bethsteel.com. strate competence to laboratory cli-
Tel.: + 1-610-694 6646, ents and ISO 17025 accreditation
Fax: +1-610-694 1739
bodies. Electronic supplementary
D. Bradley material to this paper can be ob-
ASTM Headquarters
West Conshohocken, Pa., USA
tained by using the Springer LINK
e-mail: Dbradley@ASTM.org, server located at hup://dx.doi.org/
Tel.: +1-610-8329681 10.1007/s007690 I00398-y.

Introduction ficiently, and competitively. Once MQOs are established,


laboratory personnel are responsible for implementing
This paper describes one model for use by analytical test methods that have estimated uncertainties that com-
chemistry laboratories that carry out compositional ana- ply with the MQOs.
lyses to apply reliable interlaboratory test data, such as ISO defines measurement uncertainty [I] as "A pa-
from the ASTM Proficiency Test (PT) programs, to de- rameter associated with the result of a measurement that
termine intralaboratory measurement quality objectives characterizes the dispersion of the values that could rea-
(MQOs) and to establish and consistently meet individu- sonably be attributed to the measurand." A report value
al-method uncertainties consistent with those objectives. with an estimated measurement uncertainty must be ac-
In this work, we define intralaboratory MQOs to be the companied by a confidence interval to be meaningful,
maximum measurement uncertainty to be allowed in any for example, 0.100+/-0.008% (m/m, 95% confidence).
series of test methods. Intralaboratory MQOs should be Successful application of the model requires that the lab-
set so that the data quality needs of clients are met in a oratory is compliant with ISO 17025 and is capable of
way that allows the laboratory to operate effectively, ef- successfully participating in PT programs for the meth-
A model to set measurement quality objectives and to establish measurement uncertainty expectations 81

odes) to be applied to the model. Laboratories that do not tions) to perform the same tests. The guide presumes that
meet these basic requirements can use the model to help the laboratory organization has established data quality
establish their overall Quality System program but objectives and is committed to meeting them. The Annex
should not expect to consistently receive acceptable rat- to that Guide describes a procedure for setting laborato-
ings on PTs without being in compliance with ISO ry-wide data quality objectives based on the ISO/ TC
17025. 17/SC I plot. Other recent ASTM Standards also make
The implementation of the model is simple. It makes use of the logarithmic correlation between variation and
extensive use of the logarithmic correlation between per- concentrati on [15-17].
formance statistic of a test method and concentration, in- The referenced programs typically use classical statis-
dependent of analyte (measurand), matrix, or method tical techniques, such as Dixon or Cochran, to reject in-
that was initially developed by Horwitz [2-5], beginning dividual data points and then use the log-log plot concept
in the early 1980s and further developed by Rocke and to identify data set outliers. Outliers are defined as points
Lorenzato [6]. Horwitz et. ai., used this model to analyze above or below a specified limit on the log-log plot.
the performance characteristics of test methods used for Root cause analysis of log-log plot outliers above
regulatory purposes by the Food and Drug Administra- the limit usually show possible deficiencies including:
tion. As discussed in the following representative exam- I) heterogeneity of one element in the test material,
ples, others use the concept to model the uncertainty in 2) unanticipated test method interference, 3) attempted
analytical measurement systems. application of the test method above or below the opti-
ISO 5725 [7] provides practical numerical definitions mum concentration range or 4) inadequate test method
for the repeatability, r, and reproducibility, R, of a stan- calibration and control protocols. Conversely, outliers
dard test method and describes the organization and below the line, although infrequent, appear to be "too
analysis of interlaboratory experiments for the numerical good", and usually indicate that the experimental design
determination of rand R. As part of its work program, did not include all of the normal sources of variation.
ISO Technical Committee 17 (Iron and Steel), Subcom-
mittee I (Chemical Analysis) applied these definitions
and practices to the Horwitz model [8]. That work in- ASTM'S proficiency test programs as a source
cluded 45 published and ready-to-publish national and of interlaboratory data
international standard test methods (BS!, ECISS/TC20,
and ISO/TC 17/SC 1) that employed 6 method principles ASTM currently sponsors PT programs in petroleum
(gravimetric, spectrophotometric, flame AAS, titrimet- products and lubricants, stainless steel, plain carbon and
ric-visual, titirimetric - potentiometric, and combus- low alloy steel, aluminum, gold in bullion; plastics: me-
tion/infra-red) to determine 21 elements in iron and steel. chanical properties testing, plastics testing (polyethyl-
The work showed that a clear correlation existed be- ene): melt index and ash; Textiles, and engine coolants.
tween the log of both rand R vs. the log of the analyte More information regarding ASTM PTs can be obtained
concentration. Based on that work, ISO/TC 17/SC I set a on their website, www.astm.org. These programs provide
policy [9] to use the log-log plots of rand R vs. concen- laboratories ongoing, statistically sound objective evi-
tration to evaluate test data of candidate ISO test meth- dence of their performance on common test materials as
ods. Any method that has data that exceeds specified compared with other competent laboratories around the
limits beyond the historical log-log plots will not be sub- world. With the exception of the petroleum-related pro-
mitted for international ballot to elevate to international grams, the PT reports provide feedback to laboratories
standard status. ISO 15350 [10] is a recent example of a on the current PT samples only and ASTM does not
test method that met those requirements. monitor a laboratory's long-term performance. The pe-
More recently, Flinchbaugh and Poholarz rII] de- troleum-related programs publish 2-year summary re-
scribed how the Horwitz model could be adapted to set ports of each laboratory's robust standard deviations.
MQOs for a Reference Materials (RM) Program. That These reports provide indications as to whether a lab
program uses the ISO/TC 17/ SC I plot in a manner simi- may have a persistent relative bias on a test or if their
lar to the model described in this paper to predict the precision performance has been poorer than the majority
MQOs (uncertainties) needed in certified homogeneity of labs. This analysis also attempts to identify labs with
and concentrations. A laboratory using those RMs especially tight performance. ASTM's Gold in Bullion
should be able to meet performance requirements consis- Program offer guidance to laboratories on how to track
tent with the rand R limits set by the ISO/TC 17/ SC I and monitor their own PT performance [18].
plot. The program based on that model is now accredited ASTM Committee EO I, on the Chemical Analysis of
to ISO Guide 34 [12, 13]. Metals, Ores, and Related Materials, conducts PT pro-
ASTM Committee EO I recently developed a Standard grams in plain carbon and low alloy steels, stainless
Guide [14] for managing uncertainties within a laborato- steels, aluminum, and gold in bullion. These programs
ry organization that uses multiple locations (worksta- are conducted in compliance with ISO Guide 43 and
82 D. A. Flinchbaugh · L. F. Crawford· D. Bradley

ASTM E2027 [19, 20]. This paper utilizes data from the ASTM Proficiency Test Data
first two of the EOI programs for its source of interlabo-
Plain Carbon/Low Alloy and Stainless Steel
ratory test data collected over a 2-year period . Samples .Btertaboratory Robus. Standard Deviatioa (95-1. contidcnce) VI. Robust MeaD
used to generate the PT data incorporated in this paper
were supplied by the Brammer Standard Company and 1--' .. :! ...
tested by ASTM E-826 [21], Standard Practice for Test- \-- I-HH-IHil-'- +++ml*I---+
'-I+I HHI -, I ..

ing Homogeneity of Materials for Development of RMs.


Typically, carbon, manganese, phosphorous, sulfur, sili-
con, copper, nickel, chromium, molybdenum, aluminum,
and tin are analyzed and reported by participants with
additional elements included periodically.
Once the PT data are collected, ASTM calculates the
"robust" mean and "robust" [22] standard deviation for
each element at each concentration. Robust statistics are
computed using all data, but unusually large or small ob-
servations have little effect on the mean and standard de-
viation estimates. Details on how ASTM calculates these
parameters are available from ASTM. As a measure of
laboratory performance, ASTM provides a Z-score for 1- -
~. ·H++tlIlll--H-I-HHil
each laboratory on each element reported. Z-scores are
calculated by subtracting the average laboratory result
from the overall average and dividing the difference by
the overall standard deviation. Laboratories obtaining Z-
scores greater than 2 (i.e. reported test data is greater
than 2 standard deviations from the overall average) are I
,.-. I .... i+tttfll·-++ +I'ttfII-++i +-Hlll
advised to investigate their test data and measurement
1-
systems for evidence of systematic errors. Under the
present system, conscientious laboratories react to high
Z-scores by internally auditing their test reports and 0.0001 0,001 0.01 0,1 10 100
measurements systems and taking corrective actions as
indicated by root-cause analysis. However, in many
cases, the root-cause analysis does not yield a convinc- Fig. 1 ASTM Proficiency Test Data: plain carbon/low alloy and
stainless steel. Interlaboratory robust standard deviation (95%
ing cause or solution because no apparent system failure confidence) vs. robust mean
can be identified. One cause might be that the test result
was within the laboratory's normal variation, but that the
laboratory's normal variation was large enough to allow method performance statistic, and other probability pa-
it to occasionally report data that did not meet the robust rameters are presented at the 95% confidence level, or 2
standard deviations expected by the PT program. times the parameter, unless otherwise noted. These data
cover more than 4 orders of magnitude for concentration
and include over 20 elements. Data were generated using
Compilation of historical PT data about six analytical test methods and from two different
test matrixes. Figure I clearly shows that a well-defined
For this study, we combined data from ASTM Commit- correlation exists between the robust standard deviation
tee EO I 's Plain Carbon and Low Alloy Program and and robust mean concentration. A "power" fit of the data
their Stainless Steel Program over a 2-year period, be- in Fig. I yields the equation
ginning in the third quarter of 1998. In 1999, over 120
y=0.0384xO,58 (I)
laboratories participated in the Plain Carbon and Low
Alloy Steel Program and over 40 laboratories participat- which mathematically describes the correlation between
ed in the Stainless Steel Program. These two programs the robust mean concentration and the interlaboratory ro-
were selected to incorporate different matrixes, different bust standard deviation obtained by multiple labs/instru-
sets of participants, and to expand the overall concentra- ments/methods.
tion ranges of the analytes evaluated. Visual inspection of Fig. I shows that most of the da-
A plot of the log of 2 times the robust standard devia- ta points form a well-defined band around the best-fit
tion (95% confidence) vs. the log of the robust mean line through all of the data points. The notable excep-
concentration is shown in Fig. I . For the purposes of this tions are on the positive side of the standard deviation
publication, references to robust standard deviations, line at about 1.0 and 0.1 % concentrations and shown as
A model to set measurement quality objectives and to establish measurement uncertainty expectations 83

Table 1
Element, Conc. Matrix Method a Method range

Si,0.98% Stainless EI086/E572 0.01-0.9%/Not included in scope of method


AI,I.02% Stainless h Not applicable
AI, 1.04% Stainless Not applicable
AI, 1.04% Stainless Not applicable
AI, 1.18% Stainless Not applicable
AI, (0.08% Stainless Not applicable
Si, (0.07% Stainless EI086/E572 0.OJ-0.9%/Not included in scope of method
Si,0.09% Stainless EI086/E572 0.01-0.9%/Not included in scope of method
Si.O.03% Stainless EI086/E572 0.01-0.9%/Not included in scope of method

a Method(s) shown are those used most frequently in that test.


hLaboratories reported various methods of analysis for AI, including ASTM E572 (X-ray, Ref. [23])
and ASTM EI086 (optical emission (OES), Ref. [24]), neither of them include the determination of
A I. Other laboratories reported using techniques not supported by ASTM Test Methods, such as in-
ductively coupled plasma and glow discharge OES.

blocks on Fig. I in lieu of diamonds. Table I shows the laboratory standard deviation data from other programs.
source of those points. This visual analysis indicates that Figure 4 (Figs. 4-7 are available as electronic supple-
the most clearly outlying points were at or beyond the mentary material) compares the best-fit line from Fig. 1
specification ranges of the alloys typically produced, with the best-fit lines from the ISO TC 17/SC I plot re-
were at or beyond the maximum concentrations for ferred to in the Introduction. Note that the general agree-
which the Standard Test Methods were validated, or ment is good, especially considering that the two lines
were beyond the scope of the standard test method. In represent very large, totally independent data sets. Both
the case of AI, neither of the standard test methods re- measure interlaboratory standard deviations, but they
ported (ASTM E572 and ASTM E1086) include Al in were calculated from data derived under very different
the method scope. For Si, ASTM E572 does not include protocols. Many of the individual ISO data were from
Si in the method scope and the I % silicon value is above the final, most successful iteration of several optimiza-
the validated range of the test method. Laboratories tion experiments. On the other hand, most of the ASTM
should not use/report standard test methods for elements data were generated under production conditions using
not included in the standard test method scope nor for many more pieces of equipment, each designed to handle
samples above (or below) the analytical range. Based on a specific facility's fairly unique product mix. In many
these findings, there is a strong possibility that at least cases, the number of digits reported in test results are
some of the laboratories provided the test data outside preset to meet local production requirements and may
the demonstrated capability of ASTM test methods. create rounding errors that make precision worse. Most
This brief overview illustrates how PT providers can of the ISO data resulted from calibrations with certified
use the log-log plot of two related programs under one reference materials (CRMs) and pure chemicals that are
provider to identify and improve performance on any el- highly reliable, while many of the ASTM data sets are
ement/concentration pairs that tend to generate unusually from X-ray and optical emission instruments calibrated
high standard deviations. As progressive PT providers with secondary materials because appropriate calibrants
improve their robust standard deviations, their participat- are not always available. Also, the trace determinations
ing laboratories will need to improve their performance of most residual elements in the 0.00 I to 0.01 concentra-
to maintain consistent Z-scores over time. Clearly, labo- tion range are not critical to routine spectrometric labo-
ratories will not benefit by participating in PT programs ratories. Therefore, laboratories may not exercise as
with unusually large robust standard deviations, because much diligence in controlling those elements.
those laboratories will earn relatively low Z-scores when Using data from two ASTM PT and ISO/TC 17/SC I
their measurement uncertainties are relatively large. test methods programs, we have shown that interlabora-
Conversely, laboratories will benefit by participating in tory standard deviations from these diverse programs
PT programs that consistently provide low robust stan- are very similar at any given concentration. We have
dard deviations. The long-term goal is to have all com- also shown that opportunities exist for further improve-
positional-based PT programs show the same, minimum ment in the degree of curve-fit, once the respective or-
standard deviations at each tested concentration. ganizations begin to evaluate and improve their proce-
We believe that as this model is used and overall data dures using the model. The similarities of these two
quality is improved the best-fit line will drop to some curves tend to verify the model and should give labora-
minimum standard deviation, which is undefined at this tories the confidence they need to use this model to es-
time. However, we can compare log-log plots of inter- tablish MQOs.
84 D. A. Flinchbaugh· L. F. Crawford· D. Bradley

This correlation can be used to predict future perfor- Developing control limits to comply with MQOs
mance of PT exercises. We will show how laboratories
can use the predictability of future PT results to set Having defined the maximum allowable error to meet
MQOs and design control schemes to ensure satisfactory MQOs, such as to pass 95% of all PTs, it is easy to use
performance. the uncertainty budget concept to establish the maximum
error to be allowed at each major step in the subject
method. In a bias free environment, measurement uncer-
Establishing MQOs tainty can be described by pooling standard deviations as
follows:
Laboratories must understand client expectations and es-
(3)
tablish MQOs that are consistent with the agreed, client
expectations. High-quality PT programs that attract large where:
numbers of ISO 17025 [25] compliant laboratories are
reliable sources of competitive performance data that can ak total
,
= combined
2
uncertainty in a measurement system,
Raill!
be used by laboratories to negotiate realistic performance
= each source of variance in measurement system R.
goals with clients. This approach is also useful to labora-
tories in helping clients better understand the uncertainty An essential step in utilizing the uncertainty budget is to
in their test results by establishing performance goals identify all sources of uncertainty in the measurement
where none previously existed. system and then to quantify these sources to the extent
Starting with the interlaboratory robust standard devi- possible. Individual sources of variation that contribute
ation (data shown in Fig. 1), we estimate the intralabora- significantly to the combined uncertainty for a typical
tory standard deviation of the participants by dividing in- measurement system utilizing an analytical instrument
terlaboratory data by -Ii, a standard practice for estimat- are: instrument calibration (i.e. quantity and quality of
ing a reduction in variation by removing one major CRMs used in the calibration protocol), instrument con-
source of error [7]. These data are modeled using a trol, and field sampling. Other significant sources of
"power" fit, shown as a line in Fig. 5 (available as sup- variation may exist for specific situations and should be
plemental electronic material), to yield accounted for, as appropriate. Although field sampling
y=0.0271 xO.58 (2) is generally considered a significant source of uncertain-
ty in an analytical measurement system, it will not be
The intralaboratory line represents the maximum uncer- considered here because sampling variation is not sig-
tainty a single laboratory can have and still expect to re- nificant in PT programs. It is recommended that accept-
ceive Z-scores of less than 2 in 19 out of 20 PTs. This ed sampling techniques are used and documented when
line can be used as the maximum allowable uncertainty sampling is under the control of the laboratory organiza-
for a laboratory. For example, to establish the maximum tion to reduce this source of variation as much as possi-
allowable uncertainty for a 0.3 wt. % Mn sample, Eq. (2) ble.
yields a result of 0.0135 wt. %. The identified sources of variation can be used to ex-
Thus, if a laboratory's estimated uncertainty is 0.0135 pand Eq. 3 to better describe the measurement system:
wt.% (m/m) at 0.3wt.% (mlm) they can be assured that
they will receive Z-scores of less than 2 in any ASTM a1, total = ak, Control + ak, Calib (4)
PT 95% of the time and that their instrument is operating or
correctly at this concentration level. This also means that
a laboratory will receive Z-scores of greater than 2 in a a R.total = ~alControl + alCalib (5)
PT 5% of the time due to random causes when a 95% where:
confidence level is employed. It may be in the laborato-
ry's best interest to reduce the MQO to allow them to a R, total = combined uncertainty in measurement system R
work with less than a 5% probability of receiving a Z- a2R, COlltrol = uncertainty due to instrumental control in
score of greater than 2 due to random error. For the de- measurement system R
termination of some chemical species, only one method a 2R. Calib = uncertainty due to instrument calibration in
may be available and it may not be optimized, resulting measurement system R.
in higher than the desired variation. In these cases, ef-
forts should be made to continue optimizing the test from Eq. 5 can be set as equal to the MQO deter-
a R, total
method to reduce the amount of uncertainty in the re- mined by Fig. 5. The laboratory must design and control
ported value. its measurement systems so that the right-hand side of
Eq. 5 is equal to or less than the combined uncertainty
allowed in the system. This requires that the laboratory
quantify the individual sources of variation contributing
A model to set measurement quality objectives and to establish measurement uncertainty expectations 85

Measurement Quality Objective Prediction yield the MQO for calibration variation. The MQO for
control sample variation is described by
- ---- y=0.0192xo. 58 (6)

-'-1---+ - I'
and the MQO for calibration variation is described by
~-+++---:+-tt1--+ - f ! 1 ------ y=0.0l36xo. 58 (7)
+--;---t--Htt-H-++-Hit t Ht-++-!-' I Equations 6 and 7 are shown graphically in Fig. 2.
i
.-t- - - , 1 .--- -. ' ,~ "
Equations 6 and 7 give the maximum allowable varia-
-- t- .- - . -"1 I«
tion for control and calibration and are used to design test
-- I I - ,//
methods in a cost-effective manner. If the amount of vari-
ation is less than the maximum allowed in either control
~-+-+t-+Htt-+-- (~ - i ii ;; or calibration, it reduces the probability of failing a PT
1
H Hlf---++H+HH---+-I-L LI4~i ~7.t!.. due to random causes or allows the variation to be given

I V ~ ~ :~-+-=_"=-±l±±H: to other parts of the measurement system, as needed.

Applying the model to an ISO 17025 accredited


I I'
laboratory
./ /
I, /':: ,/ We offer the following data from Bethlehem Steel's Re-
, ~ - .-- search Analytical Laboratory to demonstrate the practi-
~ k -
0.00 '

cal utility of this modeL The laboratory is accredited


lL ,/ [26] for the analysis of steel, slag, and coated sheet steeL
L
/ For most accredited methods, the uncertainty in cali-
v I bration is very low due to the fairly good supply of
. . -- -.. ~

CRMs. With a good supply of reliable CRMs, the curve-


0,0001 ,._ - -, _. fit of calibration curves is usually "tight" and there are
0.00 1 0.01 U,I
ConcenlnlioD (Wt. ~.. mlm)
10 abundant numbers of CRMs to calibrate and indepen-
dently verify calibration curves. In these cases, the cali-
Fig. 2 Measurement quality objective (MQO) prediction: squares bration errors are very small compared to the combined
- overall (intralaboratory), crosses - control material, circles - cal-
ibration
method uncertainty and usually do not have to be ad-
dressed as part of the uncertainty budget. Difficulties
arise when there is a shortage of reliable calibrants such
to (JR, total' It is best, although not always practical, to that the calibration errors become relatively significant
quantify each source of variation separately. and make it impossible to meet MQOs. In Bethlehem's
Quantitation of the first variable in Eq. 5, calibration laboratory, we try to maintain the calibration contribu-
uncertainty, ((J2R Calih) can be done using published refer- tion to the allowable variation in a measurement system
ences, such as uncertainty statements on Certificates of to between 10-30%. We found this to be an achievable
Analysis for CRMs and the degree of fit of the calibra- goal for calibration, in the majority of cases.
tion curve. It is of particular importance to use the appro- Figure 3 shows Bethlehem's two-sigma values for the
priate type and quantity of high quality CRMs when cali- statistical process control (SPC) samples associated with
brating the instrument to minimize this source of varia- accredited methods. Note that most of the points are be-
tion. Using CRMs produced by an ISO Guide 34 [12] low the MQO for control, as reproduced from Fig. 2.
compliant producer will help ensure that the material is The few points above the MQO for control samples are
adequately characterized and that estimated uncertainties from 1.6% AI, 5.98% Mn, and 6.6% MgO in slag by x-
are given. ISO Guide 34, General Requirements for the ray fluorescence (XRF) and 0.022% S in steel by optical
Competence of Reference Material Producers, details the emission spectrometry (OES). We believe these slight
necessary elements a RM producer must have in place to problems are related to a minor homogeneity problem in
generate high quality RMs. the RMs used to control the OES and XRF.
A general means for distributing the allowable vari- As an ISO 17025 accredited laboratory, the Bethlehem
ance between the two sources and still meeting the com- Laboratory quantifies and reports to their clientele, the esti-
bined MQO (Eq. 2) is by consecutive divisions by -Ii as mated uncertainties (95% confidence) associated with their
each source of variation is removed. Starting with the measurement systems. To comply with the log-log model,
combined MQO, the first division will yield the MQO Bethlehem's estimated uncertainty (95% confidence) must
for control sample variation, the second division will be below the predicted MQO. The best-fit line of the re-
86 D. A. Flinchbaugh· L. F. Crawford· D. Bradley

Control Material Comparison Figure 7 (available as supplemental electronic material)


Me.. urement QUIlUty Objective compued to S.. tIs.lc8' is a frequency diagram of the individual Z-score ratings
ProCHI Control (SPC) D... received by Bethlehem in 1999. Figure 7 shows a bell-
0.1 1 __ _

-- -'
-"- ._-
;: _ . - shaped curve, with a maximum frequency near 0.1, indi-
- II< cating that laboratories in compliance with the presented
model, as Bethlehem is, will consistently receive accept-
able Z-scores.

Advantages to the laboratory


Use of this model offers many advantages to a laboratory
organization. Some advantages are: a) consistently pass
PTs, such as those administered by ASTM, b) realistic
MQOs based on historical interlaboratory performance
data from many competent laboratories, c) protocol for
establishing cost effective control strategies, d) compli-
ance with accreditation requirements, e) demonstrate
competence to laboratory users, f) training tool to easily
explain how MQOs are determined and how control lim-
its are established, g) uniform approach to uncertainty
calculations, and h) determine if an individual PT run
• -- .
was flawed, using historical data .

0.0001 -I-.....-'--LAJ..lllj_..l.-..L..L.llJ. UI--- '-~ . L.1 LJ.1-_-'-.l...1..Lllll,


Summary
0.001 0.01 0.1 10
COllceatnCioD (WI. -;.. mlm) This paper presented a simple, empirical model to help
Fig. 3 Control material comparison. MQO compared to statistical determine MQO for analytical systems based on inter-
process control (SPC) data: crosses - control material MQO, trian- laboratory PT data from ASTM Committee E-I pro-
gles - SPC data grams. The model can be used to determine what esti-
mated uncertainty can be expected in a measurement re-
ported estimated uncertainties for each accredited test sult generated during routine analysis if the measurement
method is shown in Fig. 6 (available as supplemental elec- system is optimized and bias free. Using the model and
tronic material). Figure 6 shows that the best-fit line de- the principles of the uncertainty budget, compliant labo-
scribing the estimated uncertainties (95% confidence) is ratories can set MQOs and ensure that they to consistent-
very near the MQO, as reproduced from Fig. 2. ly pass PTs and to establish control protocols in a cost-
As final confirmation that the model will help labora- effective manner.
tories pass PTs, we offer the Bethlehem Research Ana-
Acknowledgements The authors thank the management of Beth-
lytical Laboratory historical performance on ASTM's lehem Steel Corp. and ASTM for their support and for permission
Plain Carbon/Low Alloy Steel PTs. In 1999, the Bethle- to publish this paper. In addition , we thank Bethlehem's Research
hem Laboratory had an average Z-score (defined as the Analytical Laboratory and support staff at ASTM for their valu-
summation of Z-score value/the total number of Z-score able input and supporting data needed to develop this model.
Finally, we acknowledge the tireless efforts of the volunteers
values) of about 0.04, indicating that they were an aver- which draft, debate, revise, and ballot the nationally and inter-
age of 0.04 robust standard deviations from the robust nationally accepted guides and standards used in developing this
mean of the sample being tested, an exemplary record. publication.

References
J. ISO VIM (1993) International vocabu- 4. Margosis M, Horwitz W. Albert R (1988) 8. Hobson 10; ISO Document N 938
lary of basic and general terms in me- J. Assoc Off Anal Chern 71 : 619-635 (1992) A survey of the precision of
trology, 2nd edn. ISO. Geneva 5. Horwitz W, Britton P. Chitrel SJ (1998) standard methods for the analysis of
2. Horwitz W, Kamps LR. Boyer KW J Assoc Off Anal Chern 81 : 1257-1265 steel and iron. based on ISO,
(1980) J Assoc Off Anal Chern 63 : 6. Rocke DM . Lorenzato S (1995) Tech- ECISS/EN and BSI Statistics, revised
1344-1354 nometrics 37: 176-184 1992-06-0 J. ISO, Geneva
3. Horwitz W (1982) Anal Chern 54: 7. ISO 5725 (1994) Accuracy (trueness
67A-76A and precision) of measurement meth-
ods and results. ISO, Geneva
A model to set measurement quality objectives and to establish measurement uncertainty expectations 87

9. ISO TCII7/SCI N 1235 (I 998)The 15. ASTM D6091-97: Standard practice 22. Analytical Methods Committee of the
procedures for activities of ISOITC for 99%/95% interlaboratory detection Royal Society of Chemistry (1989)
17/SC I (The 5 edn., version 2). ISO, estimate (IDE) for analytical methods Analyst 114: 1693-1697
Geneva with negligible calibration error. 23. ASTM E572-94 (reapproved 2000):
10. ISO 15350: Steel and iron: determina- 16. ASTM D6591-99: Standard practice Standard test method for X-ray emis-
tion of total carbon and sulfur contents for an interlaboratory quantitation esti- sion spectrometric analysis of stainless
- Infrared absorption method after mate steel
combustion in an induction furnace 17. ASTM E 1763: Guide for the interpre- 24. ASTM EI086-94 (reapproved 2000):
(routine method). ISO, Geneva tation and use of results from the inter- Standard test method for optical emis-
II. Flinchbaugh DA. Poholarz JM (1998) laboratory testing of chemical analyti- sion vacuum spectrometric analysis of
Accred Qual Assur 3: 367-372 cal methods stainless steel by the point-to-plane ex-
12. ISO Guide 34 (2000) General require- 18. ASTM Proficiency Test Program Re- citation technique
ments for the competence of reference port: Determination of gold in bullion 25. ISO 17025(1999) General require-
material producers. ISO, Geneva by cupellation (E 1335) May/June 200 ments for the competence of calibra-
13. American Association of Laboratory Appendix "Tracking and Monitoring tion and testing laboratories. ISO, Gen-
Accreditation (A2LA) Certificate No. your own Proficiency Test Perfor- eva
300.03, presented 7 November, 2000 mance" 26. A2LA Certificate No. 300.0 I, present-
valid through 31 August, 2002 19. ISO/IEC Guide 43-1 (1997) Proficien- ed October 18, 2000 valid through
14. E2093-00: Standard guide for optimiz- cy testing by interlaboratory compari- August 31 , 2002
ing, controlling and reporting test sons - Part I: Development and opera-
method uncertainties from multiple tion of proficiency testing schemes.
workstations in the same laboratory or- ISO, Geneva
ganization 20. ASTM E2027-99: Standard practice
for conducting proficiency tests in the
chemical analysis of metals, ores, and
related materials
21. ASTM: E-826-85 (Reapproved 1996):
Standard practice for testing homoge-
neity of materials for development of
reference materials
Accred Qual Assur (2000) 5:464-469
© Springer-Vertlag 2000

Adriaan M.H. van der Veen Uncertainty calculations in the


Jean Pauwels
certification of reference materials.
1. Principles of analysis of variance

Abstract The preparation and cer- variance (ANOV A). As GUM also
tification of reference materials is a allows alternative evaluations other
rapidly developing area. Many in- than Type A evaluations, a reinter-
novative reference materials have pretation of the theory of ANOV A
limited homogeneity and stability, is necessary to establish a model
and, additionally, the uncertainty for the certification of reference
A.M.H. van der Veen (181) estimation of the property values materials that is widely applicable.
Nederlands Meetinstituut, must be brought in agreement with For this, analysis of variance can
Schoemakerstraat 97, 2600 AR Delft, the principles of the "Guide to the be used as a statistical technique to
The Netherlands
e-mail: avdveen@nmi.nl
expression of uncertainty in meas- derive standard uncertainties from
Tel.: + 31-15-2691 733 urement" (GUM). The results of homogeneity, stability and charac-
Fax: +31-15-2612971 the homogeneity and stability stud- terisation data.
J. Pauwels ies must be included to a certain
European Commission, Joint Research extent in the uncertainty of the
Centre, Institute for Reference Materials property values of the reference Keywords Reference materials .
and Measurements, Retieseweg, material, in order to comply with Measurement uncertainty .
2440 Geel, Belgium
e-mail: Jean.Pauwels@irmm.jrc.be
these requirements. The basic the- Analysis of variance .
Tel.: +32-14-571 722 ory needed to accomplish this is Homogeneity study . Stability
Fax: +32-14-590406 essentially the theory of analysis of study

where Y ij is the result of a single measurement in the


Introduction
experiment. IJ. is the expectation of Yij, which is the val-
ue that Yij takes up when the number of repeated meas-
There are many experiments where two or more uncer-
tainty components are involved, which are evaluated si- urements tends to infinity. It should be noted that IJ. is
not the true value. It even does not include any aspect
multaneously by means of a Type A evaluation. In the
of traceability to external references, unless special pre-
certification of (batch) reference materials, this situa-
cautions have been taken. Ai is a bias term, due to the
tion can be observed in the homogeneity study, stability
(random) differences in the extracts. The variable is as-
study, and/or characterisation of the material. For in-
sumed to be normally distributed, with mean zero and
stance, the homogeneity of a batch of subsamples is to
variance (T~. Furthermore, it is assumed that Ai is inde-
be determined. The experiment involves two uncertain-
pendent of all eij which are also normally distributed
ty components: the repeatability of the measurement
variables with mean zero and variance fi2 [1]. There are
(method) and the between-sample variation. In terms
a groups, and each of them contains ni members. Ideal-
of analysis of variance (ANOVA), the samples are at
ly, the number of members in groups should be equal,
the level of groups, and the repeatability of the meas-
but in practice this is often not the case. In order to
urement method is found within the groups. The basic
make this paper useful to practitioners, the more com-
model for this type of ANOV A reads as plex formulae for incomplete data sets are given, rather
(1) than the simpler ones for complete data sets.
Unsertainty calculations in the certification of reference materials. I. Principles of analysis of variance 89

The objective of the uncertainty evaluation of the a III


- 2
experiment sketched is to obtain estimates for al and SSwit"i1l = L L ( Yij - Yi) (5)
i= 1 j=]
cr, where the first refers to the variance of the A;, and
the second refers to the variance in C;j. The model for a SSwith;1l is the part of the total sum of squares that can
two-way fully nested ANOV A reads as be attributed to the variation within groups. At the lev-
el of groups, the expression for the sum of squares
(2) reads

yr
1I _

where a second bias term has been introduced: B;j. For SSam01lIl= L ni(Yi - (6)
this bias term, which is at the level of subgroups, the i=l

same assumptions are made as for A;, that it is normally Without proof, the relationship between the three sums
distributed with mean zero, and that it is independent of squares reads as
from both any A; and any C;jk' The subscript "B C A "
should be read as "among subgroups, within groups", SSwtll/ = SSam01l11 + SSw;thi1l (7)
as A represents the level of groups (for example: sam- Each of these sums of squares has well-defined num-
ples), and B represents the level of subgroups (for ex- bers of degrees of freedom. SSlIm01l11 has a -1 degrees of
ample: extracts). In a homogeneity study, a two-way freedom, SSwithi1l has L ni-a degrees of freedom and
ANOV A might be considered if additionally to the be- SSwtai has L ni-1 degrees of freedom. Dividing SS,,;thi1l
tween-sample variation also the repeatability of sub- and SSaI1101l11 by their respective number of degrees of
sampling and extraction is to be determined. freedom leads to the respective mean squares, abbre-
For the calculation of variances from these complex viated as MS. MSwithifl can thus be calculated as
experiments, two things are needed. First a method is
needed to partition the total scattering into contribu- MS .. - SSw;th;1l (8)
wt/hlll - II
tions, attributed to the various levels in the ANOV A.
In a second step, these contributions are converted into
L n;-a
;=1
variances. These variances can directly be used in un-
certainty evaluations that are compliant to the "Guide and MSamoflll is thus defined as
to the expression of uncertainty in measurement" MS - Sam01l11
(GUM) [2]. among ---l- (9)
a-
The mean squares take up the form of variances, but
Partitioning sums of squares they are, apart from MSwith;m not equal to the variance
at their specific level.
Scattering of data can be represented in various ways. The main objective of this partitioning is that it ena-
Probably the best known way is to express scattering of bles the separation of different effects that contribute
data in terms of variances, covariances, and standard to the combined standard uncertainty of the measu-
deviations. In analysis of variance, the scattering is oft- rand. This separation only makes sense if an uncertain-
en expressed in terms of sums of squared differences, ty component, obtained from one experiment is used in
or in short "sums of squares". These sums of squares another experiment, thus in the case of a kind of Type
express the scattering at various (hierarchic) levels in B evaluation [2].
the analysis of variance. At the top level, the total sum
of squares (SSWtal) is defined
Estimating uncertainties from ANOVA
a 11;

SStotlll= L L (Y;j- y)2 (3) In order to be able to calculate variances from mean
;=1 j=1
squares, the first thing needed is the expectations of the
where mean squares, expressed in terms of variances al and
cr. These relationships read as
= 1 a IIi
Y=II
- ~ ~
i..J i..J
y IJ (4) MSWithi1l=cr (10)
M SlImo1lg = a 2 + no a~
~ ;=1 j=1
i..J ni (11)
i=1
where no is a function of the number of degrees of free-
denotes the grand mean; a is the number of groups and dom. For a complete data set, where for any value of i,
n; is the number of members in the group. SSWtll/' the n; = n. then no = n. It has been determined by mathe-
total sum of squares, can be partitioned as follows. matical statisticians that the appropriate value for no for
First, SSwithi1l will be defined incomplete date sets reads as
90 A. M. H. van der Veen . J. Pauwels

for the grand mean takes into consideration that often


~ n2
1 a a] i:-I i the data set is not complete. The expression for the par-
no=--
a-1 [ I: n i - - -
i=1 II
(12) titioned sum of squares read as [1]
I: ni _
1I
-
_
- 2
i=1
SSamong- I: ni(YA - Y) (18)
i=l
Using the expressions for the mathematical expecta-
tions of the mean squares, and considering the fact that
only estimates are available, the following equations re- (19)
sult
S'?;,;thin = MSwithin (13) (20)

(14) The double bar over Y denotes the grand mean; al-
though one could argue that it should be a triple bar, a
Although s1 is an unbiased estimator for 0-1, and can be grand mean in the ANOV A literature is always de-
employed as such, there is some aspect to keep in mind. noted b~ a dou~e bar. Likewise, group and subgroup
In various references [3], the following expression for means (YA and Y B) are always denoted by a single bar.
the confidence interval for this variance ratio can be The subscript denotes the level: A is the top level, B is
found the second level. The second step in developing expres-
2 sions for the variances 0-1, ~CA' and rr is to convert
FO.975 + n0-11 rr < S ~ < FO.025 + nail (T2 (15) the sum of squares in their respective mean squares.
S The expressions for the three mean squares are
where FO.975 and FO.025 are the lower and upper 2.5%
MS = SSamonx
(21)
one-tailed levels of F with numbers of degrees of free- among a-I
dom of a -1 and a(n -1), respectively. This expression
can be rearranged to MS _ SSBCA
(22)
!(n <u7trr <!(n
BcA - a

S7t -1) S7t -1) (16) I: bi-a


S2 FO.025 S2 FO.975 i=l

From this formula, it can be seen that S2 has an impact MS within =


SSwithin
-----.:::..:.:::=---- (23)
on the estimator s1; this can be understood by consider- a hi a

ing, that s2/n defines the "resolution" of the method for I: I: nij- I: b i
obtaining sl The smaller s2/n, the better the estimator
i=1 j=1 i=1

for sl This fact plays an important role when transfer- which are, after the discussion of the one-way ANOV A
ring uncertainty components from one experiment to self-explanatory. In the denominators, the expressions
an other. for the respective number of degrees of freedom are
given. The number of degrees of freedom is equal to
the number of observations, minus the number of pa-
Two-way fully nested design rameters computed from them. Thus, the number of
degrees of freedom among groups equals a-, as there
The expressions of the variances sl, S~CA' and S2 from are a group means and there is one parameter com-
a two-way fully nested design are developed in a similar puted from them (in fact, the grand mean).
way as those for the one-way layout. As in uncertainty These mean squares can be converted into variances
calculations for especially stability studies often a two- using the following expressions
way design is necessary (1: time; 2: samples; 3: repeata-
bility of measurement), the formulae are given below. a7nethod =M S withill (24)
The model for a two-way ANOV A has been given in -2 _ MS BCA - MSwithin
lfiJcA - (25)
Eq. (2). The grand mean is computed using no
1
a h
a hi nij
I: I: L Y ijk (17) -2 _
0:4-
M SanuJIlX- no crsc A - rr (26)
~ ~ i=lj=lk=l (nb)o
'-- '-- nij
i= 1 j= 1
Again, MSwithin equals rr. crscA is calculated by sub-
In this expression, a denotes the number of groups, b i tracting MSwithin from MS BcA ' Likewise, 0-1 could be
the number of subgroups within groups, and nij the obtained by subtracting MS BCA from MSamollg' For in-
number of observations in the subgroups. The formula complete data sets, it is less effort to compute 0-1 as
Unsertainty calculations in the certification of reference materials. I. Principles of analysis of variance 91

stated in Eq. (26) [1]. For higher-order ANOV As, the ANOV A in relation to the combined standard uncer-
pattern is similar. In [1], higher-order ANOV A as well tainty, not so much in relation to the repeatability of
as other designs of ANOVA are given. The expressions the measurement method used.
for the denominators of Eqs. (25 and 26) as well as in Returning to a one-way ANOV A, a useful relation-
the nominator of Eq. (26) read as follows ship can be developed from the expression for MSalllon!("
It is defined as
a _
L ni(Yi - y)2
i=1
MSamong = ------ (30)
a-1
(27)
The second relationship that is of interest is the expres-
sion for a sample variance

S2 = _i=_I_ _ __ (31)
a-1
(28)
If, in the expression for MSamong. ni is set to unity for all
L bi-a i, then this expression becomes identical with the one
i=1
for a sample variance for a row of results. So, from a
a
L L
(hi )2 matrix of a one-way analysis of variance, MSamong can
a h nij be computed directly from the variance of the group
L En.-i=1 1=1 means.
i= 1 j= 1 'J ~ "i Furthermore, returning to the model of a one-way
L... L nij
(nb)o = ______,_·=_1-'.1_=_1_ _ (29) ANOV A and the principle of propagation of uncertain-
a-1 ties, the following expression can be developed
As already pointed out, for practical reasons, the for- (32)
mulae for incomplete data sets are given, deliberately. For the group means, the following expression can
In most references, e.g. ISO 5725-3 [5] among others, be derived
usually only the formulae for complete data sets are
given. From a theoretical point of view, this may find its
(33)
justification in that ANOV A has been developed for
complete data sets, but a few observations (or even
subgroups) missing does not necessarily mean that the whereby it has been assumed that all groups are com-
whole experimental set-up has become invalid. Howev- plete, i.e. ni = n for all i. Combining this result with Eq.
er, the formulae needed certain modifications in order (31) leads to the interesting result that
to work with the (approximately) correct numbers of
degrees of freedom. Obviously, the more "holes" in the S 2 -SA
_ 2 +-2
Swilhin
- (34)
n
data set, the poorer the method works, and the poorer
the results. which is consistent. Furthermore, under the assumption
that ni = n for all i,
(35)
Useful relationships and inferences
These formulae also open up other options. According
The significance of ANOVA goes beyond the applica- to GUM [2], there is no difference in nature and prop-
tions sketched here. Traditionally, in chemistry erties of a standard uncertainty coming from a Type A
ANOVA has always been associated with the F-test, or a Type B evaluation. Accepting this principle, the
testing mean square ratios for significance. Although formulae given also open up possibilities to work with a
there are cases where this becomes relevant, in uncer- combination of Type A/Type B evaluation of uncer-
tainty evaluations it is rarely needed. Often, it is suffi- tainty. Especially in cases, where a series of data is to
cient to draw up the expression for the combined stand- be processed from which it is known that the values
ard uncertainty, and if components coming from carry an uncertainty (apart from the variation inherent
ANOV A are insignificant, then they will effectively to the data set), the formulae given may provide a basis
drop out in the summation anyway. What matters is the for developing a procedure for the uncertainty analysis,
significance of the uncertainty components in the based on this ANOV A work. This actually is one of the
92 A. M. H. van der Veen' J. Pauwels

reasons why the ANOV A theory is very suitable for within certain restrictions. The first equation setting re-
the description of the certification of batch reference strictions is the model, as specified in Eq. (1) for a one-
materials. way ANOV A and in Eq. (2) for a fully nested two-way
An illustration of this runs as follows. Suppose a se- ANOVA. These assumptions have already been dis-
ries of data is obtained, with a certain degree of scatter- cussed. The requirement of independence of the varia-
ing, but from which it is known that each member of bles on the right-hand sides of Eqs. (1) and (2) is prob-
the series has some additional measurement uncertain- ably the most critical one. This assumption seems to be
ty. This measurement uncertainty may be a combina- met in many cases in analytical chemistry as well as in
tion of uncertainty from Type A and Type B analysis, physical testing, but it cannot be taken for granted that
but it is assumed that there is only one additional un- this assumption is always valid. For instance, hetero-
certainty component (Uadd) to be considered, and it geneity of a material will in a homogeneity study lead
comes from a Type B evaluation. This assumption does to a greater value for Var(A;), but usually also Var(Eij)
not affect the general validity of this inference. Using increases with increasing heterogeneity.
Eq. (31), the variance can be computed. What about Another important point related to the model is that
the uncertainty of the mean? It is known that each Y i in the data obtained does not show any trend. This may
the series has this component Uadd. Given Eq. (33), it seem obvious, but for some applications (e.g. comput-
must be noted that it is impossible to determine u 2 (A;} ing uncertainty budgets from stability studies) it is
and U 2 (Eij), separately. This is the "penalty" for not something that should be carefully investigated. In ab-
having more than one data point per "group". sence of any kind of trend, the ANOVA approach is
Applying the principle of uncertainty propagation, valid. Otherwise, some kind of trend analysis or regres-
two alternatives can now be developed, which repre- sion technique is recommended. This requirement boils
sent extremes down to the usefulness of a grand mean: in absence of a
trend, the grand mean is (from a theoretical point of
2 S2 2
II (m) = - + lladd (36) view) a useful property. If there is a trend, the concept
a of calculating a grand mean, other than for internal
and purposes of the regression analysis, is of doubtful val-
ue.
2+ 2
U
2 () S Uadd
m =----'- (37) A further assumption already mentioned is the nor-
a mality of data. This assumption is quite notorious, and
The difference between the two is obvious. The first al- has lead to a variety of statistical tests for normality.
Probably the best known test for this purpose is the
ternative leads to a greater value for u 2 (m) than the
Kolmogorov-Smirnov test [4] or one of its variants.
second. Under what conditions can the second be used,
Normality of data is often assumed, but not so often
under what conditions must Eq. (36) be used? It de-
observed as desired. Statistical tests, which require nor-
pends on the nature of Uadd, and at what level it affects
mality of data, are known to be very sensitive with re-
the uncertainty in the mean m. If lladd is the same for all
Vi, it is clear that Uadd affects m at its own level: the
spect to deviations from normality. A skewed distribu-
scattering of Y i has no relationship with the value of tion may very well lead to completely wrong deci-
sions.
Uadd' If Uadd is specific to each Vi, then it affects the un-
In the applications discussed here, ANOV A is used
certainty of m at the level of Vi, and in this case Eq.
as a method for obtaining values for uncertainty com-
(37) can be used.
ponents. These values are variances, and their value is
In terms of correlations, Eq. (36) represents the case
relatively insensitive with respect to the underlying dis-
where the system is fully correlated with respect to Uad,"
tribution. This means that the evaluation method as
whereas Eq. (37) represents the fully independent case.
discussed is quite robust with respect to the actual dis-
In practice, usually it is not clear whether this addition-
tribution of the data. This is an important aspect, as it
al uncertainty source is correlated or not. As a princi-
makes testing for normality redundant. The variances
ple, in lack of information, the conservative alternative
thus obtained can be treated as any other variances
should be chosen, unless positive evidence is available
from Type A or Type B evaluations [2].
that the assumption of independence holds. That is, in
The fact that non-normality of data does not play
cases of doubt, Eq. (36) should be used instead of Eq.
such a role as in the classical use of ANOV A is dis-
(37).
cussed in ISO 5725-1 and -3 [5, 6], and ISO Guide 35
[7]. In the classical approach, the F-test for testing sig-
Underlying assumptions revisited nificance of ratios of means of squares playa dominant
role. This F-test is very sensitive with respect to non-
As with most mathematical and statistical techniques, normality of the underlying data, thus leading to false-
the computational methods as presented are only valid positive or false-negative results.
Unsertainty calculations in the certification ofreference materials. I. Principles of analysis of variance 93

In the whole process of uncertainty evaluation, test-


Concluding remarks
ing on ratios of mean squares or testing on ratios of
variance for significance has become superfluous, as in
The use of ANOV A in analytical chemistry, and more
the establishment of the combined standard uncertainty
general in physical, chemical and biological testing has
any components that are insignificant drop out auto-
gained renewed interest as a suitable technique to per-
matically. There is something more to add to the prob-
form complex Type A-evaluations of experiments ad-
lem significance/insignificance. The classical F-test ap-
dressing multiple uncertainty components. These ex-
proach tests for significance within the context of the
periments include homogeneity and stability testing, as
experiment. This may be relevant in some cases, but in
part of the process of producing (certified) reference
most experiments carried out in this context, it is not. It
materials. In routine laboratories, typical experiments
is - in principle - not of interest whether an effect ob-
that can be evaluated using ANOV A techniques are
served and quantified is significant with respect to some
studies into the performance of destruction and extrac-
other uncertainty component present in the experi-
tion methods, methods for sample preparation and re-
ment.
lated studies.
The variances obtained from ANOVA can be used
Practical applications directly in uncertainty evaluations in compliance with
GUM. The method of evaluation is less sensitive to-
The practical applications of partitioning of sums of
wards the influence of the actual distribution of data.
squares are manifold, and many of them have been re-
Even if the distribution of data deviates considerably
ported in the open literature the past few years. In ana-
from the normal distribution, the method can be ap-
lytical chemistry, steps like extraction, destruction, sub-
plied.
sampling and sampling can only be evaluated using
some kind of ANOV A technique. GUM [2] requires
however to express the repeatability of these kinds of
steps to be expressed as standard uncertainties. These List of symbols
components however only take into consideration the
random effects of these steps. Effects systematic to the Ai bias term at the group level
experiment, such as incomplete extraction, incomplete Bij bias term at the subgroup level (two-way
destruction, and biased (sub )sampling must be assessed ANOV A or higher)
separately if needed. The statistical techniques as dis- Eij random term at the within-group level (index
cussed in this paper deal essentially with Type A evalu- from one-way ANOV A)
ation of these experiments. However, this source of un- MS mean of sum of squares
certainty is frequently overlooked in uncertainty evalu- SS sum of squares (sum of squared differences)
ations. f1 expectation of Yi;
In the process of preparation and certification of refer- S standard deviation
ence materials, ANOV A plays an important role in the SA standard deviation at the top level of
evaluation of the uncertainty due to inhomogeneity, in- ANOVA
stability, and characterisation of the candidate refer- SBCA standard deviation at the subgroup level (two-
ence material. In the parts to follow, the evaluation of way ANOVA)
data from these studies, as well as developing an ex- additional uncertainty component
pression for the combined standard uncertainty of the result of a single measurement (index from
reference material will be discussed. one-way ANOV A)

References

1. Sokal RR, Rohlf FJ (1995) Biometry, tion modeling and analysis, 2nd edn. ment methods and results - Part 1:
3rd edn. Freeman, New York McGraw Hill, New York, Chapter 0 General principles and definition of
2. BIPM, IEC, IFCC, ISO, IUPAC, IU- 5. ISO 5725-3:1994 (1994) Accuracy statistical methods for quality control,
PAP, OIML (1995) Guide to the ex- (trueness and precision) of measure- vol. 2. International Organization for
pression of uncertainty in measure- ment methods and results - Part 3: I n- Standardization (ISO), Geneva, pp
ment, 1st edn. ISO, Geneva termediate measures of the precision 9-29
3. Sncdecor GW, Cochran WG (19H9) of a standard measurement method in 7. ISO Guide 35:19H9 (19H9)Certification
Statistical methods, Hth edn. Iowa statistical methods for quality control. of reference materials - General and
State University Press, Iowa, USA, International Organization for Stand- statistical principles, 2 edn. Interna-
Chapter 13 ardization (ISO), Geneva, pp 75-104 tional Organization for Standardiza-
4. Law AM, Kelton WD (1991) Simula- O. ISO 5725-1 : 1994 (1994) Accuracy tion (ISO) Geneva
(trueness and precision) of measure-
Accred Qual Assur (2001) 6:26-30
© Springer-Verlag 2001

Adriaan M.H. van der Veen Uncertainty calculations in the


Thomas Linsinger
Jean Pauwels certification of reference materials.
2. Homogeneity study

Abstract Many reference materials with many reference materials is


undergo a batch certification, that only a small test portion is
which implies that a small number drawn from the sample to carry
of samples is taken from a batch, out the measurement. Obviously,
characterised, and these results are this test portion must be represent-
then assumed to be representative ative of the sample, otherwise the
of all remaining samples. An im- certified value is still not applica-
A.M.H. van der Veen (~)
Nederlands Meetinstituut,
portant aspect in this design is the ble. Both kinds of homogeneity
Schoemakerstraat 97, 2600 AR Delft, translation of the characterisation tests are examined in the paper
The Netherlands data to a single sample, as usually and evaluated using practical ex-
e-mail: avdveen@nmi.nl the laboratory will be using only amples.
Tel.: +31-15-2691 733 one sample of the batch. This form
Fax: +31-15-2612971
of homogeneity is very important Keywords Reference materials
T.P. Linsinger . J. Pauwels and can be influenced to a certain Measurement uncertainty .
European Commission, Joint Research Analysis of variance .
Centre, Institute for Reference Materials
extent by well-designed sample
and Measurements, Retieseweg, preparation procedures. Another Homogeneity study . Minimum
2440 Geel, Belgium subsampling problem associated sample intake

Within-bottle homogeneity plays a role if a subsam-


Introduction pIing step is required. For instance, for measuring a gas
mixture shipped in a cylinder, a subsample is taken, so
Homogeneity testing is a well-known phenomenon in within-bottle homogeneity is an issue here. In this case,
the preparation and certification of reference materials. as in most other cases of solutions and mixtures, the
It also finds its application in the preparation and heterogeneity in the bottle can effectively be removed
checking of proficiency testing material. The design of by shaking, rolling or some other handling that allows
the homogeneity study is the same for both between- the mixture to become homogeneous again. For parti-
bottle and within-bottle testing, as the same question culate materials, the problem is different. Apart from
needs to be answered. For a "between-bottle" homo- reprocessing the material, there is no option of improv-
geneity test, where the differences among samples are ing the within-bottle homogeneity. A certain minimum
of interest, the problem can be stated as follows. What amount of the sample must be taken, so that the test
do these differences contribute to the uncertainty of the portion represents the contents of the package. In
characterisation of material, taking into consideration terms of measurement uncertainty, the smaller the
that the user of the material will only be using one sam- amount taken the higher the measurement uncertainty.
ple at a time? The name "between-bottle" has been Thus, this would lead to a situation where the uncer-
chosen, as most reference materials (and proficiency tainty of the characterisation of the material is no long-
testing materials) are shipped in bottles; however, the er valid for the test portion, even in the case where the
whole discussion is also valid for vials, gas cylinders, between-bottle homogeneity has been taken into con-
and test pieces, to give a few examples. sideration.
Unsertainty calculations in the certification of reference materials. 2. Homogeneity study 95

tually needed to observe no effect of the sample intake


Experimental set-up and the relationship with analysis
on the repeatability of a measurement. However, the
of variance (AN OVA)
sample intake for the study to establish the minimum
sample intake differs from the sample intake for a be-
A typical experimental set-up for a between-bottle ho- tween-bottle study: in the latter case the optimal sam-
mogeneity study is visualised in Fig. 1. On the left-hand ple intake should be chosen to obtain a good repeata-
side, the case is given where subsampling is impossible, bility, in the former, the minimum sample intake must
or just not done. With test pieces, often only one test is be used. Doing so will increase influences of within-
possible, so in this case, n, the number of replicates, bottle heterogeneity and will thus reduce the influence
equals 1. In those cases where the sample allows multi- of method repeatability on the estimated minimum
ple measurements after transformation, n will generally sample intake.
be greater. In those cases where n> 1, the data can be
treated with ANOV A [1]. In this paper, a few examples
will be given. Current practice versus uncertainty-based practice
The effect of between-bottle homogeneity is in the
variance "among groups" [1], as well as the effect of the In many cases, the well-known F-test [3-5] is still used
transformation of the sample. The variance "within to test for significance of a heterogeneity effect. Apart
groups" [1] covers only the repeatability of the meas- from the problems already stated in Part 1 [1], there is
urement. In contrast, if from each sample of the batch another big problem: this way of treating homogeneity
multiple test portions are taken, the variance "within data does not answer the question(s) raised in the in-
groups" will cover the measurement, transformation, troduction. In principle, it does not matter whether the
and subsampling. In this case, the variance "among heterogeneity observed is significant with respect to the
groups" only covers the between-bottle heterogeneity. repeatability of the test method. The repeatability of
From the perspective of obtaining an unbiased esti- the test method is only of interest in the sense of the
mate, this is the ideal situation. quality of what has been demonstrated. If the F-test
For obvious reasons, it is not possible to obtain an does not indicate a significant result from the homo-
exact estimate of the variance resulting from within- geneity test, but the repeatability of the test method
bottle heterogeneity. From a sample, multiple test por- used is poor, then it is still possible that the heterogene-
tions must be drawn, which can obviously only be ity among samples (between-bottle homogeneity) is sig-
transformed once (Fig. 2). nificant when compared to the uncertainty from charac-
This implies that the variance resulting from within- terisation.
bottle heterogeneity will always be conservative, which As pointed out by Pauwels et. al. [6] and by Van der
is not really a problem. For the computation of the Veen and Alink [7], a method used for assessing sam-
minimum sample intake [2], this implies in turn that the pling and/or subsampling performance, including ho-
minimum sample intake will be overestimated, which is mogeneity tests, should have a good repeatability. The
not a problem either. The amount of material stated repeatability of the test method, in conjunction with the
will be greater than the critical amount, the amount ac- number of replicates on each sample, defines the reso-

Fig. 1 Lay-out of a between- Between-bottle homogeneity testing


bottle homogeneity study
/- -------......\

[;::le-#l ~:~le#2] C;:tle~J [ Hottle #~J l~OU:#2J [~~


~ ___ t __ ~ ~ J---, ~
[ transfor- I tansfor- I (l -tra~~~~;-l nsub- I
r--

matlon mation j
[-~ su-~--l l:n sub- 1
_samples - " samples_)
r----~
,,:,~:,~-j ~~,,-,,-j
I

+
C---'
-~

+ + + + +
-~-

Ii i------ -, n ,
PE~~
;::
, transfor- j
(---n-~
r----~
n measu-
bO n measu- : l n measu-
transfor-
~ rem~ts :men~~j
l~"-ITl"~t' j l_~_~~~~~~~,. \_ matiti~Il~J
'i
+ r--
+ J
C~?::~] I,~~;::~~-
... among groups • ... amoIlg groups ~
96 A. M. H. van der Veen· T. Linsinger· J. Pauwels

Within-bottle homogeneity testing ous (e.g. solutions) or known from previous experi-
ence to have negligible heterogeneity when pre-
pared properly
2. Application of a more conservative estimation tech-
nique for this uncertainty source, based on Sbb as ob-
[; -~
sub-
s~~ple #1
served and Smeas.
Both options comply with GUM, and apart from the
+
----,
I
+
lra~~io~~ J
expectations about the heterogeneity aspect, Smeas
should also fulfill the requirement, to be smaller than
l
transfor-
matio:.J
+.
(
mabon the repeatability standard deviation of the measure-
ments in the characterisation. If this requirement is not
+--1
~-

1----
Il..:'ments
~~easu-J l( J-~"
"measu-
·i tb
rements
.~ §'
clearly fulfilled, then in any case a more conservative
estimator than the value of Sbb as observed from
ANOVA is necessary. This is an effect of the existing
correlation between both standard deviations. This top-
ic will be covered in more detail in Part 4, about the
among groups
certification process, as it also has consequences for the
Fig.2 Lay-out of a within-bottle homogeneity study treatment of stability data.
For within-bottle homogeneity studies, a similar rea-
soning can be developed. The uncertainty from the ex-
lution of the method. The better the resolution, the periment can be expressed as [6]
smaller the effects that can be estimated.
(3)
Looking from the perspective of the "Guide to the
expression of uncertainty in measurement" (GUM) [8], which leads to
the variation of the bottle averages (Uc(bb») is a com- S 2wb -U2
- c(wb) -
S2
method (4)
bined uncertainty consisting of the between-bottle het-
erogeneity (Sbb) and the measurement variation (smeas). with Uc(wb) being the combined standard uncertainty of
The latter comprises analytical variation and within- the experiment, Swb the within-bottle variation and
bottle heterogeneity, which should be pooled for the Smethod the intrinsic variability of the method. These
estimation of between-bottle heterogeneity anyway. terms were named u exp , Ujnh and U meas ,. respectively in
The relationship between Uc(bb), Sbb and Smeas can be ex- [6]. The second term of this expression differs from that
pressed as [6] of Eq. (1) due to the difference in experimental design:
Usually, Smethod cannot be determined independently, as
(1)
this would require a material of the same type with per-
which implies that fect within-unit homogeneity, which renders estimation
of Swb impossible. In these cases, Uc(wb) must be used to
S~b = UZ-(bb) - s;;'eas (2)
estimate the minimum sample intake. To diminish the
Note that Uc(bb), Sbb and Smeas were named u exp , Ubetw and influence of Smethod as much as possible, a sample intake
respectively in [6]. Smeas is the analytical variation
U meas , should be chosen for which Swb is much larger than
divided by the square root of the number of replicates Smethod. Examples for this approach for (trace) elements
per bottle. By increasing the number of replicates per are solid sampling (SS) techniques like solid sampling
bottle a small Smeas can be obtained even for methods SS-ETAAS (Electrothermal Atomic Absorption Spec-
with poor repeatability, thus allowing a good estima- trometry) or solid sampling inorganic inductively cou-
tion of Sbb, the estimate of between-bottle variation pled plasma-mass spectrometry (SS-ICP-MS). In any
sought for. case, Swb is not part of the uncertainty of the certified
Equation (1) obviously cannot be used if the varia- reference material, as will be explained in Part 4. It is
tion of the measurement is large compared to the heter- only needed to establish the minimum sample intake
ogeneity, without looking at the repeatability standard for which the stated uncertainty is valid.
deviation of the measurements, Smeas. In Part 1 [1], it
has been demonstrated that the variance "among
groups" is affected by the variance within groups. This Evaluation of a between-bottle homogeneity study
means that for small values of Sbb a problem of quantif-
ication arises. There are two principle choices to deal For a clay soil sample, 18 samples were taken out of a
with this batch for an homogeneity study on barium. The results,
1. Acceptance of the value for Sbb, even if it is zero, expressed in mg/kg on a dry basis are given in Table 1
because the samples are expected to be homogene- [Van Son M, Van der Veen AMH, Verkuil D, unpub-
Unsertainty calculations in the certification of reference materials. 2. Homogeneity study 97

Table 1 Homogeneity study of barium in soil tIe. The link between Eqs. (land 5) is as follows:
MSlImong is equal to n times H~(I)") and MSwithin is equal
Sample Data #1 Data #2 Data #3 Mean s n
to S~,ethod (see part 1 of this paper). As SmellS =SmethoJyn,
# O1Hs 323 301 310 311 11 3 the equivalence of Eqs. (2 and 4) follows. As it can be
# 0201 340 334 3Hi 330 12 3 seen, the computation of the grand mean is not needed
# 03X3 320 321 309 317 7 3 for this uncertainty evaluation, although Excel internal-
# 0442 315 33X 321 325 12 3
# ()557 326 33X 325 330 7 3 ly will calculate this parameter.
# 0666 325 302 304 310 13 3
# 0791 324 331 317 324 7 3
# 091 X 310 310 331 317 12 3 Evaluation of a between-bottle homogeneity study in an
# 1026 336 321 32X 32X X 3 alternative format
# 1133 310 32X 312 317 10 3
# 1249 314 314 302 310 7 3
# 1464 329 300 299 3()9 17 3 In the experimental set-up described above, method re-
# 15Xl 320 329 311 320 9 3 peatability and between-bottle heterogeneity were esti-
# 1607 322 312 311 315 6 3 mated by analysing several units n-times each. Fre-
# 1799 332 317 299 316 17 3
# lX77 313 294 293 300 11 3 quently a different approach to obtain estimates for
# 19% 324 314 335 324 10 3 Smells and Shh is used: one unit is analysed several times
# 2000 321 342 316 327 14 3 to obtain an estimate for SmetllOd and several units are
then analysed in one replicate each to obtain an esti-
mate of He(hb)' For the estimation of the between-bottle
Table 2 Analysis of variance (ANOVA) table for barium in soil variability by this approach, it is vital that the results
from one unit and from the different units are obtained
Source of variation SS df MS F P-value Fail
by the same technique using the same sample intake.
Between groups 3467 17 204 1.66 0.10 1.92 As n = 1 in this case, SmellS = Smethod and SI)" can be esti-
Within groups 4412 36 123 mated according to Eq. (1).
Total 78XO 53 For the certification of a mussel-tissue material [9],
ten units were used for the homogeneity study on sele-
nium. Five determinations on one unit were performed,
whereas the other nine units were analysed once. All
lished data]. The measurements were carried out on ex- analyses were done by ko-NAA without sample pre-
tracts obtained from aqua regia digestion using NEN treatment. The results of this study are given in Table 3.
6465, and the measurements were carried out using In this case, S/)h amounts to 2.84%, which is the uncer-
Iep-MS. Using Excel, the following ANOV A table tainty contribution of homogeneity to the uncertainty
(one-way layout) can be computed (Table 2). The co- of one bottle.
lumn "SS" provides the sums of squares, the column Obviously, this method requires fewer measure-
"dj" the associated degrees of freedom, and the co- ments than performing the complete matrix of measur-
lumn "M S" the mean squares, which form the basis for ements for the ANOV A evaluation. This advantage is
the computation of variances as discussed in Part 1 [1]. paid for with less significant results, as measurement re-
The F-test indicates that the result of the homogeneity
is insignificant (F < Ferit , the critical value of F for
a=5%). The P-value gives the level for which the ob- Table 3 Homogeneity study for selenium in mussel tissue
served F equals Ferit .
5 replicates Results from
The calculation of uncertainties is now very straight- from one unit 9 different
forward. The repeatability of the test method is just the [mg/kg] units
square root of MSwithin, equal to 11 mg/kg (=3,5%). [mg/kg]
For the variance among groups, the following expres-
sion can be used 1.907 I.X72
1.917 1.X74
1.% 1 1.92X
2 2 MSlImon,,-MSwithin
SA =Shh = (5) 1.901 1.X33
n 1.X34 1.944
I.X40
This equation can be used instead of the formulas from 1.726
Part 1, as in this case the data matrix is complete (all 1.952
groups have the same number of members, n =3). The I.X61
variance is 27 mg 2 /kg 2 ; the standard deviation between Average 1.904 1.X70
Standard deviation 0.046 0.070
bottles (s",,) is 5.2 mg/kg ( = 1.6%), which is the contri- Variation coefficient 2.40% (s"",,,,) 3.72% (Ud!>!»)
bution of heterogeneity to the uncertainty of one bot-
98 A. M. H. van der Veen . T. Linsinger· 1. Pauwels

peatability is not reduced by replication. The effect is geneity, impaired by measurement variability, for
that SmellS is more likely to be larger than U,(bb), leading measurement variability. Reliable estimates for be-
to the problems already addressed about Sbb values tween- and within-bottle heterogeneity can be obtained
tending to zero. This format may therefore lead to a given that the measurement variability is small com-
greater uncertainty for the certified reference material. pared to the heterogeneity to be detected. If the re-
As in many cases a more conservative value for the un- quirement of low measurement variability compared to
certainty due to between-bottle variation must be in- heterogeneity is not met, more conservative ap-
serted rather than the value of Sbb as obtained directly proaches should be employed.
from ANOV A. On the other hand, this example also The method as such works equally well on the fully
demonstrates how to conduct a homogeneity test in nested ANOV A designs of experiments, and on other
cases where repetition of measurements is impossible. formats. The underlying theory and concepts are the
In these cases, the results in the first column of Table 2 same, but the implementation differs. This allows appli-
must be obtained from other sources ("Type B evalua- cation of the method for homogeneity tests on test
tion" [8]), such as the quality manual of the laboratory, pieces in destructive testing as well, which has great
validation data or some other source. benefits.
This work also shows that carrying out homogeneity
tests cannot and must not be separated from other
parts of the certification project (e.g. stability studies,
Conclusions characterisation measurements), as the accuracy of the
measurements in the homogeneity study have impor-
A general framework for the estimation of within- and tant implications on the establishment of the combined
between-bottle heterogeneity has been developed. The standard uncertainty of the candidate reference materi-
approach consists of correcting an estimation of hetero- al.

References

1. Van der Veen AMH, Pauwels J(2000) 4. Schiller SB (1996) Statistical aspects of II. BIPM, IEC, IFCC, ISO, IUPAC, IU-
Accred Qual Assur 5: 464-469 the certification of chemical batch PAP, OIML (1995) Guide to the ex-
2. Pauwels J, Kurftirst U, Grobecker KH, SRMs. NIST Special Publication pression of uncertainty in measure-
Quevauviller P (1993) Fresenius J 260-125. NIST, Gaithersburg, USA ment, 1st edn. ISO Geneva, Switzer-
Anal Chern 345:4711-4111 5. BCR Guidelines (1994) Standards, land
3. ISO Guide 35: 19119 (19119) Certifica- Measurement and Testing Programme, 9. Lamberty A, Muntau H (1999) The
tion of reference materials - General Brussels, Belgium certification of the mass fractions of
and statistical principles, 2nd edn. In- 6. Pauwels J, Lamberty A, Schimmel As, Cd, Cr. Cu, Hg, Mn, Pb, Se and
ternational Organization for Standardi- H(19911) Accred Qual Assur 3:51-55 Zn in mussel tissue Mytilus edulis.
zation (ISO), Geneva, Switzerland 7. Van der Veen AMH, Alink A (19911) EUR 1111140EN
Accred Qual Assur 3: 20-26
Accred Qual Assur (200 I) 6:257-263
© Springer-Verlag 2001

Adriaan M.H. van der Veen Uncertainty calculations in the certification


Thomas P.J. Linsinger
Andree Lamberty of reference materials
Jean Pauwels
3. Stability study

Abstract To serve as a measure- tainty of the reference material is


ment standard, a (certified) reference concerned. There are different op-
material must be stable. For this pur- tions to validate the extrapolations
pose, the material should undergo made from initial stability studies,
stability testing after it has been pre- and some of them might influence
A.M.H. van der Veen (~) pared. This paper looks at the statis- the uncertainty of the reference ma-
Nederlands Meetinstituut,
Schoemakerstraat 97.2600 AR Delft, tical aspects of stability testing. Es- terial and/or the shelf-life. The latter
The Netherlands sentially, these studies can be de- is the more commonly observed con-
e-mail: avdveen@nmi.nl scribed with analysis of variance sta- sequence of what is called 'stability
Tel.: +31-15-2691733 tistics, including variant regression monitoring'.
Fax: +31-15-2612971
analysis. The latter is used in prac-
T. P. J. Linsinger' A. Lamberty tice for both trend analysis and for Keywords Uncertainty· Reference
1. Pauwels the development of expressions for materials· Stability testing·
European Commission, Joint Research
Centre. Institute for Reference Materials extrapolations. Extrapolation of sta- Analysis of variance· Regression
and Measurements, Retieseweg, bility data is briefly touched upon, as analysis
2440 Geel, Belgium far as the combined standard uncer-

Introduction statistics developed in this paper mainly concern cases


where no statistically significant or otherwise relevant
Stability testing is, together with homogeneity testing, trend in the property value has been observed. In current
crucial in the process of certifying reference materials. practice, it is not acceptable to have a time-dependent
As the lifetime of a typical certified reference material property value, apart for radioisotope CRMs where a
(CRM) is several years, the property value should be well-defined decay mechanism is present. This decay
constant during the lifetime of the CRM. A further re- will lead to a trend, which should be included in the cer-
quirement is that during transport, under conditions to be tification.
specified, the stability of the material should be guaran- It should be noted however that the data should be
teed by the producer. These aspects should be considered assessed with trend analysis. Trend analysis, as will
carefully in the design stage of the certification project, be shown, shares its basis with analysis of variance
and the stability of reference materials is an aspect that (ANOVA). A brief introduction to the assessment of
should be demonstrable. trends in data is given, although it is considered that it is
In this paper, the basic statistics for stability studies will a subject that needs - depending on the application -
be looked at, and it will be shown how these uncertainty more complete coverage. As the property value of a can-
components can be determined in order to develop a cer- didate CRM should be independent of time, it is suffi-
tification project in compliance with the "Guide to the cient to treat trend analysis in a simplified way.
expression of uncertainty in measurement" (GUM) [I]. Stability testing is often more complex than homoge-
The paper carries on from on Parts I and 2 [2, 3]. The neity testing. The reason for this is that at least one fac-
100 A. M. H. van der Veen et al.

tor potentially influencing the measurement uncertainty Trend analysis


has been added (time), but in some cases more (time, op-
erator, calibration, etc.). It will be shown that these cases The key point in a stability study is to monitor the prop-
can be described with the same statistical theory, leading erty value as a function of time. Often, such a study is
to different models for different cases. conducted at different temperatures to determine the op-
timal conditions for storage and/or transport. The data
thus obtained should first be investigated for a trend.
Types of (in)stability Before doing so, an assumption must be made about
the kind of trend that might be observed. From an empir-
There are two types of (in)stability to be considered ical point of view, the change of the property value (if
[4]: any) is expected to be small anyway, so a simple linear
1. Long-term stability of the material (e.g. shelf-life) model as a first-order approach may do. However, in
2. Short-term stability (e.g. stability of the material un- some cases some physical, chemical or biological phe-
der "transport conditions") nomenon might dictate instability, and in these cases
such mechanisms may suggest the need for another ap-
The first kind of stability study is well known, and usual- proach. In this paper, the empirical approach using a lin-
ly implemented in certification projects to a certain ex- ear model will be developed, along a paththat enables the
tent. The second kind of stability study is less common, use of other models.
but is possibly even more important than the first type. The basic model for a simple linear regression can be
The behaviour of the sample during transport under the expressed as [5]
conditions specified may differ from that under the stor-
age conditions of the producer/vendor due to external in-
Y=f3o+f3JX +e (I)
fluences (e.g. temperature, daylight, etc.). where f30 and f31 are the regression coefficients, and E de-
Often, the effect of short-term stability can be ne- notes the random error component. The random error
glected. The producer will usually specify proper trans- component, E, may be composed of only random error,
port conditions, effectively reducing this uncertainty but it may also contain one or more systematic factors.
component. Usually, this specification of conditions The theory applicable for "decomposing" E is given in
leads to the case that the uncertainty due to short-term [2]. In the case of stability studies, X denotes time and Y
stability does not exceed that of long-term stability. In the property value of the candidate CRM. For a stable
such cases, it is acceptable to set the (additional) effect reference material, f31 is expected to be zero. In those
of short-term stability, as obtained from stability testing cases, the model can be simplified to
to zero, as it has already been accounted for in the Y=f3o+E (2)
long-term stability study. Under certain conditions, it is
equally important to know what might happen to the which is essentially the model underlying ANaYA. The
sample if proper transport conditions have not been development of expressions for estimates for the param-
maintained. In many cases, a simple verification of eters f30 and f31, as well as the computation of variances
the CRM prior to first use might be sufficient, whereas of different kinds, follows the same paths as the develop-
in other cases it is evident that the CRM has become ment of the expressions for ANaYA, as shown in part 1
useless. This allows the producer to give better advice [2].
and, from the perspective of the user, supply a better Given a set of n pair wise observations of Y versus X,
product. for each Yi the following expression can be developed
Monitoring should be envisaged during the lifetime Yi=f30+f3IXi+Ei (3)
of the CRM. A fundamental problem of stability studies
Often, more than one value of Yi will be available for
is that they only account for the past, not necessarily for
each Xi' due to repetition of measurement, the use of
the present or future. Some kinds of degradation or oth-
more than one bottle per point in time, etc. These aspects
er instability problems proceed very slowly and very
should be included in a real-life model of a particular
gradually, but in many cases some abrupt change in
stability study. Based on the theory of this paper and part
properties takes place at some time, practically ending
I [2], these extensions can be developed quite straight-
the lifetime of the CRM. As these mechanisms are high-
forwardly.
ly unpredictable, monitoring of the stability is a necessi-
The sum of squared deviations can now be expressed
ty. As extrapolation of stability data is an absolute ne-
as
cessity to improve the marketability of the CRM, it is fl 2 fl 2
reasonable to make assumptions based on the 'past' S=Ie·
i=1 I
=I(Y-f3o-f3X)
i=1 I I I (4)
(e.g. stability study) and verify them through stability
monitoring. Frequently, the sum of squared deviations is expressed in
terms of the x2-statistic. The only difference between X2
Uncertainly calculations in the certification of reference materials. 3. Stability study 101

whereby it should be noted that b, and <PIC 010> are


and S is a scaling factor, which equals 1/0'2. Equation (4)
uncorrelated.
defines the regression problem: the objective is to mini-
Based on the standard deviation of b l , a judgement can
mise S (as function of f30 and f3,). This can be accom-
be made. Using Eq. (7), and an appropriate t-factor (num-
plished by differentiating (4) with respect to f30 and f3,
ber of degrees of freedom equals n-2), b, can be tested for
and setting the partial derivatives to zero. This is treated
significance. Although this method is quite uncomplicat-
in detail in many statistical textbooks.
ed, it requires the computation of s(b l ), a parameter that is
The regression parameters can be computed from the
often not calculated by software. Most software does how-
following expressions. For the estimator for the slope,
ever compute an F-table, which can also be used for eval-
solving Eq. (4), the following expression can be derived
uating the significance of regression (Table I).

b -
t,(X i- X)(y; - V) t,XiY; - 11 (t,xi)(t,Y;) The mean square due to regression is often denoted as
SS(b,lbo), to be read as "sum of squares for bl after al-
i-
1- - 1- 1- I-

it,(Xi -X)2 - itX?-~(tXir (5) lowance has been made for bo". The mean square about
regression (s2) is an estimate for the property denoted by
O'2 y'x and called the variance about regression.
whereby it should be noted that the first expression is The ratio MS reg :s2 can be tested for significance using
more suitable for numerical work in computer programs, the F-tables. Table I provides the necessary information
whereas the second is more suitable for pocket calcula- with respect to the degrees of freedom. The advantage of
tors, as the computations are less tedious. The first ex- using the F-table instead of the method using the t-test is
pression for the slope is less sensitive to round-off er- twofold:
rors, so in most cases more accurate.
The estimate for the intercept can be computed from 1. The F-table is generated by most software systems by
default.
bo=Y-bJ (6) 2. The F-table can readily be extended to other regres-
Using the error propagation formula [I], the standard de- sion models, which makes it more widely applicable.
viations in b, and bo can be computed. The estimated Irrespective of what kind of test is used, it should be not-
standard deviation of b, is given by ed that the outcome is only meaningful if the repeatabili-
s(b.) = s ty standard deviation of measurement, possibly in con-
I ~I
II (

i='
x-x-)2 (7) junction with the between-bottle homogeneity is suffi-
I
ciently small. It can be demonstrated that if the repeat-
ability standard deviation is comparable to that of the ho-
whereby mogeneity study and the characterisation of the material
II ,
I (>'] - bo - b, X;) ~ (e.g. the determination of the property value), this re-
s2 = .:..I=""'_ _----c,.--_ _ (8) quirement is met.
n-2

The estimated variance of bo is given by Experimental layouts

V(bo) = V(Y -b'X)=S2[~ +-11-(_X_2--)-2]


I X,-X
There are two basic experimental layouts for stability
studies:
i='
II
(9) 1. Classical stability study
s2IXf. 2. Isochronous stability study.
i=' I
nl X,-X-)2
II ( The isochronous stability study, introduced by Lamberty
et. al. [6], has the great advantage that all measurements
1='

Table 1 Analysis of variance


table for linear regression Source of variation Degrees of freedom Sum of squares (SS) Mean square (MS)

±(y - Y)- %
1

Due to regression
i=1 I

About regression (residual) n-2 Ln ( Y-YA)2 % S2 = SS2 %


;=1 I I n-

t U;-Yf%
1

Total, corrected for mean n-I


1=1
102 A. M. H. van der Veen et al.

The reason that this kind of stability testing is not


Study IV
continuous has to do with the use of isochronous mea-
surements: as these are done in a single run after the pe-
StudylJl riod of stability testing, it is necessary to make a cut in
the stability testing. After such a cut, the uncertainty of
the CRM can be reviewed as well, as these new stability
Stv.dy JJ data can be used as a renewed estimation of the uncer-
tainty due to instability.
As with homogeneity testing, good repeatability of
the test measurement method is an important prerequisite
for stability testing as well. In Part 1 [2], it has been ar-
o 20 60 110 100 121) 140 gued that the quality of the estimators for group and sub-
group variances depends on the within-group variances.
Fig. 1 Semi-continuous stability testing For a stability study, this means that the estimators for
variance due to instability improve when the repeatabili-
ty variance decreases. The within-bottle variance, as ob-
can be carried out in one run, with one calibration. This tained from homogeneity testing [3], forms the basis for
reduces the scattering on each point in time, and thus im- decreasing the repeatability variance to a level specific
proves the 'resolution' of the stability study. As a conse- for the method. The number of repeated measurements
quence, an isochronous stability study will lead to a (replicates) is the other variable that can be influenced to
smaller uncertainty than a classical one. decrease the repeatability variance.
A classical stability study measures a sample as a
function of time. In this case, the work is carried out un-
der (within-laboratory) reproducibility conditions, which Uncertainty modelling
leads to a higher uncertainty, as the instability of the
measurement system is now also included. When certify- A stability study may include the following uncertainty
ing a single package, such as a gas cylinder, it is impos- components:
sible to use the isochronous layout. The isochronous lay-
- Repeatability of measurement
out is specifically designed for batch certifications.
- Instability of the material
Furthermore, monitoring typically takes place using
- Instability of the measurement system (in the classical
the classical design. The problem here is that the iso-
design)
chronous design only provides data at the end of the sta-
- Between-bottle homogeneity (in batch certifications).
bility study, whereas for monitoring it is essential that in-
formation becomes available during the lifetime of the From this list, it can be seen that whenever possible, the
CRM. This has no further consequences for the uncer- isochronous design should be preferred over the classical
tainty of the CRM, in contrast to the other two stability one, as it reduces the number of components to look at.
studies, as monitoring only involves the demonstration In a typical isochronous stability study, only three com-
that the uncertainty on the certificate is still valid. This ponents of uncertainty are left, which can be separated
should obviously be done with care, so that not too much through a fully nested two-way analysis of variance (see
uncertainty is added during the verification of the CRM, Part I [2] for details). The uncertainty for a single mea-
but there is no necessity to account for these results in surement in such an experiment can be expressed as
the combined standard uncertainty of the CRM. It should + S,,"
2 + Sr2 (10)
U 2( Yijk ) -- Swu"
2
be noted that monitoring data could be used to re-evalu-
ate the uncertainty due to instability, and thus, affecting where Swuh is the standard deviation due to instability',
the uncertainty of the CRM. This topic is beyond the S,," denotes the between-bottle standard deviation, and sr
scope of this paper. the repeatability standard deviation . As in the case of the
Another way of implementing the requirement of homogeneity study, the quality of the estimator ssw" de-
monitoring the CRM is to use a kind of semi-continuous pends on S,," (and sr)' Thus, the between-bottle homoge-
stability testing. In essence, the first stability study of, neity affects the quality of the estimator for instability!
for example, 36 months is succeeded by a second one, This is inevitable, as it is a property of analysis of vari-
with some months of overlap. The principle is illustrated ance, as discussed in Parts 1 and 2 [2, 3].
in Fig. I. Measuring "reference samples" (i.e. samples It should be noted that the model is only valid if the
that have been stored at reference temperature) at each homogeneity and stability of the material are indepen-
monitoring point could be seen as a special case of this
semi-continuous study with each study consisting of one I The subscript "stab" is used either to denote "Its" (long-term sta-
point in time. bility study) or to denote "sts" (short-term stability study).
Uncertainty calculations in the certification of reference materials. 3. Stability study 103

dent. This might seem obvious, but it is not. If a material ty studies. First, it should be noted that stability monitor-
shows considerable between-bottle heterogeneity, it can ing does not affect the uncertainty statement of the refer-
also be expected that the stability of the material differ ence material on the certificate, UCRM. and it is unneces-
from bottle to bottle, as the stability of the material will sary as will be demonstrated. The uncertainty from mon-
depend (among others) on the composition of the materi- itoring can be expressed as
al. However, as the preparation of a reference material (13)
involves the reduction of heterogeneity and the improve-
ment of the stability, it is for most reference materials whereby uCRM denotes the combined standard uncertain-
reasonable to assume independence between effects from ty of the reference material, and umeas the uncertainty
heterogeneity and instability. from measurement, including calibration. Ideally, UCRM is
The model also raises another question: is it possible considerably greater than U meaS ' but it should be consid-
to estimate shh from a stability study? The answer is that ered that this is not always possible. Furthermore, the
statistically speakingit is possible. Especially in those measurements should be carried out in such a way, that
cases where the effect of (in)stability is expected to be their validity must not be demonstrated from using the
small anyway, it is certainly an option. A two-way fully CRM. One cannot check two things at the same time in
nested ANOVA will do the job and provide the three one experiment. The validity of the CRM is to be recon-
standard deviations sf' Sbb' and Sstab. As stability studies firmed, which can only be valid if the measurement is
are often carried out at different temperatures, it is to be demonstrably reliable.
recommended to pick the value at one of the lowest tem- If these experimental conditions are fulfilled, both the
peratures. property value and its expanded uncertainty are recon-
Furthermore, it can be noted that if S'ta" is zero or suf- firmed. There is, under these conditions, no need to in-
ficiently close to zero, it is possible to scatter the bottles crease U CRM' as the uncertainty from measurement is
along the time axis, and to consider the set of data as a something which must be accounted for separately. This
homogeneity study. For example, if at5 different points is true both for monitoring as well as for the normal use
in time 3 bottles have been measured, then, provided that of the CRM. It should however be noted that for the sake
S'tah - 0, the experiment can be evaluated as a between- of the validity of the monitoring measurement, ume£lS
bottle homogeneity study with 15 bottles. If there is should be as small as possible, and certainly not exceed-
some instability left, that it, SHa" is not exactly zero, this ing umeas from a typical user of the CRM, who will use a
will be accounted for in the estimate obtained for s"" similar approach for verifying her measurements.
when evaluating the data as a homogeneity study. An alternative approach is to consider the point ob-
In the classical design, the expression for the uncer- tained in monitoring as just the next point in the stability
tainty reads as study, and from the complete set of data, a new estimate
u2(YUk) =S~tah + stir + sF,,, + s; (II) for Sits can obtained. If necessary, the uncertainty of the
CRM can be reviewed, but usually the evaluation will
whereby one term has been added, the variance due to only reconfirm the value of Sits already obtained and just
lack of repeatability, sio/. This term represents the stabil- extend the shelf-life (i.e. the time for which the certifi-
ity of the measurement system. The measurements in a cate is considered to be valid).
classical stability study take place under (within-labora-
tory) reproducibility conditions. The other terms are
identical to the isochronous case. The problem with the Examples
classical stability study is that the separation between
SHah and sior is not possible; as a result, the model de- An example of the results of an isochronous stability
scribing a typical analysis of variance layout (two-way, study is shown in Table 2, which lists the results of a 12-
fully nested design) read as month isochronous measurement for the determination
u2 (YUk) = s.~tah' + sF,,, + s; (12) of total glucosinolate in rapeseed, BCR 190R. Two units
were analysed in triplicate for each time.
whereby S'tah' now denotes the uncertainty component A standard deviation between bottles of 0.28 Ilmol/kg
due to instability of the measurement system and the ma-
(l.l %) was calculated, which corresponds well with the
terial. This is the case for both the short-term stability
homogeneity study, in which a method repeatability of
study as well as for the long-term stability study.
3.7 % and a between unit variation of 1.4% was estimat-
ed [3]. Standard deviation between times was estimated
0.31 Ilmollkg (1.4%). The detected instability was refut-
Uncertainty evaluation of stability monitoring ed by subsequent stability studies. Results of a classical
stability study are shown in Table 3 [7].
The uncertainty evaluation of stability monitoring is Performing a one-way ANOVA gives a standard devi-
quite different from the long-term and short-term stabili- ation within groups of 0.063 Ilg/kg (7.9%) and a standard
104 A. M. H. van der Veen et al.

Table 2 Results of stability


t=O months t=6 months t=9 months t=12 months
tests for total glucosinolate in
rapeseed, BCR 190R. Concen-
trations are given in I1mollkg Bottle I 21.84 23.26 23.25 22.25
21.74 21.62 22.39 22.22
22.45 22.55 22.51 22.22
Bottle 2 22.32 23.42 22.54 23.89
21.89 23.16 22.61 22.78
21.37 22.75 23.73 22.84

Table 3 Results for a stability study for aHatoxin M I in milk powder in I1g/kg [7]

t=1 month t=2 months t=4 months t=6 months t=8 months t=IO months t=12 months

Value I 0.72 0.63 0.83 0.85 0.89 0.73 0.74


Value 2 0.79 0.72 0.87 0.90 0.91 0.92 0.80

deviation between groups of 0.064 (8.0%). This result is If profound knowledge of the material and the produc-
obviously strongly influenced by measurement reproduc- tion process is required to adopt possibility for the ho-
ibility. Looking at the results, the aflatoxin content mogeneity case, it is even more true for stability test-
seems to decrease initially, rise afterwards and decrease ing. Homogeneity of a material can be assumed con-
again. Such behaviour is very unlikely, but would never- stant over time. Neglecting a smaIl inhomogeneity wiIl
theless be included in an estimation of uncertainty of sta- therefore result in only slight underestimation of uncer-
bility. tainty. On the contrary, instability exacerbates with
time. Degradation between the monitoring measure-
ments may therefore result in unrealistic uncertainty
Discussion statements if no aIlowance for possible degradation is
made.
All these points emphasise the importance of pro-
Evaluating stability studies using ANOYA seems to ne-
found knowledge of the material and possible degrada-
glect the information of the relative position of the mea-
tion pathways. Being able to predict degradation there-
surement in time. As is shown in the Annex to this pa-
fore alIows one actively to counteract it, which is to be
per, the statistics from ANOYA and those of regression
preferred over any statistical evaluation of the facts.
analysis are closely related. For obtaining the estimate
Knowledge also alIows a more reliable estimation of ults
of sits' it seems that a stability study with measurements
for those cases in which degradation cannot be prevented
after 0, 2, 4 and 6 months contains the same information than statistical evaluation a posteriori.
as one with measurements after 0, 8, \6, 24 months.
Finally, it should be noted that the estimates sits and to
When using the appropriate expressions for extrapolat-
a lesser extent Ssts form only the basis for the values for
ing the data and the evaluation of measurement uncer-
these uncertainty components in the expression of the
tainty, it will become clear that the 24-months stability
measurement uncertainty of the property values. One of
study will be different from that of 6 months, as would
the aspects not covered here is the development of a rec-
be expected. The expressions for the uncertainty must
ipe for a shelf-life, in conjunction with developing an ap-
appreciate the distance between the centre of gravity of
propriate estimate for Utts at the shelf-life.
the data.
In many cases, s2 stah will be smaller than the sum of
the other contributions to U2(YUk) in Eqs. (1) and (3),
which makes the estimation of SHah impossible. This Conclusions
problem was already addressed in Part 2 for the homoge-
neity study [3]. The same options for treating these cases
A framework for the estimation of uncertainty of stabili-
exist for the stability study:
ty from ANOYA has been developed. The approach sep-
1) Accept the low value and conclude that uncertainty of arates between-bottle variation from measurement re-
stability is negligible compared to the other uncertain- peatability and variation in time. For isochronous mea-
ty contributions. surements, this variation in time represents variation due
2) Choose a more conservative approach like for exam- to instability. For classical stability studies, this variation
ple the ones outlined in [10]. is confounded with the intralaboratory reproducibility.
Uncertainty calculations in the certification of reference materials. 3. Stability study 105

Estimates of uncertainty of stability are therefore smaller The method requires that variation due to stability is
when using isochronous schemes. not negligible compared to measurement and between-
It has been shown that estimates for between-unit unit variation. If this requirement is not met, a more con-
variation can be obtained from stability studies. These servative approach should be employed. However, the
can be used to back up original homogeneity studies or decision about this should be made based on a profound
may even serve as the sole homogeneity study. knowledge of the material.

References
I. BIPM, 1Ec' IFCC, ISO, IUPAC, IUPAP, 4. Van der Veen AMH. (2000) "Determina- 6. Lamberty A, Schimmel H, Pauwels J
OIML (1995) Guide to the expression of tion of the certified value of a reference (1997) Fresenius J Anal Chern
uncertainty in measurement, 1st edn. material appreciating the uncertainty 360:359-361
ISO, Geneva, Switzerland statements obtained in the collaborative 7. Van Egmond HP, Wagstaffe PJ (1992)
2. Van der Veen AMH., Pauwels J. (2000) study", presented at AMCTM 2000 con- "The certification of atlatoxin M I in
Accred Qual Assur 5:464-469 ference, Monte de Caparica, May 2000 four milk powder samples. CRM No's
3. Van der Veen AMH., Linsinger TPJ., 5. Draper NR, Smith H (1981) "Applied 282,283,284 and 285", European Com-
Pauwels J. (2000) AccredQual Assur regression analysis", 2nd edn. Wiley, mission, EUR 10412
6:26-30 New York, chapter 1
Accred Qual Assur (2000) 5: 231-237
© Springer-Verlag 2000

Mirella Buzoianu Some aspects of the evaluation


of measurement uncertainty using
reference materials

Abstract In practice there are cation, etc.), as well as during the


three aspects that need to be con- analytical measurement process.
sidered in order to achieve the re- Practical examples of estimation of
Invited paper presented at the 2nd quired traceability according to its measurement uncertainty using
Central European Conference on definition: the 'stated reference', RMs or certified reference materi-
Reference Materials (CERM.2), 9-10
September 1999, Prague, Czech Republic the 'unbroken chain of calibra- als are discussed for their applica-
tions' and the "stated uncertainty". bility in spectrophotometric and
For a certain chemical result, each turbidimetric analysis. Use of the
of these aspects highly depends on analysis of variance to obtain some
the measurement uncertainty, both additional information on the com-
on its magnitude and how it was ponents of measurement uncertain-
estimated. Therefore, the paper de- ty and to identify the magnitude of
M. Buzoianu (81) scribes the experience of the Ro- individual random effects is de-
National Institute of Metrology, manian National Institute of Me- scribed.
Reference Materials Group, trology in estimating measurement
Sos. Vitan-Barzesti No.ll, uncertainty during the certification
75669 Bucharest, Romania
e-mail: office@inm.ro
of reference materials (RMs), in Key words Measurement
Tel.: +401-334-5060 metrological activities (calibration, uncertainty . Reference materials
Fax: + 401-334-5345 pattern approval, periodical verifi- Turbidimetry

ing from the measurand in the sample being analysed in


Introduction a chemical laboratory, up to a unit of the SI, or to the
value of a recognized measurement scale) depends on
In Romania, increased attention is been paid to the ac- the measurement uncertainty. Also, the need to esti-
curacy of all measurements performed in trade, envi- mate and report properly the measurement uncertainty
ronmental monitoring, public health, research and in- during the constantly increasing accreditation activities
dustry. According to the Law of Metrology, issued in results in the utmost attention to the application of this
1992 and amended in 1999, all legal measurements and concept in routine level laboratories.
instruments should be uniform, comparable and accu- Within this framework, the paper describes the ex-
rate, and the instruments or measurement standards perience of the Reference Materials Group of INM in
should be traceable to national or international stand- estimation of measurement uncertainty both in metro-
ards. Note that the experience and main tasks of the logical-related activities and in routine chemical appli-
Reference Materials Group of the Romanian National cations. Considering some representative spectropho-
Institute of Metrology (INM) related to metrological tometric and turbidimetric analysis, the paper attempts
assurance of all measurements performed in chemical to discuss practical problems on the estimation of meas-
laboratories are described in Refs. [1-3]. It is widely ac- urement uncertainty using the reference material (RM)
cepted that the strength of the traceability chain (start- and certified reference material (CRM). approach.
Some aspects of the evaluation of measurement uncertainty using reference materials 107

A brief review of the experience of INM in estimating EXPRESSMA1HEMA1ICAILY


measurement uncertainty 1BEREl.ATIONSHIP BEIWEEN
MEASURAND AND1BE
In order to estimate the measurement uncertainty asso- INPUfQUANITI1ES ~

ciated with a measurement result reported in a metro-


logical issue (certificate of calibration, certificate of a
RM, report of a highly accurate measurement, etc.)
ISO GUM [4], adopted in the national area as the Ro-
!
DETERMINE THE
manian standard SR 13434: 1999, should be followed VALUE OF INPUT
UANTITIES
according to a fully defined measurement process.
Since the purpose of a measurement is to determine
the value of a measurand (the value of a particular
EVALUAlE STANDARD
quantity to be measured), the measurement begins with
UNCERTAINTY OF
an appropriate specification of the measurand and of
EAOIINPUfESTIMAlE
the method of measurement or measurement proce-
dure. The required specification (definition) of the
measurand is dictated by the necessary accuracy of
measurement and it should be defined with sufficient
completeness so that for all practical purposes asso-
ciated with the measurement its value is unique. Fur-
ther, an individually designed measurement method or
procedure, taking into consideration the requirements
of the data, is needed. Then, a measurement system, in-
volving the coordinated interaction of a number of in-
fluencing factors, is chosen. This system is calibrated
against physical and chemical standards traceable to
national standards maintained by INM. Physical cali-
brations are needed for the measurement equipment it-
self and for ancillary measurements such as time, tem-
perature, volume, mass, etc. Several kinds of chemical
calibration, involving RMs and CRMs, are necessary to
calibrate the system providing results expressed in con-
centration units.
Briefly presented in Fig. 1 are the necessary steps,
recommended in ISO-GUM for measurement uncer- REPORT
tainty estimation, referring to the expression of the MFASUREMENT RESULT
mathematical relationship between the measurand and
X±U
the input quantities, the identification of uncertainty
sources, the quantification of the uncertainty compo- Fig. 1 Measurement uncertainty estimation process
nents and the calculation of the expanded uncertainty.
According to its definition, the combined standard
uncertainty in calibration includes the main compo- proach are used for metrological purposes. Several
nents described in Table 1 (components related to the types of CRMs developed and certified by INM are
knowledge of the true value of the measurement stand- presented in Ref. [1], and examples of certification us-
ard, components introduced when using the standard ing the metrological approach are described in Ref.
or the CRMs for calibrating another instrument and the [2).
components introduced by the instrument being cali- The 'metrological approach' relies on assessing pos-
brated). Note that an example of how to evaluate the sible sources of error, then, by means of subsidiary ex-
calibration uncertainty of an analytical spectro(pho- periments and/or theoretical analysis, determining the
to )meter is discussed in Ref. [3]. correction for each source and building an uncertainty
As a rule, the certification of RMs in the Reference budget by evaluating the uncertainty on the correction.
Material Group of INM follows the procedure de- Sometimes this approach is a tedious and time-consum-
scribed in ISO Guide 35: 1989, adopted as the national ing operation, although only few components of the un-
standard SR 13252-2: 1995. Mostly, certification based certainty budget associated with a few corrections dom-
on interlaboratory testing and on a metrological ap- inate. For routine measurements this approach can be
108 M. Buzoianu

Table 1 Main sources of uncertainty in calibrating spectro(photo)meters

Main sources of uncertainty Uncertainty Some necessary characteristics Method of evaluation

Due to CRMs used in calibration UCRM Homogeneity, stability, certified As indicated in ISO Guide
value of concentration 35:1989, adopted as SR 13252-
2:1995
Due to the use of the measurement Method of certification, calibration, As indicated in ISO Guide
standard conditions. matrix mismatch 33: 1989, adopted as SR 13252-
4: 1995
Due to the spectrophotometer Photometric linearity, photometric From technical specification of the
accuracy, wavelength accuracy, producer, from calibration certifi-
stray light cate or within the calibration proc-
ess

considerably simplified by calibration of the measure- but the CRM has a related matrix to the sample (sec-
ment system with traceable measurement standards, ond situation), the test sample uncertainty may be re-
since calibration considerably reduces the number of lated by a factor of k to that observed when measuring
uncertainty components that have to be evaluated. the CRM. When the previous situation does not apply,
Also, for some routine measurements, both SR the measurement of an appropriate CRM (third situa-
13434:1999 and the EURACHEM Guide [5] are used tion) can give an indication of the measurement uncer-
in accredited chemical laboratories, and RMs, or some tainty. Even so, it is recommended that isolated results
data and results from previous work or even the judg- on CRMs should be completed with the use of control
ment of the experienced analyst are chosen to do this charts if one consistently needs information on the
evaluation. Some of the problems experience by INM measuring process.
in the evaluation of measurement uncertainty of spec- A control chart is simply a graphical way to interpret
trometric results using the genealogical approach are test data. If a selected RM is measured periodically and
discussed in Ref. [6]. the results are plotted sequentially on a graph (chart
control type), a lot of information may be obtained on
the combined effect of many potential sources of errors
The evaluation of measurement uncertainty using RMs occurring in the measuring process. Limits for accepta-
ble values are defined and the chemical measurement
By definition [7], a RM is a material or a substance one system is assumed to be in control as long as the results
or more of whose property values are sufficiently ho- stay within these limits. The monitored precision of
mogeneous and well established to be used for the cali- measurement and the accuracy of measurement of the
bration of an apparatus, the assessment of a measure- reference material may be transferred, by inference, to
ment or for assigning values to materials. all other appropriate measurements made by the sys-
Further, a RM, accompanied by a certificate, one or tem while it is in a state of control (i.e. repeated meas-
more of whose property values are certified by a proce- urements over a period of time of standard samples
dure which establishes its traceability to an accurate re- processed right through the system are consistent with
alization of the unit in which the property values are the measured variance of the system). Thus, the result-
expressed, and for which each certified value is accom- ing judgment of uncertainty could be assigned to the
panied by an uncertainty at a stated level of confidence sample data output of the process provided the follow-
is a CRM. ing sources of uncertainty are taken into account:
Thus, according to the above definitions and if they - Uncertainty of the assigned value of the RM
are properly used, both RMs and CRMs may contri- - Reproducibility of the measurements made on the
bute to the evaluation of the performance of a meas- RM
urement system and to the estimation of measurement - Any difference between the measured value of the
uncertainty. Three general cases of how CRMs may be RM and its assigned value
used to evaluate measurement uncertainty are illus- - Difference between the composition of RM and sam-
trated in Fig. 2. When the matrix of the CRM matches ple
the matrix of the sample being measured, and the ho- - Difference in the response of measurement response
mogeneity and stability of the sample have been proven due to interferences or matrix effects
(first situation), the uncertainty of the sample measure- - Operations that are carried out in the laboratory on
ments can be equatable to that observed in measure- samples but not on RMs due to subdivision of the
ment of the CRM. Further, if this match is not possible, original sample.
Some aspects of the evaluation of measurement uncertainty using reference materials 109

Fig.2 Interpreting CRM


measurements (I) Matrix-Match

Measurement
(
" lIcRM== u _

(D) Matrix-Related

( Measurement )
" IIcRM == k u_ .

(ITI) Matrix- Relation Infereed

One may easily note that one of the main pre-requi- Estimation of measurement uncertainty
site for estimation of measurement uncertainty using of a mass fraction result using CRMs
RMs is the stability (statistical control) of the measure-
ment system. Statistical control may be defined as the Three examples of this type of evaluation are given in
attainment of a state of predictability [8], Working un- Table 2 for methods based on molecular absorption
der the above-mentioned conditions, the mean of a spectrophotometry, flame atomic absorption spectros-
large number of measurements will approach a limiting copy (F-AAS), and ICP-OES. In each case samples of
value and the individual measurements should have a NIST-SRM 14 g (AISI 1078 type) were analysed using
stable distribution, described by their standard devia- a Perkin-Elmer 192 flame atomic spectrophotometer, a
tion. The limits within any new measured value can be UV2-100 A TI Unicam spectrophotometer, and a Spec-
predicted with a specified probability. Confidence lim- troflame ICP-P. Prior to this experiment, each meas-
its for a single measurement or for the mean of a set of urement system was metrologically verified in accor-
measurements can be calculated, and the number of dance with the legal metrological norms (NML 9-12-97,
measurements required to obtain a mean value with a NML 9-02-94 and OIML R116). Each system was cali-
given confidence may be estimated. brated against INM's own CRMs and internal quality
procedure. The practical operation conditions and the
parameters of the calibration curves are also indicated
Some outcomes on the estimation of measurement in the Table 2. Note that the uncertainty due to the
uncertainty using RMs and CRMs sampling and sample preparation took into considera-
tion [5] aspects regarding the homogeneity estimate,
Two examples of estimation of measurement uncertain- dissolution, dilution errors, chemical effects, etc. There-
ty using both the CRM and RM approach are discussed fore, several samples were taken and analysed sepa-
below. rately according to the considered methods. The varia-
bility between the individual results was considered as
a measure of the reproducibility of the specific analyti-
cal method and of the uncertainty of sampling and sam-
110 M. Buzoianu

Table 2 Estimation of the measurement uncerainty of a mass fraction result using a CRM

Molecular Flame-AAS ICP-AES


spectrometric method method
method

Operation conditions
wavelength 450 (copper diethyldithio- 324.8 324.75
carbamate)
repeated measurements 10 5 10
measurement method as described in ST AS as described in ST AS as described in internal
1463-84 1463.84 procedure
calibration against BCS (206/2, 224, 255, 257) single element solution from multielement solution from
high purity elements high purity metals (synthetic
matrix form)
Calibration curves parameters
slope, b 0.8455 2.1016 71517
intercept, a -0.0093 0.0016 11
standard deviation of the
regression, So 0.007 0.001 13.65
Uncertainties due to:
- sampling and sample pre-
paration, 11.",,,,, (rei) 0.015 0.005 0.005
- measurement method 1,
U""th, (rei) 0.040 0.045 0.037
- instrument calibration 2 ,
U,.,,/, (rei) 0.025 0.003 0.002
- CRM, UCRM, (rei) 0.020 0.005 0.005
- data treatment (rei) negligible negligible negligible
Measurement result, W (%) 0.047 0.055 0,(147
Expanded Uncertainty 3
(k=2), u,. (rei) 0.1068 0.0914 0.0755

1 Following mathematical relations describing the measurement


procedures were used:
w=(wrec·rlm)·JOO - in the molecular spectrophotometric meth-
2
ur<"<1/ _ s~
- - 2 -+-+
/',,/
(1 1
calculated with the equation:
n(Ameas-:4l ).
b N n b(nIc-(Ic»
2 '

od, and w=(wm o'Vlm)'1O- 4 in the F-AAS and ICP-AES meth-


ods. Note that W,ec is mass fraction calculated from calibration where N is the number of CRMs used for calibration and n is the
curve (w""o=A(x)lb) with A(x) - absorbance measured for the un- number of repeated measurements on each CRM
3 calculated with the equations:
known sample; r - dilution factor (r = Vrl Vi); V - volume of sam-
ple being analysed; m - weight of sample being analysed; V, -
final volume and Vi - volume taken for dilution : = (:;)\ (~ ) \ (u;)\ (~J + (:;:::)\ (::J+ (u:~}
and U,,,=2'ujw

pIe preparation. Thus, the uncertainty due to sampling may conclude that there is no evidence that measure-
and sample preparation was determined as the differ- ment process is not as precise as required. For the ICP
ence between the above mentioned variabilities divided method, the expanded uncertainty agreed with the cer-
by the number of samples analysed. tified value of the CRM (0.003%) used in this experi-
As one may note, the absolute expanded uncertainty ment. Starting from the certified value of mass fraction
measurement (for k =2) evaluated for the mass fraction of Cu in NIST-SRM (0.047%) one may also note that
result obtain against the molecular spectrophotometric measurement results obtained both with molecular
method (0.005%) was equal to the interlaboratory spectrophotometric and ICP methods were in good
standard deviation of the standard method of analysis agreement. In the F-AAS method, a bias exceeding the
(0.005%). Using the F-AAS method, the absolute un- prescribed limits was observed. Thus, the necessary op-
certainty of the measurement result (0.004%) exceeded timization of the measurement process was concluded.
the standard method accuracy (O.003%-abs). Compar-
ing the two values, one may note that K..1I1 = (0.004/
0.003)2 = 1.78, is less than ,itllble(4:0095) =3.65, and one
Some aspects of the evaluation of measurement uncertainty using reference materials III

Table 3 Example of determination of measurement uncertainty of a turbidity result using ANOV A method

Instrument 2 3 4 5
Mean turbidity
measured

100.42 97.9X 103.60 99.XO 103.34

s(T;J, FNU O.63X 2.342 3.1XX 1.55X

Instrument 6 7 9 10
Mean turbidity
measured

101.50 102.00 100.22 99,XX 99.16

seT;;), FNU 0.620 2.722 2.545 3.569 3.547

7'=/00.79 s(T;;) =1.801


s~ = i' S2(7';) =5 (1.801)2 = (4.027? 2 sF, =s2(T;}) = (2.577)2
F=~=244
2 •
Sl>
FO,Y5(9.40) =2.12
FO'J75(y.40) =2.45

F> Fa ,Y5(y.40) F < Fa,Y75(9.40)


s2(ij) =s~/ k +s71

s71 = k·s 2 (7'j)-?{fJ = (1.384)2


k
s~. =s2(T;;) = (2.577)2

S2(7') = (J-l)'s~+J'(k-l)'sT,
J·k·(J·k-l)

Measurement uncertainty: (0.410)2 (0.569)2

Reported result: (JOO.79±k·0.410) (100. 79± k· 0.569)

On the use of analysis of variance (ANOVA) to urement only if instrument-to-instrument variability of


estimate measurement uncertainty of a turbidity result observations is the same as the variability of the obser-
vations made on a single instrument. If there is evi-
ANOVA methods are of special importance in measur- dence that the between-instrument variability is signifi-
ements to identify and quantify individual random ef- cantly larger than the within-instrument variability the
fects so that they be properly taken into account when use of this expression can lead to a considerable under-
the uncertainty of the result of the measurement is statement of the uncertainty of turbidity result. There-
evaluated. fore, the consistency of the within- and between-instru-
An example of the application of ANOV A to esti- ment variability of the observations was investigated by
mate turbidity measurement uncertainty is presented in comparing the two independent estimates of the with-
the Table 3. in-instrument component of variance, if;:
A standard of turbidity of (100.5 ± 3.7) FNU, pre- - The first one was denoted s,; with (10-1) degrees of
pared from a (4000 ± 20) primary standard of turbidity freedom. It was obtained from the variation of the
was measured against ten stable turbidimeters, denoted measurements performed on the same instrument.
1 to 10. On each instrument, five independent turbidity The mean of measurements made on an instrument
measurements were made, and the mean value -T and was the arithmetic average of five observations and
the experimental standard deviation s(T;) determined its estimated variance was calculated;
are presented in Table 3. Note that T;; denotes the ith - The second estimate of if; was denoted Sf; and was
observation on the standard made on the jth turbidi- calculated as the pooled estimate of variance ob-
meter. The experimental standard deviation of the tained from the ten individual variances sJj. Note that
mean, s(T;j) is the measure of the uncertainty of me as- this estimate has 10(5-1)=40 degrees of freedom.
112 M. Buzoianu

timated variance of the mean value of turbidity was cal-


culated taking into account both the within- and be-
tween-instrument random components of variance.
Also, following the procedure indicated in [4] a stand-
ard measurement uncertainty of 0.569 FNU was ob-
tained. Note that this value reported when a between-
instrument variance was accepted, is a more prudent
Fig. 3 Dispersion of turbidity measurement results on a RM of
100 FNU decision for practical purposes.
In the example considered above, the same RM was
also tested on several turbidimeters within a short peri-
od of time, and the 48 turbidity values measured were
The difference between s~ and sl indicates the possi- plotted in the Fig. 3. Note that the upper limit of tur-
ble presence of an effect that varies from one instru- bidity was of 105 FNU and the lower limit 95 FNU. An
ment to another but that remains constant when meas- overall mean value of 102.3 FNU and a standard uncer-
urements are made on any instrument. To test this pos- tainty of 2 FNU were determined. Over 35 results fell
sibility the F-test was used, and the value calculated within the range of ± 2 FNU.
was 2.44. Since FO.95 (9.'10) is 2.12 and FO.975 (9.'10) is 2.45, it
was concluded that there is a statistical significance be-
tween instrument effect at the 5% level of significance Conclusions
but not at the 2.5 % level. Two further situations were
considered. Presented in the left column of the table is This paper has examined the role of RMs and CRMs in
the case in which the existence of the between-instru- estimation of the uncertainty of a measurement result.
ment effect was rejected because the difference be- A clearly defined specification of the measurand,
tween s~ and sl was not viewed as statistically signifi- knowledge of the main sources of uncertainty and the
cant. Following the procedure indicated in [4], a stand- correct use of RMs and CRMs are the main targets for
ard measurement uncertainty of 0.410 FNU was ob- the adequate application of the ISO-GUM. Different
tained. Presented in the right column is the situation examples describing the use of CRMs and RMs empha-
where the existence of a between-instrument effect was size the importance of the appropriate certification of
accepted. Assuming that this effect was random, the es- RMs.

References

1. Buzoianu M, Duta S (1996) National 3. Buzoianu M (1999) Metrological Cali- 6. Buzoianu M, Aboul-Enein HY (1997)
System for Reference Materials in Ro- bration in Traceability. Proceedings of Accred Qual Assur 2:11-17
mania. Proceedings of Central Euro- the EURACHEM Workshop on the 7. ISO (1993) International vocabulary of
pean Conference on Reference Materi- Status of Traceability in chemistry, basic and general terms in metrology
als, Slovacia Bratislava (VIM), 2nd edn. International Organi-
2. Buzoianu M (1998) Accred Qual As- 4. ISO (1993) Guide to the expression of zation for Standardization (ISO), Gen-
sur 3: 270-277 uncertainty in measurements (GUM), eva
1st edn. ISO, Geneva 8. ISO Guide 33 (1989) Uses of certified
5. EURACHEM (1995) Guide to quan- reference materials. ISO, Geneva
tifying uncertainty in analytical meas-
urements. EURACHEM, London
Accred Qual Assur (19911) 3:115-116
© Springer-Verlag 19911

Werner Hasselbarth Uncertainty - The key topic of metrology


in chemistry

Abstract At the Second EURA- Key words Uncertainty·


CHEM Workshop on Measure- Metrology . Comparability
ment Uncertainty in Chemical Traceability . Validation . Bias
Presented at: 2nd EURACHEM
Workshop on Measurement Uncertainty Analysis the author had the pleas-
in Chemical Analysis, Berlin, ure of chairing a working group on
29-30 September 1997 chemical metrology. This note
W. Hasselbarth presents some propositions arising
Federal Institute for Materials Research from the preparation of, as well as
and Testing (BAM) from the discussion at and after,
D-12200 Berlin, Germany the working group session.
Tel.: +49306392 51161;
Fax: + 49 30 6392 5972;
e-mail: werner.haesselbarth@bam.de

There is no specific chemical metrology: Instead, as a The main objective of metrology in chemistry is known
truly horizontal discipline, metrology is applied in uncertainty of analytical results
chemistry
By focussing on purposes instead of procedures, most
Although "chemical metrology" is frequently used in of the current "buzz items" in measurement quality as-
chemical analysis, this term should be abandoned. surance such as comparability, traceability and valida-
The main reason for this recommendation is that tion are reduced to uncertainty as the primary perform-
metrology is a truly horizontal discipline, operating on ance characteristic, as follows.
uniform principles, largely independently of the parti-
cular application field. Although first developed for
physical measurements, the basic metrological terms, Comparability
concepts and procedures are applicable throughout
analytical chemistry. Nevertheless, there are specific For comparing different measurement results on the
challenges in chemical measurements such as the enor- same measurand, three basic requirements have to be
mous diversity of measurands (i. e. analyte/level/matrix fulfilled:
combinations) and the lack of direct measurement
• The uncertainties of the measurement results have to
methods, which require specific strategies.
be known.
To promote the application of metrology in chemis-
• The units used to express the measurement results
try, its basic concepts and procedures have to be made
(including uncertainties) have to be the same, or at
crystal clear, emphasizing purposes instead of proto-
least convertible.
cols.
• The measures used to express the uncertainties have
to be the same, or at least convertible.
114 W. Hasse1barth

Evidently, the last two requirements call for standardi- budget is valid. For other operating conditions a cor-
zation. For the units of physical quantities the standar- rection and/or an additional uncertainty component
dization problem was solved by establishing the Inter- are necessary.
national System of Units (SI) in 1960. The confusion • Linearity is intended to specify that range of analyte
about different uncertainty measures continued beyond content where a linear calibration function applies.
that date and was solved only recently, with the appear- Beyond this range a correction and/or an additional
ance and world-wide acceptance of the "standard un- uncertainty are necessary.
certainty" proposed by the Guide to the Expression of • Reproducibility aims at establishing a "top-down"
Uncertainty in Measurement (GUM) in 1993. The first estimate of the uncertainty of an analytical method,
requirement - known uncertainty - is still largely unset- including interlaboratory bias as an uncertainty com-
tled, although the concepts and methods of uncertainty ponent, but excluding method bias.
evaluation proposed by the GUM have paved the way
for substantial progress.
Discussion and guidance on determination and
correction of bias in analytical methods is urgently
needed
Traceability
To date analytical chemists as a rule have been content
Evaluation of measurement uncertainty according to with stating agreement or disagreement of analytical re-
the concept of the GUM basically includes two steps. In sults, obtained on a certified reference material, with
the first step, the measurement process is investigated the corresponding certified values. In cases of disagree-
for bias. If significant bias is found, a correction is ap- ment, usually no attempt is made to derive a correction.
plied. In the second step, the uncertainty on the bias Neither is the uncertainty on the correction taken into
correction is combined with the uncertainty due to ran- account, which is also necessary in cases of agreement,
dom effects to yield the overall uncertainty of the cor- because a correction factor of unity comes also with an
rected measurement. uncertainty.
Traceability serves the purpose of excluding, or of At the workshop, the topic of bias handling was
determining and correcting, significant measurement raised on many occasions, indicating an urgent demand
bias by comparison between measured values and cor- for discussion and guidance, for example on
responding reference values. Thus traceability, where
applicable, provides a firm basis for valid uncertainty • Practical traceability procedures, i. e. procedures for
statements. performing valid comparisons between measured
values and reference values and for evaluating the
comparison results
• Criteria for when to apply corrections on the basis of
Validation bias information
• Procedures for bias correction and for estimating
The performance characteristics considered in method correction uncertainty.
validation serve the purpose of specifying, for a given
The topic of bias correction will be addressed in the
analytical method, an application range with defined
forthcoming revision of the EURACHEM Guide
uncertainty. For example
Quantifying Uncertainty in Analytical Measurement.
• Specificity and selectivity are intended to specify the In practice, rigorous and complete traceability of
range of matrices where the uncertainty budget is analytical results to established references will be the
valid. For other matrices, a correction and/or an ad- exception rather than the rule. Therefore it will be an
ditional uncertainty component are necessary. important task to agree on levels of rigour and com-
• Robustness and ruggedness are intended to specify a pleteness of traceability statements required for, and
range of operating conditions where the uncertainty feasible in, specific analytical sectors.
Accred Qual Assur (1998) 3:101-105
© Springer-Verlag 1998

s. L. R. Ellison Estimating measurement uncertainty:


V. J. Barwick
reconciliation using a cause and effect
approach

Abstract A strategy is presented promotes consistent identification


for applying existing data and plan- of important effects, and permits
ning necessary additional experi- effective application of prior data
Presented at: 2nd EURACHEM
Workshop on Measurement Uncertainty ments for uncertainty estimation. with minimal risk of duplication or
in Chemical Analysis, Berlin, The strategy has two stages: iden- omission. The results of applying
29-30 September 1997 tifying and structuring the input ef- the methodology are discussed,
fects, followed by an explicit recon- with particular reference to the use
ciliation stage to assess the degree of planned recovery and precision
to which information available studies.
meets the requirement and thus
identify factors requiring further
study. A graphical approach to Key words Measurement
S. L. R. Ellison (lEI) . V. J. Barwick identifying and structuring the in- uncertainty· Validation'
Laboratory of the Government Chemist,
Queens Road, Teddington TW11 OLY, put effects on a measurement re- Reconciliation . Cause and effect
UK sult is presented. The methodology analysis

full mathematical model. Whilst there is commonality


Introduction between the formal processes involved [8], implying
that a reconciliation between the two is possible in
The approach to the estimation of measurement uncer- principle, there are significant difficulties in applying
tainty described in the ISO Guide to the expression of the GUM approach generally in analytical chemistry
uncertainty in measurement (GUM) [1] and the EURA- [4]. In particular, it is common to find that the largest
CHEM interpretation for analytical measurement [2] contributions to uncertainty arise from the least pre-
relies on a quantitative model of the measurement sys- dictable effects, such as matrix effects on extraction or
tem, typically embodied in a mathematical equation in- response, sampling operations, and interferences. Un-
cluding all relevant factors. The GUM principles differ certainties associated with these effects can only be de-
substantially from the methodology currently used in termined by experiment. However, the variation ob-
analytical chemistry for estimating uncertainty [3, 4]. served includes contributions from some, but not all,
Current practice in establishing confidence and inter- other sources of variation, risking "double counting"
comparability relies on the determination of overall when other contributions are studied separately. The
method performance parameters, such as linearity, ex- result, when using this, and other, data to inform
traction recovery, reproducibility and other precision GUM-compliant estimates of uncertainty, is substantial
measures. These are obtained during method develop- difficulty in reconciling the available data with the in-
ment and interlaboratory study [5-7], or by in-house formation required.
validation protocols, with no formal requirement for a
116 S. L. R. Ellison' V. J. Barwick

In this paper, we describe and illustrate a structured grouping of related effects where possible (step 4) are
methodology applied in our laboratory to overcome explicitly suggested [2].
these difficulties, and present results obtained using the The final stage of the cause and effect analysis re-
methodology. It will be argued that application of the quires further elucidation. Duplications arise naturally
approach can lead to a full reconciliation of validation in detailing contributions separately for every input pa-
studies with the GUM approach, and the advantages rameter. For example, a run-to-run variability element
and disadvantages of the methodology will be consid- is always present, at least nominally, for any influence
ered. Finally, some uncertainty estimates obtained us- factor; these effects contribute to any overall variance
ing the methodology are presented, and the relative observed for the method as a whole and should not be
contributions of different contributions are consid- added in separately if already so accounted for. Similar-
ered. ly, it is common to find the same instrument used to
weigh materials, leading to over-counting of its calibra-
tion uncertainties. These considerations lead to the fol-
lowing additional rules for refinement of the diagram
Principles of approach
(though they apply equally well to any structured list of
effects):
The strategy has two stages:
1. Cancelling effects: remove both. For example, in a
1. Identifying and structuring the effects on a result. In
weight by difference, two weights are determined,
practice, we effect the necessary structured analysis
both subject to the balance "zero bias". The zero
using a cause and effect diagram (sometimes known
bias will cancel out of the weight by difference, and
as an Ishikawa or "fishbone" diagram) [9].
can be removed from the branches corresponding to
2. Reconciliation. The reconciliation stage assesses the
the separate weighings.
degree to which information available meets the re-
2. Similar effect, same time: combine into a single in-
quirement and thus identifies factors requiring fur-
put. For example, run-to-run variation on many in-
ther study.
puts can be combined into an overall run-to-run pre-
The approach is intended to generate an estimate of
cision "branch". Some caution is required; specifical-
overall uncertainty, not a detailed quantification of all
ly, variability in operations carried out individually
components.
for every determination can be combined, whereas
variability in operations carried out on complete
batches (such as instrument calibration) will only be
Cause and effect analysis observable in between-batch measures of precision.
3. Different instances: re-Iabel. It is common to find
The principles of constructing a cause and effect dia- similarly named effects which actually refer to differ-
gram are described fully elsewhere [9]. The procedure ent instances of similar measurements. These must
employed in our laboratory is as follows: be clearly distinguished before proceeding.
1. Write the complete equation for the result. The pa- The procedure is illustrated by reference to a simpli-
rameters in the equation form the main branches of fied direct density measurement. We take the case of
the diagram. (We have found it is almost always nec- direct determination of the density d(EtOH) of ethanol
essary to add a main branch representing a nominal by weighing a known volume V in a suitable volumetric
correction for overall bias, usually as recovery, and vessel of tare weight Mtarc and gross weight including
accordingly do so at this stage.) ethanol Mgross. The density is calculated from
2. Consider each step of the method and add any fur-
d(EtOH) = (Mgross - Mtarc)/V
ther factors to the diagram, working outwards from
the main effects. Examples include environmental For clarity, only three effects will be considered:
and matrix effects. equipment calibration, temperature, and the precision
3. For each branch, add contributory factors until ef- of each determination. Figures 1-3 illustrate the pro-
fects become sufficiently remote, that is, until effects cess graphically.
on the result are negligible. A cause and effect diagram consists of a hierarchical
4. Resolve duplications and re-arrange to clarify con- structure culminating in a single outcome. For our pur-
tributions and group related causes. We have found pose, this outcome is a particular analytical result
it convenient to group precision terms at this stage ["d(EtOH)" in Fig. 1]. The "branches" leading to the
on a separate precision branch. outcome are the contributory effects, which include
Note that the procedure parallels the EURACHEM both the results of particular intermediate measure-
guide's sequence of preliminary operations very close- ments and other factors, such as environmental or ma-
ly; specification of the measurand (step 1), identifica- trix effects. Each branch may in turn have further con-
tion of sources of uncertainty (steps 2 and 3) and tributory effects. These "effects" comprise all factors
Estimating measurement uncertainty: reconciliation using a cause and effect approach 117

M(gross) M(tare) earity of balance response, together with the calibration


Temperature uncertainty associated with the volumetric determina-
tion.
This form of analysis does not lead to uniquely
structured lists. In the present example, temperature
----...----~-----~- d(EtOH)
may be seen as either a direct effect on the density to
Precision
/ t - - - - Calibration be measured, or as an effect on the measured mass of
material contained in a density bottle; either could
Fig. 1 Volume ·Un. = Linearity form the initial structure. In practice this does not af-
fect the utility of the method. Provided that all signifi-
cant effects appear once, somewhere in the list, the
overall methodology remains effective.
Once the cause-and-effect analysis is complete, it
may be appropriate to return to the original equation
Calibration for the result and add any new terms (such as tempera-
-~--...-t-----'-----___.-,;-,--'----+ d(EtOH) ture) to the equation. However, the reconciliation
,
, which follows will often show that additional terms are
i
-..
.. ,Calibration
"'..... ......... : adequately accounted for; we therefore find it prefera-
............. _- .... _- ...... _--- .. ; ble to first conduct the next stage of the analysis.
Fig. 2 Volume ------------------.- Precision

Same balance: Reconciliation


bias cancels
Temperature
Following elucidation of the effects and parameters in-
fluencing the results, a review is conducted to deter-
mine qualitatively whether a given factor is duly ac-
counted for by either existing data or experiments plan-
-~--.------'-----___.~---+ d(EtOH) ned. The fundamental assumption underlying this re-
view is that an effect varied representatively during the
course of a series of observations needs no further
study. In this context, "representatively" means that the
Precision influence parameter has demonstrably taken a distribu-
tion of values appropriate to the uncertainty in the pa-
Figs. 1-3 Stages in refinement of cause and effect diagram.
Fig. 1 Initial diagram. Fig. 2 Combination of similar effects. rameter in question. For continuous parameters, this
Fig. 3 Cancellation may be a permitted range or stated uncertainty; for fac-
tors such as sample matrix, this range corresponds to
the variety of types permitted or encountered in normal
affecting the result, whether variable or constant; un- use of the method. The assumption is justified as fol-
certainties in any of these effects will clearly contribute lows.
to uncertainty in the result. The ISO approach calculates a standard uncertainty
Figure 1 shows a possible diagram obtained directly u(y) in y(x,Xj ... ) from contributions u(yJ=u(xJ·ay/aXi
from application of steps 1-3. The main branches are (with additional terms if necessary). Each value of u(x;)
the parameters in the equation, and effects on each are characterises a dispersion associated with the value Xi'
represented by subsidiary branches. Note that there are The sensitivity coefficient ay/aXi may be determined by
two "temperature" effects, three "precision" effects differentiation (analytically or numerically), or by ex-
and three "calibration" effects. Figure 2 shows preci- periment. Consider an increment dXi in Xi. This will
sion and temperature effects each grouped together fol- clearly lead to a change dy in the result given by
lowing the second rule (same effect/time); temperature
(1)
may be treated as a single effect on density, while the
individual variations in each determination contribute Given the appropriate distribution f(dXJ of values
to variation observed in replication of the entire meth- of dx; with dispersion characterised by standard uncer-
od. The calibration bias on the two weighings cancels, tainty u(x;), the corresponding distribution g(dyJ of dYi
and can be removed (Fig. 3) following the first refine- will be characterised by u(y;). This is essentially the ba-
ment rule (cancellation). Finally, the remaining "cali- sis of the ISO approach [1]. It follows that in order to
bration" branches would need to be distinguished as demonstrate that a particular contribution to overall
two (different) contributions owing to possible non-lin- uncertainty is adequately incorporated into an ob-
118 S. L. R. Ellison' V. J. Barwick

served dispersion of results, it is sufficient to demon- where Ac is the peak area of the cholesterol, AB is the
strate that the distribution of values taken by the in- peak area of the betulin internal standard, Rr the re-
fluence parameter in the particular experiment is repre- sponse factor of cholesterol with respect to betulin
sentative of f(.:lx;). [Strictly, u(x;) could characterise (usually assumed to be 1.00), IS the weight of the betu-
many possible distributions and not all will yield the lin internal standard (mg), and m the weight of the
same value of u(y;) for all functions Y(Xi,Xj . .. ). It is as- sample (g). In addition, a nominal correction (1/ R) for
sumed here that either f(.:lx i ) is the particular distribu- recovery is included; R may be 1.0, though there is in-
tion appropriate to the problem, when g(Jly;) necessar- variably an associated uncertainty. If a recovery study
ily generates the correct value of U(Yi), or that including a representative range of matrices and levels
Y(Xi,Xj ... ) satisfies the assumptions justifying the first of analyte is conducted, and it includes several separate
order approximation of Ref. [1], in which case any dis- preparations of standards, the dispersion of the recove-
tribution f( Jlx;) characterised by u(x;) will generate ry results will incorporate uncertainty contributions
u(y;)]. from all the effects marked with a tick. For example, all
Following these arguments, it is normally straight- run-to-run precision elements will be included, as will
forward to decide whether a given parameter is suffi- variation in standard preparation; matrix and concen-
ciently covered by a given set of data or planned ex- tration effects on recovery will be similarly accounted
periment. Where a parameter is already so accounted for. Effects marked with a cross are unlikely to vary
for, the fact is noted. The parameters which are not ac- sufficiently, or at all, during a single study; examples in-
counted for become the subject of further study, either clude most of the calibration factors. The overall uncer-
through planned experimentation, or by locating ap- tainty can in principle be calculated from the dispersion
propriate standing data, such as calibration certificates of recoveries found in the experiment combined with
or manufacturing specifications. The resulting contribu- contributions determined for the remaining terms. Due
tions, obtained from a mixture of whole method stud- care is, of course, necessary to check for homoscedas-
ies, standing data and any additional studies on single ticity before pooling data.
effects, can then be combined according to ISO GUM
principles.
An illustrative example of a reconciled cause and ef-
fect study is shown in Fig. 4, which shows a partial dia- Results
gram (excluding long-term precision contributions and
secondary effects on recovery) for an internally stand- We have found that the methodology is readily applied
ardised GC determination of cholesterol in oils and by analysts. It is intuitive, readily understood and,
fats. The result, cholesterol concentration Cch in mg/ though different analysts may start with differing views,
100 g of material, is given by leads to consistent identification of major effects. It is
particularly valuable in identifying factors for variation
C ch -- Ac x Rr X IS X .l X 100 , (2) during validation studies, and for identifying the need
ABxm R for additional studies when whole method performance
figures are available. The chief disadvantage is that, in
focusing largely on whole method studies, only the
overall uncertainty is estimated; individual sources of
"Recovery" uncertainty are not necessarily quantified directly
"./ (though the methodology is equally applicable to for-
mal parameter-by-parameter studies). However, the
structured list of effects provides a valuable aid to plan-
ning when such additional information is required for
-----'---~---....o.--__.___ Cholesterol method development. Some results of applying this
methodology are summarised in Fig. 5, showing the re-
lative magnitudes of contributions from overall preci-
~ Covered by sion and recovery uncertainties u(precision) and u(re-
experiment" covery), before combination. "Other" represents the
~ Not covered by remaining combined contributions. That is, the pie
~
Int Std Repeatatlilily experiment"
charts show the relative magnitudes of u(precision),
VokJme
Temperature X

Internal Standard
u(recovery) and VL U(y;)2 with u(y;) excluding u(pre-
weight "See text cision) and u(recovery). It is clear that, as expected,
most are dominated by the "whole method" contribu-
Fig.4 Partial cause and effect diagram for cholesterol determina- tions, suggesting that studies of overall method per-
tion. See text for explanation formance, together with specific additional factors,
Estimating measurement uncertainty: reconciliation using a cause and effect approach 119

-
Fig. 5 Contributions to com- 0tIw AIo In PVC
bined standard uncertainty.
Charts show the relative sizes
of uncertainties associated
with overall precision, bias,
and other effects (combined).
See text for details

-
should provide adequate estimates of uncertainty for two approaches are equivalent given representative ex-
many practical purposes. perimental studies. The procedure permits effective use
of any type of analytical data, provided only that the
ranges of influence parameters involved in obtaining
Conclusions the data can be established with reasonable confidence.
Use of whole method performance data can obscure
We have presented a strategy capable of providing a the magnitude of individual effects, which may be
structured analysis of effects operating on test results counter-productive in method optimisation. However,
and reconciling experimental and other data with the if an overall estimate is all that is required, it is a con-
information requirements of the GUM approach. The siderable advantage to avoid laborious study of many
initial analysis technique is simple, visual, readily un- effects.
derstood by analysts and encourages comprehensive
Acknowledgement Production of this paper was supported un-
identification of major influences on the measurement. der contract with the Department of Trade and Industry as part
The reconciliation approach is justified by comparison of the National Measurement System Valid Analytical Measure-
with the ISO GUM principles, and it is shown that the ment Programme.

References

1. ISO (1993) Guide to the expression of 3. Analytical Methods Committee (1995) 7. ISO 5725:1994 (1995) Accuracy (true-
uncertainty in measurement. ISO, Analyst 120: 2303 ness and precision) of measurement
Geneva 4. Ellison SLR (1997) In: Ciarlini P, Cox methods and results. ISO, Geneva
2. EURACHEM (1995) Guide: Ouantify- MG, Pavese F, Richter D (eds) Ad- I{. Ellison SLR, Williams A, Accred Oual
ing uncertainty in analytical measure- vanced mathematical tools in metrolo- Assur (in press)
ment. Laboratory of the Government gy III. World Science, Singapore, pp 9. ISO 9()04-4:1993 (1993) Total quality
Chemist, London 5n-n7 management, part 2. Guidelines for
5. Horwitz W (191{1{) Pure Appl Chern quality improvement. ISO, Geneva
nO:1{55-I{M
n. AOAC (191{9) Recommendation. J As-
soc Off Anal Chern 72:n94-704
Accred Qual Assur (1991\) 3: 6-10

Stephen L.R. Ellison Measurement uncertainty and its


Alex Williams
implications for collaborative study
method validation and method
performance parameters

Abstract ISO principles of meas- suIt. Most of the information re-


urement uncertainty estimation are quired to evaluate measurement
compared with protocols for meth- uncertainty is therefore gathered
od development and validation by during the method development
collaborative trial and concomitant and validation process. However,
S.L. R. Ellison (lEI) "top-down" estimation of uncer- the information is not generally
Laboratory of the Government
Chemist,Queens Road, Teddington, tainty. It is shown that there is sub- published in sufficient detail at
Middlesex, TW11 OL Y, UK stantial commonality between the present; recommendations are ac-
A. Williams
two procedures. In particular, both cordingly made for future report-
19 Hamesmoor Way,Mytchett, require a careful consideration and ing of the data.
Camberley, Surrey, GU16 61G, UK study of the main effects on the re-

tainty estimation, and consider the extent to which


Introduction
method development and validation studies can pro-
vide the data required for uncertainty estimation ac-
One of the fundamental principles of valid, cost-effec- cording to GUM principles.
tive analytical measurement is that methodology should
demonstrably fit its intended purpose [1]. Technical fit-
ness for purpose is usually interpreted in terms of the Measurement uncertainty
required "accuracy". To provide reasonable assurance
of fitness for purpose, therefore, the analyst needs to There will always be an uncertainty about the correct-
demonstrate that the chosen method can be correctly ness of a stated result. Even when all the known or sus-
implemented and, before reporting the result, needs to pected components of error have been evaluated and
be in a position to evaluate its uncertainty against the the appropriate corrections applied, there will be un-
confidence required. certainty on these corrections and there will be an un-
Principles for evaluating and reporting measurement certainty arising from random variations in end re-
uncertainty are set out in the 'Guide to the expression sults.
of uncertainty in measurement' (GUM) published by The formal definition of "Uncertainty of Measure-
ISO [2]. EURACHEM has also produced a document ment" given by the GUM is "A parameter, associated
"Quantification of uncertainty in analytical measure- with the result of a measurement, that characterises the
ment" [3], which applies the principles in this ISO dispersion of the values that could reasonably be attri-
Guide to analytical measurements. A summary has buted to the measurand. Note (1): The parameter may
been published [4]. In implementing these principles, be, for example, a standard deviation (or a given multi-
however, it is important to consider whether existing ple of it) or the half width of an interval having a stated
practice in analytical chemistry, based on collaborative level of confidence."
trial [5-7], provides the information required. In this For most purposes in analytical chemistry, the
paper, we compare existing method validation guide- "measurand" is the concentration of a particular spe-
lines with published principles of measurement uncer- cies. Thus, the uncertainty gives a quantitative indica-
Measurement uncertainty and its implications for collaborative study method validation and method performance parameters 121

tion of the range of the values that could reasonably be standard methods are accepted and put into use by ap-
attributed to the concentration of the analyte and ena- propriate review or standardisation bodies. Since the
bles a judgement to be made as to whether the result is studies undertaken form a substantial investigation of
fit for its intended purpose. the performance of the method with respect to true-
Uncertainty estimation according to GUM princi- ness, precision and sensitivity to small changes and in-
ples is based on the identification and quantification of fluence effects, it is reasonable to expect some com-
the effects of influence parameters, and requires an un- monality with the process of uncertainty estimation.
derstanding of the measurement process, the factors in-
fluencing the result and the uncertainties associated
with those factors. These factors include corrections for Comparison of measurement uncertainty and method
duly quantified bias. This understanding is developed validation procedures
through experimental and theoretical investigation,
while the quantitative estimates of relevant uncertain- The evaluation of uncertainty requires a detailed exam-
ties are established either by observation or prior infor- ination of the measurement procedure. The steps in-
mation (see below). volved are shown in Fig. 1. This procedure involves
very similar steps to those recommended in the AOACI
IUPAC protocol [5, 6] for method development and
Method validation validation, shown in Fig. 2. In both cases the same proc-
esses are involved: step 1 details the measurement pro-
For most regulatory applications, the method chosen cedure, step 2 identifies the critical parameters that in-
will have been subjected to preliminary method devel- fluence the result, step 3 determines, either by experi-
opment studies and a collaborative study, both carried ment or by calculation, the effect of changes in each of
out according to standard protocols. This process, and these parameters on the final result, and step 4 their
subsequent acceptance, forms the 'validation' of the combined effect.
method. For example, the AOAC/IUPAC protocol [5, The AOAClIUPAC protocol recommends that
6] provides guidelines for both method development steps 2,3 and 4 be carried out within a single laboratory,
and collaborative study. Typically, method develop- to optimise the method, before starting the collabora-
ment forms an iterative process of performance evalua- tive trial. Tables 1 and 2 give a comparison of this part
tion and refinement, using increasingly powerful tests of the protocol [6] with an extract from corresponding
as development progresses, and culminating in collabo- parts of the EURACHEM Guide [3]. The two proce-
rative study. On the basis of the results of these studies, dures are very similar. Section 1.3.2 of the method vali-
Fig. 1 Fig. 2
Measurement Uncertainty
122 S. L.R. Ellison· A. Williams

Table 1 Method development and uncertainty estimation and between-batch variations. Collaborative trial is ex-
pected to randomise most of these contributions, with
Method validation I Uncertainty estimation
the exception of method bias. The latter would be ad-
1.3.2 Alternative approaches Having identified the possible dressed via combination of the uncertainties associated
to optimisation sources, the next step is to with a reference material or materials to which results
(a) Conduct formal rugged- make an approximate assess- are traceable with the statistical uncertainty associated
ness testing for identification ment of size of the contribu-
and control of critical varia- tion from each source, ex- with any estimation of bias using a finite number of ob-
bles. pressed as a standard devia- servations. Note that the necessary investigation and
(b) Use Deming simplex op- tion. Each of these separate reporting of bias and associated statistical uncertainty
timisation to identify critical contributions is called an un- (i.e. excluding reference material uncertainty), are now
steps. certainty component. recommended in existing collaborative study standards
(c) Conduct trials by changing Some of these components
one variable at a time. can be estimated from a series [7]. Where the method bias and its uncertainty are
of repeated observations, by small, the overall uncertainty estimate is expected to be
calculating the familiar statis- represented by the reproducibility standard deviation.
tically estimated standard de- The approach has been referred to as a "top-down"
viation, or by means of sub-
sidiary experiments which are view. The authors concluded that such an approach
carried out to assess the size would be feasible given certain conditions, but noted
of the component. For exam- that demonstrating that the estimate was valid for a
ple, the effect of temperature particular laboratory required appropriate internal
can be investigated by making
measurements at different quality control and assurance. Clearly, the controls re-
temperatures. This experi- quired would relate particularly to the principal factors
mental determination is refer- affecting the result. In terms of ISO principles, this re-
red to in the ISO Guide as quirement corresponds to control of the main contribu-
'Type A evaluation". tions to uncertainty; in method development and vali-
I Reprinted from The Journal of AOAC INTERNATIONAL
dation terms, the requirement is that factors found to
(1989) 72(4):694-704. Copyright 1989, by AOAC INTERNA- be significant in robustness testing are controlled within
TIONAL, Inc. limits set, while factors not found individually signifi-
cant remain within tolerable ranges. In either case,
where the control limits on the main contributing fac-
dation protocol is concerned with the identification of tors, together with their influence on the result, are
the critical parameters and the quantification of the ef- known to an individual laboratory, the laboratory can
fect on the final result of variations in these parameters; both check that its performance is represented by that
the experimental procedures (a) and (c) suggested are observed in the collaborative trial and straightforward-
closely similar to experimental methodology for evalu- ly provide an estimate of uncertainty following ISO
ating the uncertainty. Though the AOAC/IUPAC ap- principles.
proach aims initially to test for significance of change of The step-by-step approach recommended in the ISO
result within specified ranges of input parameters, this Guide and the "top down" approach have been seen as
should normally be followed by closer study of the ac- alternative and substantially different ways of evaluat-
tual rate of change in order to decide how closely a pa- ing uncertainty, but the comparison between method
rameter need be controlled. The rate of change is ex- development protocols and ISO approach above shows
actly what is required to estimate the relevant uncer- that they are more similar than appears at first sight. In
tainty contribution by GUM principles. The remainder particular, both require a careful consideration and
of the sections in the extract from the protocol give study of the main effects on the result to obtain robust
guidance on the factors that need to be considered; results accounting properly for each contribution to
these correspond very closely to the sources of uncer- overall uncertainty. However, the top down approach
tainty identified in the EURACHEM Guide. The data relies on that study being carried out during method
from method development studies required by existing development; to make use of the data in ISO GUM es-
method validation protocols should therefore provide timations, the detailed data from the study must be
much of the information required to evaluate the un- available.
certainty from consideration of the main factors in-
fluencing the result.
The possibility of relying on the results of a collabo- Availability of validation data
rative study to quantify the uncertainty has been con-
sidered [8], following from a general model of uncer- Unfortunately, the necessary data are seldom readily
tainties arising from contributions associated with available to users of analytical methods. The results of
method bias, individual laboratory bias, and within- the ruggedness studies and the within-laboratory op-
Measurement uncertainty and its implications for collaborative study method validation and method performance parameters 123

Table 2 Method performance and measurement uncertainty estimation. Note that the text is paraphrased for brevity and the numbers
in parentheses refer to corresponding items in the EURACHEM guide (column 2)

Method validation protocol' EURACHEM guide

1.4 Develop within-laboratory attributes of the optimised The evaluation of uncertainty requires a detailed examination
method of the measurement procedure. The first step is to identify
(Some items can be omitted; others can be combined.) possible sources of uncertainty. Typical sources are:
1.41 Determine [instrument] calibration function ... to de- I. Incomplete definition of the measurand (for example, fail-
termine useful measurement range of method. (X, 9) ing to specify the exact form of the analyte being deter-
1.4.2 Determine analytical function (response vs concentra- mined).
tion in matrix ... ). (9) 2. Sampling - the sample measured may not represent the
1.4.3 Test for interference (specificity): defined measurand.
(a) Test effects of impurities ... and other components 3. Incomplete extraction and/or pre-concentration of the
expected ... (5) measurand, contamination of the measurement sample,
(b) Test non-specific effects of matrices. (3) interferences and matrix effects.
(c) Test effects of transformation products ... (3) 4. Inadequate knowledge of the effects of environmental
1.4.4 Conduct bias (systematic error) testing by measuring conditions on the measurement procedure or imperfect
recoveries ... (Not necessary when method itself de- measurement of environmental conditions.
fines the property or component.) (3, 10, 11) 5. Cross-contamination or contamination of reagents or
1.4.5 Develop performance specifications ... and suitability blanks.
tests ... to ensure satisfactory performance of critical 6. Personal bias in reading analogue instruments.
steps ... (X) 7. Uncertainty of weights and volumentric equipment.
1.4.6 Conduct precision testing ... [including] ... both be- II. Instrument resolution or discrimination threshold.
tween-run (between-batch) and within-run (within- 9. Values assigned to measurement standards and reference
batch) variability. (4,6, 7, X, 12) materials.
1.4.7 Delineate the range of applicability to the matrices or 10. Values of constants and other parameters obtained from
commodities of interest. (1) external sources and used in the data reduction algorithm.
1.4.X Compare the results of the application of the method 11. Approximations and assumptions incorporated in the
with existing tested methods intended for the same measurement method and procedure.
purposes, if other methods are available. 12. Variations in repeated observations of the measurand un-
1.4.9 If any of the preliminary estimates of the relevant per- der apparently identical conditions.
formance of these characteristics are unacceptable, re-
vise the method to improve them, and retest as neces-
sary
1.4.10 Have method tried by analyst not involved in its devel-
opment. Revise method to handle questions raised and
problems encountered.

'Reprint from The Journal of AOAC INTERNATIONAL (19X9) 72(4):694-704. Copyright 19X9, by AOAC INTERNATIONAL,
Inc.

timisation of the method are, perhaps owing to their


Recommendations
strong association with the development process rather
than end use of the method, rarely published in suffi-
The results of the ruggedness testing and bias evalua-
cient detail for them to be utilised in the evaluation of
uncertainty. Further, the range of critical parameter tion should be published in full. This report should
identify the critical parameters, including the materials
values actually used by participants is not available,
within the scope of the method, and detail the effect of
leading to the possibility that the effect of permitted
variations in these on the final result. It should also in-
variations in materials and the critical parameters will
clude the values and relevant uncertainties associated
not be fully reflected in the reproducibility data. Final-
with bias estimations, including both statistical and ref-
ly, bias information collected prior to collaborative
erence material uncertainties. Since it is a requirement
study has rarely been reported in detail (though overall
of the validation procedure that this information should
bias investigation is now included in ISO 5725 [7]), and
be available before carrying out the collaborative study,
the full uncertainty on the bias is very rarely evaluated;
publishing it would add little to the cost of validating
it is often overlooked that, even when investigation of
the method and would provide valuable information
bias indicates that the bias is not significant, there will
for future users of the method.
be an uncertainty associated with taking the bias to be
In addition, the actual ranges of the critical parame-
zero [9], and it remains important to report the uncer-
ters utilised in the trial should be collated and included
tainty associated with the reference material value.
in the report so that it is possible to determine their
effect on the reproducibility. These parameters will
have been recorded by the participating laboratories,
124 S. L.R. Ellison' A. Williams

who normally provide reports to trial co-ordinators; it validation study would considerably reduce the work
should therefore be possible to include them in the fi- involved.
nal report.
Acknowledgements The preparation of this paper was supported
Of course there will frequently be additional sources under contract with the Department of Trade and Industry as
of uncertainty that have to be examined by individual part of the National Measurement System Valid Analytical Meas-
laboratories, but providing this information from the urement (V AM) Programme [10].

References

1. Sargent M (1995) Anal Proc 3. EURACHEM (1995) Quantifying un- 7. ISO (1994) ISO 5725:1994 Precision
32:201-202 certainty in analytical measurement. of test methods. ISO, Geneva
2. ISO (1993) Guide to the expression Laboratory of the Government 1\. Analytical Methods Committee
of uncertainty in measurement. ISO, Chemist, London. (1995) Analyst 120:2303-2301\
Geneva, Switzerland, ISBN 0-94X926-01\-2 9. Ellison SLR, Williams A (1996) In:
ISBN 92-67-101X8-9 4. Williams A (1991) Accred Qual Parkany M (ed) The use of recovery
Assur 1: 14-17 factors in trace analysis. Royal Socie-
5. Horwitz W (1995) Pure Appl Chern ty of Chemistry, London
67:331-343 10. Fleming J (1995) Anal Proc 32:31-32
6. AOAC recommendation (191\9) J As-
soc Off Anal Chern 72:694-704
Accred Qual Assur (1997) 2:1X()-lX5
© Springer-Verlag 1997

I1ya Kuselman Uncertainty in chemical analysis and


A vinoam Shenhar
validation of the analytical method:
acid value determination in oils

Abstract Quantifying uncertainty umes, instrument readings, or oth-


in chemical analysis, according to er parameters like molecular
EURACHEM document (1995), is masses. This difference requires
based on known relationships be- the harmonization of parameters to
tween parameters of the analytical be validated and to be included in
procedure and corresponding re- the uncertainty calculation. As an
sults of the analysis. This determin- example, results of the uncertainty
istic concept is different from the calculation and validation are dis-
cybernetic approach to analytical cussed for a new method of acid
method validation, where the value determination in oils by pH
whole analytical procedure is a measurement without titration.
I. Kuselman (IBJ) . A. Shenhar "black box". In the latter case,
The National Physical Laboratory of analytical results only are the basis Key words Uncertainty of
Israel, Danciger A Building, Givat Ram, for statistical characterization of measurements . Analytical method
Jerusalem 91904. Israel
Tel.: + 972-2-0530534;
the method without any direct re- validation . Acid value
Fax: + 972-2-0520797; lationship with intermediate meas- determination . Oils . pH
e-mail: freddy@vms.huji.ac.il urement results like weighings, vol- measurement

what is being measured, including the relationship be-


Introduction
tween the measurand and the parameters, (2) identifi-
cation of uncertainty sources for each parameter in this
The application to chemistry of the ISO Guide to the relationship, (3) quantifying uncertainty components
Expression of Uncertainty in Measurement [1] was is- associated with each potential source of uncertainty
sued by EURACHEM [2] in 1995. The EURACHEM identified, and (4) calculation of total uncertainty as a
document preserved the basic deterministic concept of combination of the quantified uncertainty components
the Guide [1] that the relationship between interme- [3].
diate measurement results or parameters and the value Uncertainty components can be estimated from rele-
of the measurand can be simple or complex, but that vant previous information (for example, the tolerance
each value is determined by a combination of these pa- of volumetric glassware as provided by the manufactur-
rameters only. For example, for the measurement of a er's catalogue or a calibration certificate), from special
concentration Ce = mel V e, where me is mass and Ve is experimental work (test of some parameters), or by us-
volume, the intermediate parameters are me and Ve. ing the judgement of an analyst based on his experience
Therefore, for the calculation of the uncertainty in Ce it [2]. Thus, total uncertainty evaluation is possible with-
is enough to know the uncertainties in me and Ve. It is out any analytical experiment, i.e. from theoretical
recommended to break down more complex relation- analysis with "pen and paper" only. Naturally, the re-
ships into a combination of simpler ones. In general, sult of this evaluation depends on the model of the
the uncertainty estimation process according to [2] is measurement defined by the specification of the analy-
simple and includes (1) specification, i.e. a statement of tical procedure, identified uncertainty sources, and "de-
126 I. Kuselman . A. Shenhar

Table 1 The main A V uncer- Symbol of Value Uncertainty component


tainty components in the new source
method [10] of A V determina-
Standard deviation Relative standard
tion in oils by pH measure-
ment without titration deviation

Cst O.S M 0.00022 M O.0004S


m 0.1-40 g O.OOOOR7 g O.OOOR7-O.0000022
V" O.OS-O.4 mL O.OOOS-O.004 mL 0.01
ApH O.2S-0.3S 0.014 O.OSO....().040
M KOH So.10SM 0.0023 0.000041

gree of belief" in the judgement of the analyst and oth- then a second pH (pH2) is measured after the addition
ers as in all theoretical results in science. of standard acid (for example, HCI) to the system. Acid
The principles of the analytical methods validation value is calculated according to the following formula:
[4-7], in contrast to those described above, are based
AV=MKOHCsl Vst /(lO<lPH-l)m (mg KOH/g oil) (1)
on the cybernetic approach, which considers the whole
analytical procedure as a "black box". In this case, only where MKoH is the molecular mass of KOH (g), CSI is
the results of analysis can serve as the data for statisti- the concentration of the standard acid (mollL), VSI is
cal characterization of the analytical method without the volume of the standard acid added (mL),
any direct relationship with intermediate measurement ApH=pH\ -pH2' and m is the mass of the oil sample
results such as weighings, volumes, instrument read- (g).
ings, or other parameters such as molecular masses. The main A V uncertainty components discussed be-
Moreover, from the Horwitz function [8] it follows that low are presented in Table 1.
the standard deviation of analytical results arising from
random errors is practically independent of the specifi- Preparation of the standard acid (HCI) solution
cation of the analytical method.
The characteristics of the method used for the vali- A Titrisol (Merck, Germany) solution of HCI contain-
dation (validation parameters) such as repeatability, re- ing mHCI = 18.230 g HCI is used to prepare Cst = 0.5 M
producibility and accuracy are certainly correlated with HCI of volume V = 1000 mL. The volumetric flask used
the uncertainty of the analytical results. In particular, for the solution preparation has the volume
the combined uncertainty arising from random effects 1000 ± 0.4 mL at 20°C (DIN A, Superior, Germany).
cannot be less than the repeatability [2]. The appropriate standard deviation of the calibrated
Obviously, the use of the uncertainty calculation of volume (a rectangular distribution [1, 2]) is 0.41
ref. [2] for the definition of the quality of analytical y3= 0.23 mL. Since the difference between the actual
data should be harmonized with the concepts and prac- temperature and the flask calibration temperature is
tices of the method validation as well as quality control, - 3°C (with 95% confidence), at volume coefficient of
proficiency testing, and certification of reference mate- water expansion 2.1 x 10 -41°C, the possible volume
rials [9]. variation is 1000 x 3 X 2.1 X 10 -4 = 0.63 mL, and the cor-
In the present paper, the results of the uncertainty responding standard deviation is 0.63/1.96 = 0.32 mL.
calculation and validation are discussed for a new The standard deviation of the flask filling is less than
method of acid value determination in oils by pH meas- 113 of the standard deviations for calibration and tem-
urement without titration developed in our laboratory perature variations (mentioned above) and is thus ne-
[10]. gligible. Combining the two contributions of the uncer-
tainty u(V) we have u(V)IV=V(0.23 2 + 0.32 2) /1000
=0.00039.
The calculation of the uncertainty in the method by pH The concentration of HCI is Cst = mHClI MHCI V,
measurement where MHCI is the molecular mass of HCl. The manu-
facturer of the HCI solution indicates a possible devia-
In this method [10], an oil sample is introduced into the tion of its titer of 0.02% 10c. Taking a possible temper-
reagent consisting of triethanolamine dissolved in a ature variation in the manufacturer's laboratory of
mixture of water and isopropanol. First, a conditional _2°C (with 95% confidence), the standard uncertainty
pH (pHI) in the reagent-oil system is measured I and of mHCI is u(mHCI) =18.230 X 0.02 x2/(100 x 1.96)
=0.004 g, and u(mHCI)lmHCI=0.00022.
) Because the measurements of pH are performed with an aque- The standard uncertainty of the molecular mass of
ous reference electrode calibrated by aqueous buffer solutions, HCI, according to IUP AC atomic masses and rectangu-
the results of measurements are conditional [10] lar distribution [2], is u(MHCI) =0.000043.
Uncertainty in chemical analysis and validation of the analytical method: acid value determination in oils 127

Since U(MHC:I)/MHCI is negligible in comparison with ard uncertainty of the molecular mass of KOH, accord-
u(V)/V and u(mHCI)/mHCh the relative standard uncer- ing to IUPAC data and rectangular distribution, is
tainty is u(Ct)/Ct=V(0.00039 2 +0.00022 2) =0.00045. U(MKOH)/ M KOH =0.000041, which is less than 113 of
the standard uncertainty of any component in Eq. 1, for
example u(CsJ/CSI =0.00045. In their turn, U(CI)/Cst
Weighing and transfer of an oil sample to the reagent and u(m)/m are negligible in comparison with the rela-
tive standard uncertainty in the standard acid addition
The final mass of an oil sample is the difference in mass u(Vst)/Vst =0.01. The latter is also a negligible compo-
between a beaker with the sample and the empty beak- nent of the uncertainty of A V determination, since pH
er (after transfer of oil to the reagent). In the range up measurement is the dominant source of u(A V)/ A V
to 50 g, by analogy with ref. [2], u(m) = 0.000087. Since (see Table 1). Therefore, after the logarithmic differen-
for different A Vs the recommended sample mass is tiation of Eq. 1 only the following remains valid:
from 0.1 to 40 g, u(m)/m values are from 0.00087 to u(AV)/ AV =u(10Ll.pH -1)/
0.0000022. (10Ll.pH -1) = 10Ll.pH X 2.30 X u(dpH)/(10Ll.pH -1) (2)
Therefore
Measurement of pHI u(AV)/ AV =0.032/(1-1/10Ll.PH) (3)
From the relationship of Eq. 3 illustrated in Fig. 1 for
After mixing the system "oil-reagent", pHI is measured
the dpH range 0.1-1.0, it is clear that dpH<0.2Ieads to
with a pH meter pHM 95 (Radiometer, Denmark), and
an essential increase in the A V uncertainty. At
the standard uncertainty of pH reading u (pH) = 0.01.
dpH > 0.4, the amount of standard acid added may ex-
ceed 3 times the sum of the free fatty acids in the oil
sample. This acid addition may cause (1) pH2 to deviate
Addition of HCI to the "oil-reagent" system from the linear range of pH versus A V or (2) a signifi-
cant change in the concentration of the free form of
A recommended standard addition for samples with triethanolamine in the reagent, which is inadmissible
different A V is 0.05-0.4 mL of 0.5 M HCI; this volume [10]. Therefore the recommended dpH range is 0.25-
should be negligible in comparison with the volume of 0.35, and corresponding values of u(A V)/ A V are 0.07-
the reagent -50 mL. For transfer of the acid to the 0.06. The expanded uncertainty U (A V)/ A V = k
"oil-reagent" system, a mechanical hand pipette (Gil- u(A V)/ A V is 0.14-0.12, coverage factor k being 2. This
son, France) was used with a relative standard uncer- uncertainty is higher than in ref. [13], where some addi-
tainty of u(Vst)/Vst =0.01 according to the manufactur- tional simplifications were made.
er's information. N ate, the interference of atmospheric CO 2 (due to
the reaction with triethanolamine) is not taken into ac-
count.
Measurement of pH2 and calculation of dpH Since there are no general criteria for evaluation of
expanded uncertainty values, it is worth while to com-
After mixing the system with HCI added, pH2 is mea- pare the values obtained with corresponding ones for
sured under the same conditions as those for pHI (both the standard titrimetric method of A V determination in
measurements performed within 2-3 min) with the oils [14].
same uncertainty of pH reading. The expanded uncer-
tainty of the pH measurements can reach [11] 0.05 or
even [12] 0.1, but in our case the standard uncertainty 0.15
of the difference between the two measurements is
0.13
caused only by repeatability factors. Thus, only the un-
certainty of reading is important, and u(dpH) > 0.11
~
=V2xu(pH) =0.014, which is less than would be ex-
~ 0.09
pected from [11] and [12]. '5"
0.07

0.05
Calculation of acid value
0.03 '--~~---'~----'~~~-
0.05 0.25 0.45 0.65 0.85 1.05
The acid value calculation is performed using Eq. 1. In /lpH
this equation, all sources of uncertainty were described
by us with the exception of M KOH . The relative stand- Fig. 1 Dependence of A V uncertainty on .ipH
128 I. Kuse1man . A. Shenhar

the burette (0.017 mL) is 0.0098 mL. Thus, the maxi-


The calculation of the uncertainty in the titrimetric
mum value of u(V~t)IV~t may be 0.013 if C KoH "",O.l
method
mollL, and the corresponding V~t "'" 1 mL.
The standard titrimetric method [14] consists of disso- The uncertainties u(Cst)/Cst and U(VkoH)IVkoH are
negligible in comparison to u(V!t)IV~t; therefore
lution of an oil sample in a mixed solvent (diethyl ether
+ ethanol) and subsequent titration of the free fatty u( CKOH)/CKOH =u(V~t)IV!t =0.013.
acids contained in the sample against ethanolic potas-
sium hydroxide solution in the presence of phenol-
A V determination
phthalein. The acid value is
(4) The standard [14] recommends the use in Eq. 4 of the
rounded KOH molecular mass 56.1 instead of the com-
where VKOH is the volume (mL) of potassium hydrox-
plete value M KoH =56.10564. Hence, in this case
ide solution used and C KOH is the concentration (moll
U(MKOH)IMKOH=0.00564/(v'3 x56.1) =0.00006.
L) of this solution. By analogy with the uncertainty cal-
For the free fatty acids titration against KOH, we
culation for titrimetry in ref. [2], the following enlarged
used the 5-mL burette described earlier; therefore
steps of the analysis are examined (the corresponding
main uncertainty components are given in Table 2). U(VKOH)IVKOH =u(V~t)IV~t =u( CKOH)/CKOH =0.013.
The uncertainty of oil sample weighing permitted by
the standard [14] can be calculated from the table of
Determination of C KOH accuracy versus mass m. The maximum value of the ra-
tio (accuracy/mass) allowed in [14] for m =2.5 g is 0.011
The exact concentration of the ethanolic potassium hy- 2.5 =0.004. Using rectangular distribution, we have
droxide solution (0.1 or 0.5 mollL [14]) is established u(m )Im = 0.0041v'3 = 0.0023.
before its use by titration against the standard HCI so- It is clear that the uncertainties of the molecular
lution. Therefore, CKOH=Cst V~tIVkc)H, where V~t is the mass of KOH and of oil sample weighing are negligible
volume (mL) of the standard HCI solution used for ti- here just as in the previous method based on pH meas-
tration of the volume VkoH (mL) of the KOH solu- urement. After logarithmic differentiation of Eq. 4 only
tion. U(VKOH)IVKOH and U(CKOH)/CKOH remain valid (see
As shown above, u (Cst)1 Cst = 0.00045. Table 2), and u(AVt)/AVt =O.Ol8. The expanded un-
For transfer of an aliquot of the KOH solution to certainty is U(AVt)/AVt=k u(AVt)/AVt =0.04, using
the titration vessel, a glass pipette of volume coverage factor k = 2.
5±0.01 mL (Bein Z. M., Israel) is used. Taking a possi- Note, the detection of the end point of the titration
ble temperature variation of ±3°C with 95% confi- is a dominant source of uncertainty. If, for example, the
dence and repeatability of filling the pipette (standard commercial burette used has a drop size of 0.043 mL,
deviation) 0.0033 mL, one can calculate U(VkoH)1 the expanded uncertainty will increase to 0.07. More-
VkoH = 0.0015. over, the color of the oils and the possible change in the
The titration is accomplished using a 5-mL micro- indicator behavior near the end point in the oil-solvent
burette graduated in O.01-mL divisions (Bein Z. M., Is- mixture (in comparison to water) are not taken into
rael; calibration accuracy of ±0.01 mL), as the burette consideration. The same relates also to the influence of
recommended in the standard [14], having a capacity of atmospheric CO2 on C KOH .
10 mL and graduated in O.l-mL divisions, is not suita- If we accept that the uncertainty in the standard ti-
ble for a low AV. The possible temperature variation is trimetric method (0.04) is one third of that in the new
the same as that mentioned above, the standard devia- method by pH measurement (0.12-0.14), the titrimetric
tion of filling is 0.0033 mL, and the standard deviation method can be used as a "true" for the new method
of end point detection arising due to the drop size of validation.

Table 2 The main A V uncer-


tainty components in the
Symbol of Value Uncertainty component
standard titrimetric method
source
[14] of AV determination in
Standard deviation Relative standard
oils
deviation

0.1 M 0.0013 M 0.013


1 mL 0.013 mL 0.013
2.5 g 0.0057 g OJ)023
56.1 0.0033 0.00006
Uncertainty in chemical analysis and validation of the analytical method: acid value determination in oils 129

Table 3 Evaluation of some validation parameters for the new


Validation of the method by pH measurement method [10] of A V determination in oils by pH measurement
without titration
For the validation, five kinds of vegetable oils with
three different A V values were used (Table 3). Oils Oil "True" AV 5, 52 Bias t o.Y5· 52

with two higher A V values were prepared by adding to Sunflower 24.8 0.022 0.016 -0.010 (J.()45
the initially purchased oils (with minimal A V) the 1.51 0.025 0.008 - 0.021 OJl23
known amounts of oleic acid. These oils were analyzed 0.055 0.042 0.041 -0.053 0.112
by the validated method: four replicates for each sam- Soya 24.9 0.024 0.014 -0.019 0.039
ple daily, during a period of five days [15]. Results of 1.60 0.025 O.ot7 -(Ul32 0.047
the A V determination by the standard titration method 0.107 0.036 0.042 -0.082 0.117
(average from ten replicates) were used as "true" or as- Maize 23.4 0.014 0.012 -0.006 0.034
signed [9] values. Thus, the whole experiment consisted 1.57 0.016 (l.otl -0.016 (um
0.096 0.047 0.014 -0.022 0.038
of 5 X 3 X 4 X 5 = 300 A V determinations by the vali-
dated method and 5 X 3 X 10 = 150 determinations by Canola 22.4 0.023 0.012 - 0.023 n.033
1.57 0.029 0.009 -0.010 0.025
the standard method. Corresponding statistical data are 0.063 0.026 0.034 -0.066 0.090
given in Table 3. -(UI04 (l.OO9
Olive 23.7 O.ot 3 n.OO3
Average values of the relative standard deviation SI 6.62 (U1l9 OJ 104 - 0.008 0.012
of replicates (within a day - repeatability) and values of 0.579 (U1l6 0.016 -0.029 0.045
the daily relative standard deviation S2 (within a week-
reproducibility or intermediate precision [7]) for all the
samples satisfy Horwitz's criterion [4]. The relative bias
of the average result by pH measurement for each oil
sure that assumptions made during this calculation (see,
with respect to the corresponding "true " value is less
for example, the notes) were admissible.
than to<J5 S2 (to.<Js is Student's coefficient at 95% level of
On the other hand, the comparison of the uncertain-
confidence), and consequently satisfies Student's crite-
ties of the new method by pH measurement with the
rion.
standard titrimetric method cleared up the possibility
of using the latter as a source of "true" values in the
validation process. Moreover, although judgement on
the acceptability of the analytical method is based to-
Discussion day on the final validation report [7], the uncertainty
quantified by ref. [2] may be useful at an earlier stage
Comparing the combined uncertainty arising from ran- before experiments are carried out.
dom effects [u(AV)/AV =0.06-0.07] with SI values, For our example, the advantages of the new method
one can see that it is no less real than the repeatability, based on pH measurement are simplicity, rapidity, low-
as is required in ref. [2]. Also, from Table 3 it follows cost instruments, and suitability for automation [10]. It
that for all oil samples S2 < u(A V)/ A V, i.e. the com- can be applied for on-line quality control of oils and
bined uncertainty is not less than the reproducibility regulation of the extraction (from oil seeds) and refin-
(intermediate precision) too. ing processes. Its expanded uncertainty is satisfactory
Bias values characterizing the accuracy [4] of the for practical purposes because there are no two species
method (its trueness [16]) are less than the expanded or grades of oil with A V values differing by less than
uncertainty U(A V)/ A V = 0.12-0.14 even for minimal 12-14%.
A V, where the bias values are naturally the highest. Thus, the quantification of uncertainty by ref. [2] is a
The uncertainty of the A V determination evaluated suitable instrument for planning or forecasting method
from the data collected in Table 3 by the scheme pro- applications. Therefore, relationships between the un-
posed in ref. [9] (as a root of the sum of the variances certainty and validation parameters should be analyzed
caused by S" S2 and the bias) is also less than U(A V)/ in all possible aspects.
A V, the maximum value obtained being 0.10. This val-
ue may be higher when the evaluation is complete, Acknowledgements The authors express their gratitude to Prof.
Va. I. Tur'yan, Prof. E. Schoenberger and Dr. O. Yu. Berezin for
since at present we still have no estimation of interlabo- helpful discussions.
ratory deviations in the A V determination, i.e. repro-
ducibility of the method.
As shown above by statistical criteria, the values of
the validation parameters can be accepted as satisfacto-
ry. Their comparison with the results of the uncertainty
calculation according to ref. [2] allows us only to be
130 I. Kuselman . A. Shenhar

References

1. ISO (1993) Guide to the expression o. Hokanson GC (1994) Pharmaceut 13. Tur'yan Ya I, Ruvinsky OE, Sharudi-
of uncertainty in measurement, 1st Technol IX: llX-130; 92-100 na SYa (1991) J Anal Chern (in Rus-
edn. ISBN 92-07-101XX-9, Geneva 7. Green JM (1996) Anal Chern sian):917-925
2. EURACHEM (1995) Quantifying un- oX:305A-309A 14. International Standard ISO 000
certainty in analytical measurement, X. Boyer KW, Horwitz W, Albert R (19X3) Animal and Vegetable Fats
1st edn. ISBN 0-94X920-0X-2, Ted- (19X5) Anal Chern 57:454-459 and Oils - Determination of Acid
dington 9. Analytical Methods Committee Value and of Acidity, 1st edn., Swit-
3. Williams A (1990) Accred Qual As- (1995) Analyst:2303-230X zerland
sur 1 :14-17 10. Tur'yan Ya I, Berezin OYu, Kusel- 15. Berezin OYu, Kogan L, Tur'yan Ya
4. AOAC (1993) AOAC Peer-verified man I, Shenhar A (1996) J Amer Oil I, Kuselman I, Shenhar A (1996) The
methods program. Manual on policies Chern Soc 73: 295-301 Proceedings of the Eleventh Interna-
and procedures, Gaithersburg 11. Danish Standard DS 2X7 (197X) Van- tional Conference of the Israel Socie-
5. Accreditation for Chemical Laborato- dundersogelse pH (Water Analysis, ty for Quality, Nov. 19-21, Jerusalem,
ries. Guidance on the interpretation pH), 2nd edn pp 530-53X
of the EN 45000 series of Standards 12. Jensen H, Nielsen L (1994) Uncer- 10. Pocklington WD (1991) In: Rossell
and ISOIIEC Guide 25 (1993) WE- tainty of pH Measurements. Report JB, Pritchard JLR (eds) Analysis of
LAC Guidance Document No. of Danish Institute of Fundamental Oilseeds, Fats and Fatty Foods, Else-
WGD2/EURACHEM Guidance Metrology DFM-94-R24 on Nordtest- vier, London, pp 1-3X
Document No.1, 1st edn., Tedding- project No. 1194-94, Lyngby, Den-
ton mark
Accred Qual Assur (2002) 7: 182-188
001 1O.1007/s00769-002-0447-1

© Springer-Verlag 2002

Paolo de Zorzi A practical approach to assessment


Maria Belli
Sabrina Barbizzi of sampling uncertainty
Sandro Menegon
Andrea Deluisa

Abstract The paper reports the ap- Keywords Soil sampling·


proach followed in the SOILSAMP Uncertainty· Reference sampling·
project, funded by the National En- Intercomparison . Sampling device
Presented at EUROLAB/EURACHEM vironmental Protection Agency
Workshop "Sampling",
5-6 November 2001, Lisbon, Portugal
(ANPA)of Italy. SOILSAMP is
aimed at assessing uncertainties as-
sociated with soil sampling in agri-
cultural, semi-natural, urban, and in-
P. de Zorzi (~) . M. Belli· S. Barbizzi dustrial environments. The uncer-
Agenzia Nazionale per la Protezione tainty assessment is based on a bot-
delI' Ambiente (ANPA) -
Unita Interdipartimentale di Metrologia
tom-up approach, according to the
Ambientale. Via Vitaliano Brancati 48. Guide to the Expression of Uncer-
00144 Rome, Italy tainty in Measurement published by
e-mail: dezorzi@anpa.it the International Organization for
Tel.: +39-06-5007-2086/2952 Standardization (ISO). A designated
Fax: +39-06-5007-2313
agricultural area, which has been
S. Menegon . A. Deluisa characterized in terms of elemental
Ente Regionale per 10 Sviluppo Agricolo
del Friuli Venezia-Giulia (ERSA),
spatial distribution, will be used in
Via Sabbatini 5, future as a reference site for soil
33050 Pozzuolo del Friuli (UD), Italy sampling intercomparison exercises.

Introduction piing when there is a fair chance of being severely bit-


ten" [3]. This statement appears justifiable because there
Over the past few years, a large effort has been made to is:
improve analytical techniques, laboratory practices and
- An attitude of overlooking sampling,. which is still
procedures, and reduce sources of uncertainty during
circulating the scientific community.
laboratory operations. Measurement uncertainty is usual-
- A lack of specific guidance to quantify uncertainty as-
ly well characterized, understood and controlled by labo-
sociated with sampling; measurement uncertainty is
ratory quality assurance and quality control procedures
generally considered only after the sample has been
[1]. However, uncertainty associated with sampling and
received in the laboratory for analysis.
sample preparation has not yet been fully taken into con-
- A difficulty in managing and transferring concepts
sideration and the principles to assess uncertainty associ-
linked to sampling uncertainty to practical problems
ated with this phase of environmental monitoring are
(i.e. the assessment of the effect of the application of
rarely applied. ISO/lEe 17025 reports that sampling is a
sewage sludge on agricultural land, or the classifica-
factor to be considered as a contributor to the total un-
tion of contaminated land).
certainty of measurement [2].
Thompson states that: "there is an understandable Nevertheless, a few exercises and studies on soil sam-
lack of enthusiasm for rousing the sleeping dogs of sam- pling uncertainty evaluation have recently been carried
132 P. de Zorzi et al.

out. Studies on soil sampling have mainly addressed pre- erence site for national and international intercomparison
cision and bias [4-11]. Sampling intercomparison exer- exercises.
cises have confirmed the contribution of different sam- This paper reports the methodological approach and a
pling protocols and devices [12] to the variability of the description of the first experimental activities performed
final analytical data, However, the best way of evaluat- in the framework of SOILSAMP, including a few pre-
ing the combined measurement uncertainty, which in- liminary considerations.
cludes sampling, is still under discussion. In the case of
soil, there is no doubt that an assessment of the contribu-
tion of sampling operations on the overall measurement Methodological approach
uncertainty is necessary to completely understand the
meaning of the analytical results [13, 14]. The evaluation of uncertainty associated with sampling
On the basis of these considerations, the National activities is based on a methodological approach includ-
Environmental Protection Agency of Italy (ANPA) ing the identification of the different sources of uncer-
has funded a project for the "Assessment of the uncer- tainty attributable to sampling procedures, the character-
tainty associated with the soil sampling in agricultural, ization of the sampling site (reference sampling) in terms
semi natural, urban and contaminated environments of trace element spatial distribution and the intercompar-
(SOILSAMP)". The project covers a three-year period ison exercise.
from2001 t02003 and involves collaboration with an Ex-
pert Advisory Group (EAG) composed of experts from
national and international institutions. The following In- Identification of uncertainty sources
stitutions are represented in the SOILSAMP EAG:
The combined uncertainty of analytical results u(r) in-
- National Environmental Protection Agency - ANPA cludes uncertainties associated with sampling u(s), sam-
(Italy) ple reduction u(rd) and analysis u(a). In the following
- International Union of Pure and Applied Chemistry - equation (Eq. I) the relationship between the above re-
IUPAC (United States) ported uncertainties (combined standard uncertainty) is
- International Union of Radioecology - IUR (Bel- given:
gium)
- Netherlands Energy Research Foundation - ECN
u(r) = -Ju(s)2 + u(rd)2 + u(a)2 (I)
(The Netherlands) The principles of EURACHEM/CITAC Guide [15] indi-
- Ente Italiano Nazionale di Unificazione - UNI (Italy) cate several steps to assess the uncertainty associated
- Universita Cattolica del Sacro Cuore di Piacenza, with an analytical process: a) specification of the measu-
"Istituto di Chi mica Agraria ed Ambientale - ICAA", rand, b) identification of the uncertainty sources, c)
(Italy) quantification of the uncertainty components, d) calcula-
- Universita di Pisa, Area della Ricerca CNR "Istituto tion of the combined uncertainty.
di Chimica del Terreno" (Italy) The EURACHEM approach requires a clear defini-
- Universita di Perugia, "Dipartimento di Scienze tion of the measurand and a quantitative expression of
Agro-ambientali e della Produzione Vegetale - DiSA- the relations existing between the value of the measurand
ProV", (Italy) and the parameters affecting its value. The parameters
- University of Barcelona, "Dipartimento Qufmica have to be identified; they can be other measurands,
Analftica" (Spain) quantities not measurable, or constants.
- University of Utrecht, Faculty of Geographical Sci- The first phase of the SOILSAMP project has been
ence, "Utrecht Centre for Environment and Land- devoted to the identification of the significant sources of
scape Dynamic" (The Netherlands) uncertainty linked to soil sampling. To this end, a cause-
- Regional Environmental Protection Agencies - ARPA effects diagram has been used. The diagram (sometimes
within the framework of the Centro Tematico Nazion- called fish-bone) easily shows the parameters considered
ale - Suoli e Siti Contaminati, CTN-SSC (Italy) and how they relate to each other. The fish-bone permits
- Ente Regionale per 10 Sviluppo Agricolo del Friuli visualization of the different sources of uncertainty
Venezia-Giulia - ERSA (Italy) avoiding over-counting.
- Dr. Herbert Muntau (Germany).

SOILSAMP is aimed at: i) the assessment of uncertain- Characterization of the sampling sites
ties associated with soil sampling in different environ- (reference sampling)
ments, based on trace element concentration measure-
ment in soil; ii) the characterization, in terms of trace el- Reference sampling is aimed at the characterization of
ement spatial variability, of a site to be qualified as a ref- the sampling site in terms of element spatial distribution:
A practical approach to assessmentof sampling uncertainty 133

it allows assessment of the element concentrations at any ance due to sampling, measurement and other unex-
point of a field with known uncertainty. plained spatially uncorrelated sources of variance.
In order to be used as a reference sampling site, the
site first has to be characterized for long- and short-range
spatial variation of trace element concentrations in the Trace element determination
soil. The long-range spatial variation is assessed by sub-
dividing the sample site into sub-areas of the same size. Trace element measurement in all samples is carried out
The same number of single soil samples is collected using instrumental neutron activation analysis (INAA).
from each sub-area. The samples are then pooled to give This technique achieves high precision levels and re-
a composite sample. The comparison of trace element quires little or no sample processing prior to analysis.
concentrations between the composite soil samples al- This analytical technique also eliminates uncertainty as-
lows evaluation of the long-range spatial variation. sociated with sample processing [18-21]. To rule out
The short-range spatial variation of trace element variabilities eventually caused by different analytical
concentration in the soil is assessed by comparing the laboratories, a single laboratory, following a predefined
analytical results obtained from single soil samples col- analytical protocol, performs all the analysis.
lected from randomly selected sub-areas. The number of
sub-areas to be considered for single sampling depends
on the expected spatial variability of the trace elements Field sampling exercise
considered.
The selection of the reference site must fulfil some The SOILS AMP project foresees the evaluation of the
minimal requirements, such as representative size, het- sampling uncertainty in four different environments: ag-
erogeneity, easy access, and a suitable trace element gra- ricultural, semi-natural, urban and contaminated sites.
dient within the site. The agricultural site (10,000 m2) has a regular shaped
and is characterized by the presence of three sub-areas
with different gravel content. These two conditions com-
Intercomparison sampling exercise ply with the pre-requisites of representative size and of a
structural heterogeneity of the soil. The site is a research
The intercomparison exercise is intended to assess the field belonging to a public scientific institution which is
uncertainty component attributable to different sampling easily accessible at any time, and where any accidental
devices. or unauthorized use can be prevented (i.e. spreading of
The trace element concentrations in soil samples col- unknown substances, transit of vehicles such as tractors).
lected at different point locations differ as a result of spa- The present and past land use of this site are known.
tial variation, effects of soil sampling, sample reduction, Considering that, generally, agricultural fields are not
and laboratory analyses. The final aim of the project is characterized by high spatial variability of trace ele-
not the evaluation of spatial variability of trace elements, ments, a spot-wise addition of fertilizer was performed,
but the assessment of the contribution of sampling to the to produce a well-marked analyte gradient within the test
uncertainty associated with the analytical data. To this site. The fertilizer containing 46% P20 S was added man-
end, the spatial variation must be accounted for, and sub- ually to two triangular-shaped areas, of about 50 m2
sequently eliminate. The regionalized variable theory each. The quantity of fertilizer was sufficient to increase
[16, 17] assumes that samples collected at locations close the concentration of phosphate in the first 5 cm of the
to each other are on average, more similar than samples top soil by about one order of magnitude.
collected further away from each other. Accordingly, the The agricultural test site has been divided into lOx
spatial variation of an attribute is assumed to be the sum 10m sub-areas. Figures 1 and 2 report the grid sampling
of three components: a) a structural component, having a points selected in the sampling exercises.
constant mean or trend, b) a spatially correlated compo- To assess the long-range spatial variation of trace ele-
nent, and c) an uncorrelated random noise. The spatially ments in each sub-area, samples were taken using an
correlated and noise terms are encapsulated in an experi- Edelman auger (20 cm length, 7 cm diameter) at a 2 m
mental variogram, plotting the experimental semi- distance from each other, after removing any surface
variance as a function of sampling distance. The experi- vegetation, resulting in 25 soil samples. These samples
mental semi-variance is estimated from the sample data were pooled and processed to give one composite sam-
and its value at zero distance is called the nugget. Theo- ple. The sampling device was cleaned after sampling
retically, the semi-variance should be zero at zero dis- each sub-area.
tance, but short-distance variation and other sources of To assess the short-range spatial variation of trace ele-
uncertainty make a positive value of the semi-variance. ments, on the hypothesis that the spatial variability of
The nugget is an estimation of the spatially uncorre- trace element is comparable between the different sub-
lated noise component mentioned above, including vari- areas, only 2 sub-areas (one where P20 S had been added)
134 p, de Zorzi et a!.

Fig. 1 Scheme of reference


sampling, The site (10,000 m2) LEGEND
was divided into 100 squares of
lOx I 0 m, 100 composite sam- ... Composite samples
ples, each obtained by pooling • Single samples
25 increments of the same
square. were collected, 50 sin- ,; ..... ~
....',I....: .... .
Pho sphate added
gle samples were also collected
(25 samples per 2 squares) .a. .,
I> A A "'"..,.
..A

....
I ~
, !~ .. .-.
. a./ 1. . ."

-.....
....
.. J. ......
... - "
4. A

..... . ..........,..
~~ ,

.. - ---
~

; .... 4 4-

---
,'"
•..... .. ....
-..-..
... A, I. It.
4-

I ..
..' ' ,:r~
i·4 . ( ~~! :
L_ ....

,
...
..

.. . ..
" t • •

... ... .-

A ... 'II. A A- I.

'it. ~. 4- A A- A A l:3"
I

.
-;. .. :1> A 4- 4- 4.

e-----_
----
'.(t.' 4. 4. 6. 4. 4. 4 4.

...
.... ; '-----
"A _ _ _

........'"
"
4. ·4 · 4 A A It. A

----- ----. --~-


-- .--
4 4 4 A ... A 4

--
.. .. .. ..
"
4 A A

cane 4- 4 4 A A ~.
,I
I A ,4-

4. I> I> A A 4 .4 4.

- --
A A A- Il. '4 I> A J,.

10 o 10 20 Ikh~rs

were sampled again. The resulting 25 samples per sub- Sample preparation
area will be analyzed separately to explore the within-
sub-area variability. The samples were weighted (wet weight) and the data re-
Three different devices, commonly used for sampling corded. The samples were then stored in cartons before
in agricultural fields were used for the intercomparison being dried in a fan oven at 35-40°C for several days (to
exercise: an Edelman auger (20 cm length, 7 cm diame- constant weight). They were then disaggregated with a
ter), a gauge auger (20 cm length, 3 cm diameter) and a wooden pestle and passed through an automatic, rotating
shovel. The devices were cleaned after each sampling. stainless steel sieve (2 mm mesh). The fraction above
A practical approach to assessmentof sampling uncertainty 135

Fig. 2 Representation of the


intercomparison exercise com-
paring different sampling de- LEGEND
vices (Edelman auger and
shovel)
~ Auger samples I
o Shovel samples j"

,-....
~ r"'
\'~ t LJ Phosphate ~~ded _
o
\\ \~
\
--" -_. -' r-- o \ \~

\
--,

~
\ '--
__ Li. ____ Jj

\~-

\\U'l~\\:'
\

.
\ \ --
o i \ \-
~ \ t:s" \ ~
\ \ ::.
\ \~
\ \

-
\ \

cane

10
- -o -
--
10 20 Meters

2 mm was removed and was not considered in the ana- with coning and quartering of the sample sieved at 2 mm
lytical phases. and ended with the reduction by a riffle divider.
The samples sieved at 2 mm were reduced in order to
obtain the laboratory samples. The reduction phase was
Preliminary considerations
carried out to obtain samples representative of the soil
collected but having a reduced size to make them more By applying the EURACHEM principles a general
manageable in the laboratory. The reduction phase began cause-effect diagram (fish-bone) of the sampling phase
136 P. de Zorzi et al.

Fig. 3 Preliminary cause-effect


diagram for the sampling phase
based on the EURA-
CHEM/CITAC approach
s
•m
p
I
i
n
9

Fig. 4 Aggregated cause-effect


diagram for the sampling phase
in the agricultural SOILSAMP
experimental design. The S
~
T
dashed-line (sample reduction) a
and the dotted-line (other sam-
pling uncertainty sources) re- m
present the sampling uncertain- p
ty components that can be
..• ' ,;
I
quantified as separate blocks ~ a~
, ' ..
, : I}OI- -
... ;
~
n
1~""", - g

Ij
iI
L................. __ .. _........................................................................................ :

can be established. Figure 3 reports all the potential or corer samplers can increase due to compression of the
sources of uncertainty in soil sampling. soil). The influence of this source is higher in unman-
It is necessary to point out that not all the sources of aged soil (semi-natural ecosystems) than in managed soil
uncertainty have to be considered in all experimental ac- (agricultural field).
tivities involving soil sampling. The relative contribution The above reported considerations indicate that the
of the sources of uncertainty is dependent on the type of sources of uncertainty associated with sampling are
sampling and on the type of the analyte considered. The strongly dependent on the ecosystem investigated, sam-
contribution of sampling strategy is higher in the case of pling objectives and analytes studied. Figure 4 reports
the evaluation of "hot spots" or in the case of the assess- the cause-effect diagram for the assessment of the super-
ment of elements distribution in contaminated sites. This ficial distribution of trace elements in agricultural soil. In
aspect is not relevant in the case of the determination of the frame of SOILS AMP project, the influence of the op-
the mean value of an analyte in an agricultural field . erator is ruled out by selecting only one operator for
Sample type (disturbed/undisturbed) and 3D-spatial vari- sampling. Environmental conditions are ruled out as
ation give an important contribution in the determination well, because all sampling activity is carried out with
of vertical distribution of an analyte along the soil pro- similar temperature and moisture content in soil.
file. Sample stability and sample handling have a high Another aspect that has to be considered is that in
relative contribution for volatile elements determination. some cases it is extremely difficult to quantify each sin-
Environmental conditions, like moisture content of the gle uncertainty independently. In this case is more useful
soil and temperature can influence the depth of soil sam- to select the uncertainty sources that can be evaluated as
pled (in wet conditions, the layer sampled with an auger a "block". In the agricultural SOILSAMP experimental
A practical approach to assessmentof sampling uncertainty 137

design, some aggregated uncertaInties were defined as cause it is possible to carry out their determinations both
reported in Fig. 4. The influence of particle size, sam- on the entire sample before reduction and on the differ-
pling device, sampling strategy, sample handling, sample ent fractions resulting from reduction, without any treat-
container, and part of sample preparation are included in ment before radionuclide analysis. In addition, 137Cs and
the first block, while the critical phases of cone quarter- uranium series radionuclides show similar environmental
ing and riffling (reduction of the sample) in sample prep- behavior to many others trace elements in soil. 137Cs and
aration are considered separately. uranium series radionuclides activity concentrations
Each step of the sample reduction phase has its own will be determined in at least 10 replicates by gamma-
quantifiable uncertainty and it is possible to quantify the spectrometry.
uncertainty of the sample reduction as a "block". The de-
termination of this uncertainty will be quantified experi- Acknowledgements The authors would like to thank all the par-
mentally as standard deviation after several repetition of ticipants of the SOILSAMP external advisory group for their sci-
the reduction phase in three different samples. To quanti- entific contribution during the development of the activities. A
fy the contribution of the uncertainty linked with the re- special thanks to Dr. Luisa Stellato, consultant ANPA, for the as-
sessing the georef sampling points. Moreover, we are grateful to
duction phase, 137Cs and uranium series radionuclides Valter Coletti, ERSA - Ente Regionale per 10 Sviluppo Agricolo
activity concentrations will be determined. These radio- del Friuli Venezia-Giulia, for support and technical assistance dur-
nuclides have been selected as appropriate elements, be- ing the field activity.

References
I. ISO (1993) Guide to the expression of 7. Ramsey MH, Argyraky A (1997) Sci 16. lsaaks EH. Srivastava RM (1989) An
uncertainty in measurement. Interna- Total Environ 198: 243-257 introduction to applied geostatistics.
tional Organization for Standardization 8. Ramsey MH (1997) Analyst 122: Oxford University Press. Oxford, UK
(ISO), Geneva 1255-1260 17. Mulla DJ, McBratney AB (2000) Soil
2. ISO/IEC 17025:1999 (1999) General 9. Ramsey MH (1998) J Anal At Spect- spatial variability. In: Sumner ME (ed)
requirements for the competence of rom 13: 97-104 Handbook of soil science. CRC Press,
testing and calibration laboratories. In- 10. Ramsey MH, Squire S, Gardner MJ Boca Raton. FL
ternational Organization for Standard- (1999) Analyst 124: 1701-1706 18. 'Smodis B (1992) Vestn Slov Kern Drus
ization (ISO), Geneva II. Squire S, Ramsey MH, Gardner MJ 39(4): 503-519
3. Thompson M (1999) J Environ Monit (2000) Analyst 125: 139-145 19. Smodis B, Jacimovic R. Jovanovic S.
I: 19-21 12. Belli M, de Zorzi P, Menegon S. Stegnar P (1990) BioI Trace Elem Res
4. ISO 3534-1 (1993) Statistics, vocabu- Sansone U (2000) In the Proceedings 26:43-51
lary and symbols - Part I. Probability of XXXI National Congress of Radio- 20. SmodiS B, Jacimovic R, Medin G.
and general statistical terms. Interna- protection, 20-22 September 2000. Jovanovic S (1993) J Radioanal Nucl
tional Organization for Standardiza- Ancona (Italy). pp. 97-105 Chern Artic 169( I): 177-185
tion, Geneva 13. Ramsey MH. Thompson M. Hale M 21. Svetina M. Smodis B, Jeran Z.
5. Thompson M, Ramsey MH (1995) An- (1992) J Geochem Explor 44: 23-36 Jacimovic R (1996) J Radioanal Nucl
alyst 120: 261-270 14. Muntau H. Rehnert A, Desaules A. Chern Artic 204: 45-55
6. Ramsey MH, Argyraki A, Thompson Wagner G, Theocharopoulos S.
M (1995) Analyst 120: 1353-1356 Quevauviller P (2001) Sci Total Envi-
ron 264: 27-49
15. EURACHEM-CITAC Guide (2000)
Quantifying uncertainty in analytical
measurement. 2nd edn. EURACHEM
Accred Qual Assur (2002) 7: 106-110
DOl 10.1007/s00769-00 1-0420-4

© Springer-Verlag 2002

Zhengzhi Un Quality assurance for the analytical data


Li Lin
of micro elements in food

Abstract The micro element con- Keywords Quality assurance·


tent of food is an important quality Analytical data· Micro elements·
index due to the action of these ele- Food testing
ments on human health. In this arti-
cle, we discuss how to ensure the re-
liability of analytical data on micro
elements in order to truly represent
the condition of food. Sampling,
treatment of the analytical sample,
selection of the analytical method,
standard solution, and certified refer-
Z. Hu, L. Liu (~) ence material, blank test, calibration
Chinese National Center for Food Quality of the instrument and equipment, ap-
Supervision & Testing plication of the quality control chart,
32 Xiaoyun Road, Chaoyang District,
Beijing 100027, China
assessment of the final analytical re-
Tel.: +86-10-64645551 sult, and quality assurance system
Fax:+86-10-64625604 are briefly described.

Introduction Food containing many kinds of micro element is a


primary source of mineral nutrients for the human body,
Micro elements may be divided into three classes accord- however contaminated food is often a source of harmful
ing to their function and action on human health: essen- elements. The micro elements contained in food are an
tial element, non-essential element, and harmful element. important quality index due to their effect on human
Although the essential element is a necessary mineral health. For example, the tolerance limits (mg kg-I) of
nutrient for the human body, it becomes harmful when cadmium in food are less then 0.2 in rice, 0.1 in wheat
amounts exceeds a definite limit. For example, the physi- flour, fish, and meat, and 0.05 in egg and vegetables; the
ological requirement of selenium is 40 Ilg day-l and poi- tolerance limits of mercury in food are less than 0.0 I in
soning results if the Se intake exceeds 800 Ilg day-I. potatoes and milk, 0.02 in grain, 0.05 in meat and egg,
Some elements, such as Pb and Hg, are harmful even and 0.3 in fish (0.2 for methyl mercury) which are pro-
very small quantities due to accumulation in the human vided in food standards and must be controlled rigorous-
body. The effect of some elements change depending on ly [2].
the oxidation state or form present, for example, chromi- Product quality control depends on analytical data
um (III) is an essential element while chromium (VI) is a and we must therefore ensure the accuracy and reliability
carcinogen. The toxicity of organomercury compounds is of such data in order to represent its true condition. This
different from inorganic mercury; methyl mercury is the condition is essential for the sample determination result
most toxic of these due to its ease of absorption which is to ensure the control of product quality, prevent product
highest in digestive tract of human body [I]. contamination, and protection of human health.
Quality assurance for the analytical data of micro elements in food 139

In this paper, we discuss how to ensure the quality of 2. Steps to avoid loss:
the analytical data of micro elements in food against (a) Keep the ashing temperature controlled and below
problems which can introduce error. 500°C to avoid the loss of volatile elements (e.g.,
Cd and Pb) when the dry ashing method is used
during sample treatment. Ashing aids used in dry
Sampling ashing, may promote the decomposition of the or-
ganic matter, ash solubilization, and also help
Our laboratory performed the following two parts of the avoid the loss of the elements determined because
task: (i) the mandatory inspection (including quality su- of the solute produced. For example, Mg(N0 3)2
pervisory inspection, productive license inspection, and has been applied as an ashing aid to avoid the loss
products attested inspection that are assigned by govern- of the Se, As, Cd, and Pb in fish, milk, and fruit
ment) and (ii) the entrust inspection (including common juice samples, when these samples have been
sample analysis and arbitrate analysis). treated with the high temperature ashing method.
Samples used for the mandatory inspection were ran- In the determination of volatile halogens, NaOH
domly sampled from the qualifying products within their has been used to fix fluorine, and Mg(NO, h has
guarantee dates at the products' factory storehouse or been used as an ashing aid; the alkaline dry ·ashing
market goods cabinet. Samples used for the entrust in- method may be used for the destruction of organic
spection were delivered to our laboratory in person by matter and the liberation of fluorine without any
the sampling person. A sufficient amount of the sample loss. ZnS04 has been used as an ashing aid in the
was selected by a suitable method according to the pro- iodine determination; the sample may then be ash-
vision in the relative standard. ed at 550-600 °C without any iodine loss.
Only the determination value of each ingredient in the (b) When the wet-digestive method is used for food
sample was used to determine whether the product was sample treatment, the appropriate digestant must
up to standard or not; we did not deduce whether this be selected for the sample and the element being
product set was up to standard or not. determined. For example, the oxidizing HNO r
H 2S04 must be used as the digestant for arsenic
determination in food containing high amounts of
Treatment of the analytical sample salt in order to ensure that the arsenic present is
all arsenic (V), otherwise, the arsenic (III) may be
An appropriate sample treatment method was selected be-
lost as the volatile AsCI., (b.p.= 130°C). When
fore analysis according to the rules of relative standard in
HNO.,-H 2S04 is used in· canned food digestion,
order to reduce the sampling error and misrepresentation.
the acid-insoluble meta-stannic acid is produced
Common liquids, powders, and small pellet foods
and adsorbed on the inner wall of the Kjeldahl
were homogenized by simply shaking, however, for solid
flask, so that the tin is then lost. At this time,
food, especially non-uniform solid food, it was first bro-
4 mol I-I NaOH solution must be added and then
ken into pieces and then mixed to achieve sufficient ho-
gently heated with swirling until meta-stannic acid
mogeneity. Canned foods were poured into a blender and
is fully dissolved in the sample solution. However,
thoroughly mixed since the element content is quite dif-
if H 2S04-H 20 2 used in this digestion, the afore-
ferent at the can center and at surfaces in contact with
mentioned trouble may be avoided. H 2S04 should
the can wall.
not be used in digestions for the determination of
Most of the foods examined were multi-component
trace lead in samples containing large amounts of
organisms. For the accurate determination, the micro ele-
calcium because insoluble CaS0 4 produced at the
ments must be free from organic matter before analysis.
end of the digestion causes the adsorption-loss of
The decomposition procedure of organisms and extrac-
lead.
tive procedures for inorganic elements were generally
(c) The original oxidation state (OS) of the elements
applied. However, for all procedures applied, the effec-
must be retained when the different OSs of the el-
tive steps necessary to avoid contamination or loss of the
ements have been determined individually. For
elements determined during the sample treatment were
example, in the simultaneous determination of to-
always taken.
tal chromium and chromium (VI) in food by
I. Steps to avoid contamination: atomic absorption spectrophotometry(AAS), the
(a) Clear the air in laboratory by filtration to avoid sample must be treated with 10% aqueous tetra-
analytical sample contamination by elements in methyl ammonium hydroxide in an ultrasonic wa-
the floating dust ter bath at 60±2°C until all solid matter is dis-
(b) Soak the glass vessel in dilute acid and then wash solved; all the chromium ions are extracted into
it with deionized water to avoid contamination by the alkaline solution without any change in OS.
elements adsorbed on the vessel walls For the determination of total iron and iron (II) in
140 Z. Hu' L. Liu

infant food by AAS, the sample is treated with an national standard, and trade standard have been selected
acid-extraction procedure using hydrochloric acid for the mandatory inspection. The national standard is
and ultrasonic vibration under nitrogen flow; all preferential for the entrust inspection and arbitrate in-
iron is extracted into the acid solution without any spection. If it is not available the normal standard and
change in as. then the trade standard or contract standard will be used.
(d) When the hydride generation method is applied to When no standard method was available for a certain
the determination of total arsenic in food, all of sample, the reliable method published or developed by
the organic and inorganic arsenic-containing com- us was selected, however, these methods must be passed
ponents in sample are completely converted into through assessment. Furthermore, it should be empha-
arsenic (V) by digestion with HNO r H 2S04 ; the sized that we must carefully consider the following fac-
reduction of As5+ to arsine is very slow by boro- tors when the standard method is applied to certain ele-
hydride and the As5+ must therefore be completely ment determinations in practical samples, because the
pre-reduced to As3+ by potassium iodide-ascorbic standard method has been worked out for many kinds of
acid (KI-VC). However, when graphite furnace food:
AAS is used for the total arsenic determination,
the arsenic-containing components in the sample 1. We must consider how to remove the various interfer-
solution must be completely oxidized to A S5+, ences for the determination of trace elements in food
since the atomization temperature of As3+ is very by AAS, according to the elements determined and its
different from As5+. When a sodium diethyldithio- content:
carbamate- methyl isobutyl ketone (DDTC-MIBK) (a) Prepare the standard solution in the same compo-
system is applied to the determination of chromi- sition as the sample or apply a standard addition
um concentration, the Cr3+ must be completely method in order to remove physical interference
oxidized to Cr6+ to ensure an accurate analytical from the differences in viscosity, surface tension,
result because Cr3+ is not readily chelated by and vapor pressure.
DDTC. (b) Suitable chelated-extraction or ion-exchange
methods should be applied to collect the deter-
mined element in order to separate off the high
Selection of the analytical method amounts of interfering inorganic salts or extract
out the micro elements.
Many analytical methods can be applied to the determi- (c) Make use of the characteristic gaseous state of the
nation of micro elements in food. Suitable analytical hydride (at normal atmospheric temperature and
methods should have the following features: pressure); it can therefore be decomposed at lower
temperatures. As, Sn, Bi, Pb, Se, Sb, Te, and Ge
1. The uncertainty of the methods should be minimized
may therefore be readily separated from their
(good precision). In general, the relative standard de-
mother solutions at normal temperature and pres-
viation of method should be lower than ±S%.
sure.
2. The sensitivity and detection limits of the methods
(d) When the alkaline metal and part of the alkaline-
should meet the needs of the standard (high sensitivi-
earth metals present have been determined, a
ty and low detection limit). In general, the detection
readily ionizable element (another alkaline metal)
limit of the method should lower than the permitted
must be added in an analytical solution in order to
content in the sample (provided in the product stan-
increase the free electronic concentration in the
dard) by at least one order of magnitude.
flame, therefore effectively controlling or remov-
3. A fair agreement between the true content and the ex-
ing the effect of the ionization interference.
pected content observed by the method is sufficient
(e) The chemical interference can be removed using
(good accuracy).
the temperature effect, gaseous state of the flame,
4. The method used for investigation is different from
addition of the release agent, protective agent,
the reference method and should have a better preci-
flux agent, and organic solvent etc, or by pre-
sion than the method generally used for the determi-
seperating off the interference matters.
nation of the parameters.
(f) The molecular absorption interference can be re-
In general, we selected the suitable analytical method ac- moved by the adjustment of the zero point, deduct
cording to the element, content, and matrix in the deter- with continuative light source and the Zeeman ef-
mined sample in order to ensure the accuracy and reli- fect.
ability of the analytical result. (g) When the graphite furnace-AAS is used for the
Our principle for method selection is that the standard determination of trace elements in food, the ma-
method should always be used where possible. Under the trix effect is more serious. A matrix improver
normal conditions, the existing national standard, inter- must be used for the removal of the matrix inter-
Quality assurance for the analytical data of micro elements in food 141

ference. For example, phosphoric acid must be 1549; 304 F063; GBW08509), cabbage (GBW08504),
added to the sample solution for the determination mussel (GBW08571), prawn (GBW08572) and pork
of trace lead; the ash temperature may be in- (GBW08552).
creased to 900-1000 °C and the matrix interfer-
ence for the lead determination is therefore re-
moved. In the determination of arsenic, Mg(N0 3)2 Blank test
and Ni(N0 3h are added as a matrix improver in-
creasing the ash temperature to 1100 °C; thus the The blank test is a scale for the inspection of reagents
interference of the anion and cation that coexisted and methods used in analysis to detect whether or not
in the sample solution is removed allowing the de- they correspond to the requirements of trace analysis.
tected of concentrations as low as 6 ng g-I. Blank test must be carried out for each set of the sample
and treated with high-purity reagent and water passed
through re-distiIIation, ion-exchange, or sub-boiling dis-
Standard solution and certified reference materials tiIIation apparatus in order to reduce the reagent blank
value to a sufficiently low level. For example, the nitric
I. Standard solutions acid solution containing large amounts of chromium
should not be used in the sample digestion for chromium
A standard solution of very reliable quality is a necessity
determination in food; the H 2SOr H 20 2 digestive method
for quantitative analysis. In our laboratory, the certified
is however suitable. The high pressure digestion (per-
reference reagents used as stock solutions for the micro
formed in a sealed container) is applied to the analysis of
element analysis were prepared by American Fisher Sci-
food because only small amounts of reagent are used in
entific Company or the China National Research Center
the sample digestion, thus its blank value is lower than
forCRM.
the other method. If a microwave heater was used in this
Each working standard solution was prepared by dilut-
method, the effect may be even better.
ing the stock solution with deionized water or dilute acid
using a calibrated burette and volumetric flask to a
known concentration before use. The container for the
storage of the standard solution were soaked with acid
Calibration of the instrument and equipment
and cleaned thoroughly with deionized water prior to use
All of the instruments and equipment in our laboratory,
in order to avoid contamination. The standard working
including atomic absorption spectrophotometer, UV-spec-
solutions for super micro element analysis were stored in
trophotometer, balance, thermometer, pressure gauge,
Teflon containers to avoid the adsorption loss and dis-
vacuum meter, and high capacity glass container were
solving element contamination from the container. The
calibrated at regular intervals and operative inspections
storage conditions of the standard working solutions were
were preformed every day before use to ensure a good
vigorously maintained. For example, the standard tin
operative state, accuracy, and reliability.
working solution was prepared before use with dilute acid
The calibration of the instruments are carried out
(0.1 mol I-I HCL, HN0 3. or H 2S04) and may be stored
every year by the legal measurement department (China
for several months in glass, polypropylene, polyvinyl
National Institute of Metrology, China National Research
chloride, polycarbonate, and Teflon containers. If pre-
Center for CRMs) according to the national rules for cali-
pared with water, obvious losses occurred when the stan-
dard working solution was stored for as little as one day. bration. Its verified value may be traced to the national
In our laboratory, the standard solutions were pre- measurement standards. Operative inspections were car-
ried out by analytical personnel prior to each use.
pared by two people, calibrated with each other, and then
For instruments without calibration rules, calibration
checked against newly purchased standards in order to
was carried out by comparison with a similar instrument
eliminate the risk of error.
so as to make its verified value comparable.
Any erroneous instrument or equipment was not used
2. Certified reference material
further and the data for samples recently analyzed were
The certified reference material (CRM) is used as quality re-inspected.
assurance samples for the assessment of the analytical Our laboratory had one set of small capacity glass
methods and results. The CRM used for the elements' containers which had been passed for calibration by the
analysis in our laboratory were prepared by the Ameri- legal measurement department. Other glass containers
can National Bureau of Standards, MBH Analytical Lim- were self-calibrated by personnel who had passed the
ited, and the China National Research Center for CRM. special training at the legal measurement department, us-
For example, oyster tissue (SRM 1566), bovine liver ing above the verified container as the measurement
(SRM I 577a; 308FI85), wheat flour (SRM 1567; standard in order to ensure its quantitative value could be
GBW08503), rice flour (SRM 1568), milk powder (SRM traced back to the national measurement standard.
142 Z. Hu . L. Liu

Application of the quality control chart [3] fell within the range of the certified value, it showed
that the sample analytical result was accurate and reli-
The quality control chart for the single measurem~nt ?f able.
micro element determination in food has been apphed m 2. When there no suitable CRM could be used, the sam-
our laboratory. The analytical results can therefore be di-
ple was determinate by classical .m~thods or. o~her
rectly expressed in order to discover and correct prob- methods based on a different pnnclple. StatIstIcal
lems during the inspection. The determination data are analysis was then carried on their means obtained by
therefore accurate. the above two methods. If there was no obvious dif-
Preparation of the quality control chart: a suitable ref- ference between them, we considered the sample ana-
erence material was selected as a quality control sample. lytical result to be accurate and effective.
The certified value (A) of the reference material was tak-
3. The analytical result of the unknown sample is al~o
en for the center line and certified values plus, minus corrected by the "recovery" of the certified value m
two times the standard deviation (A±2 0-) of the analyti- the analysis of the CRM.
cal method for the upper and lower control limits, re- 4. We joined the comparison test with national or inter-
spectively; take the determined values and data for the national laboratories at irregular intervals and ob-
vertical and horizontal coordinates, respectively, draw tained good results. This indicated that our analytical
out the quality control chart of the single measurement results were accurate and reliable. For example, in
for every element. 1998, we joined the proficiency test between 108 in-
These quality control charts were always available in ternational laboratories in the Asia Pacific region for
our laboratory, so that determination result of the quality the determination of As, Cd, Pb, Hg, and Zn content
control sample could be pin-pointed on its graph in order in a canned fish sample labeled APLAC T009. The
to control the analytical quality at any time. between-laboratory Z-score and the within-laboratory
The analytical quality control test for the assessment Z-score was -0.67-0.98 and -0.24-0.73, respectively.
of trueness of the analytical result was carried on each of Our achievement is elegant.
the sample determinations; at least one CRM was ana-
lyzed with each set of the practice sample. As the d~te.r­
mination result of the quality control sample fell wlthm Quality Assurance System [5]
the upper and lower control limits, it showe.d that th~ d~­
termination process of this sample set was sItuated wlthm
Our laboratory is a qualified laboratory through the ex-
the statistical control state; these results were therefore
amination and accreditation by the China National Ac-
effective. If the determination result of the CRM fell out-
creditation Committee for Laboratory (CNACL). We
side the control limits, it showed that the determination
have set up a quality assurance system and sufficiently
process was out of control. These results within the peri-
ensured the accuracy and reliability of the inspection
od from this time to former time were ineffective and the
data in six fields (including environmental condition, in-
reason for the deviation was investigated and corrected.
strument and equipment, personnel quality, inspective
process, quality appeal treatment, and accident tr~at­
Assessment of the final analytical result [4] ment). It has been continuously revised and progressIve-
ly standardized through the management review (includ-
The important analytical result (including the arbitration
ing internal quality audits and review) each year.. .
analytical result, specific sample analytical :esul~, and
Our laboratory is also subject to one re-exammatlOn
determination by non-standard methods) obtamed m our
every five years and one selective examination every
laboratory were verified by following techniques:
year by the CNACL.
1. The CRM with a similar matrix to the sample was se- In our laboratory, the CRM is used as a blind sample
lected and analyzed using the same procedure togeth- in order to allow examination of personnel. Our person-
er with the sample. If the analytical result of the CRM nel are all sufficiently competent and qualified.

References
I. Hu Zhengzhi (1996) China Encyclope- 3. Pan Xiurong (1993) Introduction to 5. Quality Manual (1998) China National
dia of Chemical Industry. Chemical In- quality assurance of analysis and in- Center of Food Quality Supervision and
dustry Publisher, Beijing, 10:57-138 spection. Scientific and Technical I~for­ Testing
2. China National Standard, GB2762-94 mation Network for Standard Matenal,
and GB4810-94 Beijing, pp 1-63
4. Wang Shuchun (1991) Mathematical
statistics and quality control for food
analysis. China People's Hygienic Pub-
lisher, pp 24-49
Accred Qual Assur (1998) 3:227-230
© Springer-Verlag 1998

Manfred Golze Customers' needs in relation to


uncertainty and uncertainty budgets

Abstract The general requirement Key words Uncertainty·


of Quality Management standards Uncertainty statement
to include in test reports a state-
Presented at: EVROLAB Workshop on ment of the uncertainty of the re-
Confidence in Testing - Customers sults reflects the fact that a test re-
Needs, Copenhagen, 11 September 1997
sult is rather useless without a
knowledge of its accuracy. After an
M. Golze (lEI)
Federal Institute for Materials Research
outline of the basic concepts of un-
and Testing (BAM), certainty, the need for uncertainty
Vnter den Eichen 87, statements is illustrated for differ-
D-12205 Berlin, Germany ent ranges of applications.
Tel.: + 49-30-8104 1943
Fax: + 49-30-8104 3717
e-mail: manfred.golze@bam.de

Introduction Uncertainty of measurements and tests - basic


concepts
The European standard EN 45001 [1] and its interna-
tional counterpart ISO/lEe Guide 25 [2] both require The true value of an unambiguously defined measu-
that a laboratory include in its calibration or test re- rand (i.e. the specific quantity subject to measurement)
ports "a statement of the estimated uncertainty of the would be obtained by a perfect measurement. But, in a
calibration or test result (where relevant)" [2]. This re- real measurement, there always exists an error, i.e. an
quirement reflects the fact that a measurement or test unknown difference between the measurement result
result is rather useless without a knowledge of its accu- and the true value. Such an error consists of two kinds
racy. Therefore the standards authorities are clearly of elements: random and systematic components.
taking a position which represents the interests of cus- The random components arise from unpredictable
tomers in their dealings with laboratories. or stochastic variations of influence quantities. They
In this paper, I mainly want to illustrate this state- cause the variations in repeated measurements and oft-
ment on the basis of examples, but it is first necessary en lead to the well-known normal distribution of meas-
to outline, at least roughly, the basic concepts of uncer- urement results. If no systematic component is present
tainty. one would obtain the true value as the mean of an in-
finite number of replicates.
A systematic component means that the centre of
the distribution of measurement results is shifted away
from the true value because of, e.g., an incorrect cali-
144 M. Golze

lished, further development is required in the field of


testing, particularly in the case of qualitative results.

The need for uncertainty statements


j"'-single '\. true value It becomes clear from the above that an uncertainty
mean
! result
statement is only an estimate and is less accurate than
random error ~ !~
the result itself. Nevertheless, knowledge of the uncer-
svstematic error I~<----~)I
tainty is essential for the assessment of the reliability
Fig. 1 Distribution of results with random and systematic error and conclusiveness of the result, i.e. its quality.
components

Comparison of different results

Uncertainties are the criteria for the decision whether


two test results obtained on identical items are compa-
tible or different. Often such decisions have to be taken
in customer/supplier relations and can influence essen-
tially the price of a commodity. In a routine case there
is no need to estimate the uncertainty of each individu-
al analysis, but both parties can agree on a particular
analytical procedure and on an uncertainty which can
be attributed to it.
An example is the delivery of ferromolybdenum, a
binary alloy with a high molybdenum content, which is
used as an additive for the production of molybdenum-
containing steels. Because the content of molybdenum
uncertainty is essential for the price of a batch, usually both buyer
and supplier perform an analysis according to an
Fig.2 The different error components and their influence on a agreed gravimetric procedure. By agreement, each par-
measurement or test result and its uncertainty (according to [4]) ty can ask for an arbitration analysis if the difference
exceeds a previously fixed value. For example, if the
result of the supplier's laboratory is WMo =72.58 wt%,
bration (see Fig. 1), i.e. the results are biased. To some that of the buyer's laboratory is WMo =72.06 wt%, and
extent, systematic effects are known and the results can the maximum difference accepted by both is 0.3 wt% ,
therefore be adequately corrected. our institute could be asked to perform an independent
But, even after correction for known systematic ef- analysis. Our finding might be wMo=72.16 wt%, which
fects, a measurement result is still only an estimate of would then agree reasonably well with that of the
the true value and should be accompanied by a state- buyer.
ment of its uncertainty. In [3], uncertainty (of measure-
ment) is defined as:
"A parameter, associated with the result of a meas- Control of tolerances or detection of deviations
urement, that characterizes the dispersion of the values
that could reasonably be attributed to the measu- In industrial quality assurance it is a common task to
rand." assess whether manufactured parts comply with speci-
Thus, uncertainty arises from random effects and fied tolerances. Equally it might be necessary to detect
unknown systematic error components but also from a specific deviation with high reliability. In both cases
imperfect correction of known systematic errors, as one needs test methods with an adequate capability.
shown in Fig. 2 [4]. For instance in the former case the so-called "golden
According to the Guide to the expression of uncer- rule of metrology" requires that the uncertainty u of
tainty in measurement (GUM) [3], the measurement the test method be less than 1110 of the tolerance inter-
uncertainty should be estimated by making up an un- val T [4]: u~ Tl10. As far as this rule applies, one can
certainty budget which takes all relevant components neglect the uncertainty of an individual measurement
into account. While in the field of metrology the neces- because the spread in the final results mainly reflects
sary tools for uncertainty evaluations are well estab- the variability of the manufactured parts.
Customers' needs in relation to uncertainty and uncertainty budgets 145

Often geometrical tolerances of manufactured parts and the inner limits depicted in Fig. 3 give the confi-
are controlled by use of calliper gauges. According to a dence interval of the certified value. The total interval
German standard [5], the highest permissible error of reflects the uncertainty of the analytical procedure esti-
such a calliper gauge should be u = 0.03 mm. Thus the mated as ± 2 s (s = standard deviation). As can be seen,
measurable tolerance using this gauge should be the procedure was out of control in early 1996 and had
T?:. 0.3 mm. If the tolerance is below this value one to be readjusted.
should use a more accurate device or measuring proce-
dure.
Uncertainty of tests caused by sampling
Compliance with limiting values Often the uncertainty associated with a test is not main-
ly caused by the measurement process itself but by the
The important task of control of whether products or
sampling procedure performed beforehand. Therefore
samples comply with limiting values defined by regula-
it is important to include this component into the un-
tions or laws for reasons of health, safety or environ-
certainty budget, which otherwise would be mislead-
mental protection is related to the control of toler-
ing.
ances. It is only mentioned here for the sake of com-
In connection with sampling, two question arise:
pleteness and is dealt with in detail in [6]. Again the
- The representativeness of the sample
uncertainty has to be taken into account when making
- The deduction of a result for the whole batch based
decisions.
on the sample result.
The first question causes severe problems e.g. in the
Laboratory quality control field of environmental analysis, but cannot be treated
here.
Reference materials are widely used in chemical The second problem which is of importance e.g. in
analysis to establish traceability and to control analyti- the fields of industrial quality control and market sur-
cal procedures e.g. by use of control charts. For this veillance, can be treated appropriately by statistical
purpose two uncertainties should be known: means [7, 8]. An application is the market surveillance
- The uncertainty of the certified reference value with regard to the so-called e-mark. This mark on pre-
- The uncertainty of the analytical procedure to be packed food and consumer goods is intended to assure
checked. the consumer that the actual contents of the prepack
Usually the latter is dominant. conform with the nominal contents within certain lim-
As an example (Fig. 3), a control chart is shown set its. For example, a German regulation [9] applied to a
up by a BAM laboratory for the analysis of aluminium lot of 10 000 packages of butter with a nominal weight
in steel by spark emission spectroscopy. The reference of 250 g each stipulates the sampling instruction 125-7/8,
material used is the EURONORM-CRM No. 194-1 which means that a sample of 125 packages is randomly
taken and each pack has to be weighed. If x gives the
number of items in the sample, with mj <241 g (the
Control-Chart ZRM 194-1 - Aluminium ~ZRM+2s
minimum permissible weight), then the criteria for ac-
0,100 . - - - - - - - - - - - - - - - - - - i ~ ZRM-2s ceptance or rejection of the whole batch are x:s; 7 and
!--ZRM x?:.8, respectively. The probability of acceptance P a as

--------.H--i =~:~-CII2
0,095
j •••••• ZRM+CI/2 a function of the quality level of the lot can be derived
from the so-called operational characteristic of this spe-
0,090 cific sampling instruction and is given in Table 1 for
some selected values. It can be seen that a batch con-
~ 0,085 . taining 8% packages with a weight less than 241 g will
:i
;:
0,080 .

Table 1 Probability of acceptance P a as function of percentage of


0,075 defective items in the lot (selected values). p, percentage of defec-
tive items; P a , probability of acceptance of the lot
0,070
co p [%] P a [%]
'<t '<t <0 <0
"
C') ll) ll)
O'l O'l O'l O'l O'l
~ ~ ~ ~
c;:; Oi ;0 N
co
0 0 0
~
0
0
~ 0 ~
<0
0 ;; Date
2 99.62
4 H7.09
Fig.3 Control chart for the analysis of aluminium in steel by H 20.90
spark emission spectroscopy. The EURONORM-CRM No. 194-1 12 1.33
is used as reference standard
146 M. Golze

still be accepted with a probability of approx. 20%. Concerted actions of the testing community,
This example may demonstrate the limited resolving accreditation and standardization bodies
power of such a sampling instruction, and should be
kept in mind. These problems cannot be solved by individual labora-
In the case of the e-mark, according to a second re- tories. Instead, the testing community as a whole is
quirement, a batch has also to be rejected if the mean asked to co-operate with accreditation and standardiza-
weight of the sample items is less than the nominal tion bodies aiming at:
weight with a significance level of 99%. Evaluation of the (generic) uncertainty of measure-
ment and test procedures
- Inclusion of uncertainty characteristics (e.g. repeata-
Concluding remarks bility, reproducibility) in testing standards
- Education of the customers.
It is the aim of this paper to demonstrate that uncer-
tainty statements are essential for the users of measure-
ment and test results when they assess these results and Provision of the necessary funds
have to take decisions based on them. However, for
this purpose it is often sufficient to know the generic It is the task of national and European authorities to
uncertainty of the type of test performed instead of the provide the necessary funds for these concerted ac-
uncertainty of the particular result. But we are faced tions
with some problems. - Because by this means the testing infrastructure can
be improved
- Because authorities are also customers of testing la-
Confusion of the customers, hesitation of the boratories and important administrative and politi-
laboratories cal decisions are based on their results.
Organizations like EURACHEM, EUROLAB and
Laboratory practitioners know from their contacts with NORDTEST can help to initiate and stimulate these
the customers that often the latter are not familiar with co-operative processes.
the concepts of uncertainty and are rather confused
when they are confronted with uncertainty statements. Acknowledgements The author would like to thank colleagues
from EUROLAB, NORDTEST and BAM for fruitful discus-
On the other hand many laboratories fear that a com- sions. In particular the contributions of Nazmir Presser, Rolf
prehensive and honest statement of uncertainty might Oberhauser, Siegfried Noack and Thomas Goedecke are grateful-
affect their reputation and competitiveness. ly acknowledged.

References

1. EN 45001 (19il9) General criteria for 4. Hernia M (1996) Qualitat und Zuver- H. ISO 3951 (19H9) Sampling procedures
the operation of testing laboratories, lassigkeit 41 : 1156-1162 and charts for inspection by variables
Brussels. 5. DIN H62, Mel3schieber - Anforderung- for percent nonconforming. Geneva
2. ISO/IEC Guide 25 (1990) General re- en, Prtifungen, 19HH 9. Bundesgesetzblatt Part I (19ill) Ver-
quirements for the competence of cali- 6. Christensen 1M, Holst E (199H) ordnung tiber Fertigpackungen (Fertig-
bration and testing laboratories, 3rd Accred Qual Assur packungsverordnung) of
edn. Geneva 7. ISO 2H59-1 (19H9) Sampling proce- 1K 12.19Hl : 15H5-1620, last change:
3. BIPM, IEC, IFCC, ISO, IUPAC, IU- dures for inspection by attributes; sam- Bundesgesetzblatt Part I (1989):
PAP, OIML (1993) Guide to the ex- pling plans indexed by acceptable 1557-1567
pression of uncertainty in measure- quality level (AQL) for lot-by-Iot in-
ment, 1st edn spection. Geneva
Accred Oual Assur (199H) 3: 237-241
© Springer-Verlag 199H

Rouvim Kadis Evaluating uncertainty in analytical


measurements:
the pursuit of correctness

Abstract Simple in principle. the comments are concerned with the


evaluation of uncertainty, especial- following items: (1) choosing an
ly in chemical analysis, is not a appropriate distribution function in
Presented at: 2 nd EURACHEM routine task and needs great care type B evaluation of uncertainty,
Workshop on Measurement Uncertainty to be correct. This can be seen, (2) the necessity for consideration
in Chemical Analysis. Berlin. 29-30
September 1997 particularly, from an examination of separate contributions to the
of the EURACHEM Guide, Quan- combined uncertainty, and (3) tak-
tifying Uncertainty in Analytical ing account of actual influence fac-
Measurement (1995), which is the tors in the uncertainty estimation
most important document on the process. Furthermore, the problem
subject. The examination reveals, of estimation of conditional versus
in the author's opinion, a shortage overall uncertainty is touched upon
of correctness in some principal in connection with comparative
details of the uncertainty estima- trials where only internal consist-
tion process as presented in ency of results is required.
R. L. Kadis (181) worked examples in the Guide,
D. I. Mendeleyev Institute for Metrology and the author has therefore for-
(VNIIM). 19 Moskovsky pr..
19H005 St. Petersburg. Russia
mulated some "in pursuit of cor- Key words Chemical analysis
Fax: +7-H12 327-97-76 rectness" rules for estimating un- Measurement uncertainty .
e-mail: hai@onti.vniim.spb.su certainty. The rules and respective Estimation process

The current state of implementation of the measure-


Introduction
ment uncertainty concept in analytical practice may be
compared with the design of a unique building at the
The term "measurement uncertainty", in common use stage where a consensus is reached on the key issues,
for the characterization of physical measurements, has but there remain some details which cannot be consid-
so far been a little difficult to adapt to the requirements ered as fixed, as they require further correction. The
of chemical analysis. However, introducing the concept same can be said about the procedures in evaluating
into this field of (amount-of-substance, analytical, uncertainties as they are presented in the EURA-
chemical) measurements is quite natural. A number of CHEM Guide on uncertainty in analytical measure-
papers [1-4] published on the topic have considered the ment [5]. The document is undoubtedly the most im-
important issues of using and evaluating uncertainty in portant contribution to the development of the concept
the context of analytical data quality. This paper, how- as applied to analytical chemistry problems. However,
ever, focuses on the things that determine the quality of some of the practical directions in the uncertainty esti-
the uncertainty estimates themselves. mation process which are indicated in worked examples
148 R. Kadis

[5, Appendix A] do not seem entirely correct and call available be used. The three typical cases of what may
for comment. This is due to some oversimplified or in- be known are as follows (see Fig. 1):
correct insights into the subject, which may result in un- 1. A (statistically) estimated confidence interval having
realistic estimation. Regardless how much the "error" a stated confidence level
in an estimate of the uncertainty may be, these issues 2. An expected value and assigned maximum bounds
are of principal importance in view of the educational about it
significance of the Guide. Therefore, these "trifles" are 3. Assigned maximum bounds only.
worth drawing attention to in order to formulate some Unless otherwise stated, it is quite natural to assume
(obvious enough) rules, so that uncertainty estimates in case 1 that a normal (Gauss) distribution was used to
can be as correct as possible. These "in pursuit of cor- calculate the interval and recover the standard uncer-
rectness" rules are given below, with reference to the tainty by using a suitable quantile of the distribution.
respective examples of the Guide. (The quantile is taken equal to 2.0 for 95% confidence
level.) In contrast, extremely little information about
the quantity in question is available in case 3, and all
one can do is to model it by symmetric uniform (rectan-
Three "in pursuit of correctness" rules in estimating gular) distribution. Then, the expected value of the
uncertainty quantity is the midpoint of the range and the conver-
sion factor is equal to -y3. Case 2 occurs where addi-
Rule i: The choice of an appropriate distribution func- tional information such as an expected value allows us
tion (in type B evaluation of uncertainty) should be to regard values of the quantity near this value as being
made on the basis of all the available information on the more likely than values near the bounds. This situation
quantity at issue. differs from that in case 3. It is because of this that item
An estimate of standard uncertainty is often made F.2.3.3 of the ISO Guide to the Expression of Uncer-
from bounds a_ and a+ within which values of the tainty in Measurement [6] recommends in such in-
quantity in question X are expected to lie. [The range stances the adoption of a triangular distribution as a
a_ to a+ is commonly symmetric with respect to the compromise between the two extremes, normal and
best estimate of X and has half-width a =(a+-a_)/2.] It rectangular distributions. The conversion factor is
is a fairly frequent task in the practice of measurement equal to V6 in this case, and the standard uncertainty
data evaluation, as assigning maximum bounds (based obtained proves to be about 30% smaller than that ob-
on objective knowledge or personal jUdgment) is often tained using the rectangular distribution model. It can
the only thing to do. One may simply divide the value be said that increasing uncertainty in going from case 2
of a above by an appropriate conversion factor depend- to case 3 is in a sense a "payment" for our ignorance
ing on what kind of probability distribution is assumed. about the distribution of possible values of the quantity
It is essential here that all the relevant information between the bounds.

Fig. 1 Type B evaluation of


standard uncertainty for a 1. 2. 3.
quantity X given as: (1) an es-
timated confidence interval,
(2) an expected value and as-
signed maximum bounds
about it, and (3) assigned
maximum bounds only
-a X a -a X a -a X a
P=0,95 u(X) = a/2,0 u(X) = a/.J6 u(X)=a/../3

Examples
Confidence interval for a Nominal values and specification The purity of a material as being
weighing result: limits for volumetric glassware: "not less than p (%) level":

(m ± 0,1) mg (P = 0,95) for a 250-ml standard flask 100 - p = 2a


u(m} = 0,1/2,0 = 0,05 mg V = (250,00 ± 0,15) ml u(p) = (100 - p)/2../3
(Example 1, Step 1 [5]) u(V) = 0,15/.J6 = 0,061 ml
Example 1, Step 2 [5] gives:
u(V) = 0,15/../3 = 0,087 ml
Evaluating uncertainty in analytical measurements: the pursuit of correctness 149

It should be recognized that all the instances of us- fied tolerances. Only the two contributions to the un-
ing rectangular distributions in the EURACHEM certainty remain in such a case, and a substantially re-
Guide examples fall in fact under case 2, not case 3. duced uncertainty value is achieved in the final analy-
Such are, in particular, the evaluations of the uncertain- sis. The cases considered are schematically depicted in
ty concerned with volumetric glassware: a nominal ca- Fig. 2. [It is necessary to note that an additional contri-
pacity is simply an expected value. Thus, all the stand- bution to the uncertainty may arise due to a substantial
ard uncertainties calculated by applying the conversion difference between the properties (such as viscosity,
factor of v'3 appear to be overestimated. Although this surface tension, and so on) of a liquid to be measured
approach based on Bayes "principle of equal igno- and those of water, for instance, in the case of nona-
rance" is common practice in estimating uncertainty in queous solutions. These effects are taken into account
metrology, we cannot regard it as correct in all the by means of individual calibration with an appropriate
cases of specifying measurement errors in the form of calibration liquid.]
maximum limits. Though simple and universal, this Let us examine another situation, "Determination of
scheme comes into conflict with common sense. Rec- organophosphorus pesticides in bread" (Example 3 of
tangular distribution is only to be assumed when noth- the Guide). This is a multistage procedure consisting of
ing but the limits for possible values of the quantity are several sequential steps, beginning with homogeniza-
available. For example, the uncertainty associated with tion and ending with a GC determination. The com-
purity of a material and expressed as being "not less bined correction Fe and the combined uncertainty U c
than the p (%) level" might be one such case insofar as for the procedure as a whole are derived from individu-
an expected (or nominal) value of purity is unknown al values of the correction factor Fi and the uncertainty
here. Other examples of the application of model distri- Ui each relating to a stage i as follows:
butions are shown in Fig. 1. n
Combined correction Fe = IT Fi
Rule 2: It is necessary to consider uncertainty compo- i=1

nents as making independent contributions to the com-


bined uncertainty as far as possible. Combined uncertainty Uc =
There are, however, situations in the worked exam-
ples where one of the components combined encom- (The values may be known from a recovery experi-
passes in fact another resulting in "double counting" ment; if R is the recovery, then Fj = 11 RJ The protocol
and hence in redundancy in the uncertainty estimation. shows all these calculations. However, this would only
Evaluation of an uncertainty in volumetric measure- be true if the components in the above formulae were
ments is a characteristic example of this. strongly independent. In fact, this is not so. For in-
So, three contributions to the uncertainty are consid- stance, the correction factor F3 experimentally obtained
ered to be essential here: at the extraction stage includes systematic effects of all
A. Specification limits for the glassware of a given the subsequent stages, as the uncertainty U3 incorpo-
type rates all the following uncertainties. The same applies
B. Repeatability of filling the article to the mark to the values of Fs and Us for the stage of concentration
C. Ambient temperature effects. of the washed extract. So, we have in fact summary es-
Standard volumetric glassware with specified capaci- timates instead of individual ones. To get the individual
ty tolerances is used everywhere. It is important, how-
ever, that the tolerance, i.e. the limit of volumetric er- General case [5] Standard volumetric Volumetric glassware
glassware individually calibrated
ror, is a single and sufficient error characteristic for an
article of volumetric ware of a given type for each
usage under standard conditions (see the ISO standards
[7, 8]). In other words, the difference between the ac-
tual capacity and the nominal capacity is to lie within
the limits in each case of filling the article (for instance,
the volumetric flask) to the mark. To consider varia-
tions in filling as a separate uncertainty component is
therefore superfluous where the tolerances are used. Of
course, if data on repeatability of filling with water are
available, they may be accounted for. But, as a result of
the repeatability experiments, one has actually the arti-
Fig.2 Evaluation of uncertainty in volumetric measurements.
cle individually calibrated and one can replace the Contributions to the combined uncertainty tic: (I) specification
nominal capacity by the estimated value of it, immedi- limits for the glassware, (2) variation in filling to the mark, and
ately eliminating the need for the application of speci- (3) ambient temperature effects
150 R. Kadis

Fig.3 Evaluation of com-


bined correction factor and
combined uncertainty in a
multistage procedure. Compo- Scheme of a multistage procedure
nents relating to a stage i
(shown in circles) are either
individual estimates Fi , Ui (up-
Parameter Stage i Procedure as a whole Example 3 [5]
per part of the table) or sum-

..
mary estimates Fi , Ui (lower Recovery Ri Fe = 1 x 1,00 x 1,04
part of the table) n x 1 x 1,10 x 0,96
®=~i -----
-.;'" Correction Comb. Correction Fe = TIFi xlxlxl=I,IO
"'~
"'''' factor

@ -- ....
i=l
.a.~ F, =1,04 F, = 1,10
.Ei<> Uncertainty Comb. Uncertainty u e = J~ u~
F.
IlFj 1
n
iij2 _ LU~
j~i1-1
r 1,06
1,10
rl'~
0,96

® -- ....
..
ri+\ i-I
~'" Uncertainty Comb. Uncertainty u e = ii ,2 + 2;u J2
"'~
E '" j"'i
F, =1,10 F, = 1,06
E.~
®=~i -----
Correction i·1

"''''
00<> factor Comb. Correction Fe = 'F;TIFj
j=1
Fe = 1 x 1,00 x 1,10
= 1,10
Recovery Ri

correction factor Fi for such a stage one must divide the Therefore, accounting for these factors (with separate
summary value Pi by the correction factor(s) relating to estimation of variabilities for GC determination and
all the following stages (the available data in Example 3 calibration stages) seems to be based on a misunder-
permit us to do this), and one can then calculate the standing in the context of the procedure.
combined factor for the procedure. Evaluation of an uncertainty associated with weigh-
It is also possible to do this without finding the indi- ing in the same example should also be mentioned. The
vidual correction factors in the calculation of the com- two contributions to the uncertainty taken into account
bined correction. It is sufficient to stop at the first sum- in this case are: a standard deviation for "repeatability
mary factor Pi (such as F3 in Example 3) in the product experiments" (0.03 g) and a standard deviation of the
of the factors. This leads the multistage procedure to be mean of the long-term data (0.008 g). The two compo-
broken up into the elements that make independent nents are combined, giving the value 0.031 g.
contributions to the combined value. Figure 3 demon- Note first of all that the use of the standard devia-
strates the two possible ways of calculating: getting in- tion of the mean as a measure of long term variability is
dividual estimates from summary estimates is depicted not correct in estimating the uncertainty sought for. If
by upright arrows, and obtaining the combined correc- the data available have covered not 11 months, as in
tion as a product is depicted by horizontal arrows. (The the example, but say 22 months, the long-term contri-
appropriate procedures in calculating the combined un- bution would be V2 times smaller following this way of
certainty are also included in the table.) The right-hand thinking. The available monthly check weights data
column of the table shows the relevant calculations as- (Table A3.1(3) [5]) give a standard deviation of 0.026 g,
sociated with Example 3. and this value alone characterizes the long-term varia-
bility of a single weighing. At the same time, Table
Rule 3: When estimating uncertainty only those influence A3.1(1) [5] shows repeat weighings / replicate readings
factors are to be considered that really affect the result of results that may be very useful for detailed examination
a measurement in the context of the procedure. of the precision of weighing. Application of one-way
Let us refer again to Example 3 of the Guide. Eval- analysis of variance according to a standard scheme [9]
uation of the uncertainty associated with the GC meas- allows us to estimate separate components of total
urements is based here (Table A3.6 [5]) on a wide- weighing variation, inasmuch as the standard deviation
ranging study of GC variability across different instru- of 0.03 g mentioned above was evidently obtained by
ments, operators, (and times). However, in spite of the treating the data as one large sample, without a separa-
wide variation of the factors, the usefulness of the un- tion.
certainty estimate so obtained is doubtful. Indeed, the So, a replicate readings standard deviation is found
conditions of the GC determinations are usually such to be 0.020 g (with the number of degrees of freedom f
that the responses for both a sample and a standard are being equal to 36). One can easily prove by means of
registered with the same instrument, by the same ana- the F test that the long term standard deviation of
lyst, and over a short period of time. This is why these 0.026 g (J = 10) does not differ significantly from this
influence factors largely cancel in the result of analysis. estimate, even at the 10% level. This means here that
Evaluating uncertainty in analytical measurements: the pursuit of correctness lSI

time is not a factor at all, so that a contribution due to plied to the problem of comparison of two measure-
long-term variability need not be considered. The anal- ment results, taking into account their uncertainties
ysis further leads to a repeat weighings standard devia- [10], regardless of the fact that the over-all uncertain-
tion of roughly 0.05 g if = 11) for a single weighing. ties, as calculated in Example 2, would be unsuitable to
This estimate is significantly greater (F test) than solve the problem correctly with respect to the compa-
0.020 g, the standard deviation for replicate readings, rative experiment. They are "excessive" for this. Clear-
and hence there is some kind of additional source of ly, a number of error components caused in particular
variation in weighing in the laboratory. Whatever the by deviations of actual experimental conditions from
source may be, it is the value 0.05 g which should be nominal ones are the same for the two results to be
taken as an actual contribution of weighing to the com- compared, and the corresponding contributions vanish
bined standard uncertainty required. in the uncertainty budget for the difference. Thus, if
only an internal consistency of results, not absolute
trueness, is of interest, the influence quantities which
And one more remark on the subject
are not variable in the scope of such a comparative trial
One of the most important points relating to the evalu- may be disregarded, with the overall uncertainty being
ation of uncertainty is that all relevant error sources reduced to that suited to the particular conditions and
should be taken into account, with the corresponding referred to as a conditional uncertainty.
contributions being combined. The estimate so ob- The possibility of such an approach, albeit with re-
tained quantifies the overall uncertainty inherent in the spect to the "top-down" method of dealing with uncer-
analytical procedure at issue. Apart from the meaning- tainty, was noticed in [1]. The term "conditional uncer-
ful reporting and interpretion of an analytical result, tainty" or a similar one is likely to gain currency in ana-
the overall uncertainties are applicable, for instance, lytical data treatment, since a considerable part of ev-
for quality control purposes when reference materials eryday tasks in analytical laboratories only requires
are used. such an internal consistency of results. It should not be
There are, however, many cases in analytical prac- regarded as a "loophole" in order to reduce an uncer-
tice in which the overall uncertainty estimates seem tainty that may otherwise be too large. It is to be reck-
inappropriate to be handled and thus unnecessary. Sup- oned rather as an instance of applying the fitness-for-
pose, for instance, one has to compare two articles of purpose principle. The notion of fitness for purpose is
ceramic ware with respect to cadmium release accord- apparently quite applicable to uncertainty estimates as
ing to BS 6478 (see Example 2 of the Guide). The ex- well as to data produced by the measurement process
periment is carried out in such a way that the tested itself.
vessels filled with the same leaching solution are al- In conclusion, it would be relevant to cite a very true
lowed to stand during the same period of time at the and profound passage (item 3.4.6) from the ISO Guide
same temperature (both measured with reasonable ac- [6], which was fully carried over to the EURACHEM
curacy), and the two extract solutions obtained are ana- document (item 5.4.16): "The evaluation of uncertainty
lyzed by AAS using the same bracketing reference so- is neither a routine task nor a purely mathematical one;
lutions. The question is whether the two samples differ it depends on detailed knowledge of the nature of the
from each other with respect to the test or not, or, in measurand and of the measurement method and proce-
terms of statistics, whether the difference between the dure used. The quality and utility of the uncertainty
two measurement results is significant against the back- quoted for the result of a measurement therefore ulti-
ground of their own variabilities. mately depends on the understanding, critical analysis,
Appropriate conformity criteria based on Bayesian and integrity of those who contribute to the assignment
theory as well as those of the usual statistics can be ap- of its value."

References
1. Analytical Methods Committee, RSC 5. EURACHEM (1995) Quantifying K ISO 47X7 (19X4) Laboratory glas-
(1995) Analyst 120:2303-230X Uncertainty in Analytical Measure- sware. Volumetric glassware. Meth-
2. Cortez L (1995) Microchim Acta ment ods for use and testing of capacity
119: 323-32X 6. ISO, IEC, OIML, BIPM (1992) 9. Doerffel K (1990) Statistik in der
3. Williams A (1996) Accred Qual As- Guide to the Expression of Uncer- analytischen Chemie, 5th edn, chap X.
sur 1: 14-17 tainty in Measurement. 1st edn. ISO Deutscher Verlag flir Grundstoffin-
4. Wegscheider W, Zeiler H-J, Heindl (The 1993 edition in the name of the dustrie, Leipzig
R, Mosser J (1997) Annal Chim seven organizations including IFCC, 10. Weise K, Wager W (1994) Meas Sci
X7:273-2X3 IUPAC, IUPAP is also available) TechnoI5:X79-XX2
7. ISO 3X4 (1979) Laboratory glassware.
Principles of design and construction
of volumetric glassware
Accred Qual Assur (1998) 3:14-19
© Springer-Verlag 1998

Angel Rios A view of uncertainty at the bench


Mignel VaIcarcel
analytical level

Abstract The problem with which varies markedly; also, the rigour of
analytical laboratories are con- the estimation increases with in-
fronted, after traceability of their creasing stringency of the demands.
results has been demonstrated, is This paper describes the primary
correctly estimating their uncertain- sources of uncertainty in chemical
ty - to which traceability is also to metrology and discusses different
some extent subject. While the approaches to its estimation in re-
general principles for calculating lation to the type of analytical la-
the uncertainty of physical measur- boratory concerned. The view pre-
ements are applicable to chemical sented tries to be close to the
metrology, some refinements are bench analytical level, in order to
needed, especially careful selection be practical and flexible for labora-
and planning the level at which un- tories, although it could sometimes
certainty will be estimated by each be considered slightly heterodox.
laboratory in accordance with its
A. Rios (lEI) . M. Valcarcel capacity and required demands.
Department of Analytical Chemistry.
Faculty of Sciences, Depending on the particular deci- Key words Uncertainty· Quality
University of C6rdoba, sion to be made, the mechanism to Assurance . Chemical
E-14004 C6rdoba, Spain be used to estimate the uncertainty measurements . Metrology

ence materials; uncertainty at the biological laboratory


Introduction
is typically very high and still regarded as secondary, as
Traceability and uncertainty have become two major acknowledged by the Accreditation Guide for Micro-
paradigms for quality systems in testing laboratories. biological Laboratories, which is accepted by EAL (Eu-
While traceability must be demonstrated (it cannot be ropean cooperation for Accreditation of Laboratories).
represented by a mathematical figure), uncertainty has Metrology in chemical measurements lies between
to be calculated. The ease with which these demands these two, though possibly closer to physical metrology.
can be met varies with the type of laboratory and where While a number of reference materials and certified
the metrological principles are applied (physical, chem- reference materials are available, they are clearly inad-
ical or biological field). Broadly speaking, metrology in equate to meet the needs and frequently entail using an
the physical field poses no serious problems at present alternative route to traceability (e.g. comparisons with
as regards traceability of its measurements; rather, its validated methods or interlaboratory exercises). Also,
chief concern is to decrease the uncertainty of measure- the preliminary operations of the analytical process oc-
ments. By contrast, traceability is the principal concern casionally escape the control required to assure tracea-
of biological metrology owing to a shortage of refer- bility. However difficult, traceability can be demon-
A view of uncertainty at the bench analytical level 153

strated, supported and documented in most instances. what is measured (i.e. the analyte or measurand) but
One must concede, however, that calculating the uncer- also where it is measured (Le. samples and their ma-
tainty of the results obtained by an analytical chemical trices). Obviously, determinations of iron in rocks and
laboratory is currently an issue of great concern for la- human blood are not the same. The "tools" to be used
boratories and also, occasionally, a source of controver- in each case vary, Le. the analytical process that follows
sy between auditors themselves. It is clear that the di- sample collection differs (viz. samples are treated dif-
rect use of metrological principles in analytical labora- ferently and subjected to measuring methods and tech-
tories as they are used in the physical field is pretty use- niques that are dictated by their analytical properties).
less and produces a strong aversion on the part of the The analytical problem addressed, which demands a so-
laboratories, which see the process as artificial and far lution, is obviously not the same either [3]. As a result,
from reality. in chemical metrology the sample (as the physical mate-
This paper is aimed at clarifying the way analytical rialisation of the analytical problem) is the decisive fac-
laboratories should approach the problem and should tor. Whatever chemical measurement is to be made will
adopt a solution consistent with their role and compe- be dictated by the type of sample; also, even if the
tence. An assumption is made that there are several measurand is the same, the standard to be used and the
levels at which uncertainty can be estimated and which way measurements are to be made (viz. the analytical
make up global uncertainty (a more rigorous and valu- technique of choice) can vary markedly. The result of a
able concept). While every laboratory should aim to es- determination will be subject to a global uncertainty ar-
timate this last value, it would be foolish to ignore the ising from three distinct but closely related agents of
fact that most analytical control laboratories - those ac- the analytical process, namely: (a) the measuring in-
credited included - estimate other types of uncertainty strument, (b) the analytical method (sample treatment
that are numerically more accessible; in so doing, they included) and (c) the sampling and sub-sampling proce-
restrict the diversity and intrinsic heterogeneity of the dure. One other distinct feature of chemical metrology
samples they receive. is qualitative analysis, which also requires appropriate
standards and is absent from physical metrology.

Features of metrology in chemical measurements


Sources of uncertainty in chemical metrology
It is important to examine in some detail the essential
The three characteristic steps mentioned above com-
features of chemical metrology and its differences from
prise various sub-steps or basic activities of analytical
physical metrology, where traceability is more imme-
work that are the actual sources of uncertainty in the
diate. These features can help one better understand
measurement process. A brief description of each fol-
the difficulties involved in estimating uncertainty in
lows.
chemical metrology. The essential feature of physical
metrology is its direct association with transfer stand-
ards, which are available for every quantity measured. Uncertainty in the measuring instrument
Also, the measured quantity is always independent of
the sample or object tested (examined). Thus, a length The sources of uncertainty in the measuring instrument
measurement is always referred to the metre, for which are easy to identify and diminish by improving the
a modern definition [1] containing no "artefacts" has equipment itself or using an alternative, more precise
been given - the meter also exists as a physical entity in analytical method. Analytical instruments require
the form of various transfer standards. Whether it is maintenance, calibration (calibration of the equipment)
used to measure the length of a table, the height of a and standardisation (analytical calibration) of their re-
tree or the distance between two objects, the property sponse. Maintenance has virtually no effect on uncer-
actually measured is independent of the nature of the tainty in the absence of underlying malfunctioning. On
sample or the body of the experimental object. In this the other hand, calibration and, especially, standardisa-
type of metrology, uncertainty is a factor (feature) in- tion, can be a source of high uncertainty. Calibration is
troduced by the measuring equipment rather than by in fact a typical activity of physical metrology and is in-
the sample itself. Hence the chief concern of physical tended to ensure that the instrument will be in perfect
metrologists is to establish the uncertainty of measuring condition to make the measurements for which it has
equipment, which is associated with the result for the been designed and constructed. It is done by using
sample or object examined (whether as such or as a transfer standards and produces an uncertainty inher-
combination of several uncertainties). ent in the measured quantity (i.e. intrinsic to the equip-
The picture in chemical metrology is rather different. ment) and independent of the measured parameter.
This type of metrology is chiefly associated with analy- The calibration of an analytical balance is a good exam-
tical chemical standards [2]; also, the essence is not only ple; however, any other laboratory instrument is tied to
154 A. Rios' M. Valcarcel

this activity (e.g. the use of holmium or didymium fil- "absolute" and "stoichiometric" methods) and second-
ters to calibrate wavelengths in UV -visible spectropho- ary methods, which involve a longer traceability chain
tometers). and are commonly referred to as "relative" or "compa-
The standardisation of an instrument's response - rative" methods. One prominent part of relative analy-
commonly referred to as "calibration", as in "calibra- tical methods is the standardisation of the instrument's
tion curve", despite the fact that calibration is a differ- response, described in the previous section. However, a
ent activity - is a purely analytical activity typical of relative method also comprises other steps that are col-
chemical metrology. It affects analytical instruments lectively designated the preliminary operations of the
only and defines their response to the measurand( s) to analytical process. In fact, these operations significantly
be measured. Standardisation is crucial to ensuring tra- complicate chemical metrology as they are varied and
ceability in the results subsequently obtained. One difficult to control and reproduce in a systematic man-
must make several decisions and take several steps to ner [6]. They are thus the source of major errors not
reach this goal, namely: (a) select an appropriate stand- only of the random but also of the systematic type that
ard, (b) choose a suitable standardisation method, (c) have a decisive influence on uncertainty.
derive a mathematical relation between the analytical Method validation is thus a central activity in labora-
signal and concentration, and (d) validate the model es- tory quality systems in as much as it assesses adherence
tablished. There are some manuals and literature refer- of the laboratory to its quality policy. The validation
ences of help in this context, particularly those with a process is closely related to representativeness of the
chemometric slant [4, 5]. It is worth emphasising the results [7], which depends on the analytical objectives
significance of validating the experimental model in and types of sample. Table 1 shows the basic landmarks
terms of quality. Validation of a model involves experi- of the process. No doubt, demonstrating that the results
mentally confirming that it is a correct simplification of obtained are accurate is essential proof and an una-
the series of experimental points it contains in such a voidable requisite. In addition to meeting other objec-
way that it can accurately predict future unknown val- tives, including compatibility with the sample matrix
ues (in the samples to be analysed). Univariate linear and adequate robustness for use in routine work, one
calibration is usually done by using least-squares re- must estimate the degree of uncertainty associated with
gression, which involves checking fulfilment of various the results produced by a given method.
statistical hypotheses -alternatively, residual analysis The primary source of uncertainty lies in the prelim-
can be used to check for homoscedasticity. However, inary operations required to treat samples. Such opera-
one must also determine how closely the model fits ex- tions as digestion/disaggregation of solid samples or ex-
perimental points by using analysis of variance traction and clean-up processes (fairly frequent) intro-
(ANOVA) in order to confirm whether a different,
more precise type of fitting is needed. The validation
process also involves determining the confidence re- Table 1 The process involved in validating an analytical method
gion for the model, its sensitivity and its valid lower
limit (represented by the limit of determination). 1. Checking fitness to the analytical problem:
Validating the standardisation model allows one to - Choosing a suitable method (to be subsequently confirmed
or rejected by validation)
ensure traceability in the results produced by inverse
2. Preliminary study:
interpolation in the analysis of samples; however, the - Clear, detailed description
model will obviously be subject to an uncertainty U3 - Checking fitness to the analytical goal via
that will be a function of the standard error, Sv/x; this • Comparability with the sample matrix (applicability to
latter characterises the standard deviation associated real samples)
with the mathematical definition for the regression line. • Limit of detection
• Determination range
This is a standard uncertainty essentially subject to the • Selectivity
model's random errors, which arise from variations in - Robustness tests
signal measurements (the independent variable, y). Be- 3. Experimentally demonstrating that the system is "under statis-
cause the uncertainties in Xi values are small relative to tical control" (by means of control graphs):
- The means of measured values should remain constant over
the previous ones, they can usually be neglected. In any long periods (at high and low analyte concentrations)
case, calculating this type of uncertainty poses no spe- - The precision should be adequate and constant
cial problem. 4. Demonstrating accuracy:
- Recovery tests
- Comparison with an independent, previously validated
Uncertainty in the analytical method method
- Comparison with CRMs
- Interlaboratory studies
Analytical methods are currently divided according to 5. Compiling the SOP after the method has been validated
traceability into primary methods (formerly designated
A view of uncertainty at the bench analytical level 155

duce significant "uncontrolled" sources of error that ry. This step is also documented in quality schedules;
contribute to uncertainty. Type A evaluation (viz. un- however, the process is complicated by heterogeneity in
certainty that can be experimentally evaluated from the most samples - particularly solid samples - and intro-
statistical distribution of the results from a series of duces appreciable variability between sub-samples that
measurements) and type B evaluation (uncertainty ultimately leads to significant differences between the
evaluated from assumed probability distributions based results obtained for the same sample. No doubt, sam-
on experience or other information) uncertainties can pling and sub-sampling will demand greater interest in
be used to estimated the global uncertainty (as the ISO the future; such standards as EN 45000 and similar ones
recognises) of the analytical method concerned, de- should provide a more systematic and extensive de-
noted by U2. Recovery tests must be conducted very scription of the minimum requirements in this respect,
carefully if their uncertainty is to be correctly esti- engage laboratories in these activities and encourage
mated. Thus, the analyte must be added in the same the release of sampling guides for specific fields. This
chemical form as it is likely to be present in the sam- will decisively increase the quality of results and con-
ples; also, the spiked sample must be thoroughly homo- nect them to the real world - from which samples ulti-
genised and additions must include variable amounts of mately come - rather than only to the portion that
analyte. Finally, the recoveries must be evaluated in reaches the laboratory. Simultaneously, the bodies and
statistical terms (usually by regression analysis). institutions concerned with or responsible for quality
The possibility of evaluating uncertainty under a on a national or international scale should promote ef-
type B approach in this analytical step is a distinct fea- forts in the direction pointed out by Thompson and
ture of chemical metrology that entails obtaining addi- Ramsey [11] in order to develop reference sampling
tional information (mainly about the preliminary oper- targets (RST), analogue sampling of an RM or CRM,
ations involved in the method used) frequently ob- collaborative trials in sampling, and the possibility to
tained outside the laboratory (from the analytical liter- organise future proficiency testing in sampling.
ature or other laboratories). This type of evaluation is The uncertainty produced by these steps, denoted by
closely related to the variety of samples where the ana- u" is undoubtedly very high and exceeds that resulting
lyte can occur or with the fact that sometimes it is not a from the previous two steps. Also, its estimation is rath-
single analyte but a group of ill-defined individual ana- er complex in most cases: it entails using vast amounts
lytes that are to be determined (e.g. bitter compounds of information about the samples analysed and their
in beer or the hydrocarbon index in waters). origin to ensure a correct assessment. Obviously, as the
tools noted in the previous section become available,
this task will be easier and more reliable. Heterogeneity
Uncertainty in sampling and sub-sampling within and between samples is the origin of this prob-
lem and one more feature that clearly distinguishes
It is widely admitted that sampling poses special prob- chemical metrology from physical metrology. It con-
lems and influences the representativeness of the re- firms that the primary target of chemical metrology is
sults. The portions extracted from a sample for analysis the sample rather than the equipment used in the ana-
should contain essentially the same information as the lytical process, which is also important but only second-
population or system studied as a whole. Unsurprising- arily. There is thus a highly significant underlying prob-
ly, this activity has been the subject of abundant litera- lem awaiting solution in order to assure quality of
ture [8-10]. The sampling strategy can be suited to the chemical measurements: sampling and sub-sampling.
analytical problem addressed by using four different Under the influence of our fellow engineers and physi-
types of approach, viz. intuitive or judgmental, random, cists, who deserve due credit for establishing metrologi-
systematic and protocol-based. This is therefore the cal principles and starting and systematising Quality
first decision with which one is confronted. The sam- Assurance systems, we have placed too much emphasis
pling manual describes in detail the conditions, equip- on measuring equipment to the detriment of our true
ment and procedures used in this step. In such a widely goal and primary source of variability: samples.
variable activity, it is utterly important to develop clear,
well-documented protocols in order to release opera-
tors from the need to improvise or undertake responsi- Estimating uncertainty
bilities beyond their qualification. Even if these cau-
tions are exercised, there remain the enormous varia- The "Guide to the Expression of Uncertainty in Meas-
bility of samples and their also variable representative- urement", published jointly by the ISO and other bod-
ness of the problem addressed. ies [12], sets general rules for assessing and expressing
Equally important is sub-sampling, which involves uncertainty, and applying them to chemical metrology.
withdrawing aliquots from previously collected samples Uncertainty is assigned various sources including the
for subjection to the analytical process at the laborato- following: an incorrect definition of the measurand,
156 A. Rios' M. Valcarcel

sampling, incomplete extraction or preconcentration of REPLICATES


the measurand, matrix and interfering effects, carry- H
over during sampling or sample preparation, unknown
effects of the environmental conditions on the sample,
RESULTS - 6;.;:>
instrument bias, tolerances of weights and volumetric RESULTS~ us,
material, reagent purity, values assigned to standards
and reference materials, calibration, etc. A document ANALYTICAL RESULTS .. U S2
PROCESS
released by EURACHEM [14] refines these principles
as applied to chemical measurements and establishes 6;;:>
RESULTS ... U S3

the steps to estimating uncertainties; it shows how to


express them properly and - especially useful - pro-
vides examples of variable complexity. Although the
document can be seen as very academic and not very L. ___ __ ~ 2I
I
2
~ U sample + U process
close to the bench analytical level, it confirms the sig-
nificance of the early steps of the analytical process
Fig. 1 General process for calculating uncertainties in chemical
(sampling, sub-sampling and sample treatment) in the metrology. The total variance (U~"tal) is the summation of the
overall process with a view to estimating uncertainty, its variances resulting from sample diversity and heterogeneity
high contribution to global uncertainty and the difficul- (U~amplc) and that stemming from the analytical process
ty involved in its evaluation. It is not an exaggeration to (U~mcc,,). CRM certified reference material, S, sample i, U uncer-
state that the mere reading of these examples can tainty
"frighten" laboratories, most of which are bound to feel
unable to calculate their uncertainty in the proposed
ways. Where does the problem lie? Is this way of ma- sents the specific analytical process that produces the
naging things too demanding? The answer, as almost results. If a certified reference material (CRM) or, fail-
always, is to rationalise computations by simplifying or ing this, a control sample suited to the type of analysis
discarding those steps whose uncertainty is known to to be performed is available, one can not only check
be small relative to other key steps with a decisive in- whether the results are traceable, but also estimate the
fluence on global uncertainty. While this can facilitate uncertainty of the analytical process (UpTOCCSS) by using
estimating uncertainty, it does not solve the actual an appropriate number of replicates (under reproduci-
problem. bility conditions). This is essentially an uncertainty esti-
As is typical of quality systems, the goals of an ana- mated as type A, that is obtained with the same materi-
lytical laboratory are established in its Quality Policy. al (a CRM). The vertical direction of Fig. 1 represents
Because estimating uncertainty and demonstrating tra- the different real samples that reach the laboratory in
ceability are two paradigms for these systems, as noted various degrees of heterogeneity. The total uncertainty
above, it seems obvious that, depending on the compe- (Utotal) will be a combination of Uprocess and the uncer-
tence of the laboratory concerned, it should answer this tainties introduced by the variability of the samples
preliminary question: What type of uncertainty is to be processed (Usamplc), which sometime are estimated as
calculated? In fact, total uncertainty is the combined type B (not experimentally). Obviously Usamplc > U pro _
contribution of the three above-described sources of cess, since, by definition, the CRM used to calculate
uncertainty (denoted by u}, U2 and U3) and the most Uproccss must be a homogeneous material - and must
real and demanding of all. As a rule, u 1 ~ U2 > U3, but exhibit some difference from the various types of sam-
we know that Ul is very difficult and laborious to esti- ples to be analysed. Therefore, these laboratories must
mate reliably. It is seemingly clear that these laborato- estimate U 1 from Usamptc (one of the principal sources of
ries do not feel compelled to estimate Ut. so they over- uncertainty). Estimating total uncertainty from a CRM
simplify the problem; there is the question, however, as only is inadequate, especially in control laboratories
to how useful can the information they provide be. that analyse no replicates, since each result they pro-
Even if the laboratory is not concerned with sampling, duce is subject to the uncertainty associated with the
if the samples it receives are variegated (e.g. those to be process. Strictly speaking, another additional uncertain-
used in determining pesticide residues in fruits of dif- ty term should be taken into account to calculate Utota )'
ferent kinds) and essentially heterogeneous, it will be in This is the uncertainty certified for the CRM (again
the position depicted graphically in Fig. 1. classified as type B by the ISO). Note that the contribu-
Figure 1 can be viewed as a general case that only tions of the uncertainty calculated as U~otal are: the un-
excludes the variability of sampling carried out outside certainty of the CRM obtained after the certification
the laboratory. It represents the general origin of the process (a wide variety of analytical methods having
uncertainties that effect the results delivered by the la- been used and combined); the uncertainty of the analy-
boratory. The horizontal direction of the figure repre- tical process under the "ideal" (use of the CRM) situa-
A view of uncertainty at the bench analytical level 157

tion used for its validation (by comparison this is the R. Albert have recently said, "the calculation of uncer-
analogous variation that introduces the tolerance stated tainty as recommended for physical measurements can-
for the internal volume of a volumetric flask); and, fi- not be transferred readily to chemical measurements"
nally, the uncertainty introduced by the variety of sam- [14], because both testing fields have entirely different
ples analysed (real samples, basically heterogeneous). error patterns and bias is difficult to identify and eradi-
Thus, Ulolal should represent the maximum uncertainty cate in chemical systems. In chemical metrology, the
that a particular laboratory could have in its reported sample (the physical entity of the analytical problem to
results. be solved) is the primary target. Also, the greatest
Estimating uncertainty under these circumstances is source of variability is the sample (its nature, its heter-
especially complex. The experience of the laboratory ogeneity, the matrix that accommodates it, the forms
concerned (or others) in the analytical process involved under which the analytes are present, etc.). Therefore,
and the type of sample processed can be of great assist- an analytical result can hardly be representative of the
ance as they may allow one to use existing data or plan object from which information is to be derived unless it
experiments for different samples or parts of samples in is accompanied by the uncertainty introduced at this
order to derive information on the degree of variability early stage of the analytical process. The principal vir-
resulting from sample heterogeneity and sample types. tue of the metrological concept of uncertainty is that it
Interlaboratory exercises are one other valuable source explicitly encompasses aspects related to the represent-
of information for assigning uncertainties due to sam- ativeness of results that have traditionally been disre-
ple variability and heterogeneity. Finally, scanning the garded in much analytical work. On the other hand, the
specialised literature - an essential task for laboratories orthodox step-by-step procedure to estimate the uncer-
wishing to sustain their competence - may also provide tainty in chemical measurements can be alternatively
estimations of variability which, duly justified, can be replaced by a overall procedure, as it has been present-
used to estimate uncertainties in specific cases. ed in this paper, which is simpler, more rational and
closer to the bench level. Under this approach, labora-
tories see as more practical and realistic the way to cal-
Conclusions culate their uncertainties by themselves.

The inherent problems and peculiarities of chemical Editors note


metrology entail introducing rational adaptations of the The above paper reflects the authors' point of view as well as
general principles established for physical measure- their understanding of uncertainty. Readers are invited to com-
ments in estimating uncertainties. As W. Horwitz and ment and report points of view of other practitioners.

References

1. XVII General Conference on 5. Miller JC, Miller IN (1993) Statistics 10. Gy PM (1995) Trends Anal Chern
Weights and Measurements (19X3) for analytical chemistry (3rd edn), 14:67-76
2. Valcarcel M, Rios A (1995) Analyst chap 5. Ellis Horwood, New York 11. Thompson M, Pansey MH (1995)
120:2291-2297 6. Valcarcel M, Luque de Castro MD, Analyst 120:261-270
3. Valcarcel M, Rios A (1997) Trends Tena MT (1993) Anal Proc 12. Guide to the Expression of Uncer-
Anal Chern 16:3X5-393 30: 276-2XO tainty in Measurements (1995) ISO,
4. Massart DL, Bandeginste BGM. 7. Rlos A, Valcarcel M (1994) Analyst Geneva, Switzerland
Deming SN, Nichotte y, Kaufman L 119: 109-112 13. Quantifiying Uncertainty in Analyti-
(19XX) Chemometrics: a textbook. El- X. Crosby T, Patel Y (1995) General cal Measurement, version 6 (1995)
sevier, Amsterdam, pp 75-92 principles of good sampling practice. EURACHEM
Royal Society of Chemistry, London 14. Horwitz W, Albert R (1997) Analyst
9. Smith R, James GV (19XI) The sam- 122:615-617
pling of bulk materials. Royal Society
of Chemistry, London
Accred Qual Assur (1998) 3:117-121
© Springer-Verlag 1998

Michael Thompson Uncertainty of sampling in chemical


analysis

Abstract Uncertainty of sampling repeatability precision, and the


is the contribution from sampling contribution of between-sampler
errors to the combined uncertainty variations to sampling uncertainty
Presented at: 2nd EURACHEM
Workshop on Measurement associated with an analytical meas- must be acknowledged. However,
Uncertainty in Chemical urement when the measurand is the collaborative trial of a sam-
Analysis, Berlin, the concentration of the analyte in pling method is an expensive and
29-30 September 1997 the 'target', the total bulk of mate- difficult exercise to execute. A sys-
rial that the sample is meant to tem of internal quality control for
represent. Of the errors considered routine sampling can be intro-
to contribute to uncertainty, ran- duced. Fitness for purpose has
dom errors of sampling, character- been defined in terms of the re-
M. Thompson (181) ised by precision, are much more quired combined uncertainty of
Department of Chemistry,
Birkbeck College, accessible to investigation than sampling and analysis.
Gordon House. those due to bias. Where an ap-
29 Gordon Square, London proximation to random sampling
WCl H OPP, UK can be achieved, realistic precisions Key words Uncertainty .
Tel.: +44-171-3807409; can normally be estimated. In Sampling . Fitness for
Fax: +44-171-31)07404;
email: some instances reproducibility pre- purpose . Collaborative trial
m.thompson@chem.bbk.ac.uk cision is significantly greater than Internal quality control

Definitions of terms in this paper Introduction

Uncertainty of sampling contribution to the combined In analytical science, measurements are not usually
uncertainty of an analytical measurement that results made on the whole amount of the material of interest
from the production of the laboratory sample from the (here called the 'target'), but on a much smaller
sampling target. amount, the sample, which is selected from the target in
Sampling target mass of material that the laboratory some manner. As a consequence, metrologists in chem-
sample is designed to represent. istry have hitherto concentrated on the analytical meas-
Fitness for purpose property of the result of a measure- urement process in isolation. As that process involves
ment when its associated uncertainty minimises a cost estimating the concentration of an analyte in the labo-
function comprising all terms that are functions of the ratory sample, the uncertainty of measurement for the
uncertainty. analyst refers to that specific measurand. For the end-
user of the data, however, the measurand of interest is
the concentration of the analyte in the target. Hence
Uncertainty of sampling in chemical analysis 159

the uncertainty relevant to the end-user should include ANALYSIS \.\


the uncertainty contribution that is introduced by pre- SAMPLE \
paring the laboratory sample from the target. ANALYSIS \·2

Samples taken from the same target will usually vary


ANALYSIS 2-1
in composition, from each other and from the average
SAMPLE 2
composition of the target, partly because of the hetero- ANALYSIS 2·2
geneity of the target but also because of shortcomings
in the sampling procedure such as contamination, loss ANALYSIS H
of analyte or use of an incorrect sampling procedure SAMPLE 3
[1]. Hence, to obtain the uncertainty (u t ) associated ANALYSIS 3·2
with the target measurand, the uncertainty of the 'pure
analytical measurement' must be augmented by a con- SAMPLING
ANALYSIS 4·l
SAMPLE 4
tribution from the sampling, so that we have: TARGET
ANALYSIS 4·2


where U a is the standard uncertainty of 'pure measure-
ment' and Us is the standard uncertainty resulting from •
errors in sampling. It is stressed here that sampling un-
certainty characterises only those errors made during •
the process of producing the laboratory sample. Errors
introduced during the selection and weighing of a test ANALYSIS m·\
portion from the laboratory sample are subsumed into SAMPLE m
the measurement uncertainty. ANALYSIS m· 2
In most sectors of analytical science sampling proce-
dures regarded as 'best practice' or 'fit-for-purpose' Fig. 1 Design for replicate sampling and analysis of a single sam-
pling target, for the estimation of sampling and analytical preci-
have been developed. Usually, however, we have very sions
little information on the performance of such proce-
dures, because the validation of sampling is far less de-
veloped than that of analysis. In such cases the uncer-
tainty of sampling usually needs to be estimated ab ini-
V
iation characterised by ~ + U;, where U; is the analy-
tical variance. An experimental design for estimating ~
tio. As such estimation can present considerable practi- and a~ is shown in Fig. 1. A reasonably reliable esti-
cal difficulties, it is currently attempted in only a lim- mate of ~ can be made by anova if a s >3aa (i.e., the
ited number of sectors of analytical practice. analytical precision must be somewhat better than the
Sampling errors can be quantified only after analysis sampling precision) and if there are a sufficient number
of the samples, so the results of ordinary measurements of replicate samples (i.e., more than ten). If the mean
carry both sampling and analytical errors. As a conse- squares between and within samples from the anova
quence, results used to estimate sampling uncertainty are designated MSB and MSW respectively, the esti-
must usually be obtained from designed experiments mate 0; of the sampling variance is given by
(with replication and randomisation) for interpretation
by anova (analysis of variance) methods. ~=(MSB-MSW)/n

where n = 2 for duplicate analyses.


Another design for estimating ~ is shown in Fig. 2.
Sampling precision and sampling bias In this design a number of distinct targets similar in
composition are each sampled in duplicate and each
The estimation of sampling precision in the manner to sample analysed in duplicate. This design gives a more
be described is analogous to the estimation of precision rugged estimate of sampling precision than the example
in the validation of an analytical method. Hence we can discussed above, as it avoids reliance on a single target
think in terms of characterising a particular sampling that might turn out to be atypical of the material as a
strategy or protocol for use with a particular type and whole. In this design the sampling precision is 'aver-
size of target. aged' over a number of targets.
If a number of samples are extracted from a target For the result of such an experiment to be useful,
by repeated application of a sampling protocol, the var- the replicated samples must be taken independently
iation in true concentration among the samples is char- and at random. Obviously a single sampling protocol
acterised for a particular analyte by the sampling var- will be used, but its implementation must be random-
iance ~. A single analysis on a sample will have a var- ised. For example, if the protocol specifies that the sam-
160 M. Thompson

ANALYSIS 1·1·1 see how bias could arise in sampling methodology. For
SAMPLE 1·1 example, the samples could be consistently contami-
ANALYSIS )·)·2 nated by the sampling tools such as containers or grind-
TARGET I ing devices (example - rock chips from a borehole con-
ANALYS)S )·2·) taminated by chromium from the drill bit). Alternative-
SAMPLE 1·2 ly, some of the analyte could be consistently lost from
ANALYSIS )·2·2 the samples because of inappropriate handing (exam-
ple - loss of elemental mercury from a rock sample dur-
ANALYSIS 2·)·) ing grinding). A sample can be biased if the sampler
SAMPLE 2·1 misunderstands the protocol (example - sampler col-
ANALYSIS 2·1·2 lects a-horizon soil contaminated with b-horizon soil).
TARGET 2
Finally a sample can be biased if the sampler is selec-
ANALYSIS 2·2·1 tive (instead of random) in the selection of increments
SAMPLE 2·2
• that form the aggregate sample (example - always pick-
ANALYSIS 2·2·2
ing up large pieces of the target materials rather than
small pieces).
• It is often possible to avoid these biases once they
are known. All too often it is very difficult to establish
• the existence of such a bias in sampling for want of (a)
a reference sampling method for comparison, or (b) the
will to carry out the comparison, or because sampling
• ANALYSIS m·)·)
precision is too large to allow the existing bias to be
SAMPLE m·1
ANALYSIS m·)·2
demonstrated at a significant level. In such cases it is
TARGET m usual to regard the sampling method as 'empirical' (by
ANALYSIS m ·2·1
analogy with the empirical analytical method, where
SAMPLE m·2 the result is dependent on the analytical method). An
ANALYSIS m·2·2 empirical sampling method would have zero bias by
definition. In the absence of any readily available infor-
Fig.2 Design for duplicate sampling and analysis of a number m mation on sampling bias it is best in most circumstances
of similar sampling targets, for the estimation of sampling and to calculate combined uncertainty estimates only from
analytical precisions precision contributions.

pIe should be made up of a number of increments tak- Internal quality control in sampling
en from the target in a specific two-dimensional pat-
tern, the replicate samples should be obtained by relo- Given estimates of Us and U a for a particular type of
cating the origin and orientation of the pattern at ran- material and a fixed sampling protocol we can consider
dom for each sample. Any deviation from randomness the application of an internal quality control method to
would tend to give rise to an underestimate of sampling ensure that the measurement system, including sam-
precision. In practice the extraction of a random sam- pling, stays in statistical control. Let us assume that
ple from particular targets may be difficult or imprac- some or all of the successive routine sampling targets
ticable. Samplers must do their best under prevailing are sampled in duplicate and each of the two samples is
circumstances. analysed once, to give two values (Xh X2) for each tar-
In all of these experiments, the set of samples col- get. For monitoring this data, a control chart, based on
lected should be analysed in a random order under re- N (0, ~ = 2( ~ + uD), could be constructed for the dif-
peatability conditions. This strategy avoids confusing ference d =x 1 - X2 between the two values. As usual the
analytical problems like drifts with genuine differences warning limits should be at ± 2u and the action limits at
between the samples. ± 3u. That would be useful as a control on sampling as
Bias in sampling is a difficult topic. Some experts on long as 0:, was the dominant precision term. An out-
sampling argue that sampling bias is not a meaningful of-control situation could arise if, for example, the tar-
concept: sampling is either 'correct' or 'incorrect' [2]. get were more heterogeneous than usual, or if an inad-
Certainly sampling bias may well be difficult to detect equate number of increments were combined to form
or estimate, because we need an alternative sampling the aggregate sample. In contrast to analytical quality
technique or protocol, regarded as a reference point, control, there exists an extra possibility, namely that
with which to compare our method under test, before the two samples could be too similar. That could arise if
we can claim that bias is present. However, it is easy to the two samples were not independent, for instance, in
Uncertainty of sampling in chemical analysis 161

an extreme case, if both laboratory samples were splits were z 2, close to the value found in analytical collabo-
from a single aggregate sample. A possible approach to rative trials. This finding suggests that in some in-
monitoring too great a similarity would be to set up ad- stances reproducibility precisions might be most appro-
ditional control lines at (say) ± O.5u. Four successive priate for estimating sampling uncertainty. At the other
points within these inner lines would occur only rarely extreme, if the target comprises material that is nearly
under statistical control and suggest problems with homogeneous in the analyte, it may be impossible to
sampling. find significant between-sampler or within-sampler var-
iation, because they are both small relative to the ana-
lytical variation. This situation has been found in a col-
The collaborative trial in sampling laborative trial in sampling wheat (unpublished data).
The lack of information in this area is hardly surpris-
Until recently sampling precision has been treated im- ing. It is sometimes an unpopular activity, often techni-
plicitly as if it were independent of conditions under cally difficult, and always expensive to organise a colla-
which sampling is executed, so that repeatability condi- borative exercise in sampling. The samplers all have to
tions should suffice for its estimation. However, it is travel to the sampling target(s) and, as they must work
well recognised that in analytical measurement preci- independently, they must visit the target in succession.
sions measured under reproducibility conditions (one This might delay the shipment of a large amount of a
method, one material, different laboratories) are valuable commodity. Often a commodity (especially a
greater than repeatability precisions (one analyst, one packaged material) would be spoilt to an unacceptable
material, one method, one instrument, short time peri- degree by multiple sampling. However, there is no
od) by a factor of about two [3], i.e., uR/ur z2. Conse- doubt that an optimised sampling/analytical system can
quently it is worth enquiring whether, for sampling, us- maximise profits for a manufacturer (see below), and
ing the same protocol and test material, reproducibility there are industrial instances known to the author
precisions are greater than repeatability precisions to where proper attention to sampling errors have re-
any important degree. Such a finding would have con- sulted in a substantial net gain.
siderable implications for the correct estimation of sam-
pling uncertainty.
The established method of considering reproducibil-
ity precisions in analysis is the collaborative trial (meth- Uncertainty and fitness for purpose
od performance study) [4]. Applied to sampling, the
collaborative trial would require a number of samplers Given that errors are introduced into measurements by
each to take independent duplicate samples from a tar- both sampling and analysis, we need to consider two
get, at random, using a fixed sampling protocol [5]. If questions relating to fitness for purpose, namely: (a)
all of the samples were then analysed in duplicate, to- given limited resources, how can we divide them opti-
gether under randomised repeatability conditions, then mally between expenditure on sampling and on analy-
hierarchical anova could be used to decide whether sis; and (b) how can we decide whether a combined un-
either within-sampler or between-sampler precision certainty is adequate for the end-user needs (i.e., is the
had reached statistically significant levels. If that were end-user able to make valid decisions given the uncer-
so, the sampling precision under repeatability and re- tainty of the measurements)?
producibility conditions could be estimated from the A simple answer to the former question is to consid-
mean squares. er the relative contributions of sampling and analysis to
Very little experimentation has been conducted the combined uncertainty. If sampling uncertainty is
along those lines so far. The findings are suggestive, but the dominant term there is no point in utilising an ex-
insufficient work has been done for general conclu- pensive highly accurate analytical method, because the
sions. Studies on sampling contaminated land [5] combined uncertainty will not be usefull im roved.
showed contrasting results for different analytes. At the For example if u a <0.2u s , then U t < (u~+(O.2us)2) =
sites investigated the major contaminant (lead) was l.02u s , so that U t z Us regardless of how small u" be-
spatially distributed in a very heterogeneous manner. comes. At the other 'extreme', if u,,>O.5u s , then U t is
As a result, the within-sampler precision was so large dilated to a level substantially greater than Us. There-
(RSD z 30%) that no significant between-sampler var- fore U a should best fall within the approximate range
iation could be detected for this element. In contrast, {0.2u s - O.5u s \. The same argument, applied to a domi-
elements present at near the background levels (i.e., nant analytical uncertainty, produces the corresponding
present because of natural processes rather than con- result. These considerations, although informative in
tamination and therefore not wildly heterogeneous), themselves, pay no attention to the relative costs of
significant levels of between-sampler precision were sampling and analysis as functions of the precision ob-
found. Values of the ratio UR/ U r found for sampling tained.
162 M. Thompson

To obtain a more realistic picture, including an ele- on the measurement). For the sake of a simple exam-
mentary consideration of costs, we need to examine the ple, we take a linear function such as
apportionment of a fixed financial resource between
sampling and analysis [6]. We first consider the cost A Le=Q+Ru t
of procuring a sample with unit sampling uncertainty. where Q and R are constant costs. We are now in a
We can achieve an uncertainty of 112 by collecting and position to define fitness for purpose in an operational
thoroughly mixing four independent samples, collected manner: it is the combined uncertainty that minimises
using the same protocol, at a cost of 4A. The cost Ls of the total cost L, which is given by
achieving any uncertainty will therefore generally be
Ls=Alu;. dL =0
dU t
The same type of consideration would apply to anal-
ysis. We could assume with reasonable confidence that where
the cost of analysis La would be given by L = Le + L t = Q + RUt + D I u~.
La=Blui. Concrete applications of this idea have yet to be
Both costs escalate steeply with requirements for published, and it is likely that the necessary informa-
decreasing uncertainty. To apportion the costs we need tion would be difficult to obtain in many practical cir-
to minimise the total cost L t = Ls + La of the measure- cumstances. However, it provides a useful conceptual
ment 0 eration for a particular combined uncertainty framework for defining the relationship between uncer-
Ut = ui + u;. It can be shown [6] that the minimum is tainties of sampling and analysis and fitness for pur-
defined by pose.

Conclusions

Uncertainty of sampling is a topic that has to date re-


ceived scant attention by metrologists or analytical
chemists. The difficulties of studying sampling errors
are great, and the cost of such studies may be substan-
at a total cost of tial. Where it has been studied, sampling uncertainty
has often been found to be of considerable magnitude,
Lt = (VA +2Y?!) 2
= ~. sometimes much greater than the pure measurement
ut ut
uncertainty. Few analytical chemists or end-users of
We now consider the end-users cost function L e • data are aware that only fit-for-purpose sampling un-
The exact form of this function would depend on parti- certainty combined with appropriate analysis will max-
cular circumstances, but would be almost invariably an imise their cost-effectiveness. Analytical chemists
increasing function of the combined uncertainty (i.e., should be willing to confront the subject: there is no
the bigger the uncertainty in the measurement, the doubt that it needs thorough investigation, with the ap-
greater the likelihood of an error of judgement based propriate investment of money.

References

1. Thompson M, Ramsey MH (1995) 3. Boyer KW, Horwitz W, Albert R 5. Ramsey MH, Argyraki A, Thompson
Analyst 120:261-270 (1985) Anal Chern 57 :454-459 M (1995) Analyst 120:2309-2312
2. Gy PM (1992) Sampling of heterogen- 4. Horwitz W (1995) Pure Appl Chern 6. Thompson M, Fearn T (1996) Analyst
eous and dynamic materials. Elsevier, 67:331-343 121 :275-278
Amsterdam
Accred Qual Assur (2002) 7:274-280
001 1O.1007/s00769-002-0489-4

© Springer-Verlag 2002

Michael H. Ramsey Appropriate rather than representative


sampling, based on acceptable levels
of uncertainty

Received: 28 December 2001 Abstract Appropriate sampling, that 50%) can be shown to be fit for some
Accepted: 25 April 2002 includes the estimation of measure- specified purposes using this ap-
ment uncertainty, is proposed in pref- proach. Once reliable estimates of the
Presented at EUROLAB/EURACHEM erence to representative sampling uncertainty are available, then a
Workshop "Sampling",
5-6 November 2001, Lisbon, Portugal
without estimation of overall mea- probabilistic interpretation of results
surement quality. To fulfil this pur- can be made. This allows financial
pose the uncertainty estimate must aspects to be considered in deciding
include contribution from all sources, upon what constitutes an acceptable
including the primary sampling, sam- level of uncertainty. In many practi-
ple preparation and chemical analy- cal situations "representative" sam-
sis. It must also include contributions pling is never fully achieved. This
from systematic errors, such as sam- approach recognises this and instead,
M.H. Ramsey (~)
Centre for Environmental Research, pling bias, rather than from random provides reliable estimates of the un-
School of Chemistry, errors alone. Case studies are used to certainty around the concentration
Physics and Environmental Science, illustrate the feasibility of this ap- values that imperfect appropriate
University of Sussex, Falmer, proach and to show its advantages for sampling causes.
Brighton BNI 9QJ, UK
e-mail: m.h.ramsey@sussex.ac.uk
improved reliability of interpretation
Tel.: +44-1273-678085 of the measurements. Measurements Keywords Representative sampling·
Fax:+44-1273-677196 with a high level of uncertainty (e.g. Uncertainty of measurement

Introduction estimates of concentration and are never exactly equal to


the true value of concentration. The emphasis in analyti-
The traditional approach to sampling is to select a "cor- cal chemistry has therefore changed to estimating the un-
rect" sampling protocol that is assumed to give a repre- certainty of a concentration measurement, as well as its
sentative sample, and to eliminate sampling bias by defi- value. This uncertainty of measurement can be defined
nition [1]. This approach has the advantage of simplicity, informally as "the interval around the result of the mea-
but can lead to unsuspected errors in estimates of con- surement that contains the true value with high probabil-
centration, caused by sources such as variations in the ity" [2]. With this information, all estimates of concen-
practical application of the protocol. tration can be compared, and the measurements can be
This situation in sampling has a useful analogy in the interpreted with a known level of uncertainty.
chemical analysis of test materials. There was a time This approach can usefully be extended to the prima-
when analytical chemists strove to find a "correct" ana- ry sampling procedure. Instead of assuming that a "cor-
lytical method, which would determine the true value of rect" sampling protocol will produce a representative
an analyte concentration. If all laboratories used this sin- sample, it is possible to select an appropriate sampling
gle "correct" method, it was argued, then all measure- protocol that will give measurements with an acceptable
ments would be true and comparable. It has now been level of uncertainty. Representivity is still an objective,
realised by the analytical community that achieving this but there is an explicit admission that it is never
ideal is impossible. All analytical measurements are only achieved perfectly, and the range within which the true
164 M. H. Ramsey

value lies is stated together with the measured value of who interprets these measurements often assumes that
concentration. they are "correct" and does not use, or have access to,
There is one crucial concept that must be accepted in any of the information gained from the AQC materials.
order to make this approach applicable; the action of pri- In the alternative "appropriate" approach, the laboratory
mary sampling must be considered as the first step in the reports an uncertainty value with each concentration val-
making of a measurement of concentration. In this way ue. This estimate of uncertainty is not just that arising
the uncertainty of the measurement includes the contri- from the chemical analysis (currently being reported in
bution from all of the sources [3]. These include the pri- some labs), but also includes components from the pri-
mary sampling, the physical preparation, the laboratory mary sampling and physical preparation. This uncertain-
sub-sampling, the chemical preparation and the chemical ty value is estimated using information derived in part
analysis of the sample. The word "measurement" refers from the AQC materials, and gives the customer access
to the final estimate of analyte concentration, but the to this information in a form that is useful.
phrase "measurement process" in this paper is used to This paper aims to give an overview of the research
denote all of these processes collectively. Estimates of that underpins this approach to "appropriate sampling",
measurement uncertainty that omit some of these sourc- with references given to sources of more detailed infor-
es, particularly the sampling, will inevitably be too small mation. It will cover definitions of uncertainty, methods
and therefore unrealistic. to estimate uncertainty that include all sources, accept-
A second important step is to include systematic er- able levels of uncertainty (fitness-for-purpose), implica-
rors (unknown or uncorrected) into the estimates of un- tions of uncertainty for interpretation of measurements,
certainty. If the uncertainty interval is to include the true and conclusions, with some examples from case studies
value, then its calculation cannot be restricted to just the to help clarify the explanations.
random errors that are unrelated to the true value. In pri-
mary sampling, the random errors are traditionally well
characterised and used, for example, to judge the Definitions of uncertainty
amount of sample that is required to achieve a specified
sampling precision. It is however the systematic errors The formal definition of uncertainty of measurement is
(e.g. sampling bias) that can often cause unsuspected er- "A parameter associated with the result of a measure-
rors, and therefore both types of error need to be esti- ment, that characterises the dispersion of the values that
mated. could reasonably be attributed to the measurand" [5].
If the actual uncertainty is estimated for every mea- The meaning of this definition is not entirely clear, and
surement, rather than relying on an assumption of cor- depends heavily on the definition of the word "measu-
rectness or representivity of the sampling, then the reli- rand", which is formally "the particular quantity subject
ability of measurements (and the sampling component) to measurement" [6]. The previously quoted informal
will improve. Moreover, once the assumption of perfect- definition of uncertainty as "the interval around the re-
ly representative samples is set aside, it is possible to de- sult of the measurement that contains the true value with
cide how close to the true value the measurements are re- high probability", clearly interprets measurand as being
quired to be, for any particular application. There are the "true value" of the analyte concentration, and not just
cases where relatively high levels of uncertainty (e.g. as the "analyte concentration" as has been implied by
80%) can be shown to be appropriate for some purposes some sources. This interpretation has the important im-
(i.e. the measurements are "fit-for-purpose") [3]. The plication that uncorrected systematic errors should be in-
practical limitation on the number and quality of mea- cluded within the estimates of uncertainty. It differenti-
surements made is frequently financial, and this "appro- ates uncertainty from the traditional "error bars" often
priate sampling" approach also allows a optimal balance quoted for analytical measurements, which are invari-
to be made between the quality of the measurements ably based upon random errors alone, with no reference
(from sampling and analysis), and the cost of both the to the "true value" of the analyte concentration.
measurements and the consequences of undetected mea- The estimation of uncertainty for analytical measure-
surement errors [4]. ments has now been widely advocated, at least in Europe
The traditional approach to quality sampling is usual- [7]. However, these estimates specifically exclude the
ly linked to an equally traditional but separate approach contribution to uncertainty arising from the primary sam-
to quality in chemical analysis. In traditional analytical pling, and often from the physical preparation of the
quality control (AQC), various AQC materials are analy- sample (e.g. drying, grinding, splitting). Several studies
sed in the same batch as the samples. If the measure- have shown that these are often the largest sources of un-
ments made on these materials fall within predetermined certainty in measurements [3, 8-10].
limits, then the measurements made on the samples are
reported to a customer as single concentration values
(e.g. in flg of analyte per g of sample). The customer
Appropriate rather than representative sampling, based on acceptable levels of uncertainty 165

Methods to estimate uncertainty that include value of concentration (and its uncertainty) can either be taken
all sources from the known concentration of analyte added, for the synthetic
RST [12), or established by the consensus of an inter-organisatio-
nal sampling trial (lOST) [13). The accepted value can also in-
Methods for the estimation of uncertainty from analytical methods clude a specification of the spatial distribution of the analyte, and
are well developed. They often use "bottom up" method which its uncertainty [14).
sum all of the individual components in the uncertainty budget [7), The second method of estimating the contribution of sampling
but they can also use information from "top down" approaches bias to the uncertainty of the measurement, is to apply more than
that use estimates of the total uncertainty, from inter-laboratory tri- one sampling protocol to a sampling target, ideally with more than
al for example, without necessarily subdividing it. The step of pri- one sampler (i.e. the person who takes the sample). One extreme
mary sampling is traditionally excluded from these estimate~ of example of this approach is the inter-organisational sampling trial
measurement uncertainty, but recent attempts have been descnbed in which eight or more samplers take samples from the same sam-
to apply bottom up methods to this aspect [10). pling target for the same specified purpose. If all participants use
New methods have been devised to estimate the uncertainty of the same protocol it constitutes a Collaborative Trial in Sampling
measurements, which is caused by all of the sources, including (CTS) [15), but if they all select their own protocols, based on
procedures used for primary sampling. Uncertainty of measure- their professional jUdgement, it constitutes a Sampling Proficiency
ment (uc) can be estimated by summing contributions from the Test (SPT) [13). The variability of the estimates of analyte con-
four types of error in the methods of measurement. These include centration between the participants can then be used to estimate
two random components (sampling precision and analytical preci- the uncertainty of the measurement procedure as a whole, as ap-
sion) and two systematic components (sampling bias and analyti- plied to a particular site. Any bias caused by the sampling of any
cal bias). The expanded uncertainty (U) is estimated in this case participant then becomes part of the random error across the
by the use of a coverage factor of two, to give approximately 95% whole sampling trial, and hence is automatically included in the
confidence for a normal distribution. uncertainty [3).
Well established methods are available to estimate three of these There are therefore four methods that can be identified for the
four components (Table I). Analytical precision can effectively be estimation of the overall uncertainty of measurement (Table 2).
estimated most cost-effectively using duplicate chemical analyses. None of these methods use the traditional "bottom-up" approach
Sampling precision can be estimated similarly by taking duplicated of adding all of the separate components of uncertainty together
samples at points separated in space (or time) by a distance reflect- [5). They rely on a fundamentally "top-down" approach that aims
ing the possible ambiguity in the sampling protocol [II). Analytical to get the most reliable estimate of the uncertainty overall, without
bias can be estimated using certified reference materials that have a necessarily identifying the contributions from all of the possible
chemical composition that is well matched to the samples. sources [16, 7).
The estimation of sampling bias is potentially much more These four methods estimate the uncertainty with increasing
problematic, but two methods have been described. One method rigour, but at increasing cost. Method 1 is the least expensive, but
requires the use of a Reference Sampling Target (RST), which is it does not include an estimate of sampling bias, although this can
the sampling equivalent of a reference material for the estimation be added by the independent use of an RST. Separation of the
of bias. The RST can either be created synthetically to have a main sources of uncertainty requires the use of analysis of vari-
known concentration of analyte [12), or it can be a routine sam- ance, usually of the robust type, to allow for non-normal frequen-
pling target selected for the purpose [13). The accepted or certified cy distributions. Detailed description of these methods is given
elsewhere [3). These methods have the advantage of estimating
the actual uncertainty for a particular investigation, and should in
that way be more realistic than estimates made by bottom-up
Table 1 The four types of errors in methods that contribute to the
uncertainty of measurements, and examples of how they might be methods. These later methods will need to use generalised values
estimated for the component variances, and cannot easily retlect the special
contribution to variance at each site, such as those made by the lo-
Error type Random (precision) cal levels of analyte heterogeneity. Top-down methods are there-
~ Systematic (bias)
fore particularly appropriate for estimation of uncertainty from
Process ~ Estimate using: Estimate using:
sampling especially for variable matrices such as those found in
Analysis Duplicate analyses Certified reference materials environmental materials. Further research will be needed to inves-
Sampling Duplicate samples RST, lOST tigate the relative merits of top-down and bottom-up methods for
this purpose.
RST, reference sampling target; lOST, inter-organisational sam-
pling trial

Table 2 Four methods for estimating uncertainty in measurements (including that from sampling)

No Method Samplers Protocols Uncertainty components estimated

Analytical Analytical Sampling Sampling


precision bias precision bias

1 Duplicates+CRMs Single Single Y Y Y No


2 Protocols+CRMs Single Multiple Y Y Between protocols
3 CTS+CRMs Multiple Single Y Y Between samplers
4 SPT (+CRMs optional) Multiple MUltiple Y Y Between protocols
+between samplers

CTS, collaborative trial in sampling; SPT, sampling proficiency test; CRM, certified reference material
166 M. H. Ramsey

Applications of uncertainty estimation the sample mass would be predicted to reduce the vari-
that includes contribution from sampling ance by a factor of 5, and hence the uncertainty by the
square root of 5.
Estimation of uncertainty in the measurement of lead in
top soils has been reported using duplicate samples [3]
(i.e. Method 1 in Table 2). The 1.8 hectares site in Der- Acceptable levels of uncertainty
byshire UK, was contaminated by a lead smelter operat- (fitness-for-purpose)
ing in the 14-16th century, and demonstrates the general
principle. Once estimates of the uncertainty of measurements are
A regular grid of 40 sampling points at 20 m spacing known, it is possible to judge whether that level of un-
was applied initially with single samples. This grid was certainty is acceptable for a particular stated purpose.
repeated using 5-fold composite samples (i.e. 5 incre- This can be used to judge whether measurements (rather
ments taken within 1 m 2 around each sampling point) in than all of the measurement procedures themselves) are
order to investigate how composite sampling would af- "fit-for-purpose" (FFP) , where fitness for purpose is de-
fect the measurement uncertainty. The duplicate samples fined as "The property of data produced by a measure-
were taken at around 20% of the sampling points (Table ment process that enables a user of the data to make
3), at a distance of 2 m away from the original sample technically correct decisions for a stated purpose" [18].
point. This distance represents the spatial uncertainty Three basic types of FFP criteria have been suggest-
caused by the method of surveying employed (i.e. mea- ed, in somewhat different contexts. The first, and widely
suring tape) on this undulating site. Duplicated analytical accepted criterion, is based on the relative precision of
measurements were then taken on both of the sample du- measurement method, usually specified within an Ana-
plicates in a balanced design, and the three components lytical Quality Control Scheme. A typically criterion is
of the variance separated using robust analysis of vari- that the relative analytical precision should be better than
ance (ANOVA) [17], according to the model: 10% (at 95% confidence). AQC is normally used to
check the measurement process and to check that this
2 _ 2 2 2 process step is in statistical control and comparable with
Stotal - Sgeochern + Ssarnp + Sanal (1)
the performance pertaining at the time of validation. This
The three components of the total variance are the ana- target performance is often set however, so as to enable
lytical variance (s~nal)' the sampling variance (s~arnp) and users of the data to make technically correct decisions. It
the geochemical variance (S~eochern). The measurement can therefore be considered as crude type of FFP criteri-
variance can be considered as the sum of the sampling on. The main problem with this approach is that this cri-
and analytical variance: terion is set by the laboratory, often without reference to
the specific purpose for which the customer will use the
(2) data. It could be, for example, that a precision of 30%
would be quite good enough for some of the user's pur-
The expanded uncertainty (U) can be estimated as poses.
2s rneas ' using a coverage factor of 2 for 95% confidence. The second FFP Criterion that has been suggested, is
The use of 5-fold composite samples reduces the that the uncertainty of the measurement (including that
overall expanded measurement uncertainty (U) by a fac- from the sampling) should not contribute more than 20%
tor of two (3742 to 1881 Ilg g-l). This is 1l0t significantly of the total variance for the analyte across all of the sam-
different from the reduction of 2.2 (i.e. v'5), predicted by ples in a particular survey [11]. The relative contribu-
the theory that sample mass is inversely proportional to tions to the total variance can be usefully represented us-
sample variance [1]. In this case the 5-fold increase in ing pie charts (Fig. I). In the case study in Derbyshire, it

Table 3 Estimates of uncertainty made at a site in Derbyshire using Method 1 (Table 2). The use of composite samples reduces the ex-
panded measurement uncertainty (U) by nearly a factor of 2

Sampling design Points Mean Pb SlOta) Smeas s2mca/ U=2s meas U=200smca/x
duplicated (x) robust robust robust S2tolal' (J.1g g-l) (%)
(J.1g g-l) (J.1gg-1) (J.1g g-l) (%)

Regular grid, single sample 7 7516 8185 1871 5.2 3742 49.8
Regular grid, compo samples 9 6093 5600 940.5 2.7 1881 30.9

Despite the high level of relative uncertainty using single samples across the site (s2mca/s2tolal%). This is well within the second FFP
(U%=50%), the variance caused by the measurements only con- criterion of 20% (Fig. I)
tributes 5.2% to the overall total variance of Pb measurements
Appropriate rather than representative sampling, based on acceptable levels of uncertainty 167

Analytical
Sampling
'0 10000 -r--~~-::-::--~--'
g ~ 8000~·~----~~--~~~
;: en 6000 +------.::.---'~.;-'f8_-=-""-:7I
S en 4000 -.-,,....--:.;.--,-----::.t-":'-':..:.-""""'t
FFP 5",...< 20%
~is. .3 2000 -~
r- .....----..;...._:'i"...-~-I ~
fIerm:Ii;Ibon cost etc.
~ 0 +-...::!~~E:...__r---..:..-f
o 50 100 150
Uncertainty (eslimaled as , - , "9'9)
Fig. 2 Optimisation of measurement uncertainty against cost,
demonstrated for a metal contaminated site in West London. It
shows the economic loss function with a clear minimum cost at
the optimal uncertainty (estimated by smea)' which is 30x lower
than the expectation of loss at the actual uncertainty of the mea-
surements
94 .76%
a Geochemical
reached despite the fact that the value of the measure-
ment uncertainty was 50% (relative to concentration val-
Sampling Analytical ue, at 95% confidence), using the protocol with single
samples. This level of uncertainty is much higher than is
often considered acceptable. These samples are not "rep-
resentative" in the usually accepted sense of the word,
but they have been shown to be "appropriate", in provid-
ing measurements that are fit-far-purpose.
FFP Smeal < 20"10 The third FFP Criterion that has been suggested in-
corporates a balance between uncertainty and financial
loss [4]. The optimal value of uncertainty is identified as
that which incurs the minimum value for the expectation
of loss. This loss includes both the cost of making the
measurement and also the loss that may arise due to in-
correct decisions made on the basis of the uncertain mea-
surements. In an initial application of these ideas to a site
investigation in West London [19], the actual measure-
97.24% ment uncertainty of 110 Ilg g-l resulted in an expectation
b Geochemical of financial lost of around £9000 per sampling point
(Fig. 2). When this uncertainty was optimised to a value
Fig. la, b Contributions towards total variance from the sam- of 55 Ilg g-I the expectation of loss was reduced by a
pling, the analysis and their combination the measurement. The
contribution from the sampling is reduced by the use of composite factor of thirty to around £300. A second stage of the op-
samples b compared with single samples a . This reduction is how- timisation allocates expenditure in an optimal way be-
ever not required as the contribution to the measurement variance tween the sampling and the chemical analysis , based on
using single samples a is already less than the fitness-for-purpose their respective contributions to the overall uncertainty.
criterion (FFP) of 20% of total variance
A recent application of this approach to the sampling of
food has shown that a substantial reallocation of expen-
diture to the sampling process (+300%) gave a substan-
was possible to optimize the sampling protocol by use of tial reduction in the overall uncertainty (-31 %) and a
this criterion. The use of 5-fold composite samples in consequent saving of £428 per batch [20].
place of single non-composite samples was shown to re-
duce the variance contributed by the sampling protocol
by a factor of approximately ..n.
However, this improve- Implications of uncertainty for interpretation
ment was shown to be unnecessary as the variance con- of measurements
tributed by the measurements overall (5.2%) was already
well below the limit of 20% of the total variance, even One of the main advantages of reporting realistic esti-
when using the non-composted samples [3]. This judge- mates of uncertainty together with measurements of con-
ment that the measurements were fit-far-purpose was centration, is that end-users of the analyses can consider
168 M. H. Ramsey

1- possibly, probably and definitely contaminated (at 95%


Concentration (C) confidence). Only 9% of the site was classified as uncon-
taminated, which is only marginally greater than the pro-
- - - - - - - - - - - - - - - Threshold (T)
portion of that which would be expected by chance at
this confidence interval. The interpretation of the site has
changed therefore, from "basically uncontaminated with
Uncontaminated Uncontaminated Contaminated Contaminated a few patches of contamination" using the deterministic
method, to "probably all contaminated" with the proba-
(a) Deterministic C1assijimtiOlI

f
bilistic approach.
More sophisticated interpretations of contaminated
Concentration (e) land use a risk assessment approach, but this can also be

~+U-f------1!I----t+-----
made more reliable by allowing for the uncertainty in the

Threshold (T) raw measurements. The uncertainties in the measure-


ments, both in the investigation and in the construction
of the risk assessment model, can be propagated through
the calculation to give more realistic uncertainty values
c-u for the calculated exposure or risk.
Uncontaminated Possibly Probably Contaminated
Contaminated Contaminated

(h) Probabilistic Classification


Conclusions

Fig. 3a, b Comparison between a deterministic, and b probabilis- Appropriate sampling is potentially much more reliable
tic classification of contaminated land, to show the effect of using and more cost-effective than representative sampling.
estimates of uncertainty to improve the interpretation of measure- The assumption that a "correct" protocol will give repre-
ments (derived from general case described previously [7])
sentative samples, does not give either the rationale for
estimation of sampling bias, or the flexibility to vary the
the implications of the uncertainty [7]. One example that quality of the sampling depending on the proposed ob-
illustrates this is in the classification of contaminated jective. An "appropriate" protocol can be selected to
land [3]. The traditional deterministic approach is to give an acceptable level of uncertainty that can allow for
compare the measured concentration values with an ap- consideration of financial constraints (e.g. the potential
propriate regulatory threshold value (Fig. 3a). Any sam- financial consequences of errors, or logistical constraints
pling point that has a reported concentration value below such as local conditions, time limitations). All "appropri-
the threshold is classified as uncontaminated, and those ate" protocols do require the estimation of uncertainty to
above as contaminated. This approach ignores the pres- be incorporated into their design. The simplest method,
ence of uncertainty, but once it is known a probabilistic using duplicate samples at a small proportion of sam-
approach can be taken (Fig. 3b). If the measured concen- pling points, is adequate for most purposes. More elabo-
tration is below the threshold but the uncertainty interval rate methods to estimate uncertainty will only be re-
extends above it, the point is classified as "possibly con- quired where the consequences of unsuspected uncer-
taminated" rather than "uncontaminated". Similarly if tainty are large. These estimates of uncertainty are used
the measured concentration is above the threshold but initially to judge the fitness-for-purpose of the measure-
the uncertainty interval extends below it, the point is ments. They are also very useful however, for improving
classified as "probably contaminated" rather than "con- the reliability of the interpretation of the measurements
taminated" . (e.g. in risk assessment or hazard classification). These
In one application of this probabilistic approach to a techniques also provide ways to assess the performance
disused landfill site in West London, the effect of the un- of sampling protocols (using collaborative trials in sam-
certainty was to totally change the interpretation of the pling) and to assess and improve performance of sam-
extent of the lead contamination at the site [3]. From the plers (using sampling proficiency tests). Sampling is
100 samples taken from the site in a regular grid pattern, never perfect, it is better therefore to measure the uncer-
the deterministic interpretation showed only eight sam- tainty that an "appropriate" sampling protocol generates,
ples, scattered across the site, to be over the appropriate rather than to assume the perfect application of a "cor-
threshold value (500 Ilg Pb g-l soil). However, the uncer- rect" protocol.
tainty of the measurements at the site was estimated as
83.6%, using Method I (Table 2). This large value is due
primarily to the high degree of small-scale heterogeneity
of lead distribution at the site. After taking this into ac-
count 91 % of the sampling point were classified as either
Appropriate rather than representative sampling, based on acceptable levels of uncertainty 169

References
I. Gy P (1979) Sampling of particulate 7. CITAC (2000) Quantifying uncertainty 15. Ramsey MH. Argyraki A, Thompson
materials - theory and practice. in analytical measurement. Eurochem M (1995) Analyst 120:2309
Elsevier, Amsterdam 8. Rios A, Valcarcel M (1998) Accred 16. Analytical Methods Committee (1995)
2. Thompson M (1995) Analyst Qual Assur 3:14 Anayst 120:2303
120:117N 9. BCR Report (1998), EUR 18405 EN 17. Analytical Methods Committee (1989)
3. Ramsey MH, Argyraki A (1997) Sci- Metrology in chemistry and biology: a Analyst 114: 1699
ence of the Total Environment 198:243 practical approach. Office for Official 18. Thompson M, Ramsey MH (1995) An-
4. Thompson M, Fearn T (1996) Analyst Publications of the European Commu- alyst 120:261
121:275 nities, Luxembourg 19. Hulls J (1998) Optimising sampling
5. ISO (1993) Guide to the expression of 10. de Zorzi P, Belli M, Barbizzi S, and analytical strategies for assessing
uncertainty in measurement. ISO, Menegon S, Deliusa A (2002) Accred contaminated land. MSc thesis, Imperi-
Geneva Qual Assur this issue al College, London
6. ISO (1995) VIM: 1995 Vocabulary of II. Ramsey MH, Thompson M. Hale M 20. Ramsey MH, LynJA, Wood R (2001)
metrology. Part I. Basic and general (1992). Journal of Geochemical Explo- Analyst 126: 1777
terms (international). International Or- ration 44:23
ganisation for Standardisation (Gene- 12. Ramsey MH. Squire S, Gardner MJ
va) [and published by the British Stan- (1999). Analyst 124:1701
dards Institution (London) as 13. Argyraki A, Ramsey MH. Thompson
PD646 I :Part I: 1995, 59 pp1 M (1995) Analyst 120:2799
14. Squire S, Ramsey MH. Gardner MJ,
Lister D (2000) Analyst 125:2026
Accred Qual Assur (2001) 6:368-371

John R. Cowles Experimental sensitivity analysis applied


Simon Daily
Stephen L.R. Ellison to sample preparation uncertainties:
William A. Hardcastle
Carole Williams are ruggedness tests enough
for measurement uncertainty estimates?

Abstract It has been suggested that significance, typical ruggedness tests


typical ruggedness tests might lead are not likely to lead to reliable un-
directly to uncertainty estimates. certainty estimates; instead, lack of
This assertion is tested using simple statistical significance in ruggedness
experimental studies of uncertainties tests is better interpreted as reason to
associated with sample grinding and \eave an effect out of the uncertainty
oven-drying operations. The results budget. Only where the ruggedness
are used to predict the outcome of study is modified in order to achieve
typical ruggedness tests on the same statistically significant change is it
l.R. Cowles· S. Daily· S.L.R. Ellison (~) systems. It is concluded that uncer- useful for uncertainty estimation.
W.A. Hardcastle· C. Williams tainty estimation from ruggedness
LGC (Teddington) Ltd., Teddington, tests is appropriate only where a Keywords Measurement
England TWII OLY, UK
e-mail: slre@lgc.co.uk strong effect can be observed. Since uncertainty· Ruggedness tests·
Tel.: +44--181-943 7000 current practice in ruggedness test- Sample pre-treatment· Moisture·
Fax: +44--181-943 2767 ing is predisposed to confirming in- Grinding

Introduction oratories, due to the practical problems of resourcing


larger studies, their practical utility and interpretation for
There is a general trend in modem analytical science to- uncertainty estimation is an important issue.
wards providing quantitative estimates of the reliability In this paper, two methods of pre-treatment, viz. oven
of measurements. Analysts are increasingly coming un- drying, and grinding, are examined using a simple sensi-
der pressure from accreditation bodies to present such tivity analysis approach to estimating uncertainty contri-
information in the form of a statement of measurement butions in sample pre-treatment. The substantial litera-
uncertainty conforming to International Organisation for ture on the effects of varying drying conditions common-
Standardization (ISO) recommendations [1,2]. ly shows that quantitation of the analyte is sensitive to
The effects of pre-treatment operations have been the drying temperature employed and to the nature of
identified by EURACHEM as important considerations both the analyte and its matrix; temperature uncertainties
in measurement uncertainty estimation [2]. The general associated with drying accordingly provide an example
approach to experimental measurement uncertainty esti- of uncertainty estimation for a readily detectable effect.
mation is to vary experimental conditions and estimate The literature on sample grinding and milling shows
the sensitivity of the result to the conditions [3-5]. This more variable results. Some authors have found no effect
is sometimes referred to as 'sensitivity analysis'. Its on the final result, others reported effects of varying
close relationship to ruggedness testing as described, for magnitude. The milling study was therefore expected to
example, by Youden and Steiner [6] has suggested the test the utility of sensitivity analysis for a variable or ill-
possibility of using ruggedness tests as the basis for mea- defined effect.
surement uncertainty estimates [2-5]. Since ruggedness It is important to bear in mind that under normal cir-
and related tests are likely to be the most detailed experi- cumstances, ruggedness tests are typically designed as
mental information to hand in most chemical testing lab- screening experiments; restricted to very short checks,
Experimental sensitivity analysis applied to sample preparation uncertainties 171

using a nominal level of a method control parameter


(such as operating temperature or grinding time) and one E
10.0 r - - - - - - - - - - - - - - - - -......- - - ,

0
0
0 0 -
or at most two alternate levels corresponding to the ex- E 0 0
;!. 9.0
pected (or permitted) variation in the parameter. The

0


1/1

studies here are significantly larger, in order to provide


more information against which to assess the likely effi-
.2
e •
cacy of routine ruggedness studies in uncertainty estima-
.;!
1/1 •
8.0 -1--.......- - - - - - - - - - . - - - - - - - - - - - - -- -
o:::E
tion.
7.0 I---,-------r--.--------~-__l

80 85 90 95 100 105 110 115 120


Experimental Temp ·C

For the determination of moisture, two different feeds Fig. 1 Mean weight loss vs drying temperature. Weight loss with
temperature for two samples (. and Oespectively)
were analysed in triplicate using the oven-loss method
specified in appropriate United Kingdom legislation [7].
A drying time of 3 h was used for all test portions. For ture. Results obtained by methods of this type are ac-
the study of temperature dependence, three S g portions cordingly subject to an uncertainty component due to the
of each sample were dried separately at each temperature allowable variation in temperature. In this study, a series
(that is, in separate drying runs), with the oven held of experiments was carried out on two samples of pellet-
within ±loC of the target temperature for the whole of ed animal feed. The drying temperature was varied over
the heating period. The weight loss for each SoC incre- a range around the target temperature of 100°C and the
ment in temperature from 8SoC to I I SoC was determined effect on the weight loss noted. The range chosen, 8S C
for each sample. to lIS C, is substantially larger than the permissible
In the grinding/milling experiments, a legislative pre- range, to permit both an investigation of the linearity or
treatment (grinding) method [7] was used. The method is otherwise of the effect and a sound sensiti vity analysis.
intended to reduce materials to <I mm particle size in The results, which are typical for moisture determina-
three or fewer grinding cycles, which allows variation tion, are shown in Fig. I. Both curves are approximately
upward from an experimentally determined minimum linear in the range 90-1 OsoC, but depart from linearity at
cycle time. Approximately 1200 g of one feed was thor- the extremes. At 8SoC, the mean weight loss is lower
oughly mixed in a tumbler mixer overnight and divided than might be expected from the trend at the higher tem-
into five equal parts, which were ground using cycle peratures. At higher extremes, factors such as progres-
times of 8, 10, 12, IS and 20 s, respectively. Ground sive oxidation or thermal degradation lead to different
samples were analysed for dimetridazole by extraction directions of departure from linearity.
with dichloromethane and clean-up using a Sep-Pak sili- To estimate the uncertainty associated with weight
ca cartridge (Waters Corporation), followed by reverse loss using the Guide to the Expression of Uncertainty in
phase high performance liquid chromatograph (HPLC) Measurement (GUM) principles [I], the gradient dUdT at
with UV detection. The solvent employed was acetoni- the nominal temperature (100°C) is multiplied by the
trile with ammonium acetate buffer. temperature uncertainty u(T). Here, linear regression ap-
plied to the linear temperature range in each case pro-
duces a gradient dUdT=0.030%m/m C-I for sample I and
ResuHs and discussion dUdT=O.O 19%m/moC-l for sample 2. The temperature
uncertainty u(T) is O.S77°C, estimated from the permit-
Case study I - Uncertainties from oven temperatures in ted variation of ±1 °C taken as the limits of a rectangular
moisture determination distribution. Calibration uncertainties in the thermometer
used are under 0.1 °C, so can be neglected by compari-
Moisture content is usually determined using a calcula- son. This gives u(l)=0.S77xO.030=0.017%m/m for sam-
tion of the form: ple I and u(l)=O.S77xO.O 19=0.0 lO%m/m for sample 2.
Comparing this with a repeatability estimate of
O.OS%m/m obtained from the replicate data by analysis
of variance (ANOVA), it is clear that the uncertainty es-
where I is the fractional loss in mass (usually expressed timates associated with temperature are just on the mar-
as a percentage), mw the mass before drying and md the gin of practical significance compared to the repeatabili-
mass after drying. ty estimate (assuming an uncertainty is practically signif-
The variability of the method is controlled by restrict- icant if greater than a fifth of the largest component -
ing the drying temperature to a narrow range, typically here, the precision is the largest component so far
±I°C. This figure represents an uncertainty in tempera- found). Under these circumstances, therefore, there is
172 J.R. Cowles

some evidence that the uncertainty arising from a permit- 115.0


ted temperature variation of 1°C could be practically sig- •
nificant compared to the repeatability precision. In the !i •

Q 110.0


context of normal use, of course, these are both small E


U
uncertainties; for most practical purposes, therefore, the g
u
105.0
I
temperature-related uncertainty can be considered suffi-
!
ciently small. Further, note that the study is carried out
under the most precise conditions available; the uncer-
II
~
100.0 •
tainties found will almost certainly prove negligible
compared, for example, to between-run variation. 5 7 9 11 13 15 17 19 21
Returning to the principal aim of the study, it is useful
Grinding cycle time (s)
to consider whether a typical ruggedness test, operating
at one or both extremes of the permitted range, would Fig. 2 Effect of grinding time on analyte concentration. Duplicate
have given comparable results in the present case. Given analyses of samples ground for different grinding cycle times
(. and. are replicates I and 2, respectively)
the largest calculated sensitivity of 0.030%m/m °C-l, the
expected variations in I across a 1 or 2°C range are
0.030%m/m and 0.060%m/m, respectively. Clearly, nei- though there appears to be a decrease in the concentra-
ther would reliably lead to statistically significant effects tion at longer grinding times, the plot does not suggest
with the present repeatability precision unless a prohibi- strong linearity, and ANOVA does not show this effect to
tively large number of replicates were undertaken in the be statistically significant (P = 0.45, CL = 95%). It re-
ruggedness test; uncertainty estimates would accordingly mains possible to provide an estimate of uncertainty us-
be extremely variable. For small to modest effects, then, ing linear regression and the first-order GUM expression
uncertainty estimation from sensitivity experiments re- [1]; in this case, we obtain a linear regression result of
quires substantially wider variation than the 'expected y=-0.45t+ 112.41, where y is analyte concentration in mg
range'. Further, if the 'expected range' is based on con- kg-I, and t the cycle time (in seconds). For an uncertain-
trol limits intended to render an effect insignificant - as ty u(t) of 1 s in grinding time (based on the practical dif-
in most standard methods - it is generally to be expected ficulties of controlling grinding time more closely), the
that the change in influence quantity will not provide uncertainty in analyte concentration would be ±0.45 mg
useful uncertainty estimates. kg-I. In comparison, the precision of the analytical meth-
od for dimetridazole, at the analyte level encountered for
the whole sample (ca. 105 mg kg-I), was 3.9 mg kg-I.
Case study 2 - Milling and particle size effects Reassuringly, this very rough uncertainty estimate con-
on HPLC determination of dimetridazole firms the insignificance implied by the ANOVA result.
Considering the possible outcome of a ruggedness
Another common requirement is for the determination of test aimed at establishing the effect of grinding time
analytes in samples of agglomerated materials such as across a range of, for example, 3-5 s about the nominal
soils or animal feeds. A common preparative method is time of 10 s (much smaller variation would be impracti-
to grind or mill the material so that it passes through a cal), again we find that such a test would almost certain-
sieve of specified aperture. The grinding time in such ly fail to find a significant effect. In this instance, how-
cases is not generally specified, leaving open the possi- ever, it would have been entirely correct to ignore the ef-
bility of variation in grinding time and particle size. Both fect in comparison with observed precision.
constitute potential sources of uncertainty. Longer grind-
ing times may affect the results through greater produc-
tion of fines (that is, a particle size effect) or by thermal Conclusions
degradation or loss of the analyte. Grinding/milling
therefore constitutes a possible source of uncertainty. In The experiments were designed with the aim of using
this study, a series of experiments with different grinding sensitivity values, generated by linear regression, to ob-
times was carried out on a sample of medicated pelleted tain estimates of the contribution to overall measure-
animal feed. A rough particle size distribution was also ment uncertainty from two methods of sample pre-treat-
estimated using different sieve sizes. The feed contained ment. By comparison, the minimal range typical of cur-
approximately 100 mg kg-l of dimetridazole (a coccidio- rent practice in ruggedness testing would not be expect-
stat) and the experiments were designed to assess the ef- ed to give useful or reliable uncertainty estimates in ei-
fect of grinding on the subsequent determination of this ther case. This is consistent with recent studies on deri-
compound. vatisation effects [8] which show that as modelling co-
A plot of the variation of observed concentration of efficients (such as gradients) become statistically insig-
dimetridazole with grinding time is shown at Fig. 2. AI- nificant, uncertainty estimates become progressively
Experimental sensitivity analysis applied to sample preparation uncertainties 173

more unreliable and can be misleadingly large, even no significant effect is found in such a study, sensitivity
though the average remains negligible. However, the coefficients obtained from a study redesigned to assure a
lack of statistical significance predicted for such mini- significant change (for example by increasing the influ-
mal studies was generally consistent with the finding ence quantity range well beyond that expected) will gen-
that the uncertainties were small compared to repeat- erally only confirm practical insignificance of the effect.
ability precision. It is clearly more sensible to use a recorded lack of sta-
This has important implications for the use of data tistical significance in typical ruggedness tests as justifi-
from simple ruggedness studies in uncertainty estima- cation for omitting an effect from the measurement mod-
tion. Given a typical ruggedness study, properly de- el and associated uncertainty budget (which is, in fact,
signed to confirm insignificance of an expected range for the traditional statistical view), than to attempt to con-
an influence quantity, the sensitivity estimates will gen- struct an unreliable estimate for a practically insignifi-
erally be too unreliable for useful uncertainty estimation cant effect.
even though the study is a valid check on the potential
Acknowledgements Production of this paper was supported un-
influence quantities' effects. It follows that typical rug- der contract with the Department of Trade and Industry as part of
gedness studies are not generally appropriate sources of the National Measurement System Valid Analytical Measurement
data for reliable uncertainty estimation. However, where Programme.

References
I. ISO- GUM (1993) Guide to the expres- 3. Ellison SLR, Williams A (1998) Accred 7. The Feeding Stuffs (Sampling and Anal-
sion of uncertainty in measurement. Qual Assur 3: 6-10 ysis) Regulations (1999). Statutory In-
ISO, Geneva, Switzerland; ISBN 4. Barwick VJ, Ellison SLR (2000) Accred strument No 1633. Her Majesty's Sta-
92-67-10188-9 Qual Assur 5: 47-53 tionary Office (HMSO). London
2. EURACHEM (1995) Quantifying un- 5. Barwick VJ. Ellison SLR, Rafferty 8. Ellison SLR, Burns M. Holcombe DG
certainty in analytical measurement. MJQ. Gill RS (2000) Accred Qual As- (2001) Analyst 126: 199-210
EURACHEM, London; ISBN sur 5: 104-113
0-948926-08-2. Second edition now 6. Youden WJ, Steiner EH (1975) Statisti-
available at: http://www.vttJilket/eura- cal manual of the AOAC. Association of
chem/publications.htm Official Analytical Chemists Interna-
tional (AOAC), Arlington, Va., USA
Accred Qual Assur (1998) 3: 462-467
© Springer-Verlag 1998

Adriaan M.H. van der Veen Relationship between performance


A.J.M. Broos
Anton Alink characteristics obtained from an
interlaboratory study programme and
combined measurement uncertainty:
a case study

Abstract In the interlaboratory been formulated that the combined


study programme "ILS Coal Char- standard uncertainty obtained from
acterisation", eight interlaboratory an interlaboratory study is equal to
studies were organised based on the reproducibility standard devia-
the ISO standards for coal analysis. tion. Whether the reproducibility
The use of blind samples in each can be used as the basis for the
round allows comparability of certification depends on whether
A.M.H. van der Veen (181) measurement results between the interlaboratory study includes
AJ.M. Broos
A. Alink rounds to be assessed. Based on all effects to be taken into account
Nederlands Meetinstituut, the results, it could be demon- for establishing an uncertainty
P.O. Box 654, strated that the vast majority of the statement.
2600 AR Delft, measurement results of the labora-
The Netherlands
e-mail: AvanderVeen@NMi.nl
tories were traceable to results ob- Key words Interlaboratory study .
Tel.: +31-15-2691733 tained in previous rounds of this Traceability . Comparability .
Fax: +31-15-2612971 programme. The hypothesis has Reference material

3. Study the influencing factors such as sample prepa-


Introduction ration, subsampling and statistics on measurement
Over a period of 3 years, an interlaboratory study pro- results.
gramme was organised that aimed to supply the coal The interlaboratory studies were evaluated and re-
community with a range of suitable reference materials ported during the programme [2-9]. After completion
that can be used as measurement standards in a wide of the programme, a report was published that covers
variety of experiments and common analyses. The pro- the evaluation of the interlaboratory study programme
gramme was entitled "ILS Coal Characterisation" and as a whole [1]. This second evaluation cycle aimed to
consisted of eight interlaboratory studies. Apart from establish links between interlaboratory studies and thus
the objective of supplying measurement standards, it establish performance characteristics for selected meth-
was also aimed to investigate several metrological as- ods common to coal analyses throughout the pro-
pects, such as establishing traceability of measurement gramme. These performance characteristics are be-
results throughout the programme. Summarising, the lieved to be better established than those from a single
interlaboratory study programme aimed to [1]: interlaboratory study obtained with the procedure giv-
1. Establish a series of well-characterised coal samples en in ISO 5725: 1994, parts 1 and 2 [10, 11].
(reference materials) in order to support coal re- The main question remaining after the validation of
search the data from the complete programme is how the val-
2. Investigate the statistical parameters of coal analy- ues obtained for repeatability standard deviation and
sis reproducibility standard deviation compare to the
Relationship between performance characteristics obtained from an interlaboratory study programme 175

(combined) measurement uncertainty at that level. In Outliers and/or stragglers [11] were identified by com-
metrology, the VIM [12] and GUM [13] are the basis puting a Z -score [15] based on the mean and the stand-
for evaluating uncertainty in measurement. However, ard deviation of the laboratory averages. The criterion
the implementation of the principles of the GUM is far was that this "Z" should not exceed a value of 2. The
from straightforward for matrix materials, especially criterion was developed based on requirements set by
when parameters are defined by the measurement all parties involved in coal production, trade, and con-
process. As a result, an interpretation of uncertainty sumption. For all samples involved in the programme,
analysis of this kind of parameter/matrix combination is the performance characteristics were computed after
required that explains the experimental results and is in removal of stragglers and outliers. A database of values
agreement with the basic principles of the GUM [13]. A of grand mean, repeatability standard deviation, and
problem that was already addressed in a paper by Van reproducibility standard deviation resulted.
der Veen and Alink [14] is that it is impossible to quan- In each interlaboratory study, a blind sample was
tify several sources of uncertainty when dealing with used, except in round I [2]. The link between the results
matrix materials. of this interlaboratory study and the results of the other
rounds in the programme was established by using the
materials of ILS Coal Characterisation I in later rounds
Set-up of the programme [1]. Figure 1 shows the principle of establishing these
traceability links in an interlaboratory study pro-
The objectives of the programme have been described gramme. The use of a blind sample enables the evalua-
in the previous section. With respect to the acceptance tion of whether the results of a round are comparable
of laboratories as participants, no specific requirements to those of other interlaboratory studies due to the fact
were set other than that these laboratories should be that each laboratory was requested to perform the
involved in the analysis of coal in support of trade. This measurement of all samples of the suite as independent
requirement implies that the laboratories are involved measurements under repeatability conditions. This way
in one-to-one comparisons between coal buyer and coal of implementing comparability also enables the assess-
seller. The implementation of a quality assurance sys- ment of traceability of measurement results to the writ-
tem (QAS) was, however, a requirement. An accredita- ten standard [12]. All laboratories involved use interna-
tion was not asked for. tionally accepted certified reference materials as a part
For the characterisation of coal there are two series of their QAS. So, from that point a traceability link is
of written standards: ASTM and ISO. In this interlabo- established between these reference materials and the
ratory study programme the ISO standards were re- results of the interlaboratory study programme.
quested. The laboratories were allowed to use their
own methods if the results of these methods are com-
parable to those obtained with the ISO method. It was
the responsibility of the laboratory to verify whether its
Samples
A
B
------+
Samples
A
D
..
---.J
Samples
D
F ~
r Samples
G
H
method is comparable to the ISO method. C E G J
Several methods, such as the determination of the
ash content, define the parameter: ash is the result of a
chemical conversion of coal. Its formation (and as a re-
sult, the ash content) highly depends on the conditions
under which the coal is combusted. This fact has some I I I
consequences. The first consequence is that it is gener-
ally not possible to determine the parameter with inde-
I laboratories] I
laboratories I I laboratories I I
laboratories ]

pendent methods. As a result, the best realisation of I I I I I I I I I I I I I I i III I


the parameter depends on how closely the written
standard is followed by the participants and how much
freedom is still left in the measurement method. Tra-
ceability of the parameter is limited to the measure-
ment results being traceable to the written standard.
The evaluation protocol was merely based on a com-
bination of ISO 5725-1 :1994 [10], ISO 5725-2: 1994 Fig.l Framework for assessing traceability of results in the inter-
[11], ISO Guide 43' [15], and ISO Guide 35: 1989 [16]. laboratory study programme

I During the interlaboratory study programme the most rece nt


draft of this ISO Guide was used
176 A.M.H. van def Veen . AJ.M. Broos . A. Alink

same coal were used. These batches differed not only in


Results preparation route, but they also differed in grain size.
Based on the evaluation of the results of the blind sam- Usually, sample preparation and extra subsampling
ples, it may be concluded that, generally speaking, steps were required to obtain suitable material for per-
there is a good agreement between results obtained in forming the measurements. The results of rounds III
two interlaboratory studies on a single batch of samples and IV show that no significant difference is found be-
[1, 3-9]. A good agreement was also obtained when tween the samples of analysis grade and of samples that
comparing the results of two different batches of sam- needed further treatment.
ples, even if one batch was prepared to analysis grade Figure 2 provides an outline of the set-up in rounds I
( < 200 J.1m top size), whereas the other was prepared to and III on the Gottelborn coal. In ILS Coal Character-
3 or 10 mm top size. isation III, samples of 10 mm top size and about 1 kg
These results make it possible to establish fairly ho- were distributed, whereas in round I samples of analy-
mogeneous performance characteristics for the coal sis grade were distributed. In Fig. 1 the ash analysis
analyses involved in the programme [1]. They also lead (ISO 1171) has been taken as an example, but any oth-
to a hypothesis that needs further investigation: the val- er coal parameter is determined in the same manner.
ues for the repeatability and reproducibility standard The laboratories are denoted by A..D in ILS I, and
deviations are estimators for the expectation value of A' .. D' in ILS III, denoting that there is no relationship
the combined measurement uncertainty. between A and A' etc. The measurement chains in
This hypothesis is supported by several facts: round III are more different than those in ILS Coal
1. If in two independent interlaboratory studies the Characterisation I, which might lead to greater differ-
same values for the grand mean, the repeatability, ences in measurement uncertainty.
and the reproducibility are obtained, then it may be It was expected that including aspects of sample pre-
assumed that these values are characteristic of the paration and subsampling (other than subsampling of a
variability of the data that can be obtained in such sample of analysis grade) would result in an increase in
an exercise. both the repeatability and reproducibility standard de-
2. Linking the results, grouped per parameter and per viations. As already stated, this was not the case. This
interlaboratory study, leads to performance charac- implies that the combined measurement uncertainty for
teristics that show a regular behaviour that can be both groups of measurement chains is expected to show
described by a mathematical model. no significant difference either.
3. An attempt to determine the combined measure- This observation seems to clearly contradict the
ment uncertainty of each step separately would lead commonly accepted "rule" that the preparation and
to a serious overestimation of the combined meas- subsampling steps contribute considerably to the com-
urement uncertainty of the complete measurement bined measurement uncertainty. When assessing the
process. quality of coal sampling, subsampling, and sample pre-
The third fact needs some explanation. As indicated, in paration, usually the ash content is used [17, 18].
some interlaboratory studies different batches of the

Fig.2 Outline of measure- Gottelborn Coal Gottelborn Coal


ment chains within laborato- ILS Coal Characterisation I ILS Coal Characterisation III
ries for the determination of
ash Sample
#.0081
Sample
#.0103
Sample
#.0254
Sample
#.0467
Sample
#.001
Sample ) Sample
#.036 #.054
I Sample
#.092
< 200 Jl.m < 200 Jl.m < 200 Jl.m < 200 Jl.m < 200 Jl.m < 200 Jl.m I
< 200 Jl.m I < 10 mm
I I
y w
I I I
I I I Crushing I Crushing Splitting
I

I I I Milling Milling Milling


Iy Iy Iy
y y y
Sub Sub Sub Sub Sub Sub Sub Sub
sampling sampling sampling sampling sampling sampling sampling sampling
A B C D A' 8' C' D'

I As;n g II As;ng II As;n g II As;ng I


177

Uncertainty analysis
tation value ,.... changes, or the variance cr changes.
Both changes can happen simultaneously. Procedures
Starting with the working hypothesis, an uncertainty such as milling may well change the probability density
analysis can be carried out. The basic problem with the function of the content on the level of particles.
approach of the "Guide to the expression of uncertain- Changes in this distribution function are sometimes
ty in measurement" (GUM) is that many sources of un- wanted (reduction of the combined measurement un-
certainty are difficult, not to say impossible, to quantify certainty by increasing the total number of particles
in evaluating measurement results from matrix materi- (crushing/milling), sometimes unwanted. An example
als. There must also be a relationship between the per- of the latter is the loss of volatiles and moisture during
formance characteristics obtained from the interlabora- milling of coal.
tory study programme and the combined standard un- The evaluation model to be developed should there-
certainty that can be obtained by applying the GUM fore (1) avoid "double counting" of sources of uncer-
directly. tainty and (2) comply with the additivity rule of uncer-
In an interlaboratory study programme the measure- tainties as expressed in Eq. 1. This has been done by
ment chain starts with the arrival of one or more sam- working with a reference term, i.e. the uncertainty is
ples to be analysed. These samples mayor may not un- expressed in terms of the uncertainty of the measure-
dergo further treatment prior to the measurement. The ment, followed by several correction terms that may ac-
principles of this part of the measurement chain, as well count for extra budgets due to other steps in the meas-
as the role that can be played by (certified) reference urement chain.
materials, have been the subject of a previous paper The statistical fundamentals read as follows. Let the
[19]. In the interlaboratory study programme the role random variable X denote the content of the critical
of the (C)RM has been taken over by the blind sample. component (in this example: ash content). The expecta-
The structure of the interlaboratory study programme tion of X is given by
met the requirements set by ISO Guide 35 [16], so that
in principle the blind samples would be suitable to
E(X) =,.... (2)
serve as (at least) reference materials. If, during the process represented by the measure-
The basic expression for the combined measurement ment chain, changes in this expectation take place, then
uncertainty after splitting up the measurement chain it is said that the method is biased. For the determina-
reads as follows tion of the ash content, it is very hard to find out
whether the measurement method is biased, as the pa-
2 _2
ttcomhincd - UcTushing
+2
Usuhdividing
+2
llmcasufcmcnt (1)
rameter is determined by the method. Ash is as such
where it should be realised that each of the terms on not present in coal; the precursor of ash is the mineral
the right-hand side of this equation consists of one or matter in coal. During combustion, this mineral matter
more contributions from various sources. This ap- is converted into ash, a chemical process. The ash com-
proach is in agreement with the rule in statistics that position is a function of the temperature at which the
variances of parts of a process can be added in order to ash formation takes place. As a result, the ash content
obtain the total variance of the whole process [20-23]. (expressed as weight-% of the coal on a dry basis) is
The dominant factor in terms of uncertainty budgets also a function of the temperature.
is the heterogeneity of the material. Unfortunately, the The requested method (ISO 1171) requires a con-
heterogeneity of the material affects all terms on the stant temperature of 815°C. Insufficient control of this
right-hand side of Eq. 1, and as a result these contribu- temperature, or a deviation of the sample temperature
tions are heavily correlated. A crushing step for in- in the oven may lead to a change in expectation value,
stance increases the heterogeneity on the level of par- and thus in a bias of the measurement method. A con-
ticles, but decreases heterogeneity on the level of, say, a venient way of expressing the expectation value of the
few grams of powder. So, from the point of view of critical content could be
evaluating uncertainty, it is no use to make an attempt
p.,= E (X) + E (Bcrushing) + E(Bsubsampling)
to quantify the contribution of heterogeneity. A better
approach is to select the opposite way, starting with in- + E (Bmcasurcmcnt) (3)
vestigating a measurement chain and - by experiment - where B denotes the bias of the given step. Equation 3
breaking it up into smaller parts. The measurement expresses the expectation value in a sum of random
will, however, always be part of the uncertainty evalua- variables, where the first term denotes the expectation
tion, as was shown in a previous paper [14]. value of the method, followed by several correction
During the processing of the material in a measure- terms. In the ideal case each of these terms has the ex-
ment chain, the critical property (say, for instance, ash pectation value 0 (~expected bias=O). Now a match
content) undergoes (from a statistical perspective) must be found between (3) and (1). The expression for
changes. In principle, there are two changes: the expec- the variance of ,.... reads as
178 A.M.H. van der Veen' A.1.M. Broos' A. Alink

cr = Var (X) + Var (Bcrushing) + Var (BsuhsamPling) given in the GUM. Thus, with each of the terms known
+ Var(Bmeasurcmcnt) (4) in Eqs. 3 and 4, the combined measurement uncertainty
can be calculated. Whether the resulting combined
where Var() denotes variance. Equation 4 provides a standard uncertainty equals the reproducibility stand-
mathematical model for expressing the variance of the ard deviation depends on the answer to the question
measurement in terms of the variance of the measurand whether all sources of uncertainty have been included
X and the variances of the bias terms. This equation is in the interlaboratory study. If this is the case, both
only valid if the biases of the steps involved in the processes should lead to the same result. If not, the re-
measurement chain are not correlated. Otherwise, cov- producibility standard deviation will be lower than the
ariance terms should be introduced [13,23] in Eq. 4. It combined standard uncertainty that would be obtained
is unlikely that the bias terms are truly uncorrelated, as by identifying and quantifying all sources of uncertain-
a bias is defined by a systematic difference between the ty.
expectation value of the measurand before and after a
specific treatment. However, it is always possible to
Conclusions
modify Eq. 4 in such a way that it can be expressed as a
set of independent variables. The concepts of the GUM are also valid for solid-state
However, even without assuming that the bias terms materials. With a careful interpretation of the statistical
are uncorrelated, Eq. 4 provides an explanation for the concepts of the standard for the organisation of interla-
insignificant difference in the performance characteris- boratory studies, ISO 5725: 1994 [10, 11, 24-26] can be
tics in the interlaboratory studies as shown in Fig. 1. As brought into agreement with the concepts of the
all laboratories maintain the operational conditions in GUM.
steps such as crushing, milling, subdividing and analysis A measurement chain is best evaluated when taking
by means of a QAS, this will eventually result in com- the consensus/certified value of a reference material as
parable measurement results. Maintenance of the oper- a reference term and expressing all other terms in the
ation conditions during sample preparation and analy- chain in the form of bias terms. The sum of the refer-
sis steps will also minimise the variance of the biases ence term and the bias terms defines the expectation
associated with these steps in the measurement chain. value of the critical content typical for the laboratory,
That is, the terms Var(Bi) in Eq.4 are minimised by a which cannot be better than the reference term (critical
detailed description of the processing of the sampled content and stated uncertainty of the reference materi-
material. If the QAS is successful, it may be expected al). The expectation value of the critical content is inde-
that the contributions of the Var(Bi) terms in Eq. 4 will pendent of a possible correlation of the bias terms.
be small in comparison with the value of the term The expression of the measurement chain in a refer-
Var(X). ence term in combination with several bias terms ena-
There are still two terms to be interpreted: E(X) and bles the evaluation of the effectiveness of the reduction
Var(X). It is well known that the concept of a true val- of the bias (and its variance).
ue is not very useful. The value E(X) is the expectation An exact evaluation of the combined measurement
of the measurand, given the matrix and given the meas- uncertainty requires (at least) knowledge about the re-
urement method. The same holds for Var(X). Both pa- lationship between the critical content X and any of the
rameters account for the performance characteristics of bias terms B. The functional relationships between the
the test method involved. So, the fundament of the bias terms may be left out in a first approximation, as it
model given in Eq.3 complies with the GUM. More- may be expected that successful implementation of a
over, it does not contain parameters that are inaccessi- QAS will minimise the variance of these terms, and as a
ble in practice. The concept of expectations also com- result will lead to a very low covariance value between
plies with the GUM, as it forms the basis for statistics. any of the bias terms when compared to the variance of
The concept of uncertainty is very closely related to the the critical content, Var(X).
standard deviation [13]. The evaluation of the functional relationship be-
Similarly, the QAS will also aim to reduce E(BcruSh- tween X and B is problematic for matrix materials (Le.
ing) and E(BsUhdividing) (and other bias terms) to values coal) due to "matrix effects", but can be well estab-
close to zero. Likewise, the values for Var(Bcrushing) and lished for synthetic, more homogeneous systems.
Var(BSUhdividing) will be minimised by maintaining the Acknowledgements The European Coal and Steel Community
procedure as well as is feasible. In a separate paper [14] (ECSC) is acknowledged for its financial support of this work
the determination of these terms has been discussed in done under contract number ECSC 7220/EC-036, "Preparation
more detail. and characterisation of coal samples and maceral concentrates for
studies on gasification and combustion reactivity of coals in com-
Finally, the combined measurement uncertainty of bined cycle processes". The participants in the interlaboratory
the measurement chain can also be calculated from the study programme are thanked for their work and their expression
experimental biases and variances by the procedures as of interest during the project.
179

References

1. Veen AMH van der, Broos AJM 10. International Organization for Stand- 20. International Organization for Stand-
(1996) Preparation and characterisa- ardization (1994) ISO 5725-1: 1994 ardization (1994) ISO 3534-1 : 1993
tion of coal samples and maceral con- Accuracy (trueness and precision) of Statistics - vocabulary and symbols,
centrates for studies on gasification measurement methods and results, part 1. Probability and general statis-
and combustion reactivity of coals in part I. General principles and defini- tical terms. Statistical methods for
combined cycle processes. Draft final tion. Statistical methods for quality quality control, vol 1. pp 9-57
report, ECSC 7220/EC-036, Eygel- control, vol 2, pp 9-29 21. International Organization for Stand-
shoven 11. International Organization for Stand- ardization (1994) ISO 3534-2: 1993
2. Veen AMH van der (1994) ILS Coal ardization (1994) ISO 5725-2:1994 Statistics - vocabulary and symbols,
Characterisation I, Evaluation report. Accuracy (trueness and precision) of part 2. Statistical quality control. Sta-
NMi Van Swinden Laboratorium measurement methods and results, tistical methods for quality control,
B.V., Eygelshoven part 2. Basic method for the determi- vol 1, pp 5H-92
3. Veen AMH van der, Broos AJM nation of repeatability and reproduci- 22. International Organization for Stand-
(1994) ILS Coal Characterisation II, bility of a standard measurement ardization (1994) ISO 3534-3: 1993
Evaluation report. NMi Van Swinden method. Statistical methods for quali- Statistics - vocabulary and symbols,
Laboratorium B.V., Eygelshoven ty control, vol 2, pp 30-74 part 3. Design of experiments. Statis-
4. Veen AMH van der, Broos AJM 12. BIPM, IEc' IFCC, ISO, IUPAC, IU- tical methods for quality control, vol
(1995) ILS Coal Characterisation III, PAP, OIML (1993) International vo- 1, pp 93-134
Evaluation report. NMi Van Swinden cabulary of basic and general terms in 23. DeGroot MH (19H9) Probability and
Laboratorium B.V., Eygelshoven metrology, 2nd edn. ISO, Geneva statistics, 2nd edn. Addison-Wesley
5. Veen AMH van der, Broos AJM 13. BIPM, IEC, IFCC, ISO, IUPAC, IU- 24. International Organization for Stand-
(1995) ILS Coal Characterisation IV, PAP, OIML (1993) Guide to the ex- ardization (1994) ISO 5725-3:1994
Evaluation report. NMi Van Swinden pression of uncertainty in measure- Accuracy (trueness and precision) of
Laboratorium B.V., Eygelshoven ment, 1st edn. ISO, Geneva measurement methods and results,
6. Veen AMH van der, Broos AJM 14. Veen AMH van der, Alink A (199H). part 3. Intermediate measures of the
(1995) ILS Coal Characterisation V, Accred Qual Assur 3:20-26 precision of a standard measurement
Evaluation report. NMi Van Swinden 15. International Organization for Stand- method. Statistical methods for quali-
Laboratorium B.V., Eygelshoven ardization (1996) ISO/IEC Guide 43- ty control, pp 75-104
7. Veen AMH van der, Broos AJM 1: voting draft. Proficiency testing by 25. International Organization for Stand-
(1996) ILS Coal Characterisation VI, interlaboratory comparisons, part 1. ardization (1994) ISO 5725-4:1994
Evaluation report. NMi Van Swinden Development and operation of profi- Accuracy (trueness and precision) of
Laboratorium B.V., Eygelshoven ciency testing schemes measurement methods and results,
H. Veen AMH van der, Broos AJM 16. International Organization for Stand- part 4. Basic methods for the deter-
(1996) ILS Coal Characterisation VII, ardization (19H9) ISO Guide 35: 19H9 mination of the trueness of a stand-
Evaluation report. NMi Van Swinden - Certification of reference materials ard measurement method. Statistical
Laboratorium B.V., Eygelshoven - general and statistical principles, methods for quality control, pp
9. Veen AMH van der, Broos AJM 2nd edn. ISO, Geneva 105-130
(1996) ILS Coal Characterisation 17. International Organization for Stand- 26. International Organization for Stand-
VIII, Evaluation report. NMi Van ardization (1975) ISO 19HH Hard coal ardization (1994) ISO 5725-6: 1994
Swinden Laboratorium B.V., Eygel- - sampling. ISO, Geneva Accuracy (trueness and precision) of
shoven IH. International Organization for Stand- measurement methods and results,
ardization (1997) ISO 1171 Solid mi- part 6. Use in practice of accuracy
neral fuels - determination of ash. values. Statistical methods for quality
ISO, Geneva control, pp 131-176
19. Veen AMH van der, Alink A, Ver-
kuil D, Lecq B van der (1996).
Accred Qual Assur 1:207-212,250
Accred Qual Assur (2000) 5:47-53
© Springer-Verlag 2000

Vicki J. Barwick The evaluation of measurement


Stephen L.R. Ellison
uncertainty from method validation
studies
Part 1: Description of a laboratory protocol

Abstract A protocol has been de- ness and ruggedness studies is dis-
veloped illustrating the link be- cussed in detail. The practical ap-
tween validation experiments, such plication of the protocol will be il-
as precision, trueness and rugged- lustrated in Part 2, with reference
ness testing, and measurement un- to a method for the determination
certainty evaluation. By planning of three markers (CI solvent red
validation experiments with uncer- 24, quinizarin and CI solvent yel-
V.J. Barwick (181) . S.L.R. Ellison tainty estimation in mind, uncer- low 124) in fuel oil samples.
Laboratory of the Government Chemist, tainty budgets can be obtained
Queens Road, Teddington, Middlesex, from validation data with little ad- Key words Measurement
TWl1 OLY, UK
e-mail: vjb@lgc.co.uk
ditional effort. The main stages in uncertainty . Method validation .
Tel.: + 44-20-89437421 the uncertainty estimation process Precision . Trueness . Ruggedness
Fax: + 44-20-89432767 are described, and the use of true- testing

cations of this approach to analytical chemistry have


Introduction been published [3, 4]. However, the GUM principles
are significantly different from the methods currently
In recent years, the subject of the evaluation of meas- used in analytical chemistry for estimating uncertainty
urement uncertainty in analytical chemistry has gener- [5-8] which generally make use of "whole method" per-
ated a significant level of interest and discussion. It is formance parameters, such as precision and recovery,
generally acknowledged that the fitness for purpose of obtained during in-house method validation studies or
an analytical result cannot be assessed without some es- during method development and collaborative study
timate of the measurement uncertainty to compare with [9-11]. We have previously described a strategy for re-
the confidence required. The Guide to the Expression conciling the information requirements of formal (Le.
of Uncertainty in Measurement (GUM) published by GUM) measurement uncertainty principles with the
ISO [1] establishes general rules for evaluating and ex- data generated from method validation studies [12-14].
pressing uncertainty for a wide range of measurements. The approach involves a detailed analysis of the factors
The guide was interpreted for analytical chemistry by influencing the result using cause and effect analysis
EURACHEM in 1995 [2]. The approach described in [15]. This results in a structured list of the possible
the GUM requires the identification of all possible sources of uncertainty associated with the method. The
sources of uncertainty associated with the procedure; list is then simplified and reconciled with existing ex-
the estimation of their magnitude from either experi- perimental and other data. We now report the applica-
mental or published data; and the combination of these tion of this approach in the form of a protocol for the
individual uncertainties to give standard and expanded estimation of measurement uncertainty from validation
uncertainties for the procedure as a whole. Some appli- studies [16]. This paper outlines the key stages in the
The evaluation of measurement uncertainty from method validation studies. Part I: Description of a laboratory protocol lSI

protocol and discusses the use of data from trueness them, uncertainty components which are less than one-
and ruggedness studies in detail. The practical applica- third of the largest need not be evaluated in detail. Fi-
tion of the protocol will be described in Part 2 with ref- nally, the individual uncertainty components for the
erence to a high performance liquid chromatography method are combined to give standard and expanded
(HPLC) procedure for the determination of markers in uncertainties for the method as a whole. The use of
road fuel [17]. data from trueness and ruggedness studies in uncertain-
ty estimation is discussed in more detail below.

Principles of approach
Trueness studies
The stages in the uncertainty estimation process are il-
lustrated in Fig. 1. An outline of the procedure dis- In developing the protocol, the trueness of a method
cussed in the protocol is presented in Fig. 2. The first was considered in terms of recovery, i.e. the ratio of the
stage of the procedure is the identification of sources of observed value to the expected value. The evaluation
uncertainty for the method. Once the sources of uncer- of uncertainties associated with recovery is discussed in
tainty have been identified they require evaluation. detail elsewhere [18, 19]. In general, the recovery, R,
The main tools for doing this are precision, trueness (or for a particular sample is considered as comprising
bias) and ruggedness studies. The aim is to account for three components:
as many sources of uncertainty as possible during the - Rm is an estimate of the mean method recovery ob-
precision and trueness studies. Any remaining sources tained from, for example, the analysis of a CRM or a
of uncertainty are then evaluated either from existing spiked sample. The uncertainty in Rm is composed of
data (e.g. calibration certificates, published data, pre- the uncertainty in the reference value (e.g. the uncer-
vious studies, etc.) or via ruggedness studies. Note that tainty in the certified value of a reference material)
it may not be necessary to evaluate every source of un- and the uncertainty in the observed value (e.g. the
certainty in detail, if the analyst has evidence to suggest standard deviation of the mean of replicate ana-
that some are insignificant. Indeed, the EURACHEM lyses).
Guide states that unless there are a large number of - R, is a correction factor to take account of differ-
ences in the recovery for a particular sample com-
pared to the recovery observed for the material used
Identify sources of to estimate Rm.
uncertainty - Rrep is a correction factor to take account of the fact
that a spiked sample may behave differently to a real
sample with incurred analyte.
Plan and carry out These three elements are combined multiplicatively
precision study
to give an estimate of the recovery for a particular sam-
ple, R, and its uncertainty, u(R):
Plan and carry out R= Rm X R,. X R rcp , (1)
trueness study

J
u(R)=Rx (2)
Identify additional sources of
uncertainty and evaluate Rill and u(Rm) are calculated using Eq. (3) and
--. ~~-I~~-~- __ . . - Eq. (4):

Combine individual uncertainty (3)


estimates to give standard and
expanded uncertainties for the
method u(Rm) = Rm X ( Cob,
~}b,)2 +(U(CRM))2, (4)
C RM
where C"b, is the mean of the replicate analyses of the
Report and document reference material (e.g. CRM or spiked sample), Sobs is
the uncertainty the standard deviation of the mean of the results, CRM
is the concentration of the reference material and
Fig. 1 Flow chart summarising the uncertainty estimation u( C RM) is the standard uncertainty in the concentration
process of the reference material. To determine the contribu-
182 V.J. BaIWick· S.L.R. Ellison

Fig. 2 Flow chart illustrating


the stages in the method vali-
dation/measurement uncer-
tainty protocol

Refine the list by resolving duplication


and grouping related terms

Carry out replicate analyses on Yes No


Does the method scope Carry out replicate analyses on
samples representative of
cover a range of analyte a representative sample
concentrations and/or matrices
across a number of batches
covered by method scope ~nc~~~::;nd/or

~
Remove sources of uncertainty covered
f-----tJ by precision experiments from list 14-------1

Yes No Calculate Rm from replicate


Calculate Rm from Representativ analyses of a representative
replicate analyses in a CRM available? spiked sample in a single
single batch batch or other method

tion of Rm to the combined uncertainty for the method uncertainty associated with Rm must be increased to
as a whole, the estimate is compared with 1, using an take account of this uncorrected bias. The relevant
equation of the form: equation is:

(5) -
u(Rm) -_V(1-Rm)2
I
- k - +u (-
Rm) 2 . (6)

To determine whether Rm is significantly different A special case arises when an empirical method is
from 1, the calculated value of t is compared with the being studied. In such cases, the method defines the
coverage factor, k = 2, which will be used to calculate measurand (e.g. dietary fibre, extractable cadmium
the expanded uncertainty [19]. A t value greater than 2 from ceramics). The method is considered to define the
suggests that Rm is significantly different from 1. How- true value and is, by definition, unbiased. The presump-
ever, if in the normal application of the method, no cor- tion is that Rm is equal to 1 and that the only uncertain-
rection is made to take account of the fact that the ty is that associated with the laboratory's particular ap-
method recovery is significantly different from 1, the plication of the method. In some cases, a reference ma-
The evaluation of measurement uncertainty from method validation studies. Part I: Description of a laboratory protocol 183

1
Fig. 2 Continued

Calculate Rs and
u(Rsl
~ cover a range of an~lyte
~
concentrations and/or
matrices?

'r~'
//~
stUd~
'>
Spiking Yes
used to estimate
Rm? /

~
Combine all recovery uncertainties to
give Rand u(RI 1(-------'

Remove sources of uncertainty


covered by trueness study from list

Identify sources of uncertainty not


covered by precision/trueness studies

Calculate uncertainty yeS~xisting d~ta'


No
available? ' - -_ _-,--_ _-.J

~/ ...
"'
<
~para~" >
have a significant
Yes Use data from ruggedness
study/ carry out additional

~~:::/
experiments

No

Combine preCision, recovery and


additional uncertainties to give
combined standard uncertainty
184 V.J. Barwick' S.L.R. Ellison

terial certified for use with the method may be availa- cially designed experimental studies. One efficient
ble. Where this is so, a bias study can be carried out method of experimental study is ruggedness testing,
and the results treated as discussed above. If there is no discussed below.
relevant reference material, it is not possible to esti-
mate the uncertainty associated with the laboratory
bias. There will still be uncertainties associated with Ruggedness studies
bias, but they will be associated with possible bias in
the temperatures, masses, etc. used to define the meth- Ruggedness tests are a useful way of investigating si-
od. In such cases it will normally be necessary to con- multaneously the effect of several experimental param-
sider these individually. eters on method performance. The experiments are
Where the method scope covers a range of sample based on the ruggedness testing procedure described in
matrices and/or analyte concentrations, an additional the Statistical Manual of the AOAC [20]. Such experi-
uncertainty term Rs is required to take account of dif- ments result in an observed difference, D Xi , for each pa-
ferences in the recovery of a particular sample type, rameter studied which represents the change in result
compared to the material used to estimate Rm. This can due to varying that parameter. The parameters are
be evaluated by analysing a representative range of tested for significance using a Student's t-test of the
spiked samples, covering typical matrices and analyte form [21]:
concentrations, in replicate. The mean recovery for
each sample type is calculated. Rs is normally assumed t= ynxD' i
(7)
to be equal to 1. However, there will be an uncertainty
-y2xs '
associated with this assumption, which appears in the where s is the estimate of the method precision, n is the
spread of mean recoveries observed for the different number of experiments carried out at each level for
spiked samples. The uncertainty, u(R,.), is therefore each parameter (n =4 for a seven-parameter Plackett-
taken as the standard deviation of the mean recoveries Burman experimental design), and DXi is the difference
for each sample type. calculated for parameter Xi' The values of t calculated
When a spiked sample, rather than a matrix refer- using Eq. (7) are compared with the appropriate critical
ence material, has been used to estimate Rm it may be values of t at 95% confidence. Note that the degrees of
necessary to consider Rrep and its uncertainty. In gener- freedom for terit relate to the degrees of freedom for the
al, Rrep is assumed to equal 1, indicating that the recov- precision estimate used in the calculation of t. For pa-
ery observed for a spiked sample is truly representative rameters identified as having no significant effect on
of that for the incurred analyte. The uncertainty, the method performance, the uncertainty in the final
u(Rrep), is a measure of the uncertainty associated with result y due to parameter Xi, u(y(xJ), is calculated using
that assumption. In some cases it can be argued that a Eq. (8):
spike is a good representation of a real sample, for ex-
ample in liquid samples where the analyte is simply dis- (y (Xi )) -_ -y2 X terit X S Oreal
(8)
solved in the matrix; u(Rrep) can therefore be assumed
U
Vn X 1.96 X--,
Owst
to be small. In other cases there may be reason to be- where Oreal is the change in the parameter which would
lieve that a spiked sample is not a perfect model for a be expected when the method is operating under con-
test sample and u(Rrep) may be a significant source of trol in routine use and Olest is the change in the param-
uncertainty. The evaluation of u(Rrep) is discussed in eter that was specified in the ruggedness study. In other
more detail elsewhere [18]. words, the uncertainty estimate is based on the 95%
confidence interval, converted to a standard deviation
by dividing by 1.96 [1,2]. The orca/Otest term is required
Evaluation of other sources of uncertainty to take account of the fact that the change in a parame-
ter used in the ruggedness test may be greater than that
An uncertainty evaluation must consider the full range observed during normal operation of the method. For
of variability likely to be encountered during applica- parameters identified as having a significant effect on
tion of the method. This includes parameters relating to the method performance, a first estimate of the uncer-
the sample (analyte concentration, sample matrix) as tainty can be calculated as follows:
well as experimental parameters associated with the
method (e.g. temperature, extraction time, equipment (9)
settings, etc.). Sources of uncertainty not adequately c= Observed change in result (10)
covered by the precision and trueness studies require
I Change in parameter '
separate evaluation. There are three main sources of
information: calibration certificates and manufacturers' where u(xJ is the uncertainty in the parameter and Ci is
specifications, data published in the literature and spe- the sensitivity coefficient.
The evaluation of measurement uncertainty trom method validation studies. Part I: Description of a laboratory protocol 185

Fig. 3 Contributions to the • CI Solvent red 24


measurement uncertainty for
[l] Quinizarin
the determination of CI sol-
vent red 24, quinizarin a nd CI [jjJ CI Solvent yellow 124
solvent yellow 124 in fu el oil

Recovery, u(R)

Brand of cartridge, U(Y(XA))

Sample v olume, U(Y(XB))

Rate of elution, u(y(x e))

Volume of hexane wash,


u(y(xo))
Conc. butan-1-ollhexane,
U(Y(XE))
Vol. butan-1-0I/hexane,
U(Y(XF)) -t:1._ _

Evaporation temp., u(y(xG))

Flow rate, u(y(x B 7)

Injection volume, u(y(x e 1)

Column temp., u(y(x 0 7)

Detector wavelength A, u(y(xE7)

Detector wavelength B, u(y(x F 1)

o 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1

Uncertainty contribution as RSD

The estimates obtained by applying Eqs. 8-10 are in- Calculation of combined measurement uncertainty for
tended to give a first estimate of the measurement un- the method
certainty associated with a particular parameter. If such
The individual sources of uncertainty, evaluated
estimates of the uncertainty are found to be a signifi-
through the precision, trueness, ruggedness and other
cant contribution to the overall uncertainty for the
studies are combined to give an estimate of the stand-
method, further study of the effect of the parameters is
ard uncertainty for the method as a whole. Uncertainty
advised, to establish the true relationship between
contributions identified as being proportional to ana-
changes in the parameter and the result of the method.
lyte concentration are combined using Eq. (11):
However, if the uncertainties are found to be small
compared to other uncertainty components (i.e. the un-
certainties associated with precision and trueness) then u~) =V(u~)r +(u~q)r +(u~)r + ... , (11)
no further study is required.

which each have uncertainties u(P), u(q), u(r) ....


where the result y is affected by parameters p , q, r . ..
Un-
certainty contributions identified as being independent
of analyte concentration are combined using Eq. (12):
186 V.J. Barwick' S.L.R. Ellison

(12) The main disadvantage of this approach is that it


may not readily reveal the main sources of uncertainty
The combined uncertainty in the result at a concentra- for a particular method. In previous studies we have
tion y' is calculated as follows: typically found the uncertainty budget to be dominated
by the precision and trueness terms [14]. In such cases,
if the combined uncertainty for the method is too large,
UCyl)=VCUCy)/)2+(yl X U~)r (13) indicating that the method requires improvement, fur-
ther study may be required to identify the stages in the
method which contribute most to the uncertainty. How-
ever, the approach detailed here will allow the analyst
Discussion and conclusions to obtain, relatively quickly, a sound estimate of meas-
urement uncertainty, with minimum experimental work
We have developed a protocol which describes how beyond that required for method validation.
data generated from experimental studies commonly We have applied this protocol to the evaluation of
undertaken for method validation purposes can be used the measurement uncertainty for a method for the de-
in measurement uncertainty evaluation. The main ex- termination of three markers CCI solvent red 24, CI sol-
perimental studies required are for the evaluation of vent yellow 124 and quinizarin C1,4-dihydroxyanthra-
precision and trueness. These should be planned so as quinone» in road fuel. The method requires the extrac-
to cover as many of the possible sources of uncertainty tion of the markers from the sample matrix by solid
identified for the method as possible. Any remaining phase extraction, followed by quantification by HPLC
sources are considered separately. If there is evidence with diode array detection. The uncertainty evaluation
to suggest that they will be small compared to the un- involved four experimental studies which were also re-
certainties associated with precision and trueness, then quired as part of the method validation. The studies
no further study is required. However for uncertainty were precision, trueness Cevaluated via the analysis of
components where no prior information is available, spiked samples) and ruggedness tests of the extraction
further experimental study will be required. One useful and HPLC stages. The experiments and uncertainty
approach is ruggedness testing which allows the evalua- calculations are described in detail in Part 2. A summa-
tion of a number of sources of uncertainty simulta- ry of the uncertainty budget for the method is present-
neously. It should be noted that ruggedness testing re- ed in Fig. 3.
ally only gives a first estimate of uncertainty contribu-
tions. Further study is recommended to refine the esti- Acknowledgements The work described in this paper was sup-
ported under contract with the Department of Trade and [ndus-
mates for any sources of uncertainty which appear to try as part of the United Kingdom National Measurement System
be a significant contribution to the total uncertainty. Valid Analytical Measurement (VAM) Programme.

References
1. [SO (1993) Guide to the expression 9. [UPAC (1988) Pure Appl Chern 17. Barwick VJ, Ellison SLR, Rafferty
of uncertainty in measurement. [SO, 60:885 MJQ, Gill RS (1999) Accred Qual
Geneva 10. AOAC (1989) 1 Assoc Off Anal Assur
2. EURACHEM (1995) Quantifying un- Chern 72:694-704 18. Barwick VJ, Ellison SLR (1999) Ana-
certainty in analytical measurement. 11. [SO 5725: 1994 (1994) Accuracy lyst 124: 981-990
Laboratory of the Government (trueness and precision) of measure- 19. Ellison SLR, Williams A (1996) In:
Chemist (LGC), London ment methods and results. ISO, Parkany M (ed) The use of recovery
3. Pueyo M, Obiols J, Vilalta E (1996) Geneva factors in trace analysis. Royal Socie-
Anal Commun 33: 205-208 12. Ellison SLR, Barwick VJ (1998) ty of Chemistry, Cambridge
4. Williams A (1993) Anal Proc Accred Qual Assur 3:101-105 20. Youden WJ, Steiner EH (1975) Sta-
30:248-250 13. Ellison SLR, Barwick VJ (1998) Ana- tistical manual of the association of
5. Analytical Methods Committee lyst 123:1387-1392 official analytical chemists. Associa-
(1995) Analyst 120: 2303-2308 14. Barwick V1, Ellison SLR (1998) Anal tion of Official Analytical Chemists
6. Ellison SLR (1997) In: Ciarlini P, Comm 35:377-383 (AOAC), Arlington, Va.
Cox MG, Pavese F, Tichter D (eds) 15. ISO 9004-4:1993 (1993) Total quality 21. Vander Heyden Y, Luypaert K, Hart-
Advanced mathematical tools in me- management Part 2: Guidelines for mann C, Massart DL, Hoogmartens J,
trology III. World Scientific, Singa- quality improvement. [SO, Geneva De Beer J (1995) Anal Chim Acta
pore 16. Barwick VJ, Ellison SLR (1999) Pro- 312:245-262
7. Ellison SLR, Williams A (1998) tocol for uncertainty evaluation from
Accred Qual Assur 3: 6--10 validation data. V AM Technical Re-
8. Rlos A, Valcarcel M (1998) Accred port No. LGC/v AM11998/088, availa-
Qual Assur 3:14-29 ble on LGC website at www.lgc.co.uk
Accred Qual Assur (2()OO) 5: 104-113
© Springer-Verlag 2000

Vicki J. Barwick The evaluation of measurement


Stephen L.R. Ellison
Mark J.Q. Rafferty uncertainty from method validation
Rattanjit S. Gill
studies
Part 2: The practical application of a laboratory
protocol

Abstract A protocol has been de- with diode array detection. The un-
veloped illustrating the link be- certainties for the determination of
tween validation experiments and the markers were evaluated using
measurement uncertainty evalua- data from precision and trueness
tion. The application of the proto- studies using representative sample
col is illustrated with reference to a matrices spiked at a range of con-
method for the determination of centrations, and from ruggedness
three markers (CI solvent red 24, studies of the extraction and
V.1. Barwick (181) . S.L.R. Ellison quinizarin and CI solvent yellow HPLC stages.
M.1.Q. Rafferty· R.S. Gill 124) in fuel oil samples. The meth-
Laboratory of the Government Chemist, od requires the extraction of the Key words Measurement
Queens Road. Teddington, Middlesex, markers from the sample matrix by uncertainty . Method validation
TWll OLY, UK Precision . Trueness . Ruggedness .
e-mail: vjb@lgc.co.uk,
solid phase extraction followed by
Tel.: + 44-20-H943 7421, quantification by high performance High performance liquid
Fax: + 44-20-H943 2767 liquid chromatography (HPLC) chromatography

taken and shows how the data were used in the calcula-
Introduction tion of the measurement uncertainty.

In Part 1 [1] we described a protocol for the evaluation


of measurement uncertainty from validation studies Experimental
such as precision, trueness and ruggedness testing. In
this paper we illustrate the application of the protocol Outline of procedure for the determination of CI
to a method developed for the determination of the solvent red 24, CI solvent yellow 124 and quinizarin in
dyes CI solvent red 24 and CI solvent yellow 124, and fuel samples
the chemical marker quinizarin (1,4-dihydroxyanthra-
quinone) in road fuel. The analysis of road fuel samples Extraction procedure
suspected of containing rebated kerosene or rebated
gas oil is required as the use of rebated fuels as road The sample (10 ml) was transferred by automatic pi-
fuels or extenders to road fuels is illegal. To prevent pette to a solid phase extraction cartridge containing
illegal use of rebated fuels, HM Customs and Excise re- 500 mg silica. The cartridge was drained under vacuum
quire them to be marked. This is achieved by adding until the silica bed appeared dry. The cartridge was
solvent red 24, solvent yellow 124 and quinizarin to the then washed under vacuum with 10 ml hexane to re-
fuel. A method for the quanti tat ion of the markers was move residual oil. The markers were eluted from the
developed in this laboratory [2]. Over a period of time cartridge under gravity with 10 ml butan-l-01 in hexane
the method had been adapted to improve its perform- (10% v/v). The eluent was collected in a glass specimen
ance and now required re-validation and an uncertainty vial and evaporated to dryness by heating to 50°C un-
estimate. This paper describes the experiments under- der an air stream. The residue was dissolved in aceton-
188 V. J. Barwick et al.

itrile (2.5 ml) and the resulting solution placed in an ul- solution, V F is the final volume of the sample solution
trasonic bath for 5 min. The solution was then passed (ml), Vs is the volume of the sample taken for analysis
through a 0.45 /-Lm filter prior to analysis by high per- (ml) and CSTD is the concentration of the standard solu-
formance liquid chromatography (HPLC). tion (mg I-I).

HPLC conditions Experiments planned for validation and uncertainty


estimation
The samples (50 j.Ll) were analysed on a Hewlett Pack-
ard 1050 DAD system upgraded with a 1090 DAD op- A cause and effect diagram [3-5] illustrating the main
tical bench. The column was a Luna 5 j.Lm phenyl-hexyl, parameters controlling the result of the analysis is pre-
250 mm X 4.6 mm maintained at 30°C. The flow rate sented in Fig. 1. Note that uncertainties associated with
was 1 ml min- I using a gradient elution of acetonitrile sampling are outside the scope of this study, as the un-
and water as follows: certainty was required for the sample as received in the
laboratory. The uncertainty contribution from sub-sam-
Time (min) o 3 4 5 9 10 20 21 23 pling the laboratory sample is represented by the "in-
Water 40 40 30 10 10 2 2 40 40 homogeneity" branch in Fig. 1. Initially, two sets of ex-
% Acetonitrile 60 60 70 90 90 98 98 60 60 periments were planned - a precision study and a true-
Calibration was by means of a single standard in ness study. These were planned so as to cover as many
sources of uncertainty as possible. Parameters not ade-
acetonitrile containing CI solvent red 24 and CI solvent
quately covered by these experiments (i.e. not varied
yellow 124 at a concentration of approximately
20 mg I - I and quinizarin at concentration of approxi- representatively) were evaluated separately using rug-
mately 10 mg I -I. CI solvent red 24 and quinizarin gedness tests or existing published data. Whilst these
studies are required for the method validation process,
were quantified using data (peak areas) recorded on
it should be noted that they do not form a complete
detector channel B (500 nm), whilst CI solvent yellow
124 was quantified using data recorded on detector validation study [6].
channel A (475 nm). The concentration of the analyte,
C in mg I-I, was calculated using Eq. (1):

C= Asx VFXCSTD
(1)
ASTDX Vs '
where As is the peak area recorded for the sample solu- Fig. 1 Cause and effect diagram illustrating sources of uncertain-
tion, ASTD is the peak area recorded for the standard ty for the method for the determination of markers in fuel oil

Sample peak area (As) Working std conc (C STD ) Recovery (R)

pipette calibratio~
integration - - - . c
stock soln
,.~
"\
vol stock soln
'matrix effects
cone "------..----"'---,-+\ T
C

~WOrking std vol


sample vol
T
weight marker
HPLC perfonnance

CSTD precision
temperatur'e------l~

VF precision
injection vol recovery precision

pipette Vs precision
reference solution cone Key:
calibration
T temperature effects
HPLC perfonnance - - - . C flask/pipel\e calibration
ASIO precision Bc balance calibration
L balance linearity
Working std peak area (ASTD) Sample volume (Vs) Precision (P)
The evaluation of measurement uncertainty from method validation studies. Part 2 189

Precision experiments studied for the extraction/clean-up stage of the method


and the levels chosen are shown in Table 1a. The rug-
Samples of 3 unmarked fuel oils (A-C) were fortified gedness test was applied to the matrix B (diesel oil)
with CI solvent red 24 at concentrations of 0.041, 1.02, sample containing 2.03 mg 1- 1 CI solvent red 24,
2.03, 3.05 and 4.06 mg 1- \ quinizarin at concentrations 0.996 mg 1-1 quinizarin and 2.40 mg 1-1 CI solvent yel-
of 0.040, 0.498, 0.996, 1.49 and 1.99 mg 1-1; and CI sol- low 124 used in the precision study. The eight experi-
vent yellow 124 at concentrations of 0.040, 1.20, 2.40, ments were carried out over a short period of time and
3.99 and 4.99 mg 1-1 to give a total of 15 fortified sam- the resulting sample extracts were analysed in a single
ples. Oil B was a diesel oil, representing a sample of HPLC run. The HPLC parameters investigated and the
typical viscosity. Oil A was a kerosene and oil C was a levels chosen are given in Table lb. For this set of ex-
lubricating oil. These oils are respectively less viscous periments a single extract of the matrix B (diesel oil)
and more viscous than oil B. sample, obtained under normal method conditions, was
Initially, 12 sub-samples of oil B with a concentra- used. The extract and a standard were run under each
tion of 2.03 mg 1-1 CI solvent red 24, 0.996 mg 1-1 set of conditions required by the ruggedness test. The
quinizarin and 2.40 mg 1-1 CI solvent yellow 124 were effect of variations in the parameters was monitored by
analysed. The extraction stage was carried out in two calculating the concentration of the markers observed
batches of six on consecutive days. The markers in all under each set of parameters, using the appropriate
12 sub-samples were quantified in a single HPLC run, standard.
with the order of the analysis randomised. This study
was followed by the analysis, in duplicate, of all 15 sam-
ples. The sample extracts were analysed in three sepa- Results and uncertainty calculations
rate HPLC runs such that the duplicates for each sam-
ple were in different runs. For each HPLC run a new Precision study
standard and a fresh batch of mobile phase was pre-
pared. The results from the precision studies are summarised
In addition, the results obtained from the replicate in Table 2. Estimates for the standard deviation for a
analysis of a sample of BP diesel, prepared for the true- single result were obtained from the results of the du-
ness study (see below), were used in the estimate of un- plicate analyses of the 15 samples, by taking the stand-
certainty associated with method precision. ard deviation of the differences between the pairs and
dividing by yz.
Estimates of the relative standard de-
viations were obtained by treating the normalised dif-
Trueness experiments ferences in the same way [7]. The results from the anal-
ysis of the BP diesel sample represented three batches
No suitable CRM was available for the evaluation of of 16 replicate analyses. An estimate of the total preci-
recovery. The study therefore employed representative sion (i.e. within and between batch variation) was ob-
samples of fuel oil spiked with the markers at the re- tained via ANOVA [8]. The precision estimates cover
quired concentrations. To obtain an estimate of Rm and different sources of variability in the method. The esti-
its uncertainty, a 2-1 sample of unmarked BP diesel was mates obtained from the duplicate samples and the BP
spiked with standards in toluene containing CI solvent oil sample cover batch to batch variability in the extrac-
red 24, quinizarin and CI solvent yellow 124 at concen- tion and HPLC stages of the method (including the
trations of 0.996 mg ml - 1, 1.02 mg ml - 1 and preparation of new standards and mobile phase). The
1.97 mg ml -1, respectively, to give concentrations in estimate obtained from matrix B does not cover batch
the diesel of 4.06 mg 1- 1, 1.99 mg 1- 1 and 4.99 mg 1- 1, to batch variability in the HPLC procedure as all the
respectively. A series of 48 aliquots of this sample were replicates were analysed in a single HPLC run. The
analysed in 3 batches of 16. The estimate of Rs and its precision studies also cover the uncertainty associated
uncertainty, u(Rs), was calculated from these results with sample inhomogeneity as they involved the analy-
plus the results from the analysis of the samples used in sis of a number of sub-samples taken from the bulk.
the precision study.

CI solvent red 24
Evaluation of other sources of uncertainty: Ruggedness
test No significant difference was observed (F-tests, 95%
confidence) between the three estimates obtained for
The effects of parameters associated with the extrac- the relative standard deviation (0.0323, 0.0289 and
tion/clean-up stages and the HPLC quantification stage 0.0414). However, the test was borderline and across
were studied in separate experiments. The parameters the range studied (0.04 mg 1-1 to 4 mg I -1) the method
Table 1 Results from the ruggedness testing of the procedure for the determination of CI solvent red 24, quinizarin and CI solvent yellow 124 in fuel oil. a Ruggedness 'D
testing of the extraction/clean-up procedure o

Parameter Values 8.ceal/ o.o" u(x;) CI solvent red 24 Ouinizarin CI solvent yellow 124
:<
'-
Dx , C; u(Y(x;» D, C; u(Y(x;) ) Dx, C, u(Y(x;» to
(mg I-I) (mgl-I) (mg I-I) (mg I-I) (mgl-I) (mgl-') '~"
;S.
B rand of silica cartridges A Varian a Waters 1/1" 0.00750* - 0.0493 -0.00750* - 0.0174 -0.00250* - OJ1l99 i>I""
Sample volume B 10ml b 12 ml 0.04ml -0.353 0.176 0.00705 -0.180 0.090 0.00360 -0.423 0.212 0.00845 ~
Rate of elution of oil C vacuum c gravity 1/10" 0.0275* 0.00493 0.070 0'()070" -0.020* 0.00199 a
with hexane
Volume of hexane wash D 12 ml d 8ml 0.04 ml 0.213 0.0531 0.00213 0.176 0.0444 0.00177 0.225 0.0563 0'()0225
Concentration E 12% e 8% 0.2% (v/v)1 -0.0425* 0.00247 0.0175* 0.000868 -0.010* 0.0010
of butan-1-0Ilhexane 4°/., (v/v)
Volume 10% F 12ml 8ml 0.08 mil 0.04ml 0.0625* 0.000986 -0.0050* 0.000347 0.080 0.020 0.00080
of butan-l-ol/hexane 4ml
Evaporation temperature G 50°C g 80°C lO'C/30°C 2.89°C -0.0275* 0.0164 -0.0425 0.00142 0.00409 0.00750* - 0.00663

a See text for explanation


* No significant effect at the 95% confidence level

Table Ib Ruggedness testing of the HPLC procedure

Parameter Values O,cal/o.c,' u(x;) CI solvent rcd 24 Ouinizarin CI solvent yellow 124

Dx , C; u(Y(x;) ) Dx , C; u(y(x;) ) Dx , C; u(y(x;»


(mg I-I) (mgl-I) (mg I-I) (mg I-I) (mgl-I) (mg 1- ')

Type of acetonitrile A' Far-UV a' HPLC a -0.0748 a a -0.101 a a -0'()228* a a


in mobile phase grade grade
Flow rate B' 0.8ml b'l.2ml 0.00173 -0.124 0.309 0'()00535 0.0283 0.0707 0.000122 -0.0465 0.116 0.00020
min- I min- 1 ml min- I
Injection volume C' 40 fl.l c' 60fl.1 0.75 fl.l 0.115 0.00576 0.00432 -0.0406 0.00203 0.00152 0.0284 0.00142 0.00107
Column temperature D' 25°C d' 35°C 2 'C/lO'C 1 'c -0.130 0.0130 0.00752 0.0201 0.00201 0.00116 -0.0233* 0.00282
Detector E' 465 nm e' 485 nm 4 nm/20 nm 1.15 nm 0.104 0.00520 0.00598 0.0239 0.00120 0.00138 -0.0161* 0.00282
wavelength (A)
Degassing of F' Degassed f' Not a 0.108 a a 0.0641 a a 0.00907* a a
mobile phase degassed
Detector G' 490nm g' 510 nm 4 nml 1.15 nm 0.105 0.00525 0.00604 -0.0112* 0.00154 0.0198* 0.00282
wavelength (8) 20nm

a See text for explanation


* No significant effect at the 95% confidence level
The evaluation of measurement uncertainty from method validation studies. Part 2 191

Table 2 Summary of data used in the estimation of u(P) the method is more variable across different matrices
and analyte concentrations for quinizarin than for the
Analyte/Matrix n Mean Standard Relative
(mg 1-1) deviation standard other markers. The uncertainty associated with the pre-
(mg 1-1) deviation cision was taken as the estimate of the relative standard
deviation obtained from the duplicate results, 0.0788.
CI solvent red 24 This estimate should ensure that the uncertainty is not
Matrix B 12 1.92 0.0021 0.0323
BP diesel 48" 3.88 0.112 0.0289
underestimated for any given matrix or concentration
Matrices A-C 15 b OJ)370 0.0414 (although it may result in an overestimate in some
Quinizarin cases).
Matrix B 11 0.913 0.0210 0.0230
BP diesel 48" 1.89 0.0250 0.0136
Matrices A-C 15 b 0.0470 0.0788
CI solvent yellow U4 CI solvent yellow 124
Matrix B 12 2.35 0.0251 (Ull07
BP diesel 48" 4.99 OJ)018 0.0124 There was no significant difference between the esti-
Matrices A-C 15 b 0.0247 0,(1464 mates of the relative standard deviation obtained for
"Standard deviation and relative standard deviation estimated samples at concentrations of 2.4 mg I - I and
from ANOYA of 3 sets of 16 replicates (see text) 4.99 mg I-I. However, the estimate obtained from the
b Standard deviation and relative standard deviation estimated duplicate analyses was significantly greater than the
from duplicate results (15 sets) for a range of concentrations and other estimates. Inspection of that data revealed that
matrices (see text)
the normalised differences observed for the samples at
a concentration of 0.04 mg I - I were substantially larger
than those observed at the other concentrations. Re-
preCISIOn was approximately proportional to analyte moving these data points gave a revised estimate of the
concentration. It was decided to use the estimate of relative standard deviation of 0.00903. This was in
0.0414 as the uncertainty associated with precision, agreement with the other estimates obtained (F-tests,
u(P), to avoid underestimating the precision for any 95% confidence). The three estimates were therefore
given sample. This estimate was obtained from the pooled to give a single estimate of the relative standard
analysis of different matrices and concentrations and is deviation of 0.0114. At present, the uncertainty esti-
therefore likely to be more representative of the preci- mate cannot be applied to samples with concentrations
sion across the method scope. below 1.2 mg I -I. Further study would be required to
investigate in more detail the precision at these low lev-
els.
Quinizarin

The estimates of the standard deviation and relative Trueness study


standard deviation were not comparable. In particular,
the estimates obtained from the duplicate results were Evaluation of Rm and u(Rm)
significantly different from the other estimates (F-tests,
95% confidence). There were no obvious patterns in The results are summarised in Table 3. In each case Rm
the data so no particular matrix and/or concentration was calculated using Eq. (2):
could be identified as being the cause of the variability.
There was therefore no justification for removing any (2)
data and restricting the coverage of the uncertainty es-
timate, as in the case of CI solvent yellow 124 (see be- where Cohs is the mean of the replicate analyses of the
low). The results of the precision studies indicate that spiked sample and C RM is the concentration of the

Table 3 Results from the re-


Analyte Target Mean, Standard
plicate analysis of a diesel oil
concentration, Cobs (mg 1-1) deviation
spiked with CI solvent red 24,
CSPikc (mg 1-1) of the mean,
quinizarin and CI solvent yel-
Sobs (mg I-I)"
low 124
CI solvent red 24 4.06 3.88 0.0360
Quinizarin 1.99 1.89 0.00370
CI solvent yellow 124 4.99 4.99 0.0167

"Estimated from ANOYA of 3 groups of 16 replicates according to ISO 5725: 1994 [9]
192 V. J. Barwick et al.

spiked sample. The uncertainty, u(Rm), was calculated only includes concentrations above 1.2 mg 1-1, for the
using Eq. (3): reason discussed in the section on precision.

(3)
Calculation of Rand u(R)
where u( C RM ) is the standard uncertainty in its concen- The recovery, R, for a particular test sample and the
tration of the spiked sample. The standard deviation of corresponding uncertainty, u(R), is calculated using
the mean of the results, Sohs, was estimated from Eqs. (6) and (7):
ANOVA of the data according to Part 4 of ISO
5725: 1994 [9]. (6)
Using information on the purity of the material used
to prepare the spiked sample, and the accuracy and u(R)=Rx ( U(~m»)2 + (U(Rs»)2 + (U(Rrc p)2 (7)
precision of the volumetric glassware and analytical Rm Rs Rrcp

balance used, the uncertainty in the concentration of CI In this study a spiked sample can be considered a
solvent red 24 in the sample, u( C RM ), was estimated as reasonable representation of test samples of marked
0.05 mg 1- 1. 1 The uncertainties associated with the con- fuel oils. There is therefore no need to correct the esti-
centration of quinizarin and CI solvent yellow 124 were mates of Rm and u( Rm) by including the Rrcp and
estimated as 0.025 mg 1- 1 and 0.062 mg 1-1, respective- u(Rrcp) terms. Both Rm and Rs are assumed to be equal
ly. The relevant values are: to 1. R is therefore also equal to 1. Combining the esti-
mates of u(Rm) and u(Rs), the uncertainty u(R) was cal-
CI solvent red 24: Rm = 0.957 u(Rm) =0.0148
culated as 0.0415 for CI solvent red 24, 0.0974 for quin-
Quinizarin: Rm =0.949 u(Rm) =0.0121
izarin and 0.0187 for CI solvent yellow 124.
CI solvent yellow 124: Rm = 1.00 u(Rm) =0.0129
Applying Eq. (4):
Ruggedness test of extraction/clean-up procedure
t= 11- Rml (4)
u(Rm) The results from the ruggedness study of the extraction/
clean-up procedure are presented in Table 1a. The pre-
indicated that the estimates of Rm obtained for CI sol- cision of the method for the analysis of the sample used
vent red 24 and quinizarin were significantly different in the ruggedness study had been estimated previously
from 1.0 (t>2) [7, 10]. During routine use of the meth- as 0.0621 mg 1-1 (v= 11) for CI solvent red 24,
od, the results reported for test samples will not be cor- 0.0216 mg 1-1 (v= 10) for quinizarin and 0.0251 mg 1-1
rected for incomplete recovery of the analyte. Equation (v= 11) for CI solvent yellow 124. Parameters were
(5) was therefore used to calculate an increased uncer- tested for significance using Eq. (8):
tainty for Rm to take account of the uncorrected bias:
t= ynxD,; (8)
(5)
V2xS '
where S is the estimate of the method precision, n is the
u(Rm)' was calculated as 0.0262 for CI solvent red 24 number of experiments carried out at each level for
and 0.0283 for quinizarin. The significance test for CI each parameter (n = 4 for a seven-parameter Plackett-
solvent yellow 124 indicated that Rm was not signifi- Burman experimental design), and Dx; is the difference
cantly different from 1.0. The uncertainty associated calculated for parameter Xi [1, 11]. The degrees of free-
with Rm is the value of u(Rm) calculated above (i.e. dom for tcrit relate to the degrees of freedom for the
0.0129). precision estimate used in the calculation of t.
u(Rs) is the standard deviation of the mean recover- The parameters identified as having no significant
ies obtained for the samples analysed in the precision effect on method performance, at the 95% confidence
studies and the BP diesel sample used in the study of level are highlighted in Table 1a. For these parameters
Rm. This gave estimates of u(Rs) of 0.0322 for CI sol- the uncertainty in the final result was calculated using
vent red 24, 0.0932 for quinizarin and 0.0138 for CI sol- Eq. (9):

(Y(Xi » -_ V2 X tcril X S XOreal


vent yellow 124. The estimate for CI solvent yellow 124
(9)
U
Vn X 1.96 --,
Olesl

1 Detailed information on the estimation of uncertainties of this where Oreal is the change in the parameter which would
type is given in Ref. [7]. be expected when the method is operating under con-
The evaluation of measurement uncertainty from method validation studies. Part 2 193

trol in routine use and <\cst is the change in the param- certainties were therefore converted to relative stand-
eter that was specified in the ruggedness study. The es- ard deviations by dividing by the mean of the results
timates of 8rca1 are given in Table 1a. For parameter A, obtained from previous analyses of the sample under
brand of silica cartridge, the conditions of the test (i.e. normal method conditions (see results for Matrix B in
changing between two brands of cartridge) were con- Table 2).
sidered representative of normal operation of the
method. 8rca1 is therefore equal to 8tcst . The effect of the
rate of elution of oil by hexane from the cartridge was Ruggedness test of the HPLC procedure
investigated by comparing the elution under a vacuum
and with elution under gravity. In routine analyses, the The results from the ruggedness study of the HPLC
oil will be eluted under vacuum. Variations in the va- procedure, and the values of 8rca1 and u(x;) used in the
cuum applied from one extraction to another will affect uncertainty calculations, are presented in Table lb. Re-
the rate of elution of the oil and the amount of oil plicate analyses of a standard solution of the three
eluted. However, the effect of variations in the vacuum markers gave the following estimates of the precision of
will be small compared to the effect of having no va- the HPLC system at the concentration of the sample
cuum present. It can therefore be assumed that varia- used in the study: CI solvent red 24, s = 0.0363 mg 1- I
tions in the observed concentration of the markers, due (n=69); quinizarin, s=0.0107mgl- 1 (n=69); CI sol-
to variability in the vacuum, will be small compared to vent yellow 124, s = 0.0196 mg 1- I (n = 69). Parameters
the differences observed in the ruggedness test. As a were tested for significance, at 95% confidence, using
first estimate, the effect of variation in the vacuum dur- Eq. (8). The uncertainties for parameters identified as
ing routine application of the method was estimated as having no significant effect on the method performance
one-tenth of that observed during the ruggedness were calculated using Eq. (9). Based on information
study. This indicated that the parameter was not a sig- from manufacturers' specifications for HPLC systems,
nificant contribution to the overall uncertainty for CI the uncertainty associated with the column temperature
solvent red 24 and CI solvent yellow 124, so no further was estimated as ± 1°C, giving an estimate of 8rcal of
study was required. The estimates of 8rca1 for the con- 2°C. Again, based on manufacturers' specifications for
centration and volume of butan-1-01 in hexane used to DAD detectors, the uncertainty associated with the de-
elute the column were based on the manufacturers' tector wavelengths was estimated as ± 2 nm, giving a
specifications and typical precision data for the volu- 8rcal value of 4 nm.
metric flasks and pipettes used to prepare and deliver The uncertainties due to significant parameters were
the solution. estimated using Eqs. (10) and (11). Information in the
For the parameters identified as having a significant literature suggests that a typical variation in flow rate is
effect on method performance, the uncertainty was cal- ± 0.3% [12]. The uncertainty in the flow rate was there-
culated using Eqs. (10) and (11): fore estimated as 0.00173 ml min - ~ assuming a rectan-
gular distribution. Data in the literature gave 1.5% as a
u(y(xJ)=u(xJxc;, (10) typical coefficient of variation for the volume delivered
Observed change in result by an autosampler [13]. The uncertainty associated with
c- = --------"'----- (11) the injection volume of 50 fLl was therefore estimated
I Change in parameter
as 0.75 fLl.
The estimates of the uncertainty in each parameter, Two remaining parameters merit further discussion;
u(xJ, are given in Table 1a. The uncertainties asso- the type of acetonitrile used in the mobile phase and
ciated with the sample volume, volume of hexane wash whether or not the mobile phase was degassed. The
and volume of the 10% butan-1-01lhexane solution method was developed using HPLC grade acetonitrile.
were again based on the manufacturers' specifications The ruggedness test indicated that changing to far-UV
and typical precision data for the volumetric flasks and grade results in a lower recovery for all three analytes.
pipettes used to prepare and deliver the solutions. The The method protocol should therefore specify that for
uncertainty in the evaporation temperature was based routine use, HPLC grade acetonitrile must be used.
on the assumption that the temperature could be con- The ruggedness test also indicated that not degassing
trolled to ± 5 0c. This was taken as a rectangular distri- the mobile phase causes a reduction in recovery. The
bution and converted to a standard uncertainty by div- method was developed using degassed mobile phase,
iding by V3 [7]. As discussed previously, the effect on and the method protocol will specify that this must be
the final result of variations in the vacuum when eluting the case during future use of the method. As these two
the oil from the cartridge with hexane was estimated as parameters are being controlled in the method proto-
one-tenth that observed in the ruggedness test. col, uncertainty terms have not been included.
The effects of all the parameters were considered to The effects of all the parameters were considered to
be proportional to the analyte concentration. The un- be proportional to the analyte concentration. The un-
194 V. J. BalWick et al.

certainties were therefore converted to relative stand- ied during these experiments, such as the extraction
ard deviations by dividing by the mean of results ob- and HPLC conditions, were investigated in the rugged-
tained from previous analyses of the sample under nor- ness tests. There are however, a small number of pa-
mal method conditions (see results for Matrix B in Ta- rameters which were not covered by the above experi-
ble 2). ments. These generally related to the calibration of pi-
pettes and balances used in the preparation of the
standards and samples. For example, during this study
Other sources of uncertainty the same pipettes were used in the preparation of all
the working standards. Although the precision asso-
The precision and trueness studies were designed to ciated with the operation of the pipette is included in
cover as many of the sources of uncertainty as possible the overall precision estimate, the effect of the accuracy
(see Fig. 1), for example, by analysing different sample of the pipettes has not been included in the uncertainty
matrices and concentration levels, and by preparing budget so far. A pipette used to prepare the standard
new standards and HPLC mobile phase for each batch may typically deliver 0.03 ml above its nominal value.
of analyses. Parameters which were not adequately var- In the future a different pipette, or the same pipette

Fig. 2 Contributions as rela-


• CI solvent red 24
tive standard deviations
(RSDs) to the measurement LJ Quinizarin
uncertainty for the determina- D C I solvent yellow 124
tion of CI solvent red 24,
quinizarin and CI solvent yel-
low 124 in fuel oil

Precision, u(P)

Sample volume, u(y(xs))

Rate of elution, u(y(x e))

Volume of hexane wash,


u(y(xo))

I
~ Cone. butan-1-ollhexane,
U(Y(XE))
e Vol. bulan-1-o11hexane,
: U(Y(XF)) -0._ _

Evaporation temp., u(y(x G))

Flow rate. u(y(x B"


Injection volume, u(y(x e"
Column temp., u(y(x 01)

Detector wavelength A, u(y(x E 1)

Detector wavelength B, u(y(x F 1)

o 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1

Uncertainty contribution as RSD


The evaluation of measurement uncertainty from method validation studies. Part 2 195

after re-calibration, may deliver 0.02 ml below the nom- ciated with the variation in recovery from sample to
inal value. Since this possible variation is not already sample was the major contribution to the recovery un-
included in the uncertainty budget it should be consid- certainty, u(R). This was due to the fact that the recov-
ered separately. However, previous experience [14] has eries obtained for matrix B were generally higher than
shown us that uncertainties associated with the calibra- those obtained for matrices A and C. However, in this
tion of volumetric glassware and analytical balances are study, a single uncertainty estimate for all the matrices
generally small compared to other sources of uncertain- and analyte concentrations studied was required. It was
ty such as overall precision and recovery. Additional therefore necessary to use "worst case" estimates of the
uncertainty estimates for these parameters have not uncertainties for precision and recovery to adequately
therefore been included in the uncertainty budgets. cover all sample types. If this estimate was found to be
unsatisfactory for future applications of the method,
separate budgets could be calculated for individual ma-
Calculation of measurement uncertainty trices and concentration ranges.

The contributions to the uncertainty budget for each of


the analytes are illustrated in Fig. 2. In all cases the
sources of uncertainty were considered to be propor-
tional to analyte concentration. Using Eq. (12): Conclusions

We have developed a protocol which describes how


data generated from experimental studies commonly
undertaken for method validation purposes can be used
the uncertainty in the final result, u(y), was calculated in measurement uncertainty evaluation. This paper has
as 0.065 for CI solvent red 24, 0.13 for quinizarin and illustrated the application of the protocol. In the exam-
0.024 for CI solvent yellow 124, all expressed as relative ple described, the uncertainty estimate for three ana-
standard deviations. The expanded uncertainties, calcu- lytes in different oil matrices was evaluated from three
lated using a coverage factor of k = 2 which gives a con- experimental studies, namely precision, recovery and
fidence level of approximately 95%, are 0.13, 0.26 and ruggedness. These studies were required as part of the
0.048 for CI solvent red 24, quinizarin and CI solvent method validation, but planning the studies with uncer-
yellow 124, respectively. tainty evaluation in mind allowed an uncertainty esti-
mate to be calculated with little extra effort. A number
of areas were identified where additional experimental
Discussion work may be required to refine the estimates. However
the necessary data could be generated by carrying out
In the case of CI solvent red 24 and CI solvent yel- additional analyses alongside routine test samples.
low 124, the significant contributions to the uncertainty Again this would minimise the amount of laboratory ef-
budget arose from overall precision and recovery, and fort required.
the brand of the solid phase extraction cartridge used. For methods which are already in routine use there
If a reduction in the overall uncertainty of the method may be historical validation data available which could
was required, useful approaches would be to specify a be used, in the same way as illustrated here, to generate
particular brand of cartridge in the method protocol, or an uncertainty estimate. If no such data are available,
to adopt matrix specific recovery corrections for test the case study gives an indication on the type of experi-
samples. mental studies required. Again, with careful planning,
The combined uncertainty for quinizarin, which is it is often possible to undertake the studies alongside
significantly greater than that calculated for the other routine test samples.
markers, is dominated by the precision and recovery
terms. The results of the precision study indicated vari- Acknowledgments The work described in this paper was sup-
ported under contract with the Department of Trade and Indus-
able method performance across different matrices and try as part of the United Kingdom National Measurement System
analyte concentrations. The uncertainty, u(Rs), as so- Valid Analytical Measurement (VAM) Programme.
196 V. J. Barwick et al.

References

1. Barwick VJ, Ellison SLR (1999) 6. EURACHEM (199H) The fitness for 10. Ellison SLR, Williams A (1996) In:
Accred Qual Assur (in press) purpose of analytical methods, a la- Parka nay M (ed) The use of recovery
2. May EM, Hunt DC, Holcombe DG boratory guide to method validation factors on trace analysis. Royal Socie-
(19H6) Analyst 111 : 993-995 and related topics. Laboratory of the ty of Chemistry, Cambridge
3. Ellison SLR, Barwick VJ (1998) Government Chemist, London 11. Youden WJ, Steiner EH (1975) Sta-
Accred Qual Assur 3:101-105 7. EURACHEM (1995) Quantifying un- tistical manual of the association of
4. Ellison SLR, Barwick VJ (1998) Ana- certainty in analytical measurement. official analytical chemists. Associa-
lyst 123: 13H7-1392 Laboratory of the Government tion of Official Analytical Chemists,
5. ISO 9()04-4: 1993 (1993) Total quality Chemist, London Arlington, Va
management Part 2. Guidelines for 8. Farrant TJ (1997) Practical statistics 12. Brown PR, Hartwick RA (eds)
quality improvement. ISO, Geneva, for the analytical scientist: a bench (1989) High performance liquid chro-
Switzerland guide. Royal Society of Chemistry, matography. Wiley, New York
Cambridge 13. Dolan JW (1997) LC-GC Internation-
9. ISO 5725: 1994 (1994) Accuracy aI10:418-422
(trueness and precision) of measure- 14. Barwick VJ, Ellison SLR (1998) Anal
ment methods and results. ISO, Gen- Comm 35:377-3H3
eva. Switzerland
Accred Qual Assur (1998) 3:412-415
© Springer-Verlag 1998

Stephan Kiippers Is the estimation of measurement


uncertainty a viable alternative to
validation?

Abstract Two examples of the use samples. The results of measure-


of measurement uncertainty in a ment uncertainty influence the
development environment are pre- type of analysis employed in the
Presented at: Analytica Conference 98, sented and compared to the use of development process, and the
Symposium 2: "Uncertainty budgets in validation. It is concluded that measurement design can be ad-
chemical measurements", Munich,
21-24 April 1998 measurement uncertainty is a good justed to the need of the process.
alternative to validation for chemi-
S. Kiippers (181) cal processes in the development Key words Chemical analysis .
Schering AG, stage. Some advantages of meas- Development process .
I n-Process-Control, urement uncertainty are described. Measurement uncertainty .
Miillerstrasse 170-178, The major advantages are that the Validation . Practical examples
D-13342 Berlin, Germany
Tel.: + 49-30-468-1-7819
estimations of measurement uncer-
Fax: +49-30-468-9-7819 tainty are very efficient, and can be
e-mail: stephan.kueppers@schering.de performed before analysis of the

only a representative of one contribution to the uncer-


Introduction
tainty of measurement: in high performance liquid
The "know how" in quality assessment has grown over chromatography (HPLC) usually the uncertainty con-
the last few years. Quality management systems are im- tribution of the sampler.
proving and validation procedures are being optimized. To illustrate the point, two examples taken from a
But the problem that validation procedures are time process research environment in a pharmaceutical com-
consuming still remains. Often validations are perform- pany are presented. It is demonstrated that measure-
ed too late. Especially in a development environment ment uncertainty has practical advantages compared to
where the advance from one development step to the validation.
next is based on a decision resulting from the previous
step. Therefore, valid analytical data is needed directly
after each method development. Examples of process development
The estimation of measurement uncertainty is an al-
ternative to validation if a quantitative result is used for The first example presents a typical process from our
the assessment of the development process. In this case research environment, where a chemical synthesis is
only one parameter of the validation is needed. The transferred from development to production. In this sit-
precision of the analysis is sometimes taken as a quality uation validation of the chemical process is performed
figure instead of a validation. But precision cannot re- throughout, including variations of process parameters
place validation because in this case the environment, for intermediates that are not isolated. For these inter-
i.e. the sample preparation is neglected and precision is mediates a pre-selected analytical method is normally
198 S. Kuppers

starting Table 1 First step in the estimation of the measurement


~----::::;:::::::::::::::::::::::~~~~~7 material uncertainty of high performance liquid chromatography (HPLC)
analysis as a method used for the assessment of a chemical
1. step process

Uncertainty component Min Max


2. step
Inhomogeneity of the sample and <0.2% 0.5%
inhomogeneity of sampling
Weighing of the reference materials or <0.1% 0.1%
3. step the sample: (about 50 mg)
Uncertainty of the instrumentation 0.5% 1.5%
(sampler uncertainty) and reference
material(s) depending on the number of
~ final step
injections

8 final product
Evaluation uncertainty (uncertainty
caused by integration)
Uncertainty of the reference material(s)
<0.1% 0.1%

Fig.l The synthesis and variations performed in a process


research validation shown schematically. HV stands for
performed according to the manufacturing formula. The example tion analogous to the concept presented by Henrion et
shown illustrates the first step of the chemical synthesis. Circles
show the variations of the two parameters tested in the validation
al. [3]. In our laboratory, a type B estimation can be
(Liel and TOBO are internal names for chemicals used in the performed with confidence because we have performed
synthesis, eq stands for equivalent) this type of analysis (an HPLC method with external
standard calibration) about 50 times before. A control
sample is included in every analysis and the results are
available, usually a method that has been employed for plotted on a control chart. The control chart, repre-
some time. Because no validation report is requested senting the total measurement uncertainty for the ana-
by regulatory authorities, no formal validation is per- lytical method, is then reviewed for the estimation of
formed. uncertainty. The standard deviation for the assay of a
An example of a typical validation scheme for a control sample for 50 analyses was 1.51 %. An example
chemical synthesis in process research is given in Fig. 1. of the way in which a control sample can be used for
In this example the process research chemist sets up a measurement uncertainty is presented in detail in [1].
number of experiments. The most important question The typical uncertainty components for this type of
presented by the chemist to the analytical department analysis are (Table 1):
is: Is the variability of the analytical process small com- - inhomogeneity of the sample and sampling
pared to the variation performed in the process valida- - weighing of reference materials (about 50 mg)
tion [1, 2]. - weighing of samples (about 50 mg)
The original manufacturing formula (HV) and five uncertainty of the instrumentation (sampler uncer-
variations are performed in the first step of the synthe- tainty)
sis. Six samples are analysed. The results of these six evaluation uncertainty (uncertainty caused by inte-
analyses are used to assess the validation of this process gration)
step. In this case validation of the analytical method is a uncertainty of reference materials.
prerequisite for any decision that is made about the val- The case presented here is simple because all six sam-
idity of the process. This information is needed before ples can be analysed as one set of samples in one analy-
the process research chemist can start variations of the tical run. For a single run the estimation of uncertainty
process otherwise it is possible that the data received can be reduced to the calibration of the method with an
cannot be assessed. The difficulty of assessing the data in-house reference material and the uncertainty of the
of the process validation results from the fact that the analysis itself. In this example inhomogeneity of the
data is influenced by the analytical method and the un- sample, the evaluation uncertainty and the uncertainty
certainty of the chemical process. If the uncertainty of of the reference material (i.e. inhomogeneity) as given
the analytical method is larger or in the same range as in Table 1 can be neglected because the results of only
the variations of the chemical process, assessment of one analysis are compared. The weighing uncertainties
the data is not possible. can be neglected because they are small compared to
To assess uncertainty, a type B estimation of uncer- the sampler uncertainty.
tainty is performed. After analysis of the samples the If the measurement uncertainty is estimated using
type B estimation can be verified by a type A estima- the parameters in Table 1 but for a longer time period
Is the estimation of measurement uncertainty a viable alternative to validation? 199

Table 2 Results of the estimation of the measurement The estimation of uncertainty replaces a full valida-
uncertainty for HPLC analysis tion of the analytical method. It generates the necessary
Option 1 a Option 2h information at the right time. The statistical informa-
tion received from the analysis can be used for the in-
Calibration 1% 1% terpretation of the data and finally the analysis is de-
Analysis of samples 1.5% 1% signed to the customers needs. In this case measure-
Total uncertainty (received from 1.11% 1.4% ment uncertainty is a good alternative to validation.
uncertainty propagation) The second example illustrates the determination of
water content which is an important characteristic for
"Option 1: two weights of the calibration standard with six chemical substances and is needed in many chemical
injections for each of them, two weights of the sample with two
injections for each weight
reactions. It is usually determined by Karl-Fischers
h Option 2: two weights of the sample with four injections for (KF) titration. The water content determined in our la-
each weight boratory ranges from < 0.1 % to about 30%. It is widely
known that KF water titration's may be influenced by
the sample and, depending on the range, some other
Table 3 Comparison of the results of the estimation and the
analysis (type B estimation compared to type A estimation) parameters may significantly affect the uncertainty.
Because the concentration of a reagent is often de-
Estimation Found termined on the basis of the water content of the reac-
(one example) tion mixture, uncertainty information for the water de-
Calibration 1% O.H%
termination is needed. The problem in a development
environment is that various synthesis routes are tested.
Analysis of samples 1% O.H%
If salts are used in a chemical reaction it is usual that
Total uncertainty 1.4% 1.14% chemists test different counter ions for the optimization
(by uncertainty propagation)
of the synthesis. However, different salts are rarely
tested in the analytical department. One of the prob-
lems of the variation of counter ions is that the hygros-
as for example covered in the control chart, the inho- copicity of the salts is often different.
mogeneity of the sample has to be included in the esti- Two independent steps have to be followed:
mation. Using the mean values between the min and 1. Substance dependent influences have to observed.
max column in Table 1 the uncertainty estimated by In most cases the chemical structure of the compo-
uncertainty propagation is 1.46%, which is close to the nent is known and therefore serious mistakes can be
value found in the control chart. In the case presented avoided. Titration software and various KF reagents
here the estimation of measurement uncertainty can be have to be available and standard operation proce-
performed in only two steps as shown in Table 2. Cali- dures have to be established.
bration and analysis of samples represent the major un- 2. The individual uncertainty has to be considered.
certainties and combined they provide the complete Therefore a preliminary specification has to be set
uncertainty of our experiment; in this case the uncer- and a type B estimation of uncertainty can be used
tainty of the HPLC sampler. The influence of the to show if there is any problem arising from the data
HPLC sampler is known, therefore, there are two op- and the specification limit [4].
tions to perform the analysis. The first step is a general task using the appropriate
Together with the customer it was decided to per- equipment and has been established in our laboratory.
form the analysis according to option 2. The samples However, the second part needs to be discussed in de-
were analysed in one analytical run. The result is shown tail.
in Table 3. The result of the estimation compares well Suppose there is a chemical reaction with reagent A
with the "found" results. The found uncertainty is where:
smaller than the estimation. Assessment of the results
A+B -> C. (1)
of the validation of the manufacturing formula be-
comes easier from the customers point of view because The alternative reaction for A may also be with wa-
the customer is able to deceide if a variation of his re- ter to D.
sult is related to his process or to the uncertainty of the
(2)
analytical method. Additionally, the influence of the in-
dividual contributions to uncertainty becomes smaller The reaction with water is often faster than with B.
because of the uncertainty propagation. Therefore, the Because of the low molecular weight of water 0.5% (wi
difference between the estimated and found uncertain- w) of water in B may be 10 mol %. Therefore an excess
ty becomes smaller with an increasing number of pa- of at least 10 mol % of A might be needed to complete
rameters that influence uncertainty. the reaction. The water content is determined in the
200 S. Ktippers

Table 4 Estimation of the measurement uncertainty for the The factors for the three contributions mentioned
titration of water (example performed manually) above are estimated on the basis of experience. The
Results from the titrations 0.20%; 0.23% 0.215% calculation is performed using a computer program [5].
(% (w/w) from the weight of +0.0150% This makes the decision easy and fast. In this case the
the sample) type A estimation on the basis of the results and the
Minimum uncertainty of 1.1% +0.00236% type B estimation of the influence factors are com-
1.1% bined. An example is given in Table 4 in a compressed
Hygroscopicity (estimated on 5% +0.00621% form.
the basis of experience with The alternative would be an experimental valida-
amin compounds) tion. In this case the uncertainty estimation has proven
Uncertainty of the titer 1% +0.00124% to be a very useful alternative to validation, although,
Reaction of the sample with 5% +0.00621% on the basis of experience, the estimate of hygroscopic-
the solvent ity is difficult and may lead to incorrect values.
Result reported including 0.25%
uncertainty

Conclusions
analytical department. For example the value deter-
mined by KF titration is 0.22% (w/w) from two measur- Method validation is a process used to confirm that an
ements with 0.2% (w/w) and 0.23% (w/w) as the indi- analytical procedure employed for a specific test is suit-
vidual values. The question that has to be asked is: Is able for the intended use. The examples above show
0.22% of the weight always smaller than 0.25% (w/w)? that the estimation of measurement uncertainty is a vi-
What we need is the measurement uncertainty added to able alternative to validation. The estimation of meas-
the "found" value. If this value is smaller than the limit urement uncertainty can be used to confirm that an
(0.25%) the pre-calculated amount of the reagents can analytical procedure is suitable for the intended use. If
be used. the estimation of measurement uncertainty is used to-
The model set up consists of two terms: gether with validations both the uncertainty estimation
and the validation have their own place in a develop-
Y=X*(l+ U)
mental environment. The major advantages of meas-
where X is the mean value of the measurements plus urement uncertainty are that it is fast and efficient.
the standard uncertainty. U is the sum of the various Normally, if the analytical method is understood by the
influence parameters on measurement uncertainty: laboratory very similar results are found for the estima-
- hygroscopicity tion of uncertainty and for the classical variation of crit-
- uncertainty of the titer ical parameters, namely, validation. The decision on
- reaction of the KF solvent with the sample how to perform a validation should be made on a case
- a minimum uncertainty constant of 1.1% (taken to case basis depending on experience.
from the control chart of the standard reference ma-
terial) covering balance uncertainties, the influence Acknowledgements Fruitful scientific discussions with Dr. P.
of water in the atmosphere and the instrument un- Blaszkiewicz, Schering AG, Berlin and Dr. W. Hasselbarth,
certainty with the detection of the end of titration. BAM, Berlin are gratefully acknowledged.

References

1. Klippers S (1997) Accred Qual Assur 3. Henrion A, Dube G, Richter W (1997) 5. Evaluation of uncertainty (1997) R.
2:30-35 Fres J Anal Chern 35H: 506-50H Metrodata GmbH, Grenzach-Whylen,
2. Klippers S (1997) Accred Qual Assur 4. Renger B (1997) Ph arm Tech Europe Germany
2:338-341 9:36-44
Accred Qual Assur (199X) 3: 155-160
© Springer-Verlag 199X

Ricardo J. N. Bettencourt da Silva Validation of the uncertainty evaluation


M. Filomena G. F. C. Camoes
J 030 Seabra e Barros for the determination of metals in solid
samples by atomic spectrometry

Abstract Every analytical result for the same method, can vary
should be expressed with some in- from analyst to analyst. It is impor-
dication of its quality. The uncer- tant to develop tools which will
Presented at: 2nd EURACHEM tainty as defined by Eurachem support each choice and approxi-
Workshop on Measurement Uncertainty ("parameter associated with the re- mation. In this work, the compari-
in Chemical Analysis, Berlin,
29-30 September 1997 sult of a measurement that charac- son of an estimated uncertainty
terises the dispersion of the values with an experimentally assessed
that could reasonably be attributed one, through a variance test, is per-
to the, ... , quantity subjected to formed. This approach is applied
measurement") is a good tool to to the determination by atomic ab-
accomplish this goal in quantitative sorption of manganese in digested
analysis. Eurachem has produced a samples of lettuce leaves. The total
guide to the estimation of the un- uncertainty estimation is calculated
R. J. N. Bettencourt da Silva (181) certainty attached to an analytical assuming 100% digestion efficiency
M. F. O. F. C. Cam6es result. Indeed, the estimation of with negligible uncertainty. This as-
CECUL, Faculdade de Ciencias da the total uncertainty by using un- sumption was tested.
Universidade de Lisboa, P-1700 Lisbon, certainty propagation laws is com-
Portugal
ponents-dependent. The estimation
J. Seabra e Barros of some of those components is Key words Uncertainty .
Instituto Nacional de Engenharia e
Tecnologia Industrial, Estrada do Pa<;o
based on subjective criteria. The Validation . Quality control
do Lumiar, P-1699 Lisbon Codex, identification of the uncertainty Solid samples . Atomic
Portugal sources and of their importance, spectrometry

ment" and presented it as a tool to describe that quali-


Introduction
ty.
The Eurachem guide for "Quantifying uncertainty in
The presentation of an analytical result must be accom- analytical measurement", which is based on the appli-
panied by some indication of the data quality. This in- cation of the ISO guide [2] to the chemical problem,
formation is essential for the interpretation of the ana- was observed. ISO aims at the estimation of uncertainty
lytical result. The comparison of two results cannot be in the most exact possible manner, in order to avoid
performed without knowledge of their quality. Eura- excess of confidence in overestimated results. The ap-
chem [1] defined uncertainty as the "parameter asso- plication of these guides turns out to be a powerful
ciated with the result of a measurement that character- tool. The exact estimation of uncertainties is important
ises the dispersion of the values that could reasonably for the detection of small trends in analytical data. The
be attributed to the, ... , quantity subjected to measure- time and effort used in such estimations can avoid
202 R. J. N. Bettencourt da Silva· M. F. G. F. C. Camoes . J. Seabra e Barros

many further doubts concerning observation of legal The dry-base content, D, is obtained by application of
limits and protects the user of the analytical data from the correction factor, fcon., to the metal content, M.
financial losses. The use of uncertainty instead of less
informative percentage criteria brings considerable D=fcorr. 1W (3)
benefits to the daily quality control.
Despite the analyst's experience, some analytical
Identification of uncertainty sources
steps like sampling and recovery are of particularly dif-
ficult estimation. Mechanisms should be developed to
The uncertainty associated with the determination of
support certain choices or approximations. The com-
fcon. is estimated from the combination of the three in-
parison of an estimated uncertainty with the experi-
volved weighing steps, Fig. 1a.
mentally assessed one can be of help.
The uncertainty associated with the sample metal
In this work the Eurachem guide [1] was used for content is estimated from the weighing, dilution and in-
the estimation of uncertainties involved in the determi-
terpolation sources (Fig. 1b). The model used for the
nation by electrothermic atomic absorption spectrome-
calculation of the contribution from the interpolation
try (EAAS) of manganese in digested lettuce leaves.
source assumes negligible standards preparation uncer-
The total uncertainty estimation was calculated assum-
tainty when compared with the instrumental random
ing a 100% digestion efficiency with negligible uncer-
oscillation [5, 10].
tainty. The experimental precision was compared with
an estimated one for the purpose of validation of the
proposed method of evaluation. After this validation
Quantification of the uncertainty components
the uncertainty estimation was used in an accuracy test
and in routine analysis with the support of a spread-
The quantification of the uncertainty is divided into
sheet programme.
equally treated operations:

The uncertainty estimation process Gravimetric operations

The uncertainty estimation can be divided into four The weighing operations are present in the dry-base
steps [1]: (1) specification, (2) identification of uncer- correction factor (three) and in the sample metal con-
tainty sources, (3) quantification of uncertainty compo- tent (one). Two contributions for the associated uncer-
nents, and (4) total uncertainty estimation. tainty, O'Wcighing, were studied:
1. Uncertainty associated with the repeatability of the
weighing operations, oi~~~;tC, is obtained directly from
the standard deviation of successive weighing opera-
Specification tions. The corresponding degrees of freedom are the
number of replicates minus 1.
A dry-base content determination method is proposed,
the sample moisture determination being done in paral-
lel. Figure 1 represents the different steps of the analy-
sis. The analytical procedure was developed for labora- i)The dry base correction factor determination:
tory samples. Sampling uncertainties were not consid-
ered.
The dry-base correction factor, fcon., is calculated
from the weights of the vial (z), vial plus non-dried
sample (x) and vial plus dry sample (y)
x-y ii) Metal content in sample quantification:
fcorr. = 1 - - - (1)
x-z
The sample metal content, M, is obtained from the
interpolated concentration in the calibration curve,
entcr. the mass of the diluted digested sample, a, and
the dilution factor, fdiL, (digested sample volume times
dilution ratio).
M= C1ntcr. XfdiL
(2) Fig. 1 Proposed method for dry-base metal content determina-
a tion in lettuce leaves
Validation of the uncertainty evaluation for the determination of metals in solid samples by atomic spectrometry 203

2. Uncertainty associated with the balance calibration,


a{!~:fh'ee, defined by

~alancc _ 2 x Tolerance
V( y - Z)2 (
(X-Z)2 ~+ -
1)2 0-;+ (x(X-Z)2
(x-z) - Y )2 u; (9)
Cahh. - yr2 (4)
The values of lTx, lTy and IT, are then calculated as
described in the section "Gravimetric operations"
where the Tolerance is obtained from the balance cali-
above. The number of degrees of freedom is calculated
bration certificate.
by the Welch-Satterwaite equation (Eq. 7). The appli-
The Eurachem guide suggests that when the uncer-
cation of a spreadsheet program available in the litera-
tainty components are described by a confidence inter-
ture simplifies this task [3]. However, the classical ap-
val, ex ± {3, without information on degrees of freedom,
proach is more flexible for different experimental con-
the associated uncertainty is 2{31yr2, which represents
figurations or for one or more dilution steps, and is also
the uncertainty of a 2{3 amplitude rectangular distribu-
easily automated.
tion. These uncertainties are designated type B. The
number of degrees of freedom associated with the
~~:rh'cc type B estimation, lI~~:fh'cc, is approximately [1]
Volumetric operations
equal to

Balance _ 1 [ ~alancc
Calih. 12
The uncertainties associated with to the volumetric op-
lIc,a l"h
1. -
- 2
- Balance (5) erations were calculated from the combination of two
mCalih.
[(1) and (2) below] or three [(1), (2) and (3) below]
were m~~:rh'cc is the mass associated with the balance components:
calibration tolerance. 1. Uncertainty associated with volume calibrations,
The two uncertainties are then combined lT~~l;h.
2 x Tolerance
_ (-'3alaneC)2
+ (_Ralanec)2 (6) lT~~)l;h. = (10)
vTI
lTW<.:iging - lfCalih. lfRepeat.

The corresponding degrees of freedom are calcu- where the information on this tolerance is normally
lated by the Welch-Satterwaite equation [1-3]. When available with the instrument in the form: volumetric
the pairs (uncertainty, degrees of freedom) instrument volume ± tolerance. This type B uncertainty
estimation has the same treatment as the one reported
in Eq. 5 for the degrees of freedom
for the quantities a, b, c, d, in a function
="21 [ -V-
Vol. _ lTCalih.
Vol. 12
V = f(a, b, c, d, ... ) are taken into account, then the effec- lICalih. (11)
tive number of degrees of freedom associated with V,
lIv, is where lI~~l;h. is the number of degrees of freedom asso-
ciated with lT~~lih. for a certain volume V.
2. Uncertainty associated with volume repeatability
tests, lT~~\at.
The lT~~~cat. and the corresponding degrees of free-
dom, lI~~~cat.' are also extracted directly from the re-
peatability tests. Such tests consist of successive weigh-
The calculation of the uncertainty and of the degrees ings of water volumes measured by the instrument. The
of freedom associated with the sample weight is by the observed standard deviation is a function of the ana-
direct application of Eqs. 4-7. The calculations of the lyst's expertise.
dry-base correction factor are more elaborate. 3. Uncertainty associated with the use of volumetric
3. Uncertainty associated with the dry base factor equipment at a temperature different from that of cali-
The dry base correction factor is a function of three bration, lTf~)~p
weighing operations (Eq. 1). To estimate the uncertain- This third component corrects for errors associated
ty, lTlenTr., associated with the femT., the general equation with the use of 20°C calibrated material in 20 ± 3°C so-
(Eq. 8) was used [1] lutions. When two consecutive volumetric operations
are performed at the same temperature, as is the case in
corr.)2 cr. + (a feorr.)2 ~ + (a fenrr.)2 ~
(afax (8)
dilution stages, they become self-corrected for this ef-
x ay Y az Z fect.
The glass instrument expansion coefficient is much
It is therefore smaller than that of the solution. For this reason we
204 R. J. N. Bettencourt da Silva· M. F. G. F. C. Camoes· J. Seabra e Barros

have only calculated the latter. For a temperature oscil- Total uncertainty estimation
lation of aT= ±3K with a 95% significance level and
for a volumetric expansion coefficient of pure water of The total uncertainty estimation, UT, is a function of the
2.1 X 10 -4 °c-1 (our solutions can be treated as pure dry-base correction factor uncertainty, u/eorr ' of the un-
water because of their low concentrations), the 95% certainty associated to the analysis sample weighing op-
volume confidence interval becomes eratIon,
. _"ample
U-Wcighing, 0 f thed·l·
l utlOn factor, Ut,.(hI. ,and 0 f
V±Vx3x2.1xl0-4. Dividing the expanded uncer- the instrumental calibration interpolated uncertainty,
tainty by the Student t value, t( 00,95% ) = 1.96, we ob- UC;nter: These four quantities combine their uncertain-
tain the temperature effect component uncertainty ties in the equation

WeIghing,)2 + (u (uIdil. )2 + (a:Clnter. )2


Vx3 x2.1
10-
( ~a'?rl'i
X_ 4
....,Vol. ______ _
(12) leorr. )2 + (15)
"Temp. -
1.96 a feorr. fdll. Clntcr.

The number of degrees of freedom due to the tem- where D represents the dry-base sample metal content
perature effect can also be estimated as for lJ~~lli·b. (Eq. and a has the same meaning as in Eq. 2. The other
· · UCalib.
11) , sub stItutmg Vol. b Vol.
Y UTemp. quantities have already been described.
These components are then combined to calculate The expanded uncertainty can then be estimated af-
the volume uncertainty, UVol. ter the calculation of the effective number of degrees of
freedom, df (Eq. 7). Therefore the coverage factor used
'"
"Vol. - -y(UVOI.
Cahb. )2+(UVOI.
VO I.)2
Repeat. )2+(UTemp. (13) was the Student t defined for that number and a 95%
significance level (t( df, 95%). The estimated confidence
The number of degrees of freedom associated with UVol. interval is defined by
can also be calculated by the Welch-Satterwaite equa-
tion. D±uT·t(df,95%) (16)
4. Uncertainty associated with the dilution factor
Our analytical method has three volumetric steps
that can be combined as a dilution factor, fdil., whose Quality control
uncertainty, Uldil , can easily be estimated by:
Ideally, the readings of the instruments for each sample
Uld ;1. = and for each standard should be random [6-7]. Normal-
(14) ly, the instrument software separates the calibration
fdil.
from the sample reading. Although this allows an im-
were the DSV, P and V stand respectively for digested mediate calculation, it can produce gross errors if the
solution volume, dilution operation pipette and dilu- operator does not verify the drift of the instrument re-
tion operation vial; UVol. and V represent respectively sponse. For this reason, the calibration curves should
each corresponding volumetric uncertainty and volume. be tested from time to time by reading a well-known
As in the other cases, the degrees of freedom were cal- control standard. This standard can also be prepared
culated by the Welch-Satterthwaite equation. from another mother solution in respect to the calibra-
tion standards, for stability and preparation checking.
Normally, the laboratories use fixed and inflexible
Sample signal interpolation from a calibration curve criteria for this control. They define a limit to the per-
centage difference between the expected and the ob-
The mathematical model used to describe our calibra- tained value, and in low precision techniques they are
tion curve was validated by the Pennincky et al. [4] obliged to increase this value. Assuming the uncertain-
method. At this stage we proved the good fitting prop- ty associated with the control standard preparation to
erties of the unweighted linear model to our calibration be negligible when compared to the instrumental un-
curve. With this treatment we aimed not only at the ac- certainty, the case-to-case interpolation uncertainties
curacy but also at the estimation of more realistic sam- can be used as a fit for each case. If the observed confi-
ple signal interpolation uncertainties. These uncertain- dence interval includes the expected value, there is rea-
ties were obtained by the application of an ISO interna- son to think that the system is not under control. The
tional standard [5]. instrumental deviation from control can be used as a
The instrument was calibrated with four standards guide for instrumental checking or as a warning of the
(0-2-4-6 1Lg/L for Mn) with three measurement repli- inadequacy of the chosen mathematical model for the
cates each [4]. Samples and control standard (4 ILglL calibration.
for Mn) were also measured three times. The control
standard was analysed for calibration curve quality con-
trol (see "Quality control").
Validation of the uncertainty evaluation for the detelTI1ination of metals in solid samples by atomic spectrometry 205

60
Validation of the uncertainty estimation

Method validation is the process of demonstrating the


~ 56-

!
56

54
I
ability of a method to produce reliable results [8]. An >--
zw 52
>--
analytical result should be expresed along with a confi- z
0
50
0

1~
dence interval and a confidence level. The confidence 2w
46

interval can be described by a mean value and a inter-


..'" I i f f 11
46

f f
-< 0
val width. Therefore the validation depends on the re- >-
a:
44
.L.
liability of the confidence interval width estimation. 0 42
40 t ---.-' •• .-~~-.+--
The accuracy test can be performed exactly only, after 0: ~ ~ L{ :£ ff f;: ~ ~
0 N
w 0: 0: 0:
that step. W W W W W W W W
It: It: It: a: It: It: It: It: It: w W W

The statistical equivalence between the estimated REPLICATES '" a:


'"
and the observed values can be used to confirm that Fig.2 Repeatability test. The confidence intervals are repre-
quality. The F-test [10] is a good tool for comparing sented by the average value plus the estimated expanded uncer-
(non -expanded) uncertainties. tainty for a 95% confidence level

Application of uncertainty validation schemes

The proposed uncertainty validation method was ap- Accuracy test


plied to the dry-base determination of manganese in di-
gested lettuce leaves by electrothermic atomic absorp- The accuracy test was performed with spinach leaves
tion spectrometry. The proposed quality control sche- (NIST 1570a) because of their claimed similarity with
me was also applied. lettuce leaves in terms of proteins, carbohydrates, fibre
The 200-mg samples were digested with nitric acid in and inorganic matter content [14]. The validated uncer-
a microwave-irradiated closed system [11]. The instru- tainty estimation was used for the comparison of ob-
mental determination was performed in a GBC atomic tained values with certified ones (Fig. 3).
spectrometer with D2 lamp background correction. A The loss of precision in EAAS with the time of use
Pd/Mg mixture [12] was used as chemical modifier. The of furnace is taken into account in the case-to-case in-
dry-base correction factor was calculated by a parallel terpolation uncertainty calculation. The accuracy is re-
assay. The samples were dried in an oven at 60°C un- tained with a larger confidence interval.
der atmospheric pressure, and the CRM (certified ref- The overlapping of these intervals indicates that
erence material - NIST 1570a) was treated as specified there is no reason to think that our method lacks accu-
by NIST [13]. racy. Our analytical method can be considered to per-
form as badly or as well as the NIST methods. This as-
sumption seems sufficiently valid to consider the meth-
od validated.
Repeatability test

The estimated uncertainties were compared with the


experimental ones by an F-test for the 95% confidence
level [10]. Figure 2 represents the obtained experimen- _ 65
tal values associated with the estimated expanded un- 1
certainty (95% confidence level). The coverage factor
used was 1.96 for the average effective number of de-
g~ 60 t ~ j
r '
grees of freedom, df, of 57500. The Eurachem [1] pro-
>--
ill 75 I r 1
posal of a coverage factor of 2 is adequate for this ~ 70
0 I
case. c
::;

~ : "- -'-'- -'- - - -'-~'- - -+- - - _+_ _-'- - -~-.- '- - -J


The replicates 1 and 5 (REPl, REP5) are consecu-
tive single outliers (Grubbs test) for a 95% confidence
level [9]. Therefore, they have not been used for the
experimental uncertainty calculation. The two uncer- NIST, NIST2 NIST3 CERTIFIED VALUE

tainties are statistically equivalent for the test used (ex-


Fig.3 Accuracy test over spinach leaves NIST CRM. The ob-
perimental uncertainty: 0.82 mg/Kg for 9 df; estimated tained values (N 1ST 1. 2 and 3) were associated with a 95% con-
uncertainty: 0.73 mg/Kg for 57500 df) at the 95% confi- fidence level expanded uncertainty. The certified value is also
dence level. presented for the same confidence level
206 R. J. N. Bettencourt da Silva' M. F. O. F. C. Camoes . J. Seabra e Barros

very simple procedure. The easy routine use of an exact


Conclusions
treatment in a spreadsheet program can be useful for
the more demanding situations. Nevertheless, further
The assumption of 100% efficient digestion with negli- approximations can be easily tested.
gible uncertainty is valid for the total uncertainty esti-
Acknowledgements Thanks are due to JNICT for financial sup-
mation of the presented example. This uncertainty esti- port to CONTROLAB LDA, for the instrumental facilities that
mation proved to be a valuable criterion for method made this work possible, as well as to the teams of INETI and
validation and quality control, which can be tested by a CONTROLAB LDA, for their support.

References

1. Eurachem (1995) Quantifying uncer- 5. ISO International Standard 8466-1 10. Miller JC, Miller IN (1988) Statistics
tainty in analytical measurement, ver- (1990) Water quality - calibration for analytical chemistry (2nd edn).
sion 6 and evaluation of analytical methods Wiley, UK
2. ISO (1993) Guide to the expression and estimation of performance char- 11. Deaker M, Maher W (1995) J Anal
of uncertainty in measurement, Swit- acteristics - Part 1: Statistical evalua- At Spectrom 10:423-431
zerland tion of performance characteristics, 12. Soares ME, Bastos ML, Carvalho F,
3. Kargten J (1994) Analyse 119:2161- Geneva Ferreira M (1995) At Spectrosc
2165 6. Analytical Methods Committee 4:149-153
4. Penninckx W, Hartmann C, Massart (1994) Analyse 119: 2363-2366 13. NIST (1994) Certificate of Analysis
DL, Smeyers-Verbeke J (1996) J 7. Staats G (1995) Fresenius J Anal SRM1570a Trace elements in spinach
Anal At Spectrom 11 :237-246 Chern 352:413-419 leaves
8. Taylor JK (1983) Anal Chern 14. Penninckx W, Smeyers-Verbeke J,
55: 600A-608A Vankeerberghen P, Massart DL
9. Grubbs FE, Beck G (1972) Techno- (1996) Anal Chern 68:481-489
metrics 14:847-854
Accred Qual Assur (200n) 5: 495-4911
© Springer-Verlag 2000

Hans Malissa Statistical evaluation of uncertainty


Wolfgang Riepe
for rapid tests with discrete readings
examination of wastes and soils

Abstract In the course of the colo- mated, a calculation of the meas-


rimetric determination of analytes urement uncertainty for the 95%
using a procedure with discrete confidence level is possible; this is
readings the measurement uncer- needed to allow a reliable decision
tainly cannot be calculated in the of whether a critical value is ex-
H. Malissa . W. Riepe (lEI) normally practiced manner. The ceeded or not.
University of Salzburg, basic principle of the analytical
Institute of Chemistry and Biochemistry,
Hellbrunnerstrasse 34, 5020 Salzburg,
method used is a stepwise and Keywords Colorimetric . Rapid
Austria non-equidistant reading. Based on tests . Discrete reading . Statistical
e-mail: wriepe@natur.sbg.ac.at the fact that half a step can be esti- evaluation

tions. In contrast, the samples taken from a site of an


Introduction
accident can be divided into two sample families, one
comprising samples from the obvious contaminated
Analytical rapid tests attain increasing importance for area and the other from the surroundings. However, it
the prompt and reliable characterization of soil and should be noted that in waste and soil analysis the clas-
waste samples. Advantages are simple and convenient sification of samples into sample families is very often a
performance, almost immediate delivery of results, and matter of discretion.
low costs [1]. Thus it is possible to characterize and
classify samples [2] directly on-site in the field.
Various situations can be realized which demand an Principle of discrete readings
instantaneous result rather than an accurate one be-
cause immediate decisions need to be taken, e.g., a rap- For a number of elements (Table 1), commercial colori-
id survey of contaminations caused by an accident, in- metric test-sets for water analysis [3] were modified and
stant classification of wastes upon delivery at the dispo- extended to be applicable for measurements in soil and
sal area, or preliminary analytical assessment of a sus- waste eluates [4]. In all these tests the analytical results
pected landfill site. However, in order to evaluate the were obtained by visual comparison of the sample color
analytical results obtained with rapid tests, typical qual- with that of a series of permanent color standards spec-
ity criteria like measurement uncertainty, limit of de- ifically designed for that particular test and mounted on
tection, and limit of quantitation must be known or a disc comparator. Each color standard represents a de-
must be calculated, respectively, as is required for any fined concentration. A typical test set for one element
other analytical method. contains five to ten concentration steps. These concen-
Any of these situations - we may call it the "site" to tration intervals are normally not equidistant (Table 1).
be assessed - renders one or several groups of samples In practice it will be found that the color intensity of
with similar composition, which we designate as "sam- the unknown lies between two successive color stand-
ple families". For example, a batch of galvanic sludge is ards. An interpretation may be difficult in these cases
a site from which a sample family of m samples is tak- but it is reasonable that the analyst will be able to read
en, which are each analyzed with n parallel determina- with some certainty a value halfway between two color
208 H. Malissa . W. Riepe

Table 1 Working ranges of the test sets

color Cu Ni Zn Cr As Cd Pb
standard concentr. concentr. concentr. concentr. concentr. concentr. concentr.
#. mg/L mglL mg/L mglL mg/L mg/L mg/L

1 0 0 0 0 0 0 0
2 0.3 0.5 0.1 0.1 0.1 0.01 0.3
3 0.6 1.0 0.2 0.2 0.5 0.03 0.6
4 1.0 1.5 0.3 0.35 1.0 0.05 1.0
5 1.5 2.0 0.4 0.6 1.7 0.07
6 2.0 3.0 0.5 1.0 3.0 0.1
7 3.0 4.0 0.7 1.H 0.3
R 5.0 6.0 1.0 3.0 0.5
9 7.0 R.O 2.0 6.0 1.0
10 10.0 1O.O 5.0 10.0

standards (half-value reading). The exact procedure for determination: It is not at all necessary to employ a
the rapid tests is described in [5]. very precise analytical method which is in general cost-
ly, complex, and time-consuming but rather a rapid
method which is less precise but can be performed
Requirements for rapid tests readily. These rapid tests allow us to generate a greater
number of analytical data points in obviously less time,
Typically, the aim of any analytical characterization of which provides for a more representative declaration of
wastes and contaminated soils is to answer the question any inhomogeneous material.
whether a specific critical value is exceeded or not. This
may have several consequences: for instance a waste is
classified as hazardous waste and must be deposed of Statistical requirements
accordingly, or a contaminated area must be cleaned up
by decontamination procedures. Some important parameters and relations shall be de-
A critical value (cv) is definitely exceeded if the ana- fined:
lytical result (e) and its confidence interval (.1e) is
greater than that critical value.
Critical value (cv)
The total error of an analytical result is given by
three terms: Uncertainty of the sampling process, of The decision whether a measured analytical result, giv-
sample preparation, and analytical determination. Un- en as a mean concentration e is definitely (95% signifi-
certainty must be expressed in terms of the variance, cance in our study) below the critical value, can be ex-
which is equal to the square of the standard deviation s pressed as: e+ .1e < cv where e = mean value and
for the propagation. Llc = confidence interval.
S~otal = s~amp. + S~rep. + S~nal.
If we take a numerical example from the domain of Confidence interval
inhomogeneous samples like solid wastes and soils the
The confidence interval (.1e) is given by
sampling error is predominant by far: we assume the
relative standard deviation due to the sampling error as A-_ t(P,f) xs
.:..Ie - ,r;,- (1)
being 50%, that due to the sample preparation 10%,
and that due to the analytical determination 5%. This
vn
gives: where n = number of parallel measurements obtained
from one sample, t = Student's factor, S = standard de-
Stota] =V502 + 10 2 +5 2 =V2625 =51 % viation of e, and P = level of significance.
The term parallel measurements comprises the en-
It is obvious from this very simple calculation that in
tire analytical procedure including sampling and sam-
practice the total error is in fact determined by the sam-
ple preparation.
pling process and only insignificantly by the analytical
operation. Only if the analytical error is in the same or-
der as the sampling error will its contribution to total
Degree of freedom
error be remarkable. If we examine very heterogeneous
materials like soils and wastes the sampling error is pre- The number of degrees of freedom (n-l) for one sam-
dominant. This fact has consequences for the analytical ple denotes the number of control measurements per-
Statistical evaluation of uncertainty for rapid tests with discrete readings - examination of wastes and soils 209

formed on that sample, which are supposed to confirm R


S=--
the first result. If measurements are performed on a d(nJ
number of similar samples (sample family), this number
In the next step this is introduced into Eq. 1 for the
m is accounted for the calculation off Introducing j as
confidence interval:
the sample number within a sample family ranging be-
tween 1 and m and on the condition that the number of Lie = t(P;f) x R (2)
parallel determinations nj is equal for each sample, the
number of degrees of freedom is:
d(nj) xyn
Then, all variables of Eq. 2, which are not dependent
f=m X (nj-1) on the quantity to be measured e itself, are combined
where m = number of samples in a sample family, into a factor F:
f = number of degrees of freedom, and nj = number of F= t(P;f) (3)
parallel measurements obtained from sample j.
d(nj) xyn
Fig. 1 shows the results of a calculation of F (Eq. 3) ver-
Range sus the number of samples in a sample family. n = 2 or 4
means duplicate or quadruplicate determinations, re-
The spread of a smaller number of results of a measure-
spectively, from each sample in that family as well as
ment series « 10) is characterized preferably by the
from any forthcoming single sample, which can be as-
range, which is the difference between maximum and
signed to that family.
minimum values [6]:
On the other hand the influence of n is considerable.
R = Cmax - Cmin However, in the practice of waste and soil analysis
hardly more than duplicate measurements will be
The condition stated above allows us to define an aver-
made. Therefore a factor F = 2 will be quite adequate
age range that can be defined for m samples of a sam-
for an established method.
ple family:
m
Confidence interval of the mean for procedures
with discrete readings
At this point the problem of discrete readings is intro-
duced: in colorimetric procedures using color standards
on a comparator wheel, the operator selects that specif-
Estimated value for the standard deviation s
ic standard window which fits the photometric density
calculated from the range R
of the sample as closely as possible. Thus, it will occur
An approximation of the standard deviation can be cal- in many cases that both readings of a duplicate meas-
culated using factors d(nj) tabulated as a function of m urement are identical although the concentrations are
and nj [7]. different. No variation of measured values, and hence

Fig. 1 Dependence of the fac- 7,00,------------------------------,


tor F for multiplying the range
of number of samples and pa-
6,00
rallel determinations

5,00

'~"
.
~ 4,00
[=+--n
=
~
_nj=4
i=2 [

~ 3,00

~
J!!
2,00

1,00

0,00 + - - - - + - - - + - - - - - - t - - - - , - - - - t - - - - - t - - - - + - - - - , . - - . . . . , - - - - t - - - - - 1
2 10 15 20
m (number of samples in a sample family)
210 H. Malissa· W. Riepe

no range R, is observed in these cases because the color A-_ t(P;f) c(i+I)- CU-l)
"-Ie - - - - - - X --'-'--'-'-'---"--"'- (4)
standard sequence is too coarse. The question is, how d(nJ xy'n 2
can we still estimate a realistic figure for the confidence
interval even in these situations. With this expression a confidence interval for a sample
An example may illustrate the relevance: two sam- family is obtained, which is dependent on the number
ples (m = 2) are taken from a batch of galvanic sludge of samples and determinations used to characterize the
(site) and both are measured in duplicate (n=2). The object as well as on the level of significance selected.
graduation of the standard window may lead to four Only those critical values that are different from the
identical readings. In this case R, and hence the confi- mean concentration by more than that confidence in-
dence interval would be zero, although it is well known terval can be recognized as unambiguously different. It
that a certain degree of inhomogeneity simply exists. must be emphasized that this is a conversion of the un-
Therefore we must derive a figure from the measure- certainty of the reading into a concentration uncertain-
ment process which is, in fact, the range that could be ty and does not necessarily reflect a real inhomogenei-
just recognized and introduce it instead of R. ty.
It can be recognized from Table 1 that the levels Lic
in the concentration scales are not equidistant. When
taking the actual readings it will be not be easy for the Conclusions
operator to assign intermediate concentrations by inter-
polation. However, it can be reasonably expected that When doing parallel analysis the first term in Eq. 4 is
he is able to set an imagined limit of photometric den- about 2 (for P=95% and three and more samples) as
sity, which will be situated halfway between two con- can be seen from Fig. 1.
centration levels and may be able to allocate the actual Therefrom follows the confidence interval for the
sample density to the higher or lower level, respective- mean value:
ly.
An upper and lower limit for each concentration
reading Ci may be defined:
U r·
pper lmlt:
C(i+I)+Ci
2 '
L r·
ower lmlt:
Cj+c(i-I)
2 ' This means that the difference between the adjacent
higher and lower values of the reading step added to
where Cj = concentration reading for window i, the measured concentration e is the test figure, which
Cj + I = concentration reading for window i + 1, and must be compared with the critical value.
Cj_1 = concentration reading for window i-I. All color The critical value is not exceeded (with 95% level of
densities within the interval between these limits are as- significance) when the following condition is satisfied:
signed to one and the same concentration C;. This is
equivalent to the statement that the distance between e+ (cu + 1) - Cu - 1») < critical value
both limits can be regarded as a range Rl of a method,
for which a range cannot be derived from the measure- This easy procedure allows us to use rapid tests with
ment values themselves: a chosen level of significance as a valuable analytical
tool. If the analytical uncertainty is estimated as de-
RI = C(i+!)-C(i-\)
scribed above, it is found that Lie has about the same
2 size as the analytical result e itself. A reliable decision
This is introduced into Eq. 2 instead of R to give the (95% confidence level) that a critical value is not ex-
confidence interval for a colorimetric procedure with ceeded can be made only if the result e is not greater
discrete readings: than half of the critical value itself.

References

1. Unger-Heumann M (1990) Strategy of 3. Merck (1974) Untersuchungen von 5. Merck (1994) Applications
Analytical Test Kits. Fresenius J Anal Wasser Darmstadt. 9. Auflage 0. Doerffel K (1990) Statistik in der ana-
Chern 354: 803-800 4. G6tzl A, Malissa H, Riepe W (1997) Iytischen Chemie. Dt Verlag der
2. Valcarcel M, Cardenas S, Gallego M Analytische Schnellerkennungsme- Grundstoffind, Leipzig, p 28
(1999) Sample Screening Systems in thoden: Bewertung abzulagender Ab- 7. Doerffel K (1990) Statistik in der ana-
Analytical Chemistry. Trends in Ana- falle und Kontrolle von Deponien. Iytischen Chemie. Dt Verlag der
lytical Chemistry, vol 18, no 11, pp UWSF - Z Umweltchem Okotox Grundstoffind, Leipzig, p 82
085--094 9:245-248
Accred Oual Assur (199X) 3: 122-126
© Springer-Verlag 199X

Gino Stringari Influence of two grinding methods on the


Ivo Pancheri
Frank Moller uncertainty of determinations of heavy
Osvaldo Failla
metals in atomic absorption
spectrometry/electrothermal atomisation
of plant samples

Abstract Chemical analyses of Further, the simultaneous effects of


trace elements are affected by rela- the grinding methods on all consid-
tively high analytical errors due to ered metals have been evaluated
Presented at: 2nd EURACHEM
Workshop on Measurement Uncertainty the different steps of the laborato- by analysis of variance. With the
in Chemical Analysis, Berlin, ry procedures: samples grinding, stainless steel grinder, on average,
29-30 September 1997 mineralisation and instrumental higher levels of the considered
measurements. In the present com- heavy metals were obtained (up to
munication, the influence of the 67% of the mean values). On aver-
grinding phase on the global uncer- age, the increments were similar
tainty of Pb, Cd, Ni and Cr deter- for metals contained in steel (Ni
minations in plant samples by the and Cr) and those not contained
classical method of atomic absorp- (Pb and Cd). The true causes of
tion spectrometry/electrothermal these differences need further in-
atomisation (AAS-ETA) after dry vestigation to determine whether
ashing is quantified. Two grinding the higher metal detection is due
machines, a planetary mill with to possible contamination, to a dif-
balls and jars of agate versus a ferent grinding quality or to other
stainless steel grinder were com- reasons. Finally, the grinding meth-
G. Stringari (lEI) . I. Pancheri pared by analysing leaf samples of ods did not seem to affect the com-
Agrarian Institute, I-3XOlO San Michele cucumber, strawberry, kiwivines, bined uncertainty of the analyses.
all"Adige, Trento, Italy apple trees and grapevines from
Tel.: + 39-461-61525X;
Fax: + 39-461-650X72; agricultural experimental plots un- Key words Atomic absorption
e-mail: Gino.Stringari@ismaa.it der controlled conditions. Variance spectrometry/electrothermal
F. Moller' O. Failla
components due to the difference atomisation . Grinding machines .
Faculty of Agriculture, University of between grinding methods and ex- Trace elements' Uncertainty'
Milan, Milan, Italy perimental plots were estimated. Variance components

are generally lower than those associated with other or-


Introduction ganic or mineral matrices.
Several interlaboratory studies [1] indicate that in
Chemical analyses of trace elements still present many spite of the standardisation of the procedures [2] data
problems of uncertainty despite the progress in analyti- dispersion is considerable and difficult to explain. The
cal techniques and instrumental performance. The ana- same confidence intervals reported for the analysis of
lyses of trace elements in plant tissues are no exception, the certified reference materials highlight that the uncer-
although the difficulties associated with these matrices tainty associated with trace elements is generally higher
212 G. Stringari et al.

than those associated with macro- and microelements Leaf samples were dried in a dust-free forced draft oven at
[3]. 70°C overnight, then coarsely ground by hand, and two 25-g por-
tions were taken for each grinding method (Table 1). Sample mi-
The possible causes of variability are present in all neralisation was carried out according to the procedure recom-
the analytical steps, which, in atomic absorption spec- mended by the CII (Comite Inter-Instituts d'etude des techniques
trometry, can be narrowed to the following three: sam- analytiques) [3] in a platinum capsule. The ash was treated with
ple preparation, mineralisation and instrumental meas- HN0 3 .
The instrumental measurement was performed on a Varian
urement. In the recent past, to achieve satisfactory pre- spectrometer equipped with a graphite tube atomiser and pro-
cision or reproducibility, the errors due to the instru- grammable autosampler (Spectra AA-400 Zeeman) with the pa-
mental techniques and/or the matrix mineralisation rameters reported in Table 2.
were investigated [4, 5, 6]. The quality control of these
two steps is indeed simplified by the availability of ref-
Statistical analysis
erence materials.
Our attention was centred on sample grinding, an Quality control
important step in sample preparation upon which sam-
An experiment was performed ax times with two deter-
ple homogeneity and possible contamination depend.
minations each time, i.e. a total number of 2ax =Nx de-
The object of this communication is to present the terminations. Following Stringari et al. [7], the
results obtained comparing the effects of two grinding ANOV A SS due to the time, SSAx , and error, SSEx ,
devices on analyses for Pb, Cd, Ni and Cr determined the mean square errors MSA x have been determined,
in routine procedures by atomic absorption spectrome-
followed by the test ratio:
try with a graphite furnace.
F=MSA./MSE x
the variance components:
Materials and methods
s;,=MSE, and s~,=(MSAx-s;)/N.
Chemical analysis
and the reproducibility variance which is the square un-
Two grinding machines, a planetary ball mill (pbm) (PM 4()OO- certainty for the measurand:
Retsch) with grinding jars and balls of agate versus a rotor-speed
mill (Pulverisette 14-Fritsch) stainless steel grinder (ssg), were s;.' + s~" for heterogeneous means, i.e. F
compared, by analysing leaf samples of cucumber (Cllcumis sati- u 2 (x) =Sk = { significant
vus L.), strawberry (Fragaria x anassa Duch.), kiwivines (Actinid- x s~" for homogenous means.
ia deliciosa Liang et Ferg.), apple trees (Malus pumila Mill.) and
grapevines (Vitis vinifera L.) from agricultural experimental plots The control limits for the mean values are given by
under controlled conditions consisting in mulching treatments
(field treatments) with composts of different origins. Mx±ksk, yI72-1I(2Nx)

Table 1 Sample number and Number of Tissue No. of Grinding time Speed
Species
grinding parameters with samples balls (min.) (rpm)
planetary ball mill
Cucumber 25 Leaf 10 10 300
Strawberry 25 Leaf 12 40 300
Kiwivines 10 Leaf 12 30 300
Apple trees 24 Leaf 11 20 300
Grapevines 32 Leaf 12 40 300

Table 2 Main parameters of PB Cd Ni Cr


the instrumental measurement
performed on a Varian Tube Positions graphites tubes without platform
spectrometer equipped with a Wavelength (nm) 2113.3 2211.11 232.0 357.9
graphite tube atomiser and Ashing (0C) 1150 475 975 1100
programmable autosampler NH 4 H 2 P0 4
(Spectra AA-400 Zeeman) Modifier of matrix + Salts of Pd
Mg(N0 3 h
Atomisation (0C) 2100 2500 2550
Measurement mode Peak height Peak area
Calibration mode Standard additions Calibration
curve
Intluence of two grinding methods on the uncertainty of determinations of heavy metals 213

with k = 2, or 2.6, depending on the significance level


(5% or 1%).
The upper control limit for the ranges IXi2 -xill
= 112 standard deviation is kS e •
Mean values or ranges whi~h lie outside the above
control limits are discarded.

Analysis of variance

Two different approaches to analysing the data have


been followed, both based on the same linear model Fig. 1 Statistical analysis based on the single grinding method.
Flow diagram for hypothesis testing. C): significant; ns: not
Yij=JL+ai+eij; (1) significant
i=1, ... , a; j=1, ... , ni Ini=n
Alternatively the model reduces to Yij = JL + cij' with
The testing of hypotheses and the estimation of var- estimation of the error variance by
iance components are based on the usual analysis of
variance. s;= (SSE + SSA)/(n -1)
and an uncertainty u(y) = Se'
Source SO, SS df MS F E

Total SOT n Analysis based on the differences of the grinding


Gj I JL, a; SSE = SOT-SOA n-a MSE U; methods
a; I JL, SSA=SOA-SOM a-I MSA F(a) a;,
JL SOM 1 MSM F(JL) U;
The second approach of the analysis is based on the dif-
where: SOT=lly~; SOA=lniY}; SOM=ny2 ference and on the relative differences
_{Xi
Yij-
j Xj
2( l- i 2 )/( ' -, 1
) 1- ' '-1 .~ -
... , a, J- , ... , ni, ~ni-n
Xijl -Xip. Xijl +Xij2
Analysis based on the single griding method
The parameters of the linear model (Eq. 1) have now
First the two grinding methods are considered sepa- the following meaning:
rately putting: JL, the overall mean difference between the two griding
_{xijlforthegridingmethodPbm '- .'- _ methods;
Yij- f h . d' h d l-l, ... ,a,J-l, ... ,nil n i- n
Xip. ort egrm mgmet a ssg ai, the a differences due to the treatments.
The parameters of the model (Eq. 1) are The variance components are determined according to
JL: the overall mean of the two grinding methods the acceptance or refusal of the following hypotheses
ai: the a main effects due to treatments.
The hypothesis to test is
HO:ai=O, vi (2)
The estimates of the above parameters with the "usual"
restrictions, are
the general mean m for JL
m = y { the difference between treatment and
general mean ai-mi-m for ai
The hypothesis testing proceeds along the lines of the
flow diagram represented in Fig. 1. If the hypothesis of
Eq. 2 is refused, the variance components d; and cru are
estimated by
s;=MSE; s~=(MSA-s;)/n'l> with n ll =(n-n m )/(a-1) Fig. 2 Statistical analysis based on the difference of the grinding
leading to a combined uncertainty of u(y) = s~ + s;. V method. Flow diagram for hypothesis testing. C): significant; ns:
not significant
214 G. Stringari et al.

(3) Further to all mean SS expected values, the compo-


nent of the reproducibility variance, 0;, is added, thus
Ho:p..=O (4)
leaving the estimates of the variance components s;
The estimates of the above parameters are obtained as and s;, unchanged, while s; is now diminished by s;lq,
in the former case. The hypothesis testing proceeds where s; is the estimate of the reproducibility var-
along the lines of the flow chart represented in Fig. 2. iance.
Thus it is possible to estimate the variance compo- If the experiment is not designed for the model of
nents and hence for the first three models the com- Eq. 5 but the estimate of s; is obtained by a separate
bined uncertainty and for the last the simple uncertain- experiment, this independent estimate is substituted in
ty u(y). the above formulas, and also the denominators of the
F-test are increased by the same value, thus obtaining
Final models Variance components new (approximate) F-tests.
Following Stringari et a1. [7], these estimates of the
Yii=jJ.,+ai+£ij s~=MSE
s; = (MSA -s~)/n,; n,,=(n-n m )/(a-l) reproducibility (Sr) were obtained (in ppm): Pb 0.0822,
s;=(MSM-n ,s2-s~)/n; nm=lnf/n Cd 0.0067, Ni 0.0456, Cr 0.0628.
1;(Y)= s~+s,~+s;

Results
The results of the above analyses are reported in the
Fig. 4. The horizontal length of the bars indicate the
s~=(SSE+SSA)/(n -1) average ppm content for all elements and species for
s; = (SQM -s;;)/n each grinding method.
~(Y)=Vs;;-s;;
The total horizontal length of each bar has been div-
s~=lly;J!n ided into portions proportional to the variation coeffi-
u(Y)=St cients based on the variance components, to which the
reproducibility variance has been added.

Reproducibility Lead
Following the procedure shown in Stringari et a1. [7], q Ssg grinding methods gave higher mean values in all
repeatitions of the determinations are considered. Thus the plant species (from 7% to 52%), and these differ-
the model of Eq. 1 generalises to ences were significant for cucumber and apple trees
leaves. Moreover, it allowed us to highlight significant
Yijk =p..+ O'i+ £ij+ 8ijk (5)
effects due to the field treatments in four of the five
and the variance components of the model without 8ijk species, while with pbm methods there were significant
are multiplied by q. effects only for apple trees and grapevines.

Fig. 3 Means and components lead (ppm) Cadmium (ppm)


of squared combined
uncertainty of lead and
cadmium as affected by plant
matrices and grinding methods

I:
Legend:
repeatib~ty variance pbm : planetary mill wilh balls and jars of agala
: error VlIllance ssg : staioless steel grinder
: treatmenl variance ~ : significant difference between pbm and ssg grinding
Influence of two grinding methods on the uncertainty of determinations of heavy metals 215

Fig.4 Means and components Nickel (ppm) Crome (ppm)


of squared c ombined
Cticumber
uncertainty of nickel and
chromium as affected by pla nt
pbm
ssg Cuoom~~=:;=~~~~~~~
matrices and grinding methods Sttawberr;
pbm
ssg ~~ ~=:;=~~~~
Klwlvlnes
pbm
ssg KtMV~ E=====:::~~Ji~~
AppIelrees
pbm
ssg AWel~~=:;==3~i
1
Grapevines 1
pbm 1
ssg - 1

.le AO _OIl .eo 1.00 I.le ...0

I:
0.00

Legend:
repeatibi~1y variance pbm : planetary mill with balls 8Jld jars of agala
: error van8Jlce ssg : stainless steel grinder
: treatment variance ~ : significant difference between pbm and ssg grinding

Cadmium other two the differences were below 1%. In cucumber


the difference was significant.
In general, this metal was associated with a high com- This metal also showed a high combined uncertain-
bined uncertainty due to the low analytical values, very ty, which in this case probably was not due to low levels
close to the analytical limits. With the exception of kiwi but to the greater difficulties for the instrumental meas-
vines, ssg grinding methods gave higher mean values, urement of this element.
with significantly higher mean values in apple trees
(from 12% to 67%), and highlighted a significant effect
of the field treatments for apple trees. Conclusion

With the stainless steel grinder, on average higher lev-


Nickel els of the considered heavy metals were obtained (up to
67% of the mean values). On average, the increments
For this metal also, the ssg method gave higher values were similar for metals contained in steel (Ni and Cr)
for all the species (from 1% to 63% ). The differences and those not contained (Pb and Cd). The true causes
were significant for strawberry and apple trees. The ssg of these differences need further investigation to deter-
method allowed to put in evidence the field treatment mine whether the higher metal detection is due to pos-
effects in cucumber and apple trees. sible contamination, to a different grinding quality or to
other reasons. Further, the stainless steel grinder per-
mitted us to detect more significant effects due to the
Chromium field treatments. The grinding methods did not seem to
affect the combined uncertainty of the analyses.
In three species out of five , the ssg method still deter-
mined higher values (from 7.5% to 28% ), while for the

References

1. CLL-Comite Inter-Instituts d 'etude 3. BCR Catalogue. BCR Reference Ma- 6. Hoening M, Baete n H , Vanhentenryk
des techniques analytiques (1993-1997) terials. Community Bureau of Refer- S (in press) Anal Chim Acta
Compte Rendu de 67", 61\c, 69 c, 70", ence (BCR) Commission of European 7. Stringari G, Moller F, Ceschini A ,
71", 72" Reunion Communities, Brussels Failla 0 (1996) Comm Soil Sci Plant
2. Martin-Prevel P, Gagnard J .,Gautier P 4. Slavin W (191\4) Graphite furnace An al 27, 5-1\:1403-1416
(191\4) In: Martin-Prevel P , Gagnard J, AAS a source book. Perkin-Elmer,
Gautier P (eds) Plant an alysis. Lavoi- Ridgefield, Conn
sier, New York 5. Hoehig M, de Kersabiec AM (1990)
L'atomisation electrothermique en
spectrometrie d'absorption atomique.
Masson, Paris
Accred Qual Assur (1998) 3: 328-334
© Springer-Verlag 1998

Mirella Buzoianu Measurement uncertainty and its


meaning in legal metrology of
environmental chemistry and public
health

Abstract The need for reliability priate traceability chain, the ex-
of measurements supporting legal perience of the INM in identifica-
decisions in environmental policy tion and evaluation of measure-
or medical diagnosis and treatment ment uncertainty in legal activities
is well known and widely accepted. concerning the environment and
This prerequisite can be met only health is reviewed. Practical exam-
by ensuring that legal measure- ples of measurement uncertainty
ments are accurate and traceable evaluation in spectrophotometric
to national or international stand- determination of five analytes,
ards. Consequently, an outline of commonly determined in environ-
the organizational structure of the mental and clinical chemistry are
Romanian National Institute of described. The implications of
Metrology (INM) for ensuring uni- measurement uncertainty for inter-
formity, consistency and accuracy pretation of regulatory compliance
M. Buzoianu (lEI) of all measurements including legal are discussed.
National Institute of Metrology, measurements performed in chemi-
Sos. Vitan-Biirzesti No. 11, cal laboratories is presented. Since Key words Measurement
75669 Bucharest. Romania
Tel.: +40-1-6344030 reliable measurements can only be uncertainty . Analytical chemistry .
Fax: + 40-1-330 15 33 accomplished within an appro- Environment . Clinical chemistry

sults. This means that the user of the measurement in-


Introduction
formation is unable to make any judgment on the con-
Traditional metrological actIvItIes in Romania have fidence to be placed in it, nor it is possible to compare,
concentrated on legal chemical measurements perform- in a rational way, the results of independent analyses of
ed in trade, environmental chemistry and public health, the same sample. Therefore, concepts of measurement
in conjunction with the implementation of quality assu- uncertainty and traceability are continuously develop-
rance system in these fields. In this paper only measur- ing in legal activities.
ements performed in laboratories from the environ- In this framework, the experience of the National
ment and public health sectors are considered. Institute of Metrology (INM) on the evaluation of
The main problem faced by analysts is whether or measurement uncertainty of spectrophotometric ana-
not they have the methodology to provide a result of lyses performed in legal activities, as well as some re-
the required accuracy and precision. However, after sults on comparability studies using certified reference
carrying out an analysis, it is very unusual for analysts materials (CRMs) are presented.
to give any indication of the measurement uncertainty, Many important decisions are based on the results of
or information on the traceability of the reported re- spectrophotometric analysis. The extent to which the
Measurement uncertainty and itsmeaning in legal metrology of environmental chemistry and public health 217

quality of these results (i.e. measurement uncertainty) variants of that model. The performances of these in-
is reflected in regulatory compliance against limits is struments are evaluated and verified, using legal metro-
also discussed. logical norm (NML) methods and appropriate CRMs.
Note that various types of CRMs developed, recog-
nized and accepted for use for spectrophotometric sys-
Outline of metrological assurance of legal tems are presented in Ref. [1]. Metrological assurance
measurements of uniformity and traceability of measurements in legal
activities is coordinated and supervised by the Roman-
In accordance with the Romanian Law of Metrology ian Bureau of Legal Metrology (BRML), and carried
(issued in 1992), all measurements performed in pro- out by the INM, 14 area-organized metrological inspec-
duction and testing of pharmaceuticals, in trade or in torates (IIJM) and a number of accredited metrological
the fields of health, safety and environmental chemistry laboratories.
should be traceable to national or international stand- Founded in 1951, the INM's mission is to ensure a
ards, by the proper use of legal instruments, reference valid scientific background for uniformity, consistency
materials (RMs), and adequate methods of measure- and accuracy of all measurements in Romania, regard-
ments. Consequently, the necessary metrological activi- less of their field of application. The main activities of
ties for legal measurements are: the INM are shown in Fig. 1.
1. the assurance of the legality of all instruments used Measurement uncertainty and traceability are very
by pattern tests and initial or periodical verifica- important for regulatory compliance against limits,
tion; when a good reliability of the analytic results and/or
2. the development of RMs required by legal metrolog- monitoring of toxic pollutants is needed. Therefore,
ical norms; much is being done by the INM to improve matters in
3. the assessment of measurement uncertainty and the the specific legal metrology of environmental chemistry
achievement of traceability. and public health. Also, for comparability purposes, the
In this respect, all instruments used in legal activities INM organized several inter-laboratory studies using
are subject to pattern approval of each model and any appropriate CRMs (single or multielement). The re-

Fig. 1 Activities of the Ro- INTERNATIONAL PUBLICATIONS.


manian National Institute of COOPERATIONS: CONNECTIONS: EDUCATION:
Metrology for ensuring unifor-
mity, consistency and accuracy IRS. OPC. RELAR Participation in BIPM. Metrological nonns,procedwes
of all measurements in Ro- COOMET. EAL (WECC).
mania Departments. other EUROMET. IMKO activities Periodical review
governmental agencies METROLOGIE
Bilateral activities
Professional associations Scientific and technical papers,
European Programs books, broclwres
Research institutes.
higher education Courses, seminars, workshops

NATIONAL
INSTITUTE of
METROLOGY

INDUSTRIAL SCIENTIFIC LEGAL


METROLOGY: METROLOGY: METROLOGY:

Calibration service Research & development in Pattern tests


the field of metrology
High accuracy measurements Metrological verifications
Realization/improvement of
Participation in the auditing of the national standards Participation in metrological
calibration laboratories expert appraisals, evaluations,
Characterization. comparison review of documentation and
Technical assistance. of standards other activities
consultancy
Development of measurement
Manufacturing of instruments / calibration methods!
and reference materials procedures!instrumentation
218 M. Buzoianu

suits on comparability of concentration measurements


in legal activities are used to support future metrologi-
cal activities related to NML procedures and metrolog-
ical training.
Spectrometri
Result
Experience of the INM on evaluation of measurement c±Uc
uncertainty in legal metrology

Detailed evaluation of the measurement uncertainty is


carried out as common practice by the INM on the re-
alization of base and derived units. The techniques Fig.2 Schematic diagram of an analytical photometric system
used rely on assessing and determining the correction
for each cause, and building up an uncertainty budget.
Among the standardized methods currently per- tained by multiplying the combined standard uncertain-
formed in chemical laboratories, spectrophotometric ty by a coverage factor k (for legal metrology applica-
ones are routinely used to determine the concentration tions K is usually taken as 2).
of analytes using a variety of equipment, starting from The above-mentioned 'adequate mathematical man-
discontinuously wide-band instruments to automated ner' takes into consideration the function describing
devices with a narrow band detection range. For a gen- the concentration, of a general form:
eral photometric system, illustrated in Fig. 2, the evalu-
ation of measurement uncertainty starts with the iden-
C = Cree I V' K (in environmental analyses) (1)
tification of the sources of errors and uncertainty com- where: Cree is the recalculated concentration from the
ponents. Measurement uncertainties due to sampling calibration curve, f is the dilution factor, V is the sam-
(Us), sample preparation (up), the photometric system ple volume and K is a proportionality factor; or
(UM)' calibration of the system (UR)' the RMs/CRMs
C = CstandanJ • Asample/ AstandanJ (in clinical analyses) (2)
used for calibration (URM) and the data treatment
(UDA) are shown in Fig. 2. But there are many other where: Cstandard is the concentration of the standard ref-
possible sources of uncertainty in spectrophotometric erence solution used for comparison, Asample is the ab-
measurements. Among them, inadequate knowledge of sorbance measured for the sample and A standard the ab-
the effects of environmental conditions on the meas- sorbance measured for the standard reference solution.
urement, finite resolution or discrimination threshold Also, note that Eqs. (1) and (2) reflect two different
and inexact values of measurement standards are some way of calibrating the spectrophotometric system: (a)
typical examples. Sources of uncertainties occurring in by measurement of a CRM, and (b) by measurement of
spectrophotometric analysis are presented in Ref. [2]. a pure standard of the analyte used to calibrate just the
Using RMs and experimental quantification [3], spectrometric comparator. For traceability purposes
identified uncertainty components are evaluated as these situations introduce the following points: the
either Type A or Type B standard uncertainties. Thus, spectrophotometer should be calibrated in a traceable
for well-characterized measurements under statistical manner, and RMs used for its calibration should assure
control the uncertainty of the input quantities deter- the traceability to SI units.
mined from independent repeated observations is esti-
mated as the experimental standard deviation (Type A
standard uncertainty). Examples of evaluation of measurement uncertainty in
For an estimate of an input quantity that has not environmental analyses
been obtained from repeated observations, the asso-
ciated estimated variance or standard uncertainty is Two examples of measurements frequently performed
evaluated using all available information and its possi- in environmental analyses to determine cadmium and
ble variability (Type B standard uncertainty). Each phosphates in waste water will be discussed.
standard uncertainty involved in the spectrophotomet- In accordance with SR ISO 5961 'Quality of water:
ric measurement is then combined in an 'adequate determination of cadmium by flame atomic absorption
mathematical manner' to give the combined standard spectrometry (F AAS)', synthetic water containing
uncertainty [u~(c)], which characterizes the dispersion (0.500 ± 0.015) mg/l Cd was prepared under well-con-
of the values that can reasonably be attributed to the trolled conditions and measured on five spectrometers
considered concentration. (type AAS 1, instrument 5; type AAS 3, instruments 1
The additional measure of uncertainty providing an and 2, and type AAS 30, instruments 3 and 4). Note
interval of confidence, the expanded uncertainty, is ob- that the performance of each instrument was tested
Measurement uncertainty and itsmeaning in legal metrology of environmental chemistry and public health 219

against single element CRMs (code 13.01), as indicated different bandwidth were used. Instrument 2 and 3
in NML 9-02-94 'Atomic absorption spectrometers for were specialized for water measurements (AQUANAL
water pollution measurements'. A summary of the pa- type). The unknown sample of 0.250 ± 0.010 mg/l, was
rameters of each calibration curve is presented in Table prepared under well-controlled conditions. The meas-
1. An estimated standard uncertainty was evaluated urement conditions and evaluation of measurement un-
starting from the linear calibration of the instrument certainty are presented in Table 2. Starting from the ex-
(uncertainty of regression, residual standard deviation perimental steps involved in each measurement method
and uncertainty of calibration curve included) [4]. Then used, an estimated measurement uncertainty was calcu-
a standard uncertainty was determined, combining lated as the square sum of partial uncertainties for vol-
standard deviation of repeated measurements, correc- ume and absorbance measurements, preparation of the
tion of the calibration curve and the uncertainty of calibration standards and the calibration curve [5].
RMs. The results of the evaluation of these uncertain- By statistical analysis of the results obtained on con-
ties are also presented in Table 1. A good agreement trol RMs or CRMs, an observed measurement uncer-
between the two standard uncertainties is observed. tainty was evaluated (taking into account repeated
Limit ratios of 0.73 and 2.47 were calculated from the measurements, correction of the calibration curve, the
determined uncertainty and URM. calibration curve and the uncertainty of the RMs). A
The concentration of phosphates in waste water was quite good agreement between the two values of meas-
determined according to a national standard STAS urement uncertainty evaluated starting from two differ-
10064 'Surface and waste waters: determination of ent approaches was accomplished. Furthermore, the ex-
phosphates' by measuring the absorbance of the blue perimental standard deviation of the mean value of
colour of a reduced phosphomolybdate complex. Sev- concentration was determined using the analysis of var-
eral types of molecular absorption spectrophotometers iance of individual random effects according to [3]. Ex-
(SPECORD M40, instrument 1; DR 2000, instrument 4 perimental variances of individual values, mean values
and CADAS 100, instrument 5), and photometers of and within parallel measurements for cadium are

Table 1 Results on evaluation of the measurement uncertainty on cadmium determination in waste water

Instrument 1 Instrument 2 Instrument 3 Instrument 4 Instrument 5

Correlation coefficient, r 0.9995 0.9999 0.9995 0.9990 0.9995


Intercept of the
regression line, a -0.0044 -0.0014 0.0060 0.003H -0.0003
Slope of the regression
line,b 0.1416 0.0694 0.2596 0.1433 0.1049
Standad deviation, So (0.O0l3) (0.0007) (0.0041 ) (0.0023) (0.0021 )
Standard deviation of
residuals, So 0.005 0.003 0.009 (!.O05 fl.OO3
Number of calibration
points, N 4 4 4 4 5
Number of replicate
measurements, n 3 3 3 3 3
Mean of all the
absorbance A, values in 0.296 0.146 0.217 0.145 O.OHH
the calibration
Mean value of the
absorbance measured on 0.067 0.032 0.133 0.075 0.049
the sample, A(cx)
Predicted concentration,
ex(mg/I) 0.502 OAHI 00491 00497 00470
Standard uncertainty
estimated, rei 0.021 (Ul33 0.026 0.027 0.021
Standard uncertainty
determined, rei 0.011 0.037 0.032 O.02H 0.034
(mg/I) 0.014 O.OlH 0.016 0.014 0.016
Confidence interval (mg/I) OAHH ... 0.516 00463 ... 00499 0.475 ... 0.507 OAH3 ... 0.511 00446 ... OAH6
220 M. Buzoianu

Table 2 Results on evaluation of measurement uncertainty of phosphate concentration in waste water

Measurement O-phosphate reacts with ammonium molybdate in acidic medium to produce a phosphomolybdate complex.
method This complex is then reduced to an intense molibden blue colour

Instrument 2 3 4 5 6
Steps considered:
l. sampling (ml) 50 10 50 25 2 50
As indica- As indica- As indica-
2. methods of STAS· ted by the STAS· ted by the ted by the STAS
measurement 10064 maunfacturer 10064 manufacturer manufacturer 10064
3. volume of 10 6 drops R. 10 0.2 10
reagent (ml) 2 drops R2
4. final volume (ml) 100 10 100 25 2.2 toO
5. calibration curve
r 0.9989 0.9988 0.9988 A '0.5722 Linear 0.9989
a 0.0168 0.0018 0.0863 -0.129 (k) 0.0933
b 0.8528 0.2589 1.2746 1.423 (F) l.2086
So 0.0450 0.0267 0.0660 0.0610
6. measurement
conditions; A (nm), 700 635 650 890 890 660
time (min), 30 5 30 2 10 30
path length (mm) 10 15 10 23.5 25 25
(,1Ele)timc 0.016 0.000 -0.04 +0.03 -0.01 -0.04
(,1Ele)tcmp25°C 0 0 0 0 0 0
Accuracy of the
method, rei 0.05 0.05 0.05 0.01 0.04 0.05
Mathematicl
equation (Ax -a) ·fr.implb (Ax -a) ·fr.implb (Ax -a) ·fr.implb 0.5722·A x F'Ax-k) (Ax -a) ·fr.implb
Validation of the
instrument with:
neutral filters Yes Yes Yes No No No
CRM 1 mg/I P04 0.984 l.03 1.05 1.05 0.96 1.01
Absorbance
measured on 0.246 0.069 0.424 0.472 0.275 0.392
sample
Concentration of
the sample (mg/I) 0.246 0.275 0.265 0.270 0.262 0.247
Estimated
standard 0.051 0.065 0.059 0.032 0.022 0.046
incertainty, rei
Determined
standard 0.047 0.051 0.060 0.030 0.022 0.044
uncertainty, rei

a ST AS: National standard ST AS 10064 'Surface and waste waters: determination of phosphates'

shown in Table 3. Since Fcalc(1.89) < Fcr (3.1O.0.9S) (3.71),


Practical considerations of measurement uncertainty in
there is no statistical significance between instrument
photometric clinical analyses
effects at the 5% level of significance. Also, note the
agreement between the standard deviation of the mean Methods most commonly used in clinical laboratories
values around the known value of RM (0.020 mg/l), and using photometric systems, as well methods to evaluate
the uncertainty assigned to the RMs used measurement uncertainty have been fully described in
(0.015 mg/I). [1] and [6]. But the concept of measurement uncertain-
The difference between the extreme values of the ty is still poorly understood in clinical laboratories.
concentration corresponding to the confidence level re- In clinical chemistry, activity has concentrated so far
ported for the CRMs was of 13.3% in the case of cad- on the evaluation of measurement uncertainty of six
mium and of 22% for phosphates. analyses (Na, K, Ca, Mg, urea and glucose), following
the international ISO Guide [3]. The main components
of uncertainty considered for three typical examples of
221

Table 3 Summary of cadmium concentration values obtained with different instruments

Instrument 1 Instrument 2 Instrument 3 Instrument 4 Instrument 5

Mean value, C (mg/I) 0.502 00481 00491 00497 00470


Standard deviation (mg/I) 0.014 0.018 0.016 0.014 0.016
Overall mean, g (mg/I) 00488
Experimental
variance:
of the individual values
(around the overall mean)
within the parallel
measurements
of mean values (around the
overall mean)
of the overall mean around
the known value of RM
of mean values around the
known value of RM

end-point determination, and their values are indicated tion lying within particular limits. Unfortunately few le-
in Table 4. A relative measurement uncertainty of 0.058 gal limits are set with allowance for uncertainty. Several
has been obtained for glucose determination, 0.128 for studies of comparability performed in the national area
urea and 0.025 for calcium. Note that the uncertainty of showed a quite large spread of the results obtained in
the CRMs used are indicated in parentheses in the ta- legal activities. For instance spreads of 35% for Cd and
ble. The ratio between the uncertainty of CRMs and Zn, and 25% for Cu and Cr in waste water have been
the measurement uncertainty evaluated for the above reported (7]. Also, in clinical laboratories under routine
described analyses varies from 1.03 to 2.78, which is ac- conditions, the spread was lower than 4.9% for Na,
ceptable agreement with the typically recommended 19% for K, 26.1 % for Ca, 18.6% for Mg and 15.6% for
value of 3. glucose, asymmetrically distributed around the assigned
values [1]. Most outliers were obtained in the absence
of a reliable uncertainty budget and insufficient quality
Measurement uncertainty meaning in legal metrology assurance procedures. Nevertheless, limit results do not
necessarily mean a higher measurement uncertainty.
Measurement uncertainty is significant when interpret- For instance, seven photometric systems of different
ing an analytical result of a toxic substance concentra- photometric accuracy were used to determine nitrite

Table 4 Uncertainty components for three examples of end-point determination

Uncertainty Evaluation of the Relative measurement uncertainty


components individual component
Glucose Magnesium Urea
(0.056) (0.023) (0.060)

Due to photometric As Type A standard uncertainty (run-to-run vairation)


system and Type B standard uncertainty (certificate of calibra- 0.018 0.017 0.060
tion)
Due to CRM As Type B standard uncertainty (certificate of CRM) 0.026 0.025 0.022
Due to volume of the As Type A standard uncertainty (run-to-run variation)
pipette and Type B standard uncertainty (manufacturer's specifi- 0.002 0.002 0.002
cation)
Due to calibration As described in [2] 0.002 0.011 0.002
Combined uncertainty Square sum of individual standard uncertainties of above
0.029 0.032 O.!)64
components
Overall
uncertainty k=2 0.058 (J.()64 0.12R
222 M. Buzoianu

Fig.3 Comparison of stand-


ard measurement uncertainties 0.1
(rei) evaluated for nitrite, iron
0.08 • Photometer uncertainty
and glucose determination, us-
0.06

1l
ing different photometric sys- [J N02 determination

L
tems 0.04 ;
[] Fe determination
0.02 III Glucose determination
o J.4 14 l..d ~
2 3 4 5 6 7

Instnunents

and iron in water, and glucose in human serum. In each tween the uncertainty of upper and lower measure-
case a standard measurement uncertainty was evalu- ments is very important. For physical standards used to
ated as described above, the results are illustrated in calibrate photometric systems in legal metrology the ra-
Fig. 3 (light-grey columns). The left-hand column in tio of 3 is most commonly followed. For concentration
each group showns the photometric uncertainty, evalu- calibrations this ratio usually does not exceed 1 or 2.
ated from the manufacture's specifications. Note that
for Fe determination using instruments of the same
photometric accuracy, the standard measurement un- Conclusions
certainty (reI) varied from 0.060 to 0.120.
In addition, note how important the confidence in- This paper has examined the importance and legal im-
terval from Table 2 is when judging the compliance plications of measurement uncertainty statements in
with limits. Measurement results from instruments 1, 3 environmental chemistry and in the public health sec-
and 5 need individual consideration if the limit is set tor.
with some allowance for measurement uncertainty. It is now accepted that the quality of an analytical
Measurement uncertainty also has a major influence result relies on the uncertainty of the quoted value,
on the traceability chains related to legal spectropho- evaluated mainly from the calibration and reproducibil-
tometric measurements. In such situations both the ity of the measurement system, and from the uncertain-
spectrophotometers and CRMs should be traceable, i.e. ty of calibration standards. But, evaluation of the over-
they need to be calibrated in a proper manner and with all uncertainty follows a complex procedure, which is
an adequate uncertainty. In this respect the ratio be- influenced by the skill of the analyst.

References

1. Buzoianu M (199H) Fresenius J Anal 4. ISO H466 (1990) 1 Qualite de I'eau - 6. Buzoianu M, Aboul-Enein H- Y (1997)
Chern 360:479--4H5 Etalonage et evaluation des methodes Accred Qual Assur 2: 1H6-192
2. Buzoianu M, Aboul-Enein H- Y (1997) d'analyse et estimation des caracteres 7. Duta S, Buzoianu M (1996) Compara-
Accred Qual Assur 2:11-17 de performance. Evaluation statistique bility of spectrophotometric measure-
3. Guide to the expression of uncertainty de la fonction linaire d'etalonage. ISO, ment results in the Romanian Institute
in measurements, ISO (1993), Geneva Geneva of Metrology. Proceedings of Central
5. EURACHEM Guide: Quantifying un- European Conference on Reference
certainty in analytical measurement, Materials, CERM '96', Slovakia
1st edn (1995) Laboratory of the Gov-
ernment Chemist, London
Accred Qual Assur (200 I) 6: 160-163
© Springer-Verlag 2001

Adriaan M.H. van der Veen Uncertainty evaluation in proficiency testing:


state-of-the-art, challenges, and perspectives

Abstract The evaluation of mea- the measurement uncertainty of the


surement uncertainty, and that of un- PTRV (proficiency test reference
certainty statements of participating value). Furthermore, the use of this
Presented at the EURACHEM/EQUALM laboratories will be a challenge to be PTRV and its uncertainty estimate
Workshop "Proficiency Testing in Analyti-
cal Chemistry, Microbiology and Labora-
met in the coming years. The publi- for assessing the uncertainty state-
tory Medicine", 24-26 September 2000, cation of ISO 17025 has led to the ments of the participants for the two
BonIs, Sweden situation that testing laboratories models will be discussed. It is con-
should, to a certain extent, meet the cluded that in analogy to Key Com-
same requirements regarding mea- parisons it is feasible to implement
surement uncertainty and traceabil- proficiency tests in such a way, that
A.M.H. van der Veen ity. As a consequence, proficiency the new requirements can be met.
Nederlands Meetinstituut, test organizers should deal with the
Schoemakerstraat 97, issues measurement uncertainty and Keywords Proficiency testing·
2628 VK Delft. traceability as well. Two common Measurement uncertainty·
The Netherlands
e-mail: avdveen@nmi.nl statistical models used in proficiency Reference value· Consensus value·
Tel.: +31-15-269-1733 testing are revisited to explore the Assessment of laboratories
Fax: +31-15-261-2971 options to include the evaluation of

Introduction this comparison is to see how these developments affect


the assessment of the participants' results. Obviously, if
The current practice in proficiency testing differs consid- a laboratory performs well today, it should also do so to-
erably from the practice in comparisons in the calibra- morrow. This holds only for the reported value; as the
tion area. This is not caused by differences between cali- laboratory has to deliver an uncertainty statements there
bration and testing; it finds its origin in the fact that most is still the option of performing unsatisfactorily on this
test results are at best accompanied by an indication of part.
their repeatability, whereas calibration results come with The "Guide to the expression of uncertainty in mea-
an uncertainty statement. In view of the new ISO 17025 surement" (GUM) [2] provides the framework for doing
[I], this difference will disappear, leaving a task for the uncertainty calculations. It does not distinguish between
proficiency testing providers in redesigning their ser- physics, chemistry, or biology; neither does it between
vices. Uncertainty calculations will playa dominant role calibration and testing. This observation is very impor-
at all levels of proficiency testing in the near future. The tant, as it allows the option of using well designed ap-
laboratories are required to express their uncertainty, and proaches of key comparisons, or other comparisons in
the organizer of the proficiency tests will be required to the calibration area. The nature of the problems in de-
evaluate the uncertainty statements delivered. signing a proficiency test does not differ from that in the
This paper aims to set the frame for the newly de- calibration area. Problems like obtaining a reference
signed proficiency tests. Furthermore, it will compare value, expressing its uncertainty, and dealing with co-
the proposed new practices with classical proficiency variances and correlations are all the same.
testing as it is carried out today. An important aspect of
224 A.M.H. van der Veen

Basic considerations for evaluating measurement The other mainstream design, in fact with a PTRV
uncertainty based on prior measurement, is always easier to imple-
ment. The understanding of what is going on during the
The basis for proficiency testing is descrihed in ISO estahlishment of reference values is usually better than
Guide 43-1: 1997 [3]. One of the tools necessary to as- in the case of consensus values: consensus values are
sess the performance of the participating lahoratories is often used in cases too complex to be handled by refer-
an assigned value, which is used as reference point. In ence values. This is often a result from a lack of under-
this paper, the abbreviation PTRV (proficiency test refer- standing, in terms of modeling, of the measurement
ence value) will be used for this purpose. Classically, problem. Properties of the sample, matrix effects, extrac-
there are two ways to obtain a PTRV: tion/destruction yields, etc. all contribute greatly to this
lack of understanding. All these aspects, that may greatly
I. By prior measurement ("reference value") influence the measurement results and therefore also
2. From the participants' results ("consensus value") their uncertainty, may lead to the conclusion that work-
Irrespective of the model chosen, the GUM [2] provides ing with a consensus value is inevitable. So, this lack of
a framework for the evaluation of the measurement un- understanding has more to do with the state-of-the-art in
certainty with respect to the PTRY. From a fundamental measurement science than with the skills of the team
point of view, there is no difference between the two operating the proficiency test.
ways of obtaining a PTRY. A practical example of work- The topic of correlation between measurement results
ing out the establishment of a PTRV using prior mea- is a very critical one, and it is gaining more and more in-
surement is given elsewhere [4J. Although the process is terest. The assumption of lID-data (independent, identi-
not uncomplicated, the estimation of measurement un- cally distributed data) is easily made, but difficult to ver-
certainty is certainly well feasible. ify, and in most cases highly critical. If data are not 110,
When working with a consensus value, the philoso- most of the statistics known do not work. Often, the
phy is not different: the GUM can be implemented problem is not so much in the distribution, it is more in
straightforwardly, as soon as the establishment of the the (in)dependence. Dependent data can already be ob-
consensus value is defined appropriately. There are how- served in cases where all laboratories use the same pure
ever some practical difficulties to be overcome, which substances for their calibration, for instance. This hap-
have mainly to do with the quality of the participants' pens, for example, in PAH-analysis, where there is only
data. It should be noted first that the quality of the PTRV one series of certified pure substances available. Obvi-
is directly dependent on the quality of the participants' ously, the purity data of these substances cannot be treat-
data. This will be reflected by the uncertainty of the ed as being independent.
PTRV as well. A further problem is the presence of sus- Both in testing and in calibration, correlation of data
picious results (e.g., outliers). It is not acceptable in a plays an important role. The consequence of data being
proficiency test to work without some policy to treat out- correlated and disrespecting this leads to wrong uncer-
liers. tainty estimates. The worst part of the message is that it
In this paper, the establishment of a PTRV through is even not known whether this leads to over- or underes-
consensus among participants will be revisited. There are timation problems. As a result, it will just not work to
a few different cases to be considered, in fact: ignore correlations. A safe practice is to drop the as-
sumption of independence, and to work from there. It
I. Results with credible uncertainty statement does make life somewhat more complicated under cer-
2. Results with non-credible uncertainty statements tain circumstances, but underestimation problems will be
3. Results without uncertainty statements avoided.

PTRV through consensus Case of credible uncertainty statements

Establishment of a PTRV through consensus is more The first important case to be considered is the case of
complicated than through prior measurement. The reason credible uncertainty statements. The development of a
for this is that it is more difficult to develop a set of as- procedure for the calculation of the consensus value does
sumptions and assertions that is in compliance with the not differ from an approach suggested for evaluating key
data obtained, and a sufficient basis on which to develop comparisons [5] which has also been demonstrated to
an algorithm at the same time. The days are gone when work for the certification of reference materials [6]. In a
all data from all participants could be thrown into a big recent paper by this author, an implementation of this
"hat" and that automatically the consensus value would recipe has been given for the case of reference materials.
come out. Building consensus values is probably one of A disadvantage of the method is that a full description of
the most complex tasks to be carried out by the organizer. all measurement models is required. This is - apart from
Uncertainty evaluation in proficiency testing: state-of-the-art, challenges, and perspectives 225

the considerable extra effort - undesirable for another These uncertainties are considered to be more or less the
reason: it is far away from the present philosophy of pro- same for all participants. The standard deviation s is the
ficiency testing as it violates the principle to work under just the standard deviation of the means of the laboratory
"normal conditions". means, whereas m is the mean of these laboratory means.
The crux in designing an evaluation method is in the p denotes the number of laboratories. Further treatment
treatment of the data from the laboratories, in relation to of data can take place as usual, including outlier/strag-
the issue of correlations between results. In principle, for gler testing and/or removal if considered appropriate. It
each laboratory pair in the proficiency test, the covari- should be noted that the larger the proficiency test (P),
ance should be computed. To the full extent, this has the smaller the first term in the expression for the uncer-
been established elsewhere [5, 6]. Here, a simpler meth- tainty, so the more important the second term becomes.
od will be proposed. The task for the statistician respon- This is a serious disadvantage of the approach, and can-
sible for the evaluation of the proficiency test is to make not be solved easily, due to apparent problems in the
a fair estimate of the degree of correlation between two uncertainty estimation.
laboratory results. In order to make such an estimation, The method can obviously also be carried out with
the organizer should have some insight in the methods, robust estimation techniques, like for instance the use of
chemicals, and standards used, etc. In most proficiency the median and the (normalized) median of absolute
tests, such information is obtained through an inquiry deviations, MADe' The procedure remains the same, and
and/or regular participant-organizer communication. usually the results from robust estimation techniques do
Instead of requesting all measurement models from not differ significantly from those after an evaluation
all laboratories to be reported like in the case of refer- using classical statistical techniques [7, 8].
ence materials [6], the statistician should make a conser- The evaluation of the performance of the laboratories
vative estimate of the (possible) degree of correlation of can now take place as in the case of the credible uncer-
results. This conservative value should flow in into the tainty statements, as the uncertainty of the consensus
evaluation method as proposed for the reference materi- value is now available, and so are all uncertainty state-
als, and the calculation can be started. Using the metho- ments from the laboratories.
dology of looking at the degrees of equivalence [5, 6],
the unsatisfactory results can be removed and the con-
sensus value can be established. Then, with the consen- No uncertainty infonnation available
sus value after removal of unsatisfactory results, the
results of the laboratories can be assessed. In several cases it may still be impossible to come up
with an uncertainty statement. This is probably the worst
situation, as the customer of the laboratory does not have
Case of non-credible uncertainty statements any indication about the reliability of the reported data.
In the absence of uncertainty data it is obviously impos-
This case cannot be compared with the case of credible sible to work with anything else than the reported labora-
uncertainty statements. The problem is that the organizer tory averages. It still leaves the organizer of the profi-
of the proficiency test gets a lot of information, but the ciency test with the task of estimating the uncertainty of
value of this information is to a certain degree question- the consensus value. Typically, one could proceed as fol-
able. Obviously, the judgment as to whether information lows. The uncertainty at the level of a laboratory can be
is credible or not is something that must be decided from computed from
case to case, but always beforehand. If, during a profi- L
ciency test, it appears that the wrong decision has been u 2 (y) = s2 + i~ur()ther (2)
taken, then it is not an easy task to do a repair: the dan-
ger of violating other assumptions is great. Furthermore, where all symbols have the same meaning as in the pre-
it leaves the participants in doubt about the outcome of vious case. The major difference is that the division by p
the proficiency test, something to be avoided at all cost. has vanished. This is a necessity, as only the reported
If the uncertainty statements are not credible, it is bet- value of the laboratory (y) can be assessed (there is no
ter to refrain from using the uncertainty information at uncertainty information).
all for the establishment of the consensus value. It is bet- In this case, the well known Z-score can still be used:
ter practice to use some kind of approximation, like for
m-y
instance the following formula: Z = u(y) (3)
S2 L
= p + i~lu~(}ther
1
u 2 (m) (I) to assess the performance of the laboratories. The esti-
mation of the uncertainty of a "typical" laboratory is a
where the last term reflects those uncertainty sources real burden, as the organizer must find ways to come up
other than those randomized in the proficiency test. with an uncertainty statement in a complete lack of in-
226 A.M.H. van der Veen

formation. This situation should be avoided, or circum- ing area in the same way as comparisons in the calibra-
vented by working with fixed limits in the performance tion area. The nature of the two comparisons is exactly
characteristics. This is a completely different philosophy, the same: the problems of credible uncertainty state-
and outside the scope of this paper. ments as well as that of correlated variables also exist in
both cases. The outcome of the restyled proficiency test
must not differ from the classical approach, provided
Role of homogeneity and stability of PTMs that the same assumptions are used and that they are
"translated" correctly in the model.
Similarly to the uncertainty of the property values of Uncertainty calculations in the testing area are no
(certified) reference materials, the uncertainty of the longer completely different from those in the calibration
property values of PTMs (proficiency test materials) area. There are differences, and both areas have their
should also include the between-bottle homogeneity [9] specific problems. There is a big task ahead for profi-
and short- and long-term stability [10]. It should be ciency testing organizers in adapting to the new situa-
noted that (1) the stability of the material is only of con- tion, but they can borrow a lot from existing techniques
cern as long as the comparison is ongoing and (2) short- made available in comparisons in the calibration area. It
term stability might impose even greater problems than will bring probably the science of experimental measure-
in the case of CRMs. This is due to the fact that PTMs ment and the science of uncertainty evaluation more
are often more like "real-world" samples, in a sense that closely and more consistently together, which will im-
the measures taken to improve stability are less severe prove the learning cycle in proficiency testing consider-
than for several groups of CRMs. The inclusion of these ably. It will give a boost to the understanding of how
uncertainty components in the uncertainty of the PTM is measurement systems behave, and this will allow for
analogous to the uncertainty model established for refer- more direct and better heading actions if method im-
ence materials and is described elsewhere [6, II]. provement is necessary.

Conclusions

In conclusion, it is demonstrated that practical ap-


proaches are at hand to run proficiency tests in the test-

References
I. ISO (1999) International Organization 5. Nielsen L (1999) Evaluation of mea- 9. Van der Veen AMH, Linsinger TPJ,
for Standardization ISO 17025: Gen- surement intercomparisons by the Pauwels J (2001) Uncertainty calcula-
eral requirements for the competence method of least squares. DFM Rep 99- tions in the certification of reference
of testing and calibration laboratories. R39, presented at the EUROMET materials. 2. Homogeneity study.
ISO Geneva workshop on uncertainty calculations Accred Qual Assur 6:26-30
2. ISO (1995) BIPM, IEC. IFCC. ISO, in key comparisons, Teddington, 10. Van der Veen AMH, Linsinger TPJ,
IUPAC, IUPAP, OIML: Guide to the Nov 1999 Lamberty A, Pauwels J (2001) Uncer-
expression of uncertainty in measure- 6. Van der Veen AMH (2000) Determina- tainty calculations in the certification
ment, 1st edn, 2nd corrected print. tion of the certified value of a refer- of reference materials. 3. Stability
ISO Geneva ence material appreciating the uncer- study. Accred Qual Assur (in press)
3. ISO (1997) International Organization tainty statements obtained in the col- II. Van der Veen AMH, Linsinger TPJ,
for Standardization: ISO/IEC Guide laborative study. Presented at AMCTM Schimmel H, Lamberty A, Pauwels J
43-1: 1997: Proficiency testing by in- 2000, Monte de Caparica, May 2000 (200 I) Uncertainty calculations in the
terlaboratory comparisons - Part I: 7. Van der Veen AMH, Broos AJM certification of reference materials.
Development and operation of profi- (1996) Preparation and characterisation 4. Characterisation and certification:
ciency testing schemes. ISO Geneva of coal samples and maceral concen- Accred Qual Assur (in press)
4. Van der Veen AMH, Horvat M, trates for studies on gasification and
Milacic R, Buacr T, Repinc U, Scancar combustion reactivity of coals in com-
J, JaCimovic R (2001) Operation of a bined cycle processes. Draft Final Rep
proficiency test of trace elements in ECSC 7220/EC-036, Eygelshoven, NL
sewage sludge with reference values. 8. Cox MG (1999) A discussion of ap-
Accred Qual Assur (submitted for proaches for determining a reference
publication) value in the analysis of key-compari-
son data. NPL Rep CISE 42/99, Tedd-
ington, UK
Accred Qual Assur (199/1) 3: 69-7/1
© Springer-Verlag 199/1

Michel Gerboles Uncertainty calculation and


Elias Diaz
Alberto Noriega-Guerra implementation of the static volumetric
method for the preparation of NO and
502 standard gas mixtures

Abstract The European Reference p Pressure of the injected pure compo-


Laboratory of Air Pollution imple- nent (NO or S02)
v Volume of the injected pure compo-
ments the static volumetric method nent (NO or S02)
for the preparation of nitrogen P Pressure in the vessel
monoxide and sulphur dioxide ref- V Volume of the vessel
erence standard gas mixtures. Ac- Uco Standard uncertainty of type B error
cording to the new ISO guide for of the purity of the gas
Uvl Standard uncertainty of type A error
the expression of uncertainty, the of the volume of the syringe (balance re-
uncertainty of these standards is up peatability)
to 0.8% for nitrogen monoxide in Uv2 Standard uncertainty of type B error
the range 100 to 600 ppbv, and up of the volume of the syringe (balance lin-
earity)
to 0.4% for sulphur dioxide in the Uv3 Standard uncertainty of type B error
range 200 to 400 ppbv. The values of the volume of the syringe (pure gas
presented in the present paper sug- diffusion)
Upl Standard uncertainty of type A error
gest that there is a 95% probability
due to the pressure sensor of the trans-
of the true value lying within the ferred component
interval specified. To attain such U p2 Standard uncertainty of type B error
low uncertainty values, the stand- due to over-pressure of the transferred
ard procedure for the implementa- component
tion of the static volumetric meth- Uv Standard uncertainty of type B error
of determination of the volume of the
od must be rigorously followed, vessel
and instruments must be carefully U p1 Standard uncertainty of type A error
maintained. of the pressure sensor in the vessel
M. Gerboles (lEI) . A. Noriega-Guerra Up2 Standard uncertainty of type B error
of the pressure sensor in the vessel
European Reference Laboratory of Air Key words Uncertainty calculation Up] Standard uncertainty of type B error
Pollution (ERLAP), Commission of the Calibration . Static volumetric
European Communities, Joint Research of the pressure in the vessel due to the
Centre, 1-21020 Ispra, Italy method . Air pollution monitoring lack of temperature equilibrium
Tel.: + 39-332-7/15652; Fax: + 39-332-7/1 Uc Combined standard uncertainty on
the volume concentration of the diluted
e-mail: michel.gerboles@jrc.it Glossary of symbols standard gas mixture
E. Diaz C I Volume concentration of the diluted k Coverage factor (k = 2)
Beca de Ampliacion de Estudios del standard gas mixture U Expanded standard uncertainty on the
F.I.S., Instituto de Salud Carlos III, Cc, Volume concentration of the pure volume concentration of the standard gas
Madrid, Spain component "" I mixture

trachloromercurate (TCM)/pararosaniline [3] method


Introduction is proposed as the reference methods for sulphur diox-
The monitoring of atmospheric sulphur dioxide and ni- ide determination, and chemiluminescence [4] as the
trogen dioxide is regulated by the European directives reference method for the determination of nitrogen ox-
801779/CEE [1] and 85/203/CEE [2]. In these, the te- ides. However, technological progress has meant that,
228 M. Gerboles . E. Diaz . A. Noriega-Guerra

throughout the European air pollution monitoring net- fraction concentration and can be expressed by the for-
work, UV fluorescence [5] has virtually replaced TCM mula:
for the analysis of atmospheric sulphur dioxide.
The European Reference Laboratory of Air Pollu- (1)
tion (ERLAP) is the reference laboratory for atmos-
pheric pollution serving the European Commission.
One of its duties is to maintain European standards for
the calibration of N0 2 and S02 methods of analysis. Implementation at the ERLAP laboratory
ERLAP has chosen the permeation method [0] evalu-
ated by gravimetry to produce reference standards for The static volumetric method is described in detail in
both N0 2 and S02 methods, but uses the static volu- the Guidelines of the VDI 3490 Blatt 14 [8], and it has
metric method [7] to cross check the permeation meth- been successfully tested for over 20 years at the UBA-
od. Pilot Station of the Federal Environmental Agency of
This paper describes ERLAP's implementation of Germany. The static volumetric system implemented
the static volumetric method and deals with the uncer- by ERLAP was devised and developed at the UBA Pi-
tainty of standards generated by this method. Practical lot Station and has been tested at the ERLAP laborato-
examples of the calculation of uncertainty are given. ry for more than 3 years. The ERLAP laboratory uses
this method for the preparation of S02 and NO stand-
ard calibration gas mixtures. The static volumetric sys-
Principle of the method tem used for this purpose is shown in Fig. 1.

General principles
Mode of operation
At atmospheric pressure p and room temperature, a
known volume v of the pure component to be analysed Borosilicate glass mixing vessel
(Co == 1) is transferred with a syringe to a large borosili-
cate vessel of known volume V filled with a selected Experiments at the UBA Pilot Station have clearly
carrier gas. The vessel is then filled with the selected shown that, for components such as S02 and NO, wall
carrier gas to pressure P, which is usually about 1.5 atm effects from borosilicate glass were negligible when
to facilitate use of the mixture. The mixture can be used preparing concentrations of 100 ppbv and above in dry
once temperature has returned to ambient tempera- carrier gases [9]. The volume of the vessel was deter-
ture. Under these conditions, the volume concentration mined as 0.11184 m3 ± 0.1% by a replicated process of
of the component C j is practically equal to the molar filling with water and deriving volume from weight of

Fig. 1 Static volumetric system

~
.j!
Pressure gauge Temperature gauge
~
~

/ / IS Dilution Gas

.1
;z:

Fan

Borosilicate Vessel (1101) _ _ _"..... Venting


Vacuum Pump

Reference
Analyser
0
ppb
Pure Gas
NO or S02
Uncertainty calculation and implementation of the static volumetric method for the preparation 229

liquid contained within the vessel. To check for leaks in Syringes


the vessel, pressure stability was periodically deter-
mined at 10-2 mbar and 1600 mbar. Pure gases were injected into the vessel with 100-, 50-
and 25-,.d Hamilton series 1800 syringes. Special ER-
LAP mechanisms were used to improve the repeatabil-
Fan ity of the filling process. The volume of each syringe
was determined by filling with water and deriving vol-
A fan mounted inside the vessel to help mix the pure ume gravimetrically (as described above for the boro-
gas and carrier gas comprised a stainless steel propeller silicate glass mixing vessel).
mounted on a stainless steel axis. The fan was powered
by an electric motor.
Procedure for the preparation of a mixture of NO
with nitrogen
Vacuum pump
Cleaning the vessel
This was an oil-free Alcatel molecular pump model
DRYTEL 100C, capable of creating a residual pressure Check that the contents of the vessel are at ambient
in the vessel of less than 10-2 mbar. pressure, and, if not, open the outlet valve to allow the
pressure to equilibrate. Start the molecular pump and
open the pump valve sufficiently to establish a pressure
Septum and other external connections inside the vessel of about 10-2 mbar.

The septum was a silicone rubber disk. Other external


valves were made of stainless steel or polytetrafluoroe- Filling the vessel with carrier gas
thane (PTFE).
Open the reducing valve on the carrier gas (N2) fol-
lowed by the dilution valve until the pressure inside the
Temperature vessel reaches about 1050 mbar. Open the outlet valve
to allow the chamber to return to ambient pressure and
Temperature within the vessel was measured with a then close it. Wait 10 min for the temperature and pres-
PT100 stainless steel probe connected to a Thesto sure to stabilise, and then open the outlet valve to allow
Therm 9000 process unit. The overall uncertainty of the excess N2 to escape. Record the values of temperature
system was better than 0.1 K. Ti and pressure p.

Pressure Filling the syringe with pure gas

The pressure sensor was a Druck model DPI 510, with This involves the use of a small stainless steel container
a precision of 0.025% and an accuracy of 0.04%. (the septum chamber, see Fig. 2), the integrity of which
is maintained by two manual stainless steel valves. One
valve connects directly to the reducing valve of the
Pure gas pure NO cylinder and the other to a vacuum pump.
Turn on the vacuum pump after first checking that
The pure gases used for the preparation of NO and S02 all valves between the pump and the NO cylinder are
standard mixtures were manufactured by Messer Grie- closed. Introduce an empty syringe into the chamber
sheim, with a purity better than 99.5% for NO and (through a septum similar to that in the mixing vessel)
99.98% for S02. NO with purity better than 99.8% may and carry out the following sequence to ensure that the
be available in the future. syringe is filled with pure NO.
(a) Close both septum chamber valves, the reducing
valve and the NO cylinder.
Carrier gas (b) Open the septum chamber valves for 20 s to "clean"
the septum chamber and tubes.
For the preparation of NO standards, cylinders of chro-
matography-grade N2 (NO free) were used. For the
preparation of S02 standards, zero air produced by the
ERLAP zero air generator (S02 free) was used.
230 M. Gerboles . E. Diaz . A. Noriega-Guerra

Increasing pressure in the mixing vessel

Open the reducing valve on the cylinder of carrier gas


(N 2), and the dilution valve to establish a pressure of
about 1550 mbar (read on the Druck model 510 instru-
ment). Wait 10 min for the temperature in the mixing
vessel to stabilise at the initial value T j • (read on the
Thesto Them 9000 instrument) and at this point record
the value of pressure P.
Calculating the theoretical value - The theoretical
value is calculated using Eq. 1.

Procedures for S02

Apart from the use of zero air as carrier gas instead of


N2, the procedure for the preparation of S02 standard
gas mixtures is exactly the same as for the preparation
of NO standard gas mixtures.
Fig. 2 Syringe filling system

(c) Close both septum chamber valves and open the Correction for the level of purity of the pure gases
NO cylinder.
(d) Close the NO cylinder and open the reducing valve The pure gases (NO and S02) are supplied by Messer
until a pressure of 2 bar is established (read on indica- Griesheim Italia. The NO cylinder is of the Nitric Ox-
tor II of the reducing valve). ide 2.5 F1S type with a certified purity ;::: 99.5%. The
(e) Close the reducing valve and open the first septum S02 cylinder is of the Sulphur Dioxide 3.8 F1S type
chamber valve to send pure NO into the septum cham- with a certified purity;::: 99.98%. It was decided to ap-
ber. Close the first septum chamber valve. ply a correction factor to the reference value of the NO
(f) Fill the syringe with pure NO and drain the syringe. standards of -0.25%, with lower and upper limits of
Repeat three times. -0.5 and 0%. For S02 standards, the correction factor
(g) Fill the syringe with pure NO and open the second was -0.01 %, with lower and upper limits of -0.02 and
septum chamber valve to clean the septum chamber 0%. The purity certified by the manufacturer was verif-
and syringe. Drain the syringe and close the second ied by FT-IR spectrometry.
septum chamber valve.
(h) Repeat steps c, f, g.
(i) Repeat steps c, d and c. The new ISO method for calculating uncertainty
G) Slowly fill the syringe completely with pure NO and
wait for one min before removing from the septum In general, an analytical measurement provides only an
chamber. estimation of the value of a determinant, and must be
(k) Switch off the vacuum pump after first opening its accompanied by a quantitative statement of the uncer-
venting valve (to avoid oil entering the tubing). tainty attached to the estimate.
During these operations, it is important not to touch In 1993, ISO published a new guide to the expres-
the glass of the syringe or the septum chamber to en- sion of uncertainty [10], in which each component con-
sure that these remain at ambient temperature. tributing to the uncertainty of a measurement is allot-
ted an estimated uncertainty, termed standard uncer-
tainty (u;) equal to the positive square root of the esti-
Injecting the pure NO into the mixing vessel mated variance u/.
The uncertainty associated with a measurement gen-
Take the syringe filled with pure NO out of the septum erally consists of several components, which may
chamber, and, 10 s after adjusting it to the required vol- grouped into two categories:
ume (by way of the ERLAP mechanism), slowly inject A. Those which are evaluated by statistical meth-
the pure NO into the mixing vessel. Once the syringe is ods
empty, quickly remove it (to avoid possible loss of pure B. Those which are evaluated by other means.
NO along the surface of the syringe needle) and turn
on the fan for 2 min to aid mixing.
Uncertainty calculation and implementation of the static volumetric method for the preparation 231

Type A evaluation of standard uncertainty The partial derivatives are referred to as sensitivity
coefficients.
Type A evaluation of uncertainty may be based on any It is assumed that corrections have been applied to
valid statistical method for treating data. For example: compensate for each systematic effect which significant-
- Calculating the standard deviation of the mean of a ly influences the measured value, and that every effort
series of independent observations has been made to identify such effects.
- Using the method of least squares to fit a curve
- Carrying out an analysis of variance ANaYA to
quantify random effects. Expanded uncertainty
As an example of Type A evaluation, consider an
input quantity Xi whose value is estimated from n inde- What is often required is an expression of uncertainty
pendent observations Xu obtained under identical con- to define the limits associated with a measured value y
ditions of measurement. In this case, the estimated within which the value Y is confidently believed to lie.
standard deviation of the mean is the positive square The measure of uncertainty intended to meet this re-
root of: quirement is termed the expanded uncertainty, sug-
gested symbol U, and is obtained by multiplying ll,(y)
(3) by a coverage factor, suggested symbol k. Thus
U = k ll,(y) and is confidently believed that y - U ::5 Y
::5 Y + U, which is commonly written Y = y ± u. In
general, the value of the coverage factor k is chosen on
the basis of the desired level of confidence to be asso-
Type B evaluation of uncertainty ciated with the interval defined by U = k ll .. Typically
k is in the range 2 to 3. When the normal distribution
Type B evaluation of standard uncertainty is usually applies, U = 2 ll, defines an interval having a level of
based on scientific judgement and may make use of all confidence of approximately 95% and which is consis-
available relevant information, including: tent with current international practice.
- Previously measured data
- Available information concerning the behaviour and
properties of materials and instruments involved Uncertainty budget
- Manufacturer's specifications
- Calibration data and other available information. Purity of the gases Co
As an example of Type B evaluation, consider an input
quantity Xi whose value is estimated from an assumed As described in "Correction for the level of purity of
rectangular probability distribution with lower limit a- the pure gases", for NO, the lower and upper limits of
and upper limit a +. In this case the input estimate is the purity correction are 99.5-100%. The probability
usually expressed by: that the purity lies in this interval is 100% (rectangular
xi=(a+ +a_)/2 (4) distribution). The best estimate of the standard uncer-
tainty of the quantity is then the positive square root
and the standard uncertainty associated with Xi is: of:
ll(x;)=a/-{3" where a=(a+ -a_)/2 (5)
(a+ -a_)2 = (1-0.995)2 =2.08310- 0 (7)
12 12
For S02, the lower and upper limits of the purity cor-
Combined standard uncertainty rection are 99.8-100. The best estimate of the standard
uncertainty of the quantity is then the positive square
The combined standard uncertainty of a measured val- root of:
ue (ll,,) is assumed to correspond to the estimated
standard deviation of the result. In the case of non-cor- llj;l= (a+ -a_)2 = (1-0.998)2 =3.33310 -7 (8)
related components, it is derived by combining individ- 12 12
ual standard uncertainties ll;, which may arise either
from type A or type B evaluations. The method is often
referred to as the law of propagation of uncertainty, Y olume of the syringe v
and is expressed as:
The volume of each syringe (with ERLAP mechanism)
(6) was determined by filling with water and deriving the
volume from the weight of liquid (measured with a
232 M. Gerboles . E. Diaz . A. Noriega-Guerra

Table 1 Measured volume of the syringes. Average and standard Pressure of the pure gas p
deviation of 15 replicate measurements, linearity balance devia-
tion U~2 and diffusion deviation through the needle U~3
It is important that pure gas is injected at room temper-
Syringes Average (I) U~1 W) ature following the procedure described in "Mode of
operation" and paying special attention to the precau-
1 (NO) 24.71 10- 6 2.19 10- 15 1.33 10- 16 5.09 10- 17 tions relating to pressure.
2 (NO) 39.86 10- 6 2.08 10- 15 1.33 10 -16 1.32 10 -16
3 (NO) 78.43 10- 6 8.91 10 -15 1.33 10 -16 5.13 10- 16 Room pressure was measured with a barometer
4 (NO) 99.23 10- 6 5.11 10- 15 1.33 10 -16 8.21 10 -16 manufactured by Lambrecht Klimatologish Messtech-
5 (S02) 39.65 10- 6 2.16 10- 15 1.33 10 -16 1.31 10- 16 nik (Gotlingen) model 00.06040.100000, the specified
6 (SOz) 49.16 10- 6 2.44 10 -15 1.33 10 -16 2.01 10 -16 uncertainty Upl of which is ± 0.25 mbar. Pressure is in-
7 (S02) 69.29 10- 6 2.36 10 -15 1.33 10 -16 4.00 10 -16
fluenced by the time interval between extracting the sy-
ringe from the septum chamber and injecting into the
vessel. Appendix 2 shows the results of S02 measure-
Mettler A T201 balance) contained in the syringe (cor- ments made with different time intervals between with-
rected for water density of 0.998 g/cm3at 22 Qq. Several drawal of the syringe from the septum chamber and in-
replicate volume measurements were carried out in a jection of pure S02 into the mixing vessel. The results
hysteresis cycle, and the results are shown in Appendix show that there were few differences between injec-
1. Variations in the measured volume reflected variabil- tions after 5- or 15-s intervals, although an transient
ity in the filling and emptying processes and in the per- over-pressure of 0-0.2% cannot be ruled out. Assuming
formance of the balance. The standard deviation of the a rectangular distribution, the best estimate of the
various estimates of v is the square root of u v / and is standard uncertainty of p is the square root of u p / (in
given in Table 1. mbar2):
The balance was also subject to a linearity deviation
from the true value, evaluated by the manufacturer as (4)2 = 1.333 (9)
being ± 0.02 mg (0.02 ml) in the range 0-5 g. Assuming 12
a rectangular distribution, the best estimate of the
standard uncertainty of v is the square root of u v / , giv-
Volume of the vessel V
en in Table 1.
The syringe is filled with pure gas (NO or S02) at a
Total volume was determined by filling with water as
pressure of about 2 bar. When the pure gas is injected
described in "Mode of operation". Assuming rectangu-
into the glass vessel, it must be returned to the ambient
lar distribution, the best estimate of the standard uncer-
pressure without having reacted and undergone any
tainty of V is the square root of u/ (in 1-2) with:
transformation. In fact, the absorption of NO and S02
on the glass walls of the syringe has never been ob- u2=(a+-a_)2 =(111.95-111.73)2 =4.0310-3 (10)
served and is not likely to produce a relevant reduction v 12 12
of the injected volume. No reaction or transformation
of the pure gases has been evidenced so far. However, The 4-cm wide borosilicate glass vessel is not expected
diffusion of the pure gas out of the syringe through the to undergo increases in volume at internal pressure up
syringe needle is observed after the transient period to 1.7 bar.
needed for the syringe pressure to be adapted to the
ambient pressure. The pure gas may leave the syringe
chamber by diffusion through the needle before injec- Pressure in the vessel P
tion into the glass vessel. This diffusion has been inves-
tigated in Appendix 2 and Fig. 3. With a time interval Pressure was determined as described in in "Mode of
of 10 s (+ 15 s of tolerance) before injection, the pure operation". For pressures up to 1500 mbar, the manu-
gas represents only 99.9-100% of the syringe volume facturer claimed a precision of 0.375 mbar, although the
that is injected into the glass vessel. Assuming a rectan- precision of the digital display might be thought to be
gular distribution, the best estimate of the standard un- at least one digit (i.e. 1 mbar). Assuming rectangular
certainty of v is the square root of U v 3 2 , given in Ta- distributions, the best estimate of the standard uncer-
ble 1. tainty of P is the sum of the square root U p l 2 and U p 2 2
Obviously, there are differences between injecting a (in mbar 2):
gas with the syringe and injecting a liquid, but good es- 2 (a+ -a-f (0.75f =0.047 (11)
timates of volume are possible provided the precau- UPl
12 12
tions outlined in "Procedure for the preparation of a
mixture" are followed (especially with respect to the (12)
duration of injection, see Appendix 2).
Uncertainty calculation and implementation of the static volumetric method for the preparation 233

If room temperature were not reached after dilution


had established a pressure of 1.5 atm, P would not be
correctly estimated, as temperatures exceeding ambient
by more than 0.5 °C have been observed to produce de- For our examples, the calculations are summarised in
viations of ± 1 mbar in the final pressure P. Assuming Table 4.
rectangular distributions, the best estimate of the stand-
ard uncertainty of P is (in mbar 2 ): Table 2 Examples of calculation of NO

(a+ _a_)2 (2)2 =0.333 Experi- Co v V p P C1


U~3= (13)
ment 11-1 I mbar mbar ppbv
12 12
1 99.75% 24.76 111.114 993 15119 1311
2 99.75% 39.93 111.114 1000 15411 23()
Uncertainty calculation 3 99.75% 711.59 111.114 996 15011 463
4 99.75% 99.43 111.114 995 1459 605
Tables 2 and 3 below shows four NO calculations and
three S02 calculations according to Eq. 1 based on ex-
perimental data involving different operators. Table 3 Examples of calculation of S02
The uncertainty calculation uses Eq. 2. For non-cor- Experi- Co v V p p
related variables, the combined standard uncertainty ment 11-1 I mbar mbar
IS:
1 99.99% 39.73 111.H4 995 15511 227
2 99.99% 49.26 111.114 992 1595 274
3 99.99% 69.43 111.114 999 1557 3911

Table 4 Uncertainty calculation

af)2 Uz
(ax; NO
syringe 1
NO
syringe 2
NO
syringe 3
NO
syringe 4
S~
syringe 1
S02
syringe 2
S02
syringe 3
components

C1 1311 230 463 605 227 274 3911


in ppbv

pV)2 UZ·
(-PV Gas purity 4.0 10 -20 1.1 lO- lq 4.5 1O- IY 7.6 1O- lq 1.7 10 -20 2.5 10 -20 5.3 10- 20
<>

Syringe volume 7.4 10 -20 7.0 1O- 1Y 3.3 1O- 1Q 2.2 1O- IY 7.9 10 -20 11.6 10- 20 9.5 10- 20

Repetition of measurements 6.11 IO -20 6.9 1O- IY 3.1 10- 19 1.9 1O- IY 7.0 10 -20 7.5 10 -20 7.11 10 -20
Linearity of the balance 4.1 10- 21 4.4 10 -21 4.6 10 -21 4.9 10 -21 4.3 10- 21 4.1 10- 21 4.4 )()-21
Diffusion of pure gas 1.6 10 -21 4.4 10- 21 1.11 10 -20 3.0 10- 20 4.3 10- 21 6.2 10 -21 1.3 10 -20

Syringe 2.7 10 -20 7.4 10- 20 3.0 1O- 1Y 5.2 1O- 1Y 7.3 10- 20 1.1 1O- 1Y 2.2 1O- IY

Barometer repeatability 1.2 10 -21 3.3 10- 21 1.4 10- 20 2.3 10 -20 3.2 10- 21 4.11 10 -21 9.9 10 -21
Over pressure in the syringe 2.6 10- 20 7.1 10- 20 2.9 1O- 1Y 4.9 1O- 1Y 6.9 10- 20 1.0 1O- 1Q 2.1 10- 19

(COPV)2 2 Vessel volume 6.1 10- 21 1.7 10- 20 6.9 10- 10 1.2 1O- 1Q 1.7 10- 20 2.4 10 -20 5.1 10- 20
V2p U v

(C'PV)2 2 Vessel pressure 5.4 10- 21 1.6 10 -20 6.7 10 -20 1.2 10 -20 1.5 10 -20 2.1 10- 20 4.7 10- 2 1)
p2V Up
" ..• 112 PI Random effect of the sensor 3.5 10- 22 1.0 10 -21 4.4 10 -21 11.1 10- 21 1.0 10 -21 1.4 10- 21 3.1 10- 21
" ••• 112 P2 Systematic effect of the sensor 2.5 10 -21 7.3 10 -21 3.1 10- 20 5.7 10 -20 7.0 10- 21 9.11 10 -21 2.2 10 -20
" ... u 2 P3 ~ pressure due to the room 2.5 10 -21 7.3 10 -21 3.1 10- 20 5.7 10- 20 7.0 10- 21 9.11 10- 21 2.2 10 -20
temperature

Ue Combined standard uncertainty 3.9 10- 10 9.6 10- 10 1.1 1O- Q 1.3 lO- Q 4.5 10- 10 5.1 10- 10 6.11 10- 10

U=2u, in Expanded uncertainty 0.11 1.9 2.2 2.6 0.9 1.0 1.3
ppbv and % 0.56% 0.113% 0.47% 0.43% 0.39% 0.37% 0.34%
234 M. Gerboles . E. Diaz . A. Noriega-Guerra

bility of the true value lying within this interval. The


Conclusions
major component contributing to uncertainty was the
determination of syringe volume, the most important
The uncertainty associated with the reference value of aspects of which were the filling and transferring of gas.
NO and SOz standards derived by ERLAP using the This was the case except at high concentrations, when
static volumetric method have been evaluated. The val- NO gas purity became the most important component
ue was calculated to be about ± 0.5% with 95% proba- of uncertainty.

Table 5 Syringe volume

NO Syringe 1 NO Syringe 2 NO Syringe 2 NO Syringe 3 NO Syringe 4 S02 Syringe 1 S02 Syringe 2 S02 Syringe 3
I·Ll f11 f11 f11 f11 f11 f11 f11

24.71 39.9 40.0 7H.49 99.37 39.72 49.22 69.47


24.HO 39.6 39.9 7H.40 99.30 39.Hl 49.2H 69.52
24.69 39.6 40.0 7H.4H 99.50 39.67 49.31 69.42
24BO 39.9 40.0 7H.60 99.50 39.76 49.24 69.45
24.77 40.0 40.0 7H.5H 99.45 39.72 49.20 69.44
24.76 39.9 40.1 7H.65 99.43 39.66 49.37 69.36
24.74 39.9 39.9 7H.69 99.47 39.74 49.23 69.37
24.76 39B 40.1 7H.63 99.44 39.74 49.32 69.39
24.H5 39.9 40.0 7H.52 99.33 39.68 49.22 69.43
24.73 39.9 39.9 7H.56 99.47 39.72 49.27 69.37
24.73 40.2 39.7 7H.60 99.55 39.69 49.28 69.47
24.74 39.9 7H.5H 99.48 39.Hl 49.24 69.47
24.74 40.1 7H.65 99.46 39.69 49.27 69.46
24.71 40.0 7H.79 99.36 39.76 49.25 69.44
24.84 40.1 7H.63 99.36 39.76 49.18 69.36

Table 6 Static volumetric experiments with various time intervals before injection

Date Experi- Time interval p (initial) P (final) Theoretical S02 (B/A) -I

ment before mbar mbar value (A) (AF 21M) %


injection ppbv (B)
ppbv

3/12/96 2 5 1001 1525 458.6 458.7 0.03


3/12/96 3 5 WOO 1523 45H.5 458.7 0.03
3/12/96 4 30 999 1530 455.4 453.2 -0.49
3/12/96 5 30 999 153H 453.2 452.2 -0.20
3/12/96 6 30 999 1530 455.7 455.9 0.05
3/12/96 7 5 WOO 1532 455.5 456B 0.2H
3/12/96 8 30 1000 1535 454.7 455.9 O.2H
4/12/96 9 5 to02 1529 457.H 456.3 -0.34
4/12/96 10 60 1002 1530 457.H 454.4 -0.74
4/12/96 11 5 1002 1570 445.6 446.2 0.14
4/12/96 12 60 to02 1532 456.3 452.6 -0.81
4112/96 13 5 1002 1532 456.3 457.2 0.19
4/12/96 14 60 1002 1530 456.9 456.3 -0.14
4/12/96 15 60 ]()O2 1581 442.2 441.6 -0.12
5/12/96 16 5 1004 152H 45H.7 460.8 0.46
5/12/96 17 120 W04 1537 455.9 449.0 -1.51
5/12/96 18 5 1003 152H 45H.0 457.2 -0.17
5/12/96 19 120 1001 1521 459.H 448.1 -2.54
5/12/96 20 120 1001 152H 457.2 447.2 -2.19
5/12/96 21 5 toOl 1526 457.6 456.3 -t130
6/12/96 22 5 1004 1532 457.2 456.3 -0.20
6/12/96 23 15 1004 1573 445.4 446.3 0.18
6/12/96 24 15 W05 1526 459.H 459 -0.17
6/12/96 25 5 1004 1536 456.9 457.2 0.05
6/12/96 26 15 1004 1519 461.0 46lB 0.17
6/12/96 27 5 1004 1525 459.3 460.0 0.13
6/12/96 2H 15 1005 1525 459.H 460.0 OJ)3
Uncertainty calculation and implementation of the static volumetric method for the preparation 235

Fig. 3 Losses of pure S02 by 1.0% -r----------~----~---------~----_,

diffusion through the needle


of the syringe _'-" _ _ _ _._.J"""'r- ... _._ -_a_-_-.'- ... -.-_ .._.__ -,.. .. -..-:". _____ ._ ... _._ ... :'" _._._._._._.:... ... _ ... _._ .. _. ___ .
0.5%

0.0% t--;---T"-+"""""=±::::=--+-----t------+L-----+-------l
1I -0.5%

I . .. . - . - - -. .
- - - . - - - - - -' - - .. - - - - - - - - -.. -
.
1! -1.0% .
- - _. - . .

J"S -1.5%

) -2.0% . . . --. - - .... - - - - - - - .. _. - - - - - - . .- . -


c

1-2.5% - - - - - . - _. " - - - . _ ... _ .. - - - - - - . - - _. - - - - - - . - ........ - - . - -


- - . - - -,

I~ -3.0%

-3.5% --- - - . - .. , - - - . _.
,
- - - -, - -
I
. . . . . - .,. . . -.
,
- - - - .
....
,
- - . . . . ., •
- -' . - - - --

-4.0% . l . -_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ .___ ..... __ ... _.. ________._ _ _ _ _ _- '

o 20 40 80 100 120

To attain such high levels of confidence on the refer- ume from the weight of liquid (measured with a Mettler
ence value of NO and S02 standards, the following A 1'201 balance) contained in the syringe (corrected for
should be taken into account: water density of 0.998 g/cm 3 at 22°C). The volumes are
- The complex procedure for manipulating the dosing reported in table 5.
syringe must be carefully followed.
- Values specified in the pure gas manufacturer's cer-
tificate must be periodically verified. Appendix 2: 502 measurements plotted against time
- The balance used to weigh the syringe and the pres- interval before injection
sure sensor serving the mixing vessel must be well
maintained to ensure accurate and precise measure- In the present experiments, a UV fluorescence analyser
ments. Traceablility certificates for these instruments manufactured by Environnement SA model AF21 M
must be available. was used. A Hamilton syringe series 1800 with a needle
- Room temperature must remain constant between series n080451 was used and a volume of 78,...,1 (v) was
injection and dilution with carrier gas. injected for all experiments.
The uncertainty associated with the reference value Table 6 shows a series of experiments carried out to
of N0 2 (obtained by ERLAP using the permeation determine the maximum tolerable time delay before in-
method) has been evaluated previously [11] as about jection (see "Procedure for the preparation of a mix-
1 % with 95% confidence limits. This is slightly greater ture"). This time interval depends on the diffusion rate
than that of an NO standard prepared using the static of the S02 through the syringe needle and must be es-
volumetric method. tablished for the methodology to be viable. The results
are presented in the Fig. 3, and show that the perform-
ance of the methodology is not compromised provided
Appendix 1: Volumes dispensed with the syringe the interval is kept below 30 s.

The volumes of the syringes (with ERLAP mechanism)


were determined by filling with water and deriving vol-
236 M. Gerboles . E. Diaz . A. Noriega-Guerra

References

1. Directive du Conseil du 15 juillet 4. Norme internationale ISO/DIS 7996 9. Rudolf W, "Implementation of the
1980 concernant des valeurs limites et (F)/TC 146 (1984) Qualite de l'air - Static Injection Method", EEC con-
des valeurs guides de qualite atmo- Determination des oxydes d'azote tract n° 4108-90-10-ED ISP D
spherique pour I'anhydride sulfureux dans l'air ambiant - Methode par 10. Guide to the expression of uncertain-
et les particules en suspension. Jour- chimiluminescence ty in measurement, ISBN 92-67-
nal officiel des Communautes 5. Norme internationale ISO/CD 10498 10188-9, copyright International Or-
2. Directive du Conseil du 7 mars 1995 (F)/TC 146 (1984) Air ambiant - ganisation for Standardisation,
concernant les normes de qualite de Dosage de souffre - Methode par flu- Printed in Switzerland
l'air pour Ie dioxyde d'azote (85/203/ orescence dans I'ultraviolet 11. Gerboles M, Manalis N, De Saeger E,
CEE). Journal officiel des Commu- 6. Norme internationale ISO 6349 (F) Payrissat M (1996) Report EUR
nautes (1979) Analyse des gaz - Preparation 16432 EN "Study of the long term
3. Norme internationale ISO 6767 (F) des melanges de gaz pour etalonnage stability of N02 Permeation Sources
(1990) Air ambiant - Determination - Methode par permeation and the efficiency of Gravimetry in
de la concentration en masse du diox- 7. Norme internationale ISO 6144 (F) determining their permeation rate"
yde de souffre - Methode au tetrach- (1981) Analyse des gaz - Preparation
loromercurate (TCM) et a la pararo- des melanges de gaz pour etalonnage
saline - Methode volumetriques statique
8. Verien Deutscher Ingenieure, "Mess-
en von Gasen, Priifgase - Herstellen
von Priifgasen nach der Volume-
trisch-Statischen Methode unter Ver-
wendung von Glasbehiiltern", VDI
3490 Blatt 14, November 1985
Accred Oual Assur (20()O) 5: 2X()-2X4
© Springer-Verlag 20()O

Daniela Kruh Assessment of uncertainty in calibration


of a gas mass flowmeter

Abstract A primary calibration the tests and calibration procedure


system was set up in Rafael some conducted for the uncertainty as-
years ago, based on volumetric sessment, the different components
flow rate. The primary standard contributing to the measurement
measures volumetric flow by uncertainties, and the formulas in-
means of the volume change of a volved with volumetric flow fates
dual piston over a specific time in- and with thermal mass flowmeters.
D. Kruh (121) terval. This system serves to cali-
Rafael Calibration Laboratories, brate secondary standards of the
P.O. Box 2250, 31021 Haifa, Israel thermal mass flowmeter type. Cali- Key words Volumetric gas flow
e-mail: danielak([irafael.co.il
Tel.: + 972-4-X794494 bration procedures were prepared rate . Mass flowmeter .
Fax: +972-4-X7942IX and validated. The paper describes Calibration' Uncertainty

collected. This volume divided by the filling time yields


Introduction the volume flow rate. Correcting for the absolute pres-
Flowmeters are widely used in analytical chemistry. Ex- sure and temperature of the gas gives the mass flow
amples of some chemical methods which use flowmet- rate.
ers are: gas chromatography, flow injection analysis, gas The primary system uses the ideal gas law, Eq. 1, as
analysis, monitoring of quality environment, etc. the basis for determining mass flow rates:
There are a number of methods available for per- PV=n*R*T (1)
forming calibration, which may be categorized as pri-
mary or secondary (transfer) techniq ues. where:
P = absolute pressure (mmHg),
V =volume of gas (1),
Primary calibration system n = moles of gas (mol),
R = universal gas constant (62.364 mmHg' 11K I mol),
In a primary system the measurement is based on fun- T = absolute temperature (K).
damental units such as length, mass, temperature, and Assuming constant temperature and constant pres-
time. The basic flow input for the instrument to be cal- sure, by measuring the change in volume over time the
ibrated is determined through measurement of time flow rate is attainable. Methods for measuring the vol-
and gas volume. Our laboratory is equipped with a vol- ume displaced by a gas are based on Eq. 2.
umetric system MKS Califlow A150 (Andover, Mass.,
USA)based on two pistons in the range of 1 standard
Q = VI(tz-t l ) (2)
cubic centimeters per minute (SCCM) to 50 standard where:
liters per minute (SLPM). The distance the piston trav- Q = the volumetric flow rate (IImin),
els over a time interval is measured. Due to the fixed (trtJ) = the elapsed time (min).
cross sectional area of the piston, area times length Generally, one uses Qs, the volumetric flow rate,
equals the volume of the cylinder in which the gas is which refers to standard conditions, standard tempera-
238 D. Kruh

ture and pressure (STP), which are given as 0 C and


760 mm of mercury. The subscript "s" denotes stand-
ard conditions.

Qs=~*Ts* V (3)
Ps T (t2 -(.)
Since allowances must be made for changes in tem-
perature and pressure, the parameters of the perfect
gas law must apply. The precise calculation of the volu-
metric flow rate through a piston meter of this type is
based on Eq. 4:
C*Ts*Pm
(4)
Qs= K* t *Tm *pS
Fig. 1 Schematic structure of a typical thermal mass flowmeter
where:
Qs =the gas flow rate corrected to STP (l/min);
Ts =standard temperature: 273.15 K;
Thermal mass flowmeters have the capability of giv-
P s = standard pressure: 760 mmHg;
ing accurate measurements over a fairly wide range of
Tm = the temperature of the test gas ill the cylinder
temperature and pressure without the need to enter
(K);
pressure or temperature corrections into the calcula-
Pm = the pressure of the test gas in cylinder (barometric
tions. This feature is due to the units that are calibrated
pressure + cylinder pressure) (mmHg);
with reference to standard conditions and which are
C = total number of counts accumulated by a shaft en-
stable from about IS-32°C and from about atmospher-
coder synchronized to a crystal clock (counts);
ic pressure to about 30 PSI for the large volume flow-
t =total time to accumulate C counts (min);
meters (about 25-50 SLPM). Thermal mass flowmeters
K = number of counts per liter (calculated for each cy-
are always calibrated for a particular type of gas. All
linder at the calibration time).
manufacturers list gas conversion factors in their manu-
als for corrections to be made when measuring differ-
ent gasses. These figures are mostly theoretically com-
Secondary standards
puted based on densities, specific heats, and atomic
As secondary standards our laboratory uses thermal weights of the gases.
mass flowmeters and controllers in the range of 100
SCCM to 500 SLPM.
Mass flowmeters use the thermal properties of a gas Calibration methods
to measure flow rate directly [1]. Mass flow rates are
determined by measuring the heat required to maintain Standard thermal mass flowmeters up to 50 SLPM are
an elevated temperature profile along a laminar flow calibrated by comparison to the primary system, Cali-
sensor tube. For a specific flowmeter range and gas spe- flow AlSO and those used for higher range are sent
cies, flow is proportional to the voltage necessary to abroad for calibration.
maintain a constant temperature profile. The sensor in The secondary thermal mass flowmeters are used to
a mass flowmeter is a long, thin stainless steel tube, oft- calibrate all kinds of flow devices in accordance to the
en called a capillary tube because of its shape (see required range and uncertainty.
Fig. 1). A calibration procedure was written by our calibra-
Coils wrapped around the midpoint of the capillary tion laboratory, which describes in detail how the two
tube serve two functions: first as heaters and second as types of calibrations mentioned above are performed.
temperature sensors. Since the resistance of the coils
varies with temperature, they function as temperature
detectors, or resistance temperature detectors (RTDs), Uncertainty calculation
which measure the temperature of the gas. The heaters
create a known temperature profile along the sensor The uncertainty evaluation was done according to the
tube and then maintain the profile during gas flow by recommendations in the ISO Guide for the Expression
means of an autobalancing bridge circuit. As gas flows of Measurement Uncertainty [2]. This procedure was
through the sensor, the gas flow convects heat and the performed on both previously mentioned calibration
temperature difference is converted into a flow read- methods. According to the Guide, we followed the
ing. steps described further on.
Assessment of uncertainty in calibration ofa gas mass flowmeter 239

erage factor, k, which depends on the number of de-


Calibration YS. Califlow AlSO grees of freedom, Vcff. determined by Welch-Satterth-
waite [2].
Determination of the measurement model
In our case most of the components are of Type B
and each repeatability test consisted of ten measure-
In our case the physical model is described by Eq. 5:
ments. The computed effective degrees of freedom is
large enough, hence k = 2.
(5) The components of the uncertainty computations
have been divided into three groups:
1) Reference standard uncertainty:
Sensitivity coefficients calculation - Volumetric factors, such as the cylinder area, the
length of the stroke, etc.
In order to determine the weight of each uncertainty - Thermal and pressure influence on the measure-
component the partial derivatives are calculated for ments and system
each of them, resulting in 1 for each of them. - Measuring instruments resolutions and variations
during the calibration
- Electronic uncertainty, mainly caused by the encod-
Identification of standard uncertainties components er and time measurement
2) Unit under test (UUT) uncertainty:
The various components of the uncertainty and their Measuring instrument calibration uncertainty, which
contribution as required by the ISO Guide for the Ex- in our case was a Wavetek 1271 DMM.
pression of Uncertainty [2] appear in Table 1. The drift, 3) Repeatability measurements which evaluate the
temperature measurement and repeatability measure- closeness of sequential measurement results of the
ments are the most influential factors. same parameter at the same experimental condi-
The sources of uncertainty have to be determined by tions.
experiment, or by using figures that are widely accept- 4) Drift of the readings.
ed and their weighted contribution has to be consid- The drift was evaluated from the deviations between
ered by their sensitivity coefficients. the readings obtained in different runs, of ten measure-
The product of these two values yields the weighted ments each, at the same flow.
uncertainty, which after summation constitute the com- Expanded uncertainties for calibration vs. Califlow
bined uncertainty. The expanded uncertainty at 95% A150 of additional flowmeters are presented in Ta-
confidence level is obtained by mUltiplying it by a cov- ble 2, using the same method as described above.

Table 1 Uncertainty budget for the calibration of a 10 standard liters per minute (SLPM) mass flowmeter vs. the Califlow

Source of uncertainty Standard Sensitivity Weighted


uncertainty coeff. symbol uncertainty
U(Xi) (%) and value C. U(Xi)* C. (%)

Reference standard uncertainty


Volume measurement collective uncertainty (Type B) 0.0076000 1 0.0076000
Barometric pressure repeatability (Type A) O.()(K)9671 1 O'()OO9671
Barometric pressure resolution (Type B) 0.0037984 1 0.0037984
Density temperature effect (Type B) 0.0115470 1 0.011547
Time reading (Type B) 0.0002887 1 O.()002887
Temperature measuring RTD calibration (Type B) 0.0307000 1 0.0307000
Temperature reading resolution (Type B) 0.0577350 1 0.0577350
Encoder calibration uncertainty (Type B) 0.0138000 1 0.0138000
Encoder resolution (Type B) 0.0048113 1 0.0048113
Nitrogen purity uncertainty (Type B) 0.0028752 1 0.0028752
System etTects uncertainty
Mass flowmeter calibration repeatability (Type A) 0.077(K)00 0.0770000
Voltmeter calibration uncertainty (Type B) 0.0003600 O.OOO36()()
Voltmeter manufacturer uncertainty per year (Type B) 0.0000500 O.OOOO5(K)
Voltmeter reading repeatability (Type A) 0.0230940 0.0230940
Calibrated mass flowmeter zeroing repeatability (Type A) O.!l115470 0.0115470
Drift of readings during different runs (Type B) 0.1443376 0.1443376
Combined Uncertainty of the Calibration Process (u,.) 0.2% F.S.
Expanded uncertainty of the calibration process (U) 0.4% F.S.
240 D. Kruh

Table 2 Calculated expanded uncertainties for calibrated f1ow- Table 4 Calculated expanded uncertainties for calibrated f1ow-
meters vs. Califlow A 150 meters VS. flowmeters

Description of flowmeter Calculated expanded uncertainty Description of flowmeter Calculated expanded uncertainty
(U) (%) (U) (%)

0.5 SLPM 0.35 0.1 SLPM VS. 0.5 SLPM 0.47


5 SLPM 0.33 5 SLPM VS. 5 SLPM 0.46
10 SLPM 0.36 10 SLPM VS. 10 SLPM O.4X
30 SLPM 0.34 10 SLPM VS. 30 SLPM 0.46

Calibration vs. a secondary standard mass flowmeter Identification of standard uncertainty components
In this kind of calibration a precise mass flowmeter pre-
Standard uncertainty components are described in Ta-
viously calibrated by the AlSO Califlow serves as a sec-
ble 3. Both flowmeters were assumed to be calibrated
ondary standard. Using the same method for determin-
by the same type of gas and were exposed to the same
ing the measurement uncertainty as described in the
temperature and pressure conditions.
section on Calibration vs. Califlow AlSO we proceeded
Expanded uncertainties for calibration of flowmet-
with the following steps:
ers are presented in Table 4 using the same method as
described above. In this case, the reference mass flow-
meter, the drift and the repeatability measurements
Determination of the measurement model
contribute mostly to the uncertainty.
In this case the physical model is described by the Eq.
(12):
Summary and conclusions
(6)
where: Descriptions are given of the Rafael Calibration Labo-
ratory's facilities for calibrating flowmeters.
Qm = the gas flow rate measured by the tested flow-
Methods of operation are given together with the
meter (SLPM),
measurement uncertainties obtained using the "ISO
QR = the gas flow rate measured by the reference flow-
Guide for the Expression of Uncertainty" recommen-
meter (SLPM).
dations.
The first method uses a primary standard of the pis-
ton prover type, Califlow AlSO. The uncertainty esti-
Sensitivity coefficients calculation
mation in this case is based on the manufacturer's un-
Sensitivity coefficients for Qm and QR were calculated certainties, experienced judgment, and propagation un-
and resulted in 1 for each of them. certainty techniques. Typical values of uncertainty us-

Table 3 Calibration uncertainty budget for 10 SLPM flowmeter VS. secondary standard flowmeter

Source of uncertainty Standard Sensitivity Weighted


uncertainty coeff. symbol uncertainty
u(x,) (%) and value C, u(x,)* C1 (%)

Reference standards uncertainty


Reference mass flowmeter uncertainty (Type B) 0.179257 1 0.179257
Voltmeter calibration uncertainty (Type B) 0.000360 1 0.000360
Voltmeter manufacturer uncertainty per year (Type B) O.()00050 0.000050
Voltmeter reading repeatability (Type A) 0.023094 0.023094
Reference mass flowmeter zeroing repeatability (Type A) 0.011547 0.011547
System effects uncertainty
Voltmeter calibration uncertainty (Type B) 0.000360 1 0.000360
Voltmeter manufacturer uncertainty per year (Type B) 0.000050 1 0.000050
Voltmeter reading repeatability (Type A) 0.023()94 1 0.023094
Calibrated mass flowmeter zeroing repeatability (Type A) 0.011547 1 (UJ11547
Drift of readings during different runs (Type B) O.14433X 1 0.14433X
Mass flowmeter calibration repeatability (Type A) 0.057000 1 0.057000
Combined uncertainty of the calibration process (u e ) 0.24% F.S.
Expanded uncertainty of the calibration process (U) 0.5% F.S.
Assessment of uncertainty in calibration of a gas mass flowmeter 241

ing this method are around 0.3% F.S., with the drift and The second method is based on a comparison of the
repeatability measurements as the most significant con- UUT to a secondary standard calibrated by the first
tributors to the total budget. The most significant con- method. The uncertainty budget in this case is deter-
tributor to Califlow AlSO uncertainty is the tempera- mined equally by the drift and the reference standard
ture measurement uncertainty, followed by that of the used for the calibration. Typical values received in that
encoder and the volume. case, are around 0.5%.

References

1. Hinkle LD, Marino CF (1990) To- 2. ISO (1995) Guide to the expression of
wards understanding the fundamental uncertainty in measurement. ISO,
mechanism and properties of the ther- Geneva, Switzerland
mal mass flow controller. MKS Instru-
ments, Andover, Mass., USA
Accred Qual Assur (1998) 3:231-236
© Springer-Verlag 1998

Paul Willetts Measurement uncertainty - a reliable


Roger Wood
concept in food analysis and for the use
of recovery data?

Abstract Steps which are taken to tive trial data. In many analytical
implement the concept of measure- sectors, the differing strategies cur-
ment uncertainty in analytical rently followed for the determina-
Presented at: 2nd EURACHEM chemical laboratories should take tion and use of recovery informa-
Workshop on Measurement Uncertainty full account of existing internation- tion are an important cause of the
in Chemical Analysis, Berlin,
29-30 September 1997 ally agreed protocols for analytical non-comparability of analytical re-
quality assurance and reflect the sults. Guidelines which are being
needs of particular analytical sec- prepared for the estimation and
P. Willetts' R. Wood (lEI)
Food Labelling and Standards Division, tors. For the food sector this may use of recovery information in ana-
Ministry of Agriculture, Fisheries and mean that for official purposes the lytical measurement may provide a
Food, CSL Food Science Laboratory, use of the term measurement un- more unified approach which in-
Norwich Research Park, Colney, certainty is replaced by the term cludes measurement uncertainty as
Norwich NR4 7UQ, UK
Tel.: + 44-1603-259350
measurement reliability and that a a key concept in the use of recove-
Fax: +44-1603-501123 quantitative estimation of this is ry data.
e-mail: r.wood@fscii.maff.gov.uk made based on existing collabora-

namely the external testing of methods and laboratory


Introduction performance, internal data quality, trueness and the au-
diting of procedures and records. With the exception of
Recent years have seen the issue of the quality and re- the latter which is administratively based, in each case
liability of data become of paramount importance in all reliability is limited by either, or in many cases both,
analytical sectors. In order to address this matter, ana- systematic or random experimental 'inaccuracies',
lysts from the different analytical sectors have worked quantities now being embraced by the concept of meas-
together under the sponsorship of ISO, IUPAC and urement uncertainty.
AOAC INTERNATIONAL, to produce International
Harmonised Protocols on the subjects of the collabora-
tive testing of analytical methods [1], proficiency testing Requirements and initiatives in the food sector
[2] and the use of internal quality control in analytical
chemistry laboratories [3]. In addition, the use of certif- In introducing this concept to the analytical chemical
ied reference materials is increasingly being advocated community there is a need to ensure that steps taken to
with respect to the traceability of analytical data [4], implement measurement uncertainty are made in the
and laboratory accreditation schemes are being widely context of the existing protocols and strategies in analy-
implemented. tical quality assurance. Moreover, the needs and ac-
Each of these components of analytical quality assu- tions of particular analytical sectors must also be recog-
rance concerns a different aspect of data reliability, nised. In the food sector a number of initiatives have
Measurement uncertainty - a reliable concept in food analysis and for the use of recovery data? 243

been advanced recently which affect specifically the is- tation agencies to ensure that measurement uncertainty
sue of data quality, and therefore reliability, in this area estimations are carried out as part of the accreditation
of analysis. Firstly, in the EU, there is a tendency in the process [9].
food analysis sector to not prescribe specific methods of
analysis but to adopt a "criteria of methods approach"
whereby analysts may use the method of their choice Recovery and analyte losses
provided it meets certain prescribed quality criteria.
This flexibility of approach, to take advantage of the One aspect of analytical chemistry where, for all analy-
developments of new techniques and procedures as tical sectors including the food sector, current practice
they occur in analytical chemistry, clearly has conse- continues to have important consequences in terms of
quences for the comparability and measurement uncer- the non-comparability and uncertainty or reliability of
tainty of reported data. Secondly, there is a require- reported data, is that of the use of recovery informa-
ment in the food sector, as set out in EC Directive 93/ tion. This arises because of the different strategies for
99, that methods of analysis for food control purposes dealing with recovery assessment and the effect these
should wherever possible be formally validated by col- may have on the variability of the analytical results re-
laborative trial [5]. Thirdly, there have been discussions ported.
on measurement uncertainty within the Codex Com- Recovery studies are an essential component of
mittee on Methods of Analysis and Sampling. The Re- quality assurance systems in analytical measurement.
port of the March 1997 Session of that Committee Their use, particularly in the trace analyte area, to as-
states that with regard to measurement uncertainty sess the efficiency of the removal of the measurand
[6]: from the sample matrix and its transfer prior to detec-
tion is widely quoted in the scientific literature. Al-
1. The Committee will develop for Codex purposes an
though they thus provide an important indication of the
appropriate alternative term for measurement un-
reliability of these steps in the measurement process,
certainty, e.g. measurement reliability.
there generally has been no consistent approach to the
2. The precision of a method may be estimated
way in which recovery information is derived and used
through a method-performance study, or where this
in analytical data. In particular, in the case of recovery
information is not available, through the use of in-
factors calculated and applied to analytical data to cor-
ternal quality control and method validation.
rect for displacement or bias, the absence of accepted
3. Consideration should be given as to whether it is
strategies for the determination and use of these factors
necessary to undertake an additional formal evalua-
has meant that it frequently has been difficult to make
tion of a method of analysis using the ISO approach
comparisons between analytical results produced in dif-
[7] in addition to using information obtained
ferent laboratories or verify the suitability of that data
through a collaborative trial.
for the intended purpose. This is particularly marked in
4. Governments should advise accreditation agencies
the case of complex matrices, such as foodstuffs, where
that for national and Codex purposes the measure-
the difficulties of completely extracting the analyte are
ment uncertainty result need not be calculated using
most pronounced. Quite commonly in such procedures
the ISO approach [7] providing the laboratory is
a substantial proportion of the analyte remains in the
complying with the appropriate Codex principles.
matrix after extraction, so that the transfer is incom-
Discussions are on-going in Codex. However, if plete, and the subsequent measurement is lower than
these proposals are accepted, it is likely that the term the true concentration in the original test material. If
'measurement reliability' rather than measurement un- no compensation for these losses is made, then marked-
certainty will be adopted and that estimates of this will ly discrepant results may be obtained by different labo-
be made from collaborative trial data if such data are ratories. Even greater discrepancies are likely to arise if
available. In a recent study, carried out in the UK, some laboratories compensate for losses and others do
which compared 'top-down' (collaborative trial) and not. These considerations are especially important in
'bottom-up' (ISO) approaches to the estimation of legislative/enforcement situations where for instance
measurement uncertainty, it was concluded that for the difference between applying or not applying a re-
comparable matrix/analyte combinations these ap- covery factor to correct for the incomplete removal of
proaches gave not dissimilar results in the limited num- the analyte may mean respectively that a legislative
ber of cases studied [8]. It should be noted that, in re- limit is exceeded or that a result is in compliance with
cognising the importance of the concept of measure- the limit.
ment uncertainty in underpinning the reliability of ana-
lytical data, the Codex recommendations and discus-
sions are in accordance with statements on uncertainty
in ISO Guide 25 and EN 45001, which require accredi-
244 P. Willetts· R. Wood

Table 1 Typical approaches relating to the application of recove-


Recovery correction factors ry factors

Thus, where an estimate of the true concentration is re- a The reporting of an analytical result without correcting
quired, there is a compelling case for including a com- for bias by the application of a recovery factor, no ac-
companying statement being given of the level of recove-
pensation for losses in the calculation of the reported ryachieved
analytical result, provided that the correction factor can The reporting of an analytical result without correcting
b
be estimated reliably. In the case of an empirical meth- for bias by the application of a recovery factor, together
od, where the measurand is defined in terms of the with a statement of the level of recovery achieved
method used and no attempt is being made to estimate c The reporting of an analytical result corrected for bias
the amount of analyte actually present in the sample by the application of a recovery factor, without an ac-
matrix, the question whether or not a correction is ap- companying statement of the level of recovery
plied is a matter for the definition of the empirical d The reporting of an analytical result corrected for bias
method. by the application of a recovery factor, together with a
The four most common approaches which typically statement of the level of recovery achieved
have been taken by analysts in respect of the applica-
tion of recovery factors are shown in Table 1.
Table 2 Examples of ways in which the recovery factor may be
determined with spiking

Reference materials and spiking experiments a Basing a recovery correction factor on the recovery of
the analyte from a spiked sample in the batch
Quite apart from the variation which can arise from la- b Basing a recovery correction factor on the mean value
boratories adopting different practices in respect of obtained for the recovery of the analyte spiked into a
whether a correction factor is applied or is not applied sample in each of a number of batches
to an analytical result, a further aspect which can hin- c Basing a recovery correction factor on the recovery of a
der data comparison is the fact that 'recovery' informa- chemically similar internal standard added to the test
material
tion may be derived either from the inclusion of refer-
ence materials or the use of spiked samples. d Basing a recovery correction factor on the recovery of an
isotopic form of the analyte added as internal standard
In the case of reference materials, the analyte is to the test material
usually integrated or incorporated into the matrix,
whereas in the case of spiked samples the analyte is
merely added to the matrix. Potentially different infor-
mation relating to the behaviour of the native analyte Consideration of these different strategies has led
to be measured may be derived from each type of re- analytical chemists to recognise the desirability of using
covery measurement. Moreover, the regularity and pat- a more uniform approach when dealing with the topic
tern of use of these recovery materials may affect the of recovery measurements in order to facilitate the
recovery information produced. In the case of spiking, comparability of data.
for example, the different ways in which the recovery
factor may be determined include those shown in Ta-
ble 2. Guidelines for using recovery information
Each of these approaches differs in the representa-
tiveness it provides of the actual extraction of the ana- Following the circulation to a broad cross section of the
lyte itself, the basis of the representation being differ- analytical community world-wide of a questionnaire on
ent in each case. While it is generally agreed that, of the determination and use of recovery measurements in
these four alternatives, the use of an isotopic internal 1995, background information was obtained which ena-
standard is the preferred approach since the recovery bled further consideration to be given to the role of re-
of the auxiliary analyte equates most closely to being covery studies in chemical analysis [10]. The main ques-
'fully equivalent' to that of the target analyte, this op- tions addressed the issues shown in Table 3.
tion is often not possible. As a consequence one of the As expected, the differing answers given to the ques-
other alternatives is often followed in spiking experi- tions posed revealed considerable variation in the ways
ments. in which analysts deal with recovery measurements. In
When a reference material is used rather than spik- particular, the question on measurement uncertainty it-
ing, then it will be included at a different position in the self produced more differences than any of the other
batch to the test material itself. In this respect the use questions, perhaps suggesting a lack of appreciation of
of a reference material is akin to options a or b for spik- either the need for or the means of calculating this val-
ing (see Table 2). ue. The findings of this survey were presented at the
Measurement uncertainty - a reliable concept in food analysis and for the use of recovery data? 245

Table 3 Outline of questions included in the recovery factors Table 4 A summary of guidelines for the use of recovery infor-
questionnaire mation

Question Question Question Question 1. A distinction is recognised between:


number number surrogate recovery (recovery of a pure compound or element
specifically added to the test portion or test material as a
Meaning of 12 Recovery of ana- spike - sometimes called "marginal recovery")
recovery lyte and internal the recovery of native analyte incorporated into the test ma-
standard terial by natural processes and manufacturing procedures -
sometimes called "incurred analyte".
2.1 Purpose/use of re- 13 Spiking procedure
covery measure- 2. It is recognised that there is a dual role for recovery determi-
ments nations in analytical measurement, that is, (a) for quality con-
2.2 trol purposes and (b) for deriving values for recovery factors.
Reporting results 14 Blank material
In the latter application, more extensive and detailed data
3 Recovery frequen- 15 Spiking level are required.
cy in time
3. Variable practice in handling recovery information is an im-
4 Recovery frequen- Hi Spiking portant cause of the non-equivalence of data. To mitigate its
cy in batch concentration effects, in general, results should be corrected for recovery,
5 Recovery level 17 Carrier solvent for unless there are overriding reasons for not doing so. Such
analyte reasons would include the situation where a limit (statutory
or contractual) has been established using uncorrected data,
6 Acceptable 20 Recovery solution or where recoveries are close to unity.
recovery levels
4. It is of over-riding importance that (a) all data, when re-
7 Assessment of ac- 21 Time of sample ported, should be clearly identified as to whether or not a re-
ceptable recovery preparation covery correction has been applied and (b) if a recovery cor-
9 M ulti-analyte 23 Precision rection has been applied, the amount of the correction and
determinations the method by which it was derived should be included with
the report. This will promote direct comparability of data
10 Procedure for the 24 Measurement sets. Thus, in all situations, correction functions should be es-
determination of uncertainty tablished based on appropriate statistical considerations, doc-
recovery umented, archived and available to the client.
11 Matrices used 5. Recovery values should always be established as part of
method validation, whether or not recoveries are reported or
results are corrected, so that measured values can be con-
verted to corrected values and vice versa.
Symposium on Harmonisation of Quality Assurance 6. When the use of a recovery factor is justified, the method of
Systems in Chemical Analysis held in Orlando, USA, in calculation should be given in the method.
1996. From the deliberations of that meeting, harmon- 7. IQC control charts for recovery should be established during
ised guidelines for the use of recovery information in method validation and used in all routine analysis. Runs giv-
analytical measurement are being prepared under the ing recovery values outside the control range should be con-
sponsorship of IUPAC, ISO and AOAC INTERNA- sidered for re-analysis in the context of acceptable variation,
or the results should be reported as semi-quantitative.
TIONAL [11]. The guidelines, the main points of which X. Uncertainty is a key concept in formulating an approach to
are summarised in Table 4, refer to uncertainty as being the estimation and use of recovery information. Although
a key concept in formulating an approach to the esti- there are substantive practical points in the estimation of un-
mation and use of recovery information. certainty that remain to be settled, the principle of uncertain-
ty is an invaluable tool in conceptual ising recovery issues.

Uncertainty and recovery correction


Table 5 Sources of uncertainty in recovery estimation
Although the estimation of uncertainty in recovery has
Repeatability of the recovery experiment
yet to be studied in detail, the guidelines list some
sources of the uncertainty in measured recovery (Ta- 2 Uncertainties in reference material values
ble 5) and include a treatment which considers the un- 3 Uncertainties in added spike quantity
certainty estimation in cases of incomplete recovery 4 Poor representation of native analyte by the added spike
where either a correction is or is not applied to an ana- 5 Poor or restricted match between experimental matrix
lytical result [12]. In this treatment, the difference in and the full range of sample matrices encountered
the measured recovery (R) from the value of unity, rep- 6 Effect of analyte/spike level on recovery and imperfect
resenting total recovery, is compared to the uncertainty match of spike or reference material analyte level and
in the determination of R. analyte level in samples
The comparison is made using a significance test to
assess whether I R -11 is greater than the uncertainty
246 P. Willetts' R. Wood

(UR) in the determination of R, at some level of confi- which is necessarily greater than uxlx and may be con-
dence. The significance test takes the form siderably greater. Hence correction for recovery seems
at first sight to degrade, perhaps substantially, the relia-
I R -11 IUR>t: R differs significantly from 1
bility of the measurement.
I R -11 IUR::S t: R does not differ significantly from 1
It is stated that such a perception is incorrect. Only if
where t is a critical value based either on a 'coverage the method is regarded as empirical, and this has draw-
factor' allowing for practical significance or, where the backs in relation to comparability as already discussed,
test is entirely statistical, t(<>/2. n-l), being the relevant is U x the appropriate uncertainty. If the method were
value of Student's t for a level of confidence I-a. taken as rational, and the bias due to loss of analyte
Following this assessment, for a situation where in- were not corrected, a realistic estimate U x would have
complete recovery is achieved, four cases can be distin- to include a term describing the bias. Hence uxlx would
guished, chiefly differentiated by the use made of the be at least comparable with, and may be even greater
recovery R. than, uClJrrlxcorr'
(a) R is not significantly different from 1. No correction These approaches to the estimation of the uncertain-
is applied. ty of a recovery are necessarily tentative. Nevertheless,
(b) R is significantly different from 1 and a correction the following important principles of relevance to the
for R is applied. conduct of recovery experiments are demonstrated.
(c) R is significantly different from 1 but, for operation- (a) The recovery and its standard uncertainty may both
al reasons, no correction for R is applied depend on the concentration of the analyte. This
(d) An empirical method is in use. R is arbitrarily re- may entail studies at several concentration levels.
garded as unity and UR as zero. (Although there is (b) The main recovery study should involve the whole
obviously some variation in recovery in repeated or range of matrices that are included in the category
reproduced results, that variation is subsumed in for which the method is being validated. If the cate-
the directly estimated precision of the method.) gory is strict (e.g., bovine liver) a number of differ-
In the first case, where R is not significantly different ent specimens of that type should be studied so as to
from 1, the recovery can be viewed as being equal to represent variations likely to be encountered in
unity, no correction being applied. There is still an un- practice (e.g., sex, age, breed, time of storage etc.).
certainty, UR, about the recovery that contributes to the Probably a minimum of ten diverse matrices are re-
overall uncertainty of the analytical result. quired for recovery estimation. The standard devia-
In the cases where R is significantly different from 1, tion of the recovery over these matrices is taken as
the loss of analyte occurring in the analytical procedure the main part of the standard uncertainty of the re-
is taken into account, and two uncertainties need to be covery.
considered separately. First, there are the uncertainties (c) If there are grounds to suspect that a proportion of
associated only with the determination, namely those the native analyte is not extracted, then a recovery
due to gravimetric, volumetric, instrumental, and cali- estimated by a surrogate will be biased. That bias
bration errors. That relative uncertainty uxlx will be low should be estimated and included in the uncertainty
unless the concentration of the analyte is close to the budget.
detection limit. Second, there is the uncertainty UR on (d) If a method is used outside the matrix scope of its
the estimated recovery R. Here the relative uncertainty validation, there is a matrix mismatch between the
URI R is likely to be somewhat greater. If the raw result recovery experiments at validation time and the test
is corrected for recovery, we have Xcorr =xlR (i.e., the material at analysis time. This could result in extra
correction factor is lIR). The relative uncertainty on uncertainty in the recovery value. There may be
Xcorr is given by problems in estimating this extra uncertainty. It
would probably be preferable to estimate the recov-
ery in the new matrix, and its uncertainty, in a sepa-
rate experiment.

References
1. Horwitz W (1988) Pure Appl Chern 3. Thompson M, Wood R (1995) Pure 6. Codex Alimentarius Commission
60:855-864 Appl Chern 67: 649-666 (1997) Codex Committee on Methods
2. Thompson M, Wood R (1993) Pure 4. Thompson M (1996) Analyst of Analysis and Sampling 21st Ses-
Appl Chern 65:2123-2144 [Also pub- 121:285-288 sion
lished in J. AOAC Int (1993) 5. Official Journal of the European
76:926-940] Communities (1993) L290114 Council
Directive 93/99/EEC
Measurement uncertainty - a reliable concept in food analysis and for the use of recovery data? 247

7. ISO (1993) Guide to the Expression 9. Draft - ISOIIEC Guide 25 (1996) 12. Ellison SLR, Williams A (1996) In:
of Uncertainty in Measurement, Gen- General requirements for the compe- Parkany M (ed) Proceedings of the
eva tence of testing and calibration labo- Seventh International Symposium on
8. Brereton P, Anderson S, Willetts P, ratories the Harmonisation of Quality Assu-
Ellison S, Barwick Y, Thompson M, 10. Willetts P, Anderson S, Wood R rance Systems in Chemical Analysis.
Wood R (1997) CSL Report FD 90/ (1998) CSL Report FD 97/65 Royal Society of Chemistry, London
103 11. International Union of Pure and Ap-
plied Chemistry (1997) Draft Har-
monised Guidelines for the use of
Recovery Information in Analytical
Measurement
Accred Qual Assur (199X) 3:127-130
© Springer-Verlag 199X

Andre Henrion In- and off-laboratory sources of


uncertainty in the use of a serum
standard reference material as a means
of accuracy control in cholesterol
determination

Abstract Repeated subsampling or as an off-laboratory source was


a hierarchical design of experi- found to be significant. This
ments combined with an analysis knowledge might be essential when
Presented at: 2nd EURACHEM
Workshop on Measurement Uncertainty of variance (ANOY A) is demon- the material is used for calibration
in Chemical Analysis, Berlin, strated to be a useful tool in the and for the self-assessment of a la-
29-30 September 1997 determination of uncertainty com- boratory.
ponents in amount-of-substance
A. Henrion (~) measurements. With the reference
Physikalisch-Technische Bundesanstalt, material of human serum as inves- Key words Experimental design .
Bundesallee 100, D-38116 Braunschweig, tigated here for total cholesterol, Analysis of variance (ANOY A) .
Germany
Tel.: +495315923321;
besides several in-laboratory Determination of uncertainties
Fax: +495315923015; sources of uncertainty, a vial-to- Human serum' Cholesterol
e-mail: andre.henrion@ptb.de vial effect which can be regarded Standard reference material

The amount-of-substance measurement would comprise the


Introduction following steps:
• Reconstitution of the serum by addition of water
For an amount-of-substance measurement to be re- • Separation of the analyte from the matrix (saponification of
garded as under control, it is necessary to be able to cholesterol fatty acid esters, extraction into hexane)
• Derivatization of the cholesterol (for preparation of gas chro-
clearly state its uncertainty. Under many circumstances, matography)
knowledge of particular uncertainty components to be • Quantification by gas chromatography/mass spectrometry
assigned to particular steps of the whole procedure (GClMS)
would also be of valuable assistance. This would allow Details of these manipulations are given elsewhere [2].
An experiment was set up according the strategy of repeated
future work to be focussed on the points requiring im- su bsamp ling, also called a nested or hierarchical design. This is
provement. A carefully designed series of repeated sketched in Fig. 1. Two vials had been drawn by the manufacturer
measurements along with the evaluation of the results as samples out of each of three different serum pools. After re-
by analysis of variance (ANOYA) offers a powerful constitution, the contents of each of the vials were subdivided
into three aliquots. Then, after addition of an internal standard
tool for obtaining this information. This will be demon- (spike), the cholesterol was separated from the matrix and ex-
strated by the example of the determination of total tracted for each aliquot independently of the others. Subsamples
cholesterol in a human serum reference material. of the extracts were derivatized on three different occasions to
obtain three solutions ready for analysis, each of which was finally
subjected to three different GClMS runs.
It should be noted that it would also have been possible to
Material and method carry out this survey with two subsamples on every tier, instead of
three. This often might be more advantageous because of the cost
The material investigated was the NIST Standard Reference Ma- reduction involved, though, of course, degrees of freedom in the
terial 1952a [1]. This is a freeze-dried serum certified for its con- ANOV A would be sacrificed.
centration of cholesterol.
In- and off-laboratory sources of uncertainty in the use of a serum standard reference material 249

source of variation CV

(pool A) (pool B) (pool C) • heterogeneity of


material
supplier of 1,00 %

/\ /\ /\ • filling into vials

• freeze-drying
the material

\
• reconstitution

11 11 till II il 11 II illl II ti II il il llll il • separation from


matrix and 0,76 %

A/\ ~ ~
extraction of laboratory

~~
cholesterol of the user

uuu UUU UUU uuu uuu uuu • derivatization 0,35 %

/\ /\
gQQ g g g ggg
/\ • instrument
reproducibility
0,48 %

Fig.l Sampling scheme and sources of variation attributable to Table 1 ANOY A of the amount-of-cholesterol data
the tiers
Source of DF" SSb MS c F ratio
variation

For particulars of experimental designs and ANOY A, the Inter vials 3 99.007 33.022 147.595
reader not familiar with it is referred to the multitude of text- Inter extT. 12 70.00X 5.X39 20.()9X
books on this subject, e.g. Anderson and Bancroft [3). Inter deriv. 30 21.290 0.592 2.044
Inter GC/MS runs lOX 24.104 0.224

a Degrees of freedom
Results h Sum of squares
C Mean square
The direct results of the ANOV A are compiled in Ta-
ble 1. Since all of the F ratios exceed the tabulated crit-
ical value corresponding to 5% risk of error, all of the numbers of subsamples drawn on each of the tiers and
mean squares can reasonably be assumed to represent would change when other sampling schemes were
significant contributions to the total sum of squares used.
apart from the sole GC/MS sum of squares. Otherwise, Equating the mean squares in Table 1 with the cor-
the model would have to be recalculated, omitting the responding expected values furnishes the sought for es-
insignificant source( s). timates of variances of the individual sources. Their
The mean squares given in Table 1 do not yet repre- square roots, which can be interpreted in terms of un-
sent the variances attributable to the individual sources, certainties caused by them, are compiled in the third
as each of these is actually a weighted sum of contribu- column of Table 2. These figures at the same time rep-
tions of all sources below it in the hierarchy. This be- resent the coefficients of variation (CV), since, in this
comes evident from the calculated expected mean example, prior to the ANOV A, all data had been
squares derived from theory for this nested model [3]. standardized to yield mean values of 100 for each of the
These are given in Table 2. The coefficients reflect the pools.
250 A. Henrion

Table 2 Expected mean squares and derived coefficients of var- In addition to this, a vial-to-vial effect is observed. It
iation for the amount-of-cholesterol data ((T: standard deviations is significant though it was characterized with only
of the individual sources)
three degrees of freedom (see Table 1). The corre-
Source of E(MS)" sponding CV is about 1.0%. It can be discussed in
variation terms of heterogeneity of the material, reproducibilities
of filling into vials, freeze-drying and reconstitution pri-
Inter vials ~C/MS + 3 a3erivs. or to use. The author believes the last source to be ne-
+3'3~xtL
+ 3·3· 3 tTvials 1.()O gligible in magnitude. If so, the vial-to-vial CV can be
regarded as a plain off-laboratory contribution to the
Inter extr. ubC/MS + 3 a3crivs.
+3'3~xtL 0.76 overall CV.
Knowledge of the off-laboratory CV will be impor-
Inter deriv. oiC/MS + 3 a3crivs. 0.35
tant if the material is intended to be used for self as-
Inter GClMS
Runs O.4X sessment or evaluation of the performance of a labora-
~ClMS
tory to be accredited. Then
a Expectation of mean square
b Coefficient of variation
Uvials being the expanded uncertainty [4] attributable to
the off-laboratory source and tp.df the percentage points
Table 3 Cholesterol concentrations [in j..Lmol/g (dry mass)] certif- of the t distribution. This, for instance, is about 9% if
ied and found for SRM 1952a only two vials are available as in the present case, but
Pool Certified a Found Diff. in % would not have been more than 1% if six vials out of
each pool had been under investigation.
A 41.66 41.77 +0.25 Considering that
B 63.X5 64.76 +1.42
C X6.72 X5.50 -1.40 U~otal = (tp.df· CV vials)2 I nvials + (tp.df· CVin-lab.) 2I nin-Iab.
a Calc. from the d~ta given in the certificate one can derive the minimum number of vials that
would be needed if a laboratory was to be tested for its
capability of determining the concentration for the pool
A comparison of the mean concentrations found for with a given uncertainty Uin-Iah ..
the pools with the data stated in the manufacturer's cer- In the example discussed here, Uvial" as already
tificate is given in Table 3. mentioned, is estimated to be at as high as 9%. Howev-
er, the mean concentrations of cholesterol found for
the pools happen to be pretty close to those certified
Discussion (see Table 3). Therefore, at this level of uncertainty,
there is no reason to suspect that the results are
The CVs (Table 3) can be combined to obtain an esti- biased.
mate of the CV of a single measurement: CV =
(CV~ials + CV~xtL + CVacriv. + CVbC/MS) 112 = 1.39%. This
is characteristic of the one-time drawing of a random Conclusions
vial out of a pool, separating from the matrix and ex-
tracting the cholesterol, derivatizing it and finally de- A carefully designed survey combined with ANOV A is
tecting the concentration by GC/MS. It is, of course, a powerful tool for providing the experimenter with
equivalent to the CV calculated with the whole meas- knowledge of particular components of the total var-
urement repeated several times in a straightforward iance. In many instances, as in the example presented
way (i.e. formally with only one element on each of the here, no other way of detecting them is conceivable.
tiers). The variance components furnish valuable information
Knowledge of the individual variance contributions as to what steps of the whole procedure need to be
allows a further interpretation. For illustration, see the checked for possible improvement. These at the same
right-hand part of Fig. 1. Instrument reproducibility time would be the steps which perhaps would require
(GClMS) is not a critical source of uncertainty, as pos- further subsampling and averaging of the results in or-
sibly could have been assumed, and neither is the deri- der to keep the uncertainty of the final result low.
vatization step. Among the manipulations in the labo-
ratory of the user of the material, the separation from
the matrix and the extraction into solvent might be re-
viewed for improvement. The in-laboratory CVs can be
combined to form a joint CV of about 1.0%.
In- and off-laboratory sources of uncertainty in the use of a serum standard reference material 251

References

1. Certificate of Analysis for SRM 1952a, 2. Henrion A, Dube G, Richter W (1997) 4. Guide to the Expression of Uncertain-
National Institute of Standards & Fresenius J Anal Chem 358: 506-508 ty in Measurement (1st edn) (1993) In-
Technology, Gaithersburg, MD 20899, 3. Anderson RL, Bancroft TA (1952) ternational Organization for Standardi-
January 8, 1990 Statistical Theory in Research. zation, ISBN 92-67-10188-9
McGraw-Hili Book Company, New
York
Accred Qual Assur (1999) 4:124-128
© Springer-Verlag 1999

I1ya Kuselman Assessment of limits of detection and


Felix Sherman
quantitation using calculation of
uncertainty in a new method for water
determination

Abstract An approach to the as- is performed for the validation of a


sessment of the limit of detection new method for water determina-
and the limit of quantitation using tion in the presence of ene-diols or
uncertainty calculation is discussed. thiols, developed for analysis of
The approach is based on the chemical products, drugs or other
known evaluation of the limits of materials which are unsuitable for
detection and quantitation as con- direct Karl Fischer titration. A
centrations of the analyte equal to good conformity between calcu-
three and ten standard deviations lated values and experimental vali-
of the blank response, respectively. dation data is observed.
It is shown that these values can
be calculated as the analyte con- Key words Limit of detection .
I. Kuselman (121) . F. Sherman centrations, for which relative ex- Limit of quantitation . Uncertainty
The National Physical Laboratory of Is- panded uncertainty achieves 66% of measurements . Analytical
rael, Givat Ram, Jerusalem 91904, Israel method validation . Water
Tel.: + 972-2-6536-534
and 20% of possible results of the
Fax: + 972-2-6520-797 analyte determination, correspond- determination .
e-mail: kuselman@netvision.net.il ingly. For example, the calculation Karl Fischer titration

1982-1984]. Experimental design for LOD and LOQ


Introduction assessment is complicated, and expensive [3].
On the other hand, quantifying uncertainty in analy-
Validation of analytical methods used for the enforce- tical measurement is not included in the list of valida-
ment of regulations, in particular in the pharmaceutical tion parameters [1, 2] but is required during laboratory
industry, has become obligatory in last few years [1]. accreditation according to the guides of the Interna-
Obviously, different methods should be validated using tional Organization for Standardization (ISO) [4], CI-
different validation parameters. For example, Category TAC [5], EURACHEM [6] and other regulations. Un-
I methods for the quantitation of major components of certainty values and values of LOD and LOQ de-
bulk drug substances or active ingredients in finished scribing the same analytical method are interdependent
pharmaceutical products do not require evaluation of [7-9]. When this interdependence is described in an
the limit of detection (LOD) and the limit of quantita- analytical (mathematical) form, the design of the ex-
tion (LOQ). In contrast, validation of methods from periment for validation can be based on the uncertainty
Category II for the determination of impurities in bulk calculation and prediction of the parameters.
drug substances or degradation compounds in finished For example, Karl Fischer titration of water in solid,
products is incomplete without these parameters [2, pp. liquid or gaseous samples is a routine analytical method
Assessment ofiimits of detection and quantitation using calculation of uncertainty in a new method for water determination 253

used today for quality control of many products. The The analogous dependence is also valid for LOQ.
method is relevant to ISO 9001-9003, Good Manufac- By definition, LOQ is the lowest concentration of an
turing Practice, Good Laboratory Practice, and Food analyte in a sample which can be determined by the
and Drug Administration (FDA) guidelines [10]. analytical method with acceptable precision and accu-
Therefore, different aspects of the measurement uncer- racy; LOQ is usually equal to ten standard deviations of
tainty in this method were studied in a number of pub- the blank response [2, p.1983]. In other words, the rela-
lications [10-13]. In particular, the work concerning the tive standard measurement uncertainty is u( C)/C = 10%
analysis of the uncertainty budget for Ca(OHh and at C = LOQ. Therefore, the relative expanded uncer-
Mg(OHh determination in CaO and MgO is interest- tainty with the coverage factor 2 at the LOQ is U(C)/
ing [14, lS]. In this method the Karl Fischer titration of C=20% and
water produced by the analytes at high temperatures is
LOQ=(100/20) U(C) =S U(C). (2)
the second final step of the determination.
One of the most important sources of uncertainty is Since uncertainty in analytical measurement can be
the presence of materials in the sample (components of calculated with "pen and paper only" [18], the Eqs. (1)
the matrix) other than water that react with the Karl and (2) also allow one to calculate or predict LOD and
Fischer reagent (KFR). If the sample includes such LOQ before an experiment, as it is shown below for
components in significant concentrations, direct Karl water determination in presence of ene-diols or thiols.
Fischer titration will be impossible. This is a problem,
for example, for ene-diols (such as ascorbic acid and its
preparations) and thiols which are used in the pharma- Uncertainty in water determination
ceutical, perfumery, food and other industries.
We have developed a new method for simultaneous The analytical procedure begins with the titration of
determination of water and ene-diols or thiols in sam- the sample test portion against the novel reagent [17].
ples unsuitable for direct Karl Fischer titration [16]. This reagent includes iodine in non-aqueous solvents,
The method is based on the consecutive titration first which oxidizes ene-diol or thiol to diketones or sul-
of ene-diol or thiol against a novel reagent [17] and phide derivatives, respectively, which do not interfere
then of water against a conventional KFR in the same with the next titration of water by KFR.
test portion and the same cell for electro metric location After the first titration (assay) the total water con-
of the end-point in both titrations. For ene-diol or thiol tent in the flask consists of the original amount of water
the method is classified as a Category I method and for in the test portion and that introduced with the novel
water a Category II method. Therefore LOD and LOQ reagent during titration. This total water content is ti-
evaluation is only necessary for water. trated against KFR (second titration).
In the present paper, the dependence of LOD and The original water content in the sample (Cw , %
LOQ on uncertainty in analytical measurement is dis- mass) is calculated from the equation:
cussed and used in the design of an experiment for the
C w = [VKFR - (F X V r -a )] xT KFR X 100/m, (3)
assessment of LOD and LOQ in the new method for
water determination. where m is the mass of the sample test portion in mg;
V KFR is the volume of the KFR spent for titration of
the solution formed after the first titration in ml; T KFR
Dependence of LOD and LOQ on uncertainty in
is the titre of the KFR in mg H 2 0/m!. Then
analytical measurement
LOD is the lowest concentration of an analyte in a sam- T KFR =m T H20NT KFR,
(4)
ple that can be detected, but not necessarily quantified, where m T H20 is the mass of water test portion used for
by the analytical method [2, p.1983]. Wegscheider has determination of the KFR titre in mg; V T KFR is the vol-
shown [9], that in order to define this concentration C ume of the KFR spent for the titration of m T H20 in ml;
as corresponding to three standard deviations of the V r-a is the volume of the reagent spent for titration of
blank response, it would mean that the relative stand- the sample test portion in ml, and F is the factor that
ard measurement uncertainty (relative standard devia- corresponds to the volume of the KFR in ml spent for
tion in the concentration domain) is u(C)/C=33% at titration of water traces in 1 ml of the novel reagent.
C=LOD. Hence, using the coverage factor 2, the rela- The F value is calculated from two consecutive titra-
tive expanded uncertainty at C=LOD is U(C)/ tions of one and the same dry SnCh sample by the nov-
C=2u(C)/C=66% and vice versa: el reagent and then by the KFR [16]:
LOD=(100/66) U(C)=1.S U(C), (S)
where U(C) is the expanded uncertainty (absolute val- where VOr is the volume of the novel reagent spent for
ue). the first SnCh titration in ml, and VOKFR is the volume
254 I. Kuselman· F. Shennan

in ml of the KFR spent for the second titration of the water concentrations in the range 0.1-1.0% mass are
solution formed after the first titration. shown in Fig. 1.
Masses of the sample test portion (m:::::: 50 mg) and
even of the water test portion for determination of the
KFR titre (m TH20:::::: 5 mg) are measured with negligible LOD and LOQ prediction and design of the experiment
uncertainty (for example, with Mettler AT 201 balance
According to Eqs. (1) and (2) and the calculated value
it is 0.015 mg) in comparison to the uncertainty of Yol-
U(C w), the predicted values of LOD and LOQ for the
urnes [6, 8]. Therefore, it is desirable to transform Eqs.
water determination are as follows:
(3-5) in the following way:
LODwp = 1.5 U(C w) =1.5 xO.074=0.11 % mass (8)
Cw =
[YKFR-(Yr-a xyoKFR/Vor)] X [em TH20 X 100/m)/V T KFR], and
(6) LOQwp = 5 U (C w) = 5 X 0.074 = 0.37% mass. (9)
where (m TH20 X 100/m) = K is a negligible source of un- Corresponding values of the relative combined uncer-
certainty (designated as K for convenience). tainty (66% reI. and 20% reI.) are indicated in Fig. 1 by
The standard combined uncertainty of the water de- empty circles.
termination can be evaluated by the partial differentia- Based on this prediction, the experiment for evalua-
tion of Eq. (6) taking into account the value K: tion of LOD and LOQ was designed to analyse a sam-
ple with a water concentrations close to LODwp and
u( C w) = {(K/V TKFR) 2 X [( u(Y KFR)/V KFR) 2 + two samples with water concentrations, one a little less
+ (yr_a/yor) 2 X (U(yoKFR)/YOKFR)2+ and one a little more than LOQwp. A purchased ascor-
+ (YOKFR/VOr)2 X (u(Yr-a)/V r-a ) 2 + bic acid powder containing 0.15% mass of water, pur-
+ (Y r-a XYOKFR/(YOr)2f X U(YOr)/VOr)2] + chased a-monothioglycerol with 0.24% mass of water
+ [(YKFR-(Yr-a X yOKFR/VOr» X and a fortified sample of a-monothioglycerol with
X K/(yT KFR)2]2 X (U(yT KFR)/V TKFR) 2} 112. (7) 0.53% mass of water were used for this purpose.
All the volumes were measured with a 10-ml burette
graduated in 0.02 ml divisions (Bein Z.M., Israel). The Experimental data
manufacturer specifies a calibration accuracy of ± 0.02
ml, which can be converted using rectangular distribu- The "true" values of water concentration, C tn in these
tion to a standard deviation uc(Y) = 0.02/p = 0.012 ml. samples, shown in Table 1, were obtained as a differ-
The standard deviation of the burette filling obtained ence between two titrations of two independent test
was equal to Ur (Y) = 0.013 ml. Since the volumes are
spent for titrations, they also depend on the standard 100
deviation of end-point detection. Bipotentiometric lo-
90
cation of the end-point by the direct dead-stop tech-
nique is very precise [19]. Therefore, in both titrations 80
the main deviation in end-point detection arises due to
the drop size of the burette. In our case the drop is 70
0.013 ml and the corresponding standard deviation is ~
~ 60
u,,(Y) =O.013/p =0.0075 ml. The temperature uncer-
0

i 50
tainty source is negligible here, therefore the standard ~
uncertainty of each volume spent for titration using
the burettes described above is u(Y) = [( uc(Y» 2 +
~
;;.
40

(ur(Y)f + (U,,(y»2] 1/2 =0.019 ml. Yolumes Y KFR , 30


Y r-a, yO r and yT KFR are equal of - 5 ml, while the vol-
20
ume yO KFR is 1 ml. Therefore the relative standard
deviations are different: u(Y KFR)/V KFR = u(Yr_a)/V r-a) 10
= u(yor )/Vo r) = U(yT KFR)/V TKFR = 0.0038, whereas
U(yoKFR)/yoKFR = 0.019. By substituting these values 0
into Eq. (7) one can calculate the combined uncertainty 0.0 0.2 0.4 0.6 0.8 1.0
u(C w) =0.037. The expanded combined uncertainty in Cw, % mass
the water determination with the coverage factor 2 is
Fig.l Dependence of relative expanded uncertainty in water de-
U(C w) =0.074% mass, and its corresponding relative termination (% reI.) on the water concentration in a sample (%
value is U(Cw)/C w=0.074/C w or 7.4/C w% reI. Calcu- mass). Empty circles correspond to predicted LODwp and LOQwp,
lated values of the relative combined uncertainty for full circles and bars to experimental data
Assessment oflimits of detection and quantitation using calculation of uncertainty in a new method for water determination 255

Table 1 Results of the water Sample Cav ,


Ct " S" Sh,
determinations % mass % mass parts of 1 parts of 1

Ascorbic acid (purchased) 0.15 0.18 0.630 (J.167


a-monothioglycerol (purchased) 0.24 0.25 0.215 0.042
a-monothioglycerol (fortified) 0.53 0.55 0.064 0.036

portions (each with ten replicates). First, by the Karl clear also that the model of uncertainty given in Eq. (7)
Fischer method and then by the pharmacopoeial meth- is not absolutely complete, but predicted parameters of
od for the determination of ascorbic acid [2, p. 131] or the method correspond rather well to the experimental
of a-monothioglycerol [2, p. 2271]. The average of 20 data. Maybe U(Cw)/Cw values calculated for concentra-
replicate water determinations by the new method, Cay; tions close to LOQwp and above, are excessively pes-
the corresponding relative standard deviation of repli- simistic (high). However, it is obvious that the experi-
cates, So and relative bias Sh = (Cay - CIT)/Cay are shown mental data (even obtained from 20 replicates) have a
in Table l. wide uncertainty range: sample statistical values can be
The relative standard uncertainty of the replicates significantly different from the population ones.
ucxp (Cw)/Cw= lOO(ST 2+ Sh 2) 112 is shown (% reI.) for val- Because the whole study was performed within the
ues Cw= CIT in Fig. 1 by full circles. Relative expanded framework of the new method validation according to
uncertainty in the uncertainty ucxp(Cw)/Cw by [18] for the AOAC Peer-Verified Methods Program [1] which
the level of confidence 0.95 and 20-1 =19 degrees of defines LOD and LOQ as experimental values, finally
freedom is U cxp =2.09x100/(2x19)1/2=39% reI., LOD w=0.2% mass and LOQw=0.5% mass were ac-
where 2.09 is the corresponding two-tailed percentile of cepted. These values are sufficient for the purposes of
the Student's t-test distribution. The U cxp values (39% the method. It has been adopted as an AOAC Peer-
reI. of ucxp (Cw)/Cw) are shown in Fig. 1 by the bars to Verified Method with the assigned number PVM
the full circles. 1: 1998.

Discussion Conclusions
From Fig. lone can see, that the calculated relative un- The approach to the assessment of the limits of detec-
certainty U(Cw)/Cw based on Eq. (7), being a hyperbo- tion and quantitation using uncertainty calculation can
la, depends on each 0.01 % mass of water concentration be helpful for the prediction of the former and experi-
at Cw close to LODwp' At Cw<LODwp the values of mental design.
U(Cw)/C w quickly tend to infinity. On the other hand, The calculation performed for the new method of
for Cw>LOQwp the relative uncertainty asymptotically water determination in samples which are unsuitable
draws nearer to zero. for direct Karl Fischer titration is in good conformity
Note that another definition of LOD and LOQ (for with the experimental validation data.
example, with requirements to both Type I and Type II
errors in decisions [20-22]) leads to another correlation Acknowledgements The authors thank Prof. E. Schoenberger
between these parameters and the uncertainty. It is for helpful discussions.

References
1. Lauwaars M (1998) Accred Qual As- 4. ISO/IEC Guide 25 (1990) General re- 7. Ellison SLS, Williams A (1998)
sur 3:32-35 quirements for the competence of cal- Accred Qual Assur 3:6-10
2. USP 23 (1995) The United States ibration and testing laboratories, 3d 8. Kuselman I, Shenhar A (1997)
Pharmacopeia (USP). The National edn. International Organization for Accred Qual Assur 2: 180-185
Formulary. United States Pharmaco- Standardization (ISO), Geneva, 9. Wegscheider W (1997) The Proceed-
peial Convention Inc, Md., USA 5. CITAC Guide I (1995) International ings of the 2nd EURACHEM Work-
3. Kuselman I, Shenhar A (1995) Anal guide to quality in analytical chemis- shop "Measurement Uncertainty in
Chim Acta 306:301-305 try: An aid to accreditation, 1st edn. Chemical Analysis. Current Practice
Teddington, UK and Future Directions", 29-30 Sep-
6. EURACHEM (1995) Quantifying un- tember, Berlin. EURACHEM
certainty in analytical measurement,
1st edn. EURACHEM, Teddington,
UK
256 I. Kuselman . F. Shennan

to. Dietrich A (1994) American Lab 20 14. Zeiler HJ, Heindl R, Wegscheider W 20. American Public Health Association,
(5): 33-39 (1990) Veitsch-Radex Rundschau American Water Works Association,
11. Mitchell J Jr, Smith DM (19HO) 2:4H--55 Water Environment Federation
Aquametry, Part III (the Karl Fischer 15. Wegscheider W, Zeiler HJ, Heindl R, (1995) Standard methods for the ex-
reagent). A treatise on methods for Mosser J (1997) Ann Chim amination of water and wastewater,
the determination of water. Wi\ley- H7:273-2H3 19th edn. American Public Health
Interscience, N.Y., USA 10. Sherman F, Kuselman I, Shenhar A Association, Washington, USA, pp 1-
12. Margolis SA (1995) Anal Chern (1990) Talanta 43: 1035-1042 10,1-11
07:4239-4240 17. Sherman F, Kuselman I, Shenhar A 21. Kaus R (199H) Accred Qual Assur
13. Margolis SA (1997) Anal Chern (199H) Reagent for determining water 3: 150-154
09:4HM-4871 and ene-diols or thiols. USA Patent 22. Vogelgesang J, Hadrich J (199H)
No. 5,750,404, 12.05.9H Accred Qual Assur 3: 242-255
IH. Kuselman I (199H) Accred Qual As-
sur 3:131-133
19. Cedergren A (1990) Anal Chern
oH:3079-30Hl
Accred Qual Assur (2002) 7:115-120
DOl 10.1007/s00769-002-0442-6

© Springer-Verlag 2002

Yunqiao Li Study of the uncertainty in gravimetric


Guanghui Tian
Naijie Shi analysis of the Ba ion
Xiaohua Lu

Abstract The determination of bari- as BaCI 2 and barium, and sodium


um by the gravimetric method, in by FAAS, calculated as Na 2S04.
which the precipitation of BaS04 The average mass of barium in the
was formed and weighed, coupled filtrate contributes about·0.06% rela-
with instrumental measurement of tive to that of the total barium, in
trace constituents was studied. The washes about 0.09%, mechanical
analyte's remaining in the filtrate loss about 0.06%, contaminants of
and washes, mechanical loss, con- BaCl 2 about 0.08% and Na2S04
taminants in the precipitate are the about 0.05%. All the trace constitu-
main influencing factors of uncer- ents were determined and corrected
tainty. A series of condition tests on a sample-by-sample basis. Sourc-
have been done, to reduce the effect es of uncertainty were assessed thor-
of the factors mentioned above and oughly. The uncertainty of this com-
the optimum test condition was bined gravimetric-instrumental
found. The determination was car- method was improved remarkably
y. Li (~). G. Tian· N. Shi . X. Lu
National Research Center ried out with a strictly defined opera- compared with that of gravimetric
for Certified Reference Material. tional procedure. The trace amounts method alone. The expanded uncer-
No.18 Bei San Huan Dong Lu, of barium in the filtrate, washes and tainty (k =2) is 0.08%.
Chaoyang Qu, Beijing, 100013, mechanical loss were determined by
P. R. China
e-mail: nrccrm@public3.bta.net.cn ICP-AES, the chloride occluded in Keywords Gravimetry· Uncertainty·
Tel.: +86-10-64228404 the precipitate was determined by Barium· Traceability
Fax: +86-10-64228404 ion chromatography (lC), calculated

Introduction main influencing factors on uncertainty and their magni-


tudes are not clear. Thomas W. Vetter et al [2] used "in-
Gravimetric analysis is a classical chemical determina- strumental-enhanced" gravimetric analysis for determi-
tion method, which has been developed for more than nation of sulfate. Sulfate in the filtrate, contaminants in
one hundred years [1]. The precipitation method is of the the precipitate, and volatilized sulfate was quantified
greatest importance because often it is more or less spe- with instrumental methods. The mechanical loss of sul-
cific for the constituent being determined and is of gen- fate was estimated. The accuracy of the method was in-
eral applicability, especially for the determination of al- creased and the expanded uncertainty (k =2) of the meth-
kali metals, alkaline earth metals, sulfate, phosphate and od was 0.16%.
so on. The analyte is precipitated as a very slightly solu-
ble compound, and separated from the solution, ignited
and weighed, and then the content of the constituent is The correction factors
found. The results can be traced to SI units without any
reference materials. This method is traditionally used for Many variables that influence the contamination of pre-
the measurement of "percentage-level" concentrations cipitate and loss of barium have been obtained in our ex-
and the results are often unsurpassed. The previous re- periment by systematic researching work and summa-
search work done on the method was on the repeatabili- rized in Table 1. A variable that decreases the magnitude
ty. The uncertainty of the method is not obvious, the of a factor is considered favorable and its effect is indi-
258 Y. Li et al.

Table 1 Effect of variables on factors of uncertainty

Variable Factor

Bahin Ba2+in mechanical BaCl 2 Na2 S04


filtrate washes loss of Ba2+ occlusion occlusion

Regular precipitation nd +
Reverse precipitation ++ nd ++
Increase excess of precipitant nd nd nd +
Increase adding rate of precipitant -s -s nd -s +
Increase concentration of precipitant -s + nd + ++
Increase acidity of solution + nd -s
Increase concentration of analyte -s nd + +
Increase standing time nd
Increase standing temperature -s nd nd nd nd
Increase volume of washes +
Ignition condition nd nd
Increase coexisting ion nd nd nd nd +
-: factor decreases, +: factor increases, nd: no data available, -s: factor decreases not distinctly, - - : factor not applicable

cated with the minus (-). A variable that increases the on the transferring and washing processes. It can be
magnitude of a factor is indicated with the plus (+). The dissolved in a hot HCl solution and detected.
relative magnitude of the effect is not considered in this 4. BaCl 2 and Na2S04 occlusion: It has been found that
table. The influence of these variables is explained below. all chloride in the precipitate was presented as BaCl 2
and the sodium as Na2S04.12] The variables are some-
1. Barium in the filtrate: Because of the solubility of
times contradictory for the contamination of BaCl 2
precipitate, there is always trace amount of barium in
and Na2S04' Using reverse precipitation, the occlu-
the filtrate. A relatively large amount of barium was
sion of BaCl 2 decreased but that of Na2S04 increased
in the filtrate when reverse precipitation was used.
markedly. Increasing the excess amount and the add-
The amount of barium in the filtrate will increase
ing rate of precipitant, the content of coexisting cation
along with the increase of acidity of the precipitating
and anion, the occlusion of Na2S04 was aggravated,
solution. To reduce barium in the filtrate, regular pre-
but almost no data changed for BaCI 2. By diluting the
cipitation was used with the proper excess amount
solution, prolonging the standing time of precipitate,
and the adding rate of precipitant, so the common-ion
and increasing the volume of washes, both occlusions
effect is dominant. Diluting the precipitating solution,
of Na2S04 and BaCl 2 were minimized.
with the proper acidity and temperature of the precipi-
tating solution, standing the precipitate in the filtrate
for a time, so that the small and imperfect crystals
Experimental
will form larger and more perfect crystals and the sol-
ubility of the precipitate will be decreased. Another
1. Sample preparation
important variable is the filtrate volume, for a given
condition, the solubility of the precipitate is almost
The analytical-reagent grade BaCl 2 was weighed and
constant, as the filtrate volume decreased, the loss of dissolved in 5% HC1, 20 kg of barium solution was pro-
barium in the filtrate will be reduced. duced and the nominal concentration was 20 mg.g- I . The
2. Barium in the washes: There is always a trace amount solution was mixed and packed in 20 mL ampOUles.
of barium dissolved in washes by washing the precipi-
Some of them were taken for analysis.
tate. Increasing the concentration of the precipitant
and the volume of washes, the amount of barium in
the washes will be increased. In order to decrease pre- 2. General procedure
cipitate contamination, thorough rinsing of the precipi-
tate is necessary. Increasing the adding rate of precipi- The optimal test conditions shown in Table 2 were used in
tant, the acidity of precipitating solution, the concen- the following experiment. A hot solution of Na2S04 was
tration of analyte and the standing time of precipitate, added to a pre-weighed hot acidified solution of BaCI 2.
the amount of barium in washes will be decreased. The precipitated BaS04 was left overnight on a water-bath
3. Mechanical loss of barium: Some of the micro crys- at about 94°C and then filtered and washed. The precipi-
tals of the precipitate still adhere to the beaker side, tate and filter were charred, then ignited to constant mass
stir rod and policeman when it is transferred onto the (the difference between two ignitions was less than 0.03
filter. The amount of the mechanical loss depends up- mg) at 800 °C in a platinum crucible. The masses of the
Study of the uncertainty in gravimetric analysis of the Ba ion 259

Table 2 Optimal test conditions for the gravimetric analysis of barium sulfate

Precipitation mode Regular

Excess amount of precipitant 2.5 times


Adding rate of precipitant 10 mL of precipitant is poured at 4 minute intervals
Concentration of precipitant 0.5% (10 mL 5% Na 2S04 diluted by water to 100 mL)
Acidity of precipitation solution pH: 1.8
Concentration of analyte 0.67mg·g- 1 (about II g sample solution diluted by water to 300 mL)
Standing time of precipitate 20 h
precipitating temperature 90-95 DC
Standing temperature of precipitate 94 DC
Volume of washes 150mL
Igniting condition of precipitate 800 DC
Fusing condition of precipitate 1.000 g, K2CO" 9 I 0 DC, 35 min
Extracting condition of melts Hot water, several times, acidified with HNO" total volume is 100 mL

Table 3 Gravimetric factors used for calculation

Correction factors Symbol Conversion Coefficient Calculation

Mass of Sample (g)


Mass of Precipitate weighed (mg)
Mass of Ba 2+in filtrate(mg) Ba2+~Ba2+ I
Mass of Ba 2+in washes (mg) Ba2+~Ba2+ I
Mechanical Loss of Ba 2+ (mg) Ba2+~Ba2+ I
Mass of BaCI 2 in Precipitate (mg) CI-~BaCI, 2.9368 IC CI-x2.9368
Equivalent mass of Ba 2+in BaCI} (mg) CI-~Ba2+ • 1.9368 IC CJ-xI.9368
Mass of Na2S04 in Precipitate (mg) Na+~Na2S04 3.0893 AAS Na+x3.0893

precipitate were blank-corrected. The crucibles had been ;\Ia,50, Dissolve


Mechanical
Solution Adhesive l(P·AES
pre-ignited to constant mass. All the weights were correct- -
I.oss of Ra] I

Precipitate with I--- !\;1easure


ed for buoyancy. The barium in filtrate, washes, and the
~

Correction
Ba"

l
Water and IICi
mechanical loss were determined by ICP-AES. The pre-

-i
cipitate was fused with K2C0 3 at 910 °C, extracted with I Ba"Sampic
t---
a
hot water and acidified with HN0 1. Some of the aliquots Solution ICP·AES Ba'in

r-
for determining chloride by IC and were calculated as BaSO, Filtrate Measure Da 2 ;
f---
filtrate
Correction
BaCl 2 and barium. Other for determining sodium by AAS
and was calculated as Na 2S04. A schematic diagram of
I

t-
I Precipitate
-1
b BaSO, lCp·AES Ba 2 'in
the procedure is shown in Fig. I and the gravimetric fac- BaSOI
Measure Ba' Washes
tors used for calculation are listed in Table 3. Washes f---
Correction

C tvteasure 13.('1,
f
3. Apparatus n - Occlusion
(' orrce! ion

All the instruments and the significant experimental pa-


rameters are listed in Table 4.
"used "ith
K,CO,
Acidified with
I Extracted
I solution
r FAAS '!a,SO,
RNO,
!\.-1easure Occlusion
g
Na-t r-- Correction
4. Analytical reagents
Fig. 1 Gravimetric coupled with instrumental measurement for
Na2S04' HN0 3 , HCl and K2C0 3 are all of guaranteed re- determined of barium
agent grade, BaCl 2 is analytical-reagent grade. All the
important impurities in the reagents were measured. The
sodium in K2C0 3 is 28 J..Ig g-I and chloride is 6 J..Ig g-I. BaS04) in a 1000 mL high-pressure polyethylene bottle.
The Na2S04 solution was prepared and at a concentra- The barium, sodium and chloride calibration solutions
tion of 5%, filtered, and left for several weeks (to promote were made up from GBW(E)080243, GBW(E)080260
the formation of larger crystals when used to precipitate and GBW(E)080269, NRCCRM standard solutions.
260 Y. Li et al.

Table 4 Instruments and parameters used in the determination

Method Constituents Condition Parameter

ICP-AES barium Wave length 455.403 nm


ICP power 1.0 kW
Plasma flow 12 L·min- I
Nebulizer flow 0.7 L·min- I
Auxiliary flow 1.0 L·min- I
Observation height 7mm
IC Chloride Eluent 25 mmol·L-INaHCOi25 mmol·L-INa2CO,
Flow rate 1.5 mL·min- 1
Injection Volume 50 fJL
Detector Conductivity
Column SA4 anion-exchange column
FAAS Sodium Wavelength 589.0 nm
Slit 0.4 nm
Oxidant gas flow rate air: 1.6 kg· min-I
Reducing gas flow rate C 2H 2: 0.25 kg· min-I
Burner height 7.5 mm
Background correction Zeeman mode

Results and discussion 0.3% due to leaking, relative to that of the total barium.
This quantity of barium should be added to that of the to-
Results tal barium. If it is not corrected, it may lead to a negative
error of about 0.09% under this condition. It can also be
The total concentration of barium in solution is calculat- seen that the effect of leaking or other factors can be
ed as follows minimized or eliminated by using correction.
CBa2+(mg. g-l) Mechanical loss of barium. The mass of the mechanical
= (/11;2 - m6 -mg) x 0.5883993 + I1l:l + m4 + ms + '!by X 10-3 loss of barium was about 0.03%--0.08% relative to that of
m1 (1) the total barium. This quantity of barium should be added
to the total barium. If it is not corrected, it may lead to a
The factors used for calculation are given in Table 3. The negative error about 0.03%--0.08% under this condition.
results for gravimetric coupled with instrumental deter-
mination are listed in Table 5. All corrections were ap- Occlusion of BaCI2. The mass of Bael 2 occluded was about
plied on a sample-by-sample basis. The expanded uncer- 0.04%--0.1 % relative to that of BaS04, or 0.02%--0.05%
tainty of the final result was calculated according to relative to that of the total barium. This quantity of Bael 2
GUM guide [3]. should be subtracted from the mass of BaS04and the rele-
vant mass of barium should be added to that of the total bar-
ium. If it is not subtracted, it may lead to a positive error
Gravimetric determination about 0.02%--0.05% under this condition.
Using the Gravimetric method alone, the concentration of Occlusion of Na2S04' The mass of Na2S04 occluded was
Ba2+ in solution is 19.2680 mg g-l and the relative stan- about 0.03%-0.06% relative to that of BaS04' or
dard deviation is 0.089%. They are all listed in Table 5. 0.02%--0.04% relative to that of the total barium. This
quantity of Na2S04 should be subtracted from the mass
Loss of barium and contamination of BaS04. If it is not subtracted, it may lead to a positive
error about 0.02%--0.04% under this condition.
Loss of barium in filtrate. The mass of barium in the fil-
trate is about 0.02%-0.09% relative to that of the total Total loss of barium Total loss of barium includes the
barium. This quantity of barium should be added to the loss in filtrate and washes, mechanical loss and con-
total barium. If it is not corrected, it may lead to a nega- tained in the occlusion of Bael 2. The mass of total loss
tive error of about 0.1 % under this condition. of barium was about 0.2%-0.3% relative to that of the
total barium. This quantity of barium should be added to
Loss of barium in washes. The mass of barium in washes that of the total barium. Otherwise, it may lead to a nega-
was about 0.09% except in one sample which was up to tive error about 0.2%-0.3% under this condition.
Study of the uncertainty in gravimetric analysis of the Ba ion 261

Total occlusion The total occlusion includes Na2S04 and


00_000\ tnN t-r-Ifj 0 \ 0 BaCI 2. The mass of the total occlusion was about
NI"--o-o'T-ooolf'.O ~'T'T
OO\o\OO\OO\OOOON
~NN('f""',N('f"jNrr.rr'l rr'.oo
0.06%-0.15% relative to that of BaS04, or 0.04%-0.1 %
0\0\0\0\0\0\0\0\0\ 0\00 relative to that of total barium. This quantity of occlu-
sion should be subtracted from the mass of BaS04' Oth-
erwise, it may lead to a positive error about 0.04%-0.1 %
O~NO'TI"-'TOO
00""-o000\~00-o~
under this condition.
\O ........
"""~("f"")o\-Ntr;
('f"',l£)lr)"","N~'o::::t('f"j",," The data listed above show that the effect of individual
000000000 factors is approximately parts per ten thousand compared
to the final results, its magnitude is to some extent in tune
with the relative standard deviation of the" gravimetric-
('f",O\..q('f",tr.('f'")tr.tnN
O'\--O\-tr.MI..OV',
II"l-01"-0\II"l~1"-'''II"l
instrumental" method. The quantity of total loss of bari-
-.:::t
tr) tr. ~ tn 0 tn tr, \0
um is more than that of total occlusion, so if the correc-
00000""':000
tion is not made, the results will be on the low side.

3~~~~2;J~~~
""or.~~ool"--o'To\
--NNO---- Gravimetric-instrumental determination
000000000
Using gravimetric-instrumental determination, the con-
-o'TO\'TI"-II"ll"-ooN
centration of Ba 2+ in solution is 19.3010 mg·g- 1 and a
r-MO\Otn~",,"\OO\
("f'"', tn ('f""', N lr, ........ t.n t- ('f"', relative standard deviation of 0.024% was obtained. The
N"""'N-NN-N
000000000 results are list in Table 5 and were calculated with Eq.
(l). All the corrections were made on a sample-by-sam-
pIe basis.
The over-all repeatability of the low-precision instru-
mental measurement coupled with high-precision gravi-
metric analysis (relative standard deviation, 0'()24%) is
three times better than the repeatability of the gravimet-
oo\O~oor-rnootr,\O ric analysis alone and the value (19.3010 mg·g- 1) is
00 00 \0 r- lr'. N 00 ar,
greater than the gravimetric value (19.2680 mg·g- 1). This
("f')
II"l001"--o1"-0-o1"-1"-
........................................ \0 ................ -
000000000 indicates that an improvement has been made by the
coupled instrumental analysis. In addition, a determina-
0'T0-o0\0ll"l000
'TNI"-II"lool"-l"-'Tl"- tion made without the instrumental correction would be
-o'T-oo\NOI"-I"-~.
0000----- negatively biased by 0.2%.
000000000

Assessment of uncertainty
-5 00 0\ II"l or. I"- 0\ I"- 'T 0\
0\~."''T0\0\00'T~. 00000\
°
.~ I"- I"- I"- 00 or. N -0 I"- -0 -0 - 00
NNNNNNNNN NOO
Instrumental measurement
------- -
0\0\0\0\0\0\0\0\0\ 0\00
"0

=
OJ
........

:0
E
o
Barium in the filtrate measured by ICP-AES. There was
u -0 N ~. 'T I"- 0\ II"l 0\ 00
0\0\ \0(".1 \O('f""',oo-\O
a lot of NaCI in the filtrate, so the effect of matrix is very
1"--0\ I"-'T 'TO\-o-o
C'ir-:orrioo.....:r-:ooo high. The barium was measured by means of standard
\O\OLr;l£)<o::j"V1trl\Otr.
("f"")("fj('f"',('f"',('f"Jtr"lM("f')("I"'j addition. The concentration of barium in the filtrate was
about 0.2 mg·L-l, and the repeatability of ICP-AES was
(:f"',tr.",,"rt"'llr'.OOOO("f'"',
........ r- l£', \0 0\ ('f"', \0 00 N Jess than 2.5%. The combined standard uncertainty was
NO\"q-NVJrt"'l-OOlr,
r-O-O'\",,"lrlrrIOVl 3.5% and the mass of barium in filtrate was about
ON I"-I"--Ot"-O\ 00\
""':""':00000""':0 100 flg, so the estimated relative standard uncertainty of
"-
o the correction for the barium in filtrate was 1.8x1 0-5 .

Barium in washes measured by ICP-AES. The barium in


o -C"l-C"I-N--N washes was determined by ICP-AES with the matrix-
z JJoAoA00r1.obob matched calibration curve method. The concentration of
barium in washes was about 1.5 mg·L-l, and the repeat-
ability was 1.5%. The combined standard uncertainty
was 2% and the mass of barium in filtrate was about
262 Y. Li et al.

180 Ilg, so the estimated relative standard uncertainty of (3) standard uncertainty for correction of barium loss in
the correction for the barium in washes was 1.7xlO-5 . washes u5= 1. 7x 10-5
(4) standard uncertainty for correction of mechanical
Mechanical loss of Barium measured by ICP-AES. The loss of barium
barium was determined by ICP-AES with the matrix- a standard uncertainty for the measurement of me-
matched calibration curve method. The concentration of chanicalloss of barium u6= 1.1 X 10-5
barium in solution was about 2 mg·L-l, and the repeatabil- b standard uncertainty caused by the incompletely
ity was 1.5%. The combined standard uncertainty was 2% extraction for the mechanical loss of barium
and the mass of mechanical loss of barium was about u7 = I xlO-5(the extraction rate of barium from the
100 Ilg, so the estimated relative standard uncertainty of beaker, stir rod and policeman was about 98%)
the correction for mechanical loss of barium was l.lxlO-5 . (5) standard uncertainty for correction of occlusion
BaCl 2
Chloride measured by Ie. There is a lot of KN0 3 in the a standard uncertainty for the measurement of oc-
extract, so the effect of the matrix is high. The chloride clusion BaCl 2 U8= 1.1 X 10-4
is measured by means of standard addition. The concen- b standard uncertainty caused by the incompletely
tration of chloride in solution is about 0.2 mg-L-l, and extraction for occluded BaCl 2 in precipitate is
the repeatability of IC is about 5%. The combined stan- u9=1.5xlO-5
dard uncertainty was 10%. The mass of BaCl 2 (trans- (6) standard uncertainty for correction of occlusion
ferred from chloride) in extract was about 250 Ilg, so the Na2S04
estimated relative standard uncertainty of the correction a standard uncertainty for the measurement of oc-
for BaCl 2 and barium from occlusion was I.lxlO-4. clusion Na2S04 uIO=2.3xlO- 5
b standard uncertainty caused by the incompletely
Sodium measured by FAAS. There was a lot of KNO, in extraction for occluded Na 2S0 4 in precipitate is
the extract, so the effect of the matrix is high. The sodium ull=1.0xlO-5
is measured with a matrix-matched calibration curve (7) standard uncertainty of the measurement for atomic
method. The concentration of sodium in solution is about weight u12=1.9xlO-4
0.3 mg-L-l, and the repeatability of FAAS was about 3%. (8) standard uncertainty of the buoyancy modification
The combined standard uncertainty was 5% and the mass u 13 =5.8xlO--6
of Na2S04 (transferred from sodium) in the extract was
The combined standard uncertainty is 4xl0-4
about 160 Ilg, so the estimated relative standard uncertain-
The expanded standard uncertainty is 8xlO-4
ty of the correction for occluded Na 2S04 was 2.3xlO-5 .

Conclusion
Assessment of the combined uncertainty
The combination of a classical gravimetric determination
Uncertainty Assessment of type A. The source of Type A
together with instrumental techniques was used to ana-
uncertainty is the relative standard deviation of result by
lyze the concentration of barium in solution. Corrections
gravimetric -instrumental method u 1=2.4xlO-4.
were made to the classical gravimetric method to correct
the loss of barium in the filtrate, washes and the mechan-
Uncertainty Assessment of type B. The sources of Type B
ical loss, and the contaminants of BaCl 2 and Na2S04' In-
uncertainty are as follows:
strumental methods (ICP-AES, IC, FAAS) were used to
(1) weighing uncertainty: quantify the loss of barium and the contaminants in the
a standard uncertainty for sample weighing u2=2xl()-6 precipitate. The sources of the uncertainties have been
b standard uncertainty for precipitate weighing assessed thoroughly, and the values were obtained. The
u3=8xlO--6 uncertainty of the combined method has been improved
(2) standard uncertainty for correction of barium loss in remarkably and the expanded standard uncertainty (k =2)
filtrate u4=1.8xI0-5 is 0.08%.

References
1. Kolthoff 1M, Sandell EB: Textbook of 2. Vetter TW (1995) Analyst 3. GUM (1995) Guide to the expression of
Quantitative Inorganic Analysis 3rd edn. 120:2025-2030 uncertainty in measurement"(issued by
ISO, IEC, BIPM, IFCC, IUPAC and
OIML)
Accred Qual Assur (2()()()) 5: 100-103
© Springer-Verlag 2()()O

I1ya Kuselman Assessment of permissible ranges


for results of pH-metric acid number
determinations using uncertainty
calculation

Abstract An approach to assess calculation is performed before the


the permissible ranges for results analytical method validation, the
of replicate determinations using permissible ranges can be pre-
Presented at: EURACHEM Workshop uncertainty calculation is discussed. dicted. As an example, the range is
on Efficient Methodology for the The approach is based on the predicted for a new pH-metric
Evaluation of Uncertainty in Analytical
Chemistry, Helsinki, Finland 14--15 June known range distribution for nor- method for acid number determi-
1999 malized "range/standard deviation" nation without titration in petro-
values, which is equivalent to the leum oils (basic, white and trans-
distribution of the range for nor- former). The results of the predic-
malized results of replicate deter- tion are in good conformity with
minations having an average of 0 the experimental data.
and a standard deviation of 1. It is
I. Kuselman shown that the permissible ranges
The National Physical Laboratory can be assessed using tabulated Key words Range . Prediction
of Israel (INPL), Givat Ram, percentiles of this distribution and Uncertainty of measurements'
Jerusalem 91904, Israel
e-mail: kuselmanC<:I.netvision.net.i1
calculated values of the determina- Analytical method validation .
Tel.: + 972-2-6536534 tion (analysis) standard uncertain- Acid number determination
Fax: + 972-2-6520797 ty. When the standard uncertainty pH-metry

Usually these parameters are estimated from the


Introduction
data of the method validation and its collaborative
study. However, even before validation, an assessment
According to the definition [1] a range W I1 is the differ-
ence between the highest and lowest values in data con- (prediction) of Rn.s, R".d and Rn.l values can be helpful
in deciding whether the method is "fit-for-purpose".
sisting of n values. During the use of an analytical
Predicted R I1 •s, R n.d and Rn.l values are also expedient
method a permissible range R I1 •s of results of n replicate
for the design of the experiment for method validation
determinations Xl, X2, ... , Xn obtained under the same
and diagnosis of outliers during the experiment. Such
conditions should be known (repeatability level [2]).
predictions can be performed using calculation of the
So, each W I1 $ R n .s . For example, if n = 2, the value R 2 is
,S
analysis uncertainty "with pen and paper", as was done
the maximal acceptable difference of the duplicate re-
for the assessment of limits of detection and quantita-
sults XI and X2. A norm R I1 • d of the difference between
tion [3].
two or more results of the analysis of the same sample
In the present paper, the range prediction is dis-
in the same laboratory but under different conditions
cussed and used for statistical analysis of the data ob-
(days, analysts, instruments, etc.) and a norm R n .l for n
tained during validation of a new method for acid num-
results of different laboratories are also necessary for
ber (AN) determination without titration in petroleum
quality control of the analysis. These are known as the
oils which has been developed in our laboratory
intermediate precision level and reproducibility level,
(INPL) [4].
respectively [2].
264 I. Kuselman

The ratio of intra-laboratory standard deviation for


Assessment of a range using analysis uncertainty
the repeatability level and the corresponding standard
deviation for the intermediate precision level is an in-
If combined standard uncertainty U e is the standard de-
termediate between 1 (repeatability level) and 0.67 (re-
viation quantified using intra-laboratory components
producibility level). It can be accepted as approximate-
for analysis under the same conditions [5], predicted
ly (1 +0.67)/2 =0.83. In this case R n . d = Qp uJO.83 and
R n . s = Qp Un where Qp is the critical value (limit) of t~.e
the difference between the two results should be no
ratio "range/standard deviation" at the level of conll-
more than R 2 .d = 3.34u n while the range for three re-
dence P. In essence, Qp is the limit of the range for nor-
sults - no more than R 3 .d =3.99u c at P=O.95.
malized values x which have an average of 0 and a
If a range is assessed for values XI. X 2 , ... , X", which
standard deviation of 1. The limits are calculated from
are the averages from n replicates, the permissible val-
ue of the range should be divided by Vn. The reason is
the function of the range distribution [6]:
+00 that the standard deviation of the average is Vn times
P{wn~Rn}=n S [P(x+Rn)_p(x)]n-1dP(x). (1) less than the standard deviation of the replicate.
Qp values are tabulated for normally distributed x, for
example in [6, 7]. Figure 1 gives an overview of the val- Uncertainty in AN determination
ues which are dependent on n for different P. One can
see that at P=0.95 for n=2 the value QO.95=2.77 and The method is proposed for determination of
for n = 3 it is QO.95 = 3.31. Therefore, a range of dupli- AN < 0.1 mg KOH/g oil in such oils as white, trans-
cates, should be no more than R 2 .,=2.77u", and a range former and basic oils. The AN is an important charac-
of three replicates - no more than R 3 ,.1'=3.31ue . teristic of a petroleum oil's quality because the conduc-
Since the ratio between intra-laboratory and inter- tive and corrosive properties, and several other proper-
laboratory standard deviations of analytical results is ties of the oil are dependent on AN. The method is
approximately 0.67 [8], the analogous limit for a range based on rapid and complete extraction of acids from
of results obtained in different laboratories is an oil test portion into a special reagent and measure-
R n .I =Q p uJO.67. Therefore, the difference between two ment of the conditional pH in the "oil-reagent" mixture
such results should be no more than R2.I=4.13u", and before (pH 1 ') and after (pH2') standard acid addition.
the range for three results - no more than R 3 .d = 4.94u c As a standard addition, a solution of hydrochloric acid
at P=0.95. is used.
The calculation of AN is carried out according to the
following formula:
AN =56.11 N'I V,,/[m(10.1PH-1)], (2)
5.3 where 56.11 is the molecular mass of KOH, N'I and V'I
are the concentration (M) and volume (ml) of the ad-
5 ded Hel standard solution, m is the mass of the oil test
portion (g) and .1pH = pH l ' -pH2 ' .
The standard uncertainty of the AN determination
can be evaluated according to [9] as
u(AN) =0.032 AN/(1-1I10.1pH). (3)
So, for values .1pH = 0.25-0.40 recommended in the
method [4] the standard uncertainty u(AN) is about
0.06 AN.

Design of the experiment for method validation


and prediction of ranges

The experiment for the method validation was de-


10 12 14 16 18 20 signed in order to obtain four replicate results of AN
n determination for each sample daily, over 5 days, in
Fig. 1 Dependence of limits Qp on the number of replicates n at
two laboratories (Lab 1 - INPL, as the method origina-
different levels of confidence P. Curve J corresponds to P=O.90, tor, and Lab 2 - Bio-Lab Ltd., Israel, as an independent
2-P=O.95, and 3-P=O.99 laboratory [8]). Samples of purchased and fortified
Assessment of permissible ranges for results of pH-metric acid number determinations using uncertainty calculation 265

white, transformer and basic oils were used. The design


Results and discussion
allowed the assessment of the method precision for the
levels of repeatability (intra-laboratory, within a day),
The results of the experiment are shown in Table 1,
intermediate precision (intra-laboratory, between days) 4
and reproducibility (inter-laboratory). where Xi = L Xi/4 is the daily average result of AN de-
The predicted permissible range at the level of re- j=l

peatability for four replicates is R4 .s = 3.63 u(AN) = termination (from 4 replicates, j = 1,2, ... ,4), and Wi is
0.22 AN. the range of replicates Xij in i-th day (i = 1,2, ... ,5).
At the level of intermediate precision for five repli- Table 2 shows the ranges Ws between the highest
cates, each an average from four daily results, the per- and lowest daily average values Xi, the total laboratory
5 4
missible range is RS,d=3.86u(AN)/(0.8JV4) = average results Xlii'!! = L L xi)20 and the difference
0.14 AN. i= I j=l
For the reproducibility level, the predicted permissi- W2 between them, with the predicted permissible
ble range for laboratory results, each an average from ranges: 1) for four replicates during a day - R 4s ; 2) for
4 x5 =20 replicates, is R2J=2.77 u(AN)/(0.67V20) = five daily average values - Rs.tI and 3) for two total av-
0.06 AN. erage results - R2J.
Comparing Wi values with their norms R 4 .s one can
see that all Wi::5 R 4 ." so the method repeatability is satis-

Table 1 Results of the experiment for method validation

Oil, Para- Lab 1 (INPL) Lab 2 (Bio-lab Ltd.)


sample meter Day ilResults (mg KOH/g oil) Day i/Results (mgKOH/g oil)

2 3 4 5 2 3 4 5

Basic oil, Xi (),()(J60 0,O()62 (),O(J60 0'()O61 0,OO5H O,OOti2 0,0061 0,0062 0,()O60 (W066
purchased Wi 0,0004 (1.0010 (J,0009 (),(J(J03 O,OO(JH 0,0006 O.OO(JH (J,0007 0,0004 0,0003
Basic oil, Xi O,()53 0,055 0,051 0,052 (1.052 (1.055 0,050 (1.051 0,050 (),()52
fortified Wi (),()03 (1.003 (),O02 (),()(J5 O'(J01 0,003 (J,OOH (J,005 0,006 (t()05
Transformer Xi 0,0023 (),0022 0,0023 0,0023 (1.0023 0,0021 (J,O()21 O,()O2l (J,()()22 0,0022
oil, purchased Wi (WOOl (),OO02 (WOOl O,O(J01 (),O()O1 0,0002 0,0000 0,0001 0,0001 OJ )(JOI
Transformer Xi 0,052 0.051 0,052 0,051 0,051 0,049 (J.()52 0,052 (J.()5l 0,052
oil, fortified Wi 0.001 0,003 0,001 0,003 0.004 I),(JOO (),OOO 0,001 O,OOl (J.()01
White oil, Xi 0.0021 (1,0022 (J,(J021 0,()O21 (),O()22 (),OO2() O,(J021 0,0019 0,0019 (W020
purchased Wi (),()()O3 (WOO2 0,OO(J2 (J,0002 0,0001 0,0002 0,0002 O,OO(JO 0,0000 0,0000
White oil, Xi 0,050 0,052 0,052 0,051 0,050 O,(J51 0,050 O,(J50 (J.()49 0,049
fortified Wi (J.(J03 (J.(JOl (J.()02 (J,(Jm 0,003 0,002 0,004 (J,004 0,002 (J,002

Table 2 Results of the range prediction and statistical analysis of the experimental data using Horwitz's norms

OiL Lahor- Range calculations (mg KOH/g oil) Statistical analysis with the Horwitz's norms
sample atory
R4 " W, R,,,, X m ·g W2 R2J C, parts RSD" RSD 1N, RSD 2 • RSD 2N . RSD,. RSD'N.
of I (Yo <X) (X, (X, (Xl (X)

Basic oil, 1 0,0014 0,O(XJ4 0,(XJ09 0,0060 0,0002 (W)04 1.1 ,10-5 4.29 7,47 1.96 4,50 2.32 2,49
purchased 2 0,0006 0,0062 3,40 2,91
Basic oil, 1 (J,()] 1 (W04 0,007 O'()53 0,001 0.003 9,4,10- 5 2.24 5,40 2.77 3.25 1.35 1,80
fortified 2 O,()O5 0,052 4.28 3,66
Transformer 1 (WOO5 0,0001 0,0003 (UX)23 O,(X)Ol 0,0001 4,0' 10-" 2.33 8,71 2,22 5.25 3,14 2,t)O
oiL purchased 2 O,()ml (W)22 2,01 1,05
Transformer 1 0,011 0,001 0,007 0,051 0,000 o,om 9,1'10-5 1.77 5,43 0,90 3,27 (),(K) 1.81
oil, fortified 2 0,003 0,051 0,83 2,17
White oil, 1 0,0004 (WOO 1 (),m03 (UK)21 0,0001 O,(KKl1 3,7' 10-6 3,47 8,83 2,70 5,32 3,45 2,94
purchased 2 0,0002 0,0020 l.67 4.24
White oil, I O.!lll 0,002 O,()(J7 0.051 (J.()01 0,003 9,0,10- 5 1.35 5,43 1.65 3,27 1,40 1.81
fortified 2 (),O02 (J.()50 2.lfi l.66
266 I. Kuselman

factory. The intermediate precision is also satisfactory, therefore, more difficult to analyse. However, it should
since Ws:5 Rs,d for all oils in both laboratories. The sim- be noted, the sample standard deviations RSD 3 have
ilar assessment of the method reproducibility by the 2 (20-1 )-1 = 37 degrees of freedom while the number of
condition W2:5 R2.I is positive too. degrees of freedom of the norms RSD 3N can be accept-
The same results can be obtained by statistical anal- ed as infinity. So, correct comparison of the standard
ysis of the experimental data using comparison of the deviations with their norms should be based on X2 or
relative standard deviations (RSDs) of Xij' Xi and Xl/ vg Fisher's criteria [13]. For example, by Fisher's criterion
with the corresponding empirical Horwitz's norms [8, for transformer oil F= RSDj/RSDjN =3.14 2/2.90 2
10-12]. For this purpose the following parameters are =1.17. It is less than the critical value F o.lJs {37, oo)
calculated and shown in Table 2: = 1.54 at the level of confidence 0.95. Therefore, the
1. Values X avg (mgKOH/g oil) expressed as concentra- population value of RSD 3 for this oil is no more
tions of naphthenic acid in decimal fractions: than the Horwitz's norm. The same is true also for
C=Xavg (100/56.11)/1000, where 100 and 56.11 are white oil, since the corresponding F=3.45 2 /2.94 2 =
the molecular masses of naphthenic acid and KOH, 1.38 < Fo.lJ5 {37, oo} = 1.54.
respectively and 1000 is the factor for transformation From the values above it can be seen that the range
of mg to g. prediction and uncertainty calculation (on which the
2. RSDs of Xij averaged for 5 days (the i-th values were prediction is based) are adequate and in good confor-
homogeneous ): mity not only with the experimental data for the meth-
RSD I = 100t~IJI [(Xij- X;)2IX;]I[5 (4-1)] f2, %.
od validation, but also with the database used by Hor-
witz for calculation of his norms.
3. Norms for RSD I by Horwitz:
RSD IN =2(I-O.5LogC) X 0.67, %.
4. RSD of Xi: Conclusions
J
RSD 2 = 100[JI [(Xi - Xavg) 2]/(5 -1) 1/21Xavg, %. The approach to evaluate the permissible ranges for re-
sults of replicate determinations using uncertainty cal-
5. Norms for RSD 2 derived from RSD I:
culation can be helpful for prediction of ranges and sta-
RSD 2N =RSD IN /(0.83y'4), %.
tistical analysis of the data obtained during validation
6. RSD of Xavg:
of the analytical (chemical) method.
RSD3=100y'2[(Xavgl-XlIvg2)/(XlIVI(I + X lIvI(2)] , %,
The range prediction performed for the new method
where numbers 1 and 2, as additional indices for
of pH-metric AN determination without titration in pe-
Xl/VI(' denote Lab 1 and Lab 2, respectively.
troleum oils is in good conformity with the experimen-
7. Norms for RSD 3 by Horwitz:
tal validation data.
RSD 3N =RSD IN /(0.67yLO), %.
All the RSD values are less than their norms, except Acknowledgements The author thanks Professor E. Schoenber-
inter-laboratory RSD 3 for purchased transformer and ger, Professor Ya. Tur'yan and Dr. E. Strochkova for helpful dis-
white oils. These samples have the lowest AN and are, cussions.

References

1. Havilcek LL, Crain RD (1988) Practi- 5. EURACHEM (1995) Quantifying un- 9. Kuselman I, Shenhar A (1997)
cal statistics for the physical sciences, certainty in analytical measurement Accred Qual Assur 2: 180-185
American Chemistry Society, Wash- 1st edn. EURACHEM, p Hi 10. Horwitz W, Albert R (1987) Anal
ington, D.C. 6. Owen DB (1962) Handbook of statis- Proc 24:49-55
2. United States Pharmacopeia. USP 23 tical tables, Addison-Wesley, Read- 11. Thompson M, Fearn T (1996) Ana-
(1995) US Pharmacopeial Conven- ing, Mass., pp 138--139 lyst 121 :275-278
tion, Inc., Rockville, Md., pp 7. Dixon WJ, Massey FJ Jr (1969) Intro- 12. King B (1999) Accred Qual Assur
1982-19M duction to statistical analysis, 3rd edn. 4:27-30
3. Kuselman I, Sherman F (1999) International Student Edition, New 13. Miller JC, Miller IN (1993) Statistics
Accred Qual Assur 4: 124-128 York, Table A-8b for analytical chemistry, 3rd edn. Ellis
4. Tur'yan YI, Strochkova E, Berezin 8. AOAC Peer-Verified Methods Pro- Horwood, Bodmin, England
OY, Kuselman I, Shenhar A (1998) gram (1993) Manual on policies and
Talanta 47: 53-58 procedures. Association of Official
Analytical Chemists International,
Arlington, p 9
Accred Qual Assur (2002) 7: 13-18
© Springer-Verlag 2002

I1ya Kuselman Uncertainty and other metrological parameters


Elena Kardash-Strochkova
Yakov I. Tur'yan of peroxide value detennination in vegetable oils

Abstract Measurement uncertainty fined oils have PV~0.5 meq/kg, the


in the proposed redox-potentiometric limit of detection (LOD) and limit of
methods for peroxide value (PV) de- quantitation (LOQ) of the methods
termination in vegetable oils is eval- are important. An approach to assess
uated in comparison with uncertainty the LOD and LOQ using uncertainty
in the standard methods. The meth- calculation was applied. It is shown
ods determine all peroxides in oils, how important is the influence of the
in terms of milliequivalents per kg of solvents purity on the values of LOD
sample (meq/kg), that oxidize potas- and LOQ.
sium iodide (KI) under the condi-
tions of the test. The standard meth- Key words Peroxide value·
ods are based on KI oxidation by the Vegetable oils· Uncertainty of
1. Kuselman (~) . E. Kardash-Strochkova oil test portion and volumetric titra- measurements· Limit of detection·
Y.I. Tur'yan tion of the liberated iodine, while the Limit of quantitation
The National Physical Laboratory
of Israel (INPL), Givat Ram,
proposed methods are using redox-
Jerusalem 91904, Israel potentiometric iodine determination
e-mail: kuselman@netvision.net.il without titration. As far as fresh re-

Introduction properties deterioration [6]. The methods [3, 4, 5] deter-


mine all peroxides in oils, in terms of milliequivalents
If standard analytical methods are time and labor con- per kg of sample (meq/kg), that oxidize potassium iodide
suming, new methods are developing instead of them. (KI) under the conditions of the test. The standard
The aims of the development should be achieved with methods are based on KI oxidation by the oil test portion
the condition that the metrological parameters of a new and volumetric titration of the liberated iodine, while the
method are "fit for purpose" [1]. The choice of these pa- proposed methods are using redox-potentiometric iodine
rameters and their assessment affect the final result of determination without titration. The major drawback of
the development. As a universal parameter, the mea- the standard methods is that the titrimetric determination
surement uncertainty can be applied. The others, such as of low levels of PV is complicated and requires a cer-
limit of detection (LOD) and limit of quantitation tain experience of the analyst. Such PV levels (less than
(LOQ), can be expressed through the uncertainty values 0.5 meq/kg) should be guaranteed, for example, in fresh
[2]. refined oils [7]. The advantage of the proposed methods
In the present paper the measurement uncertainty in is their simplicity for automation [3].
the proposed redox-potentiometric methods for peroxide The standard methods are highly empirical, and any
value (PV) determination in vegetable oils, developed by variation in the test procedure may lead to erratic re-
us [3], is evaluated in comparison with the uncertainty in sults. The difference between standard methods con-
the standard methods [4, 5]. sists only in the kind of solution (acetic acid-chloro-
PV is an important characteristic of the oil quality and form or acetic acid-isooctane) used for the oil dissolu-
appears as an indicator of the lipid oxidation and oil tion: commercial chloroform is more pure and ensures
268 I. Kuselman . E. Kardash-Strochkova . Y. I. Tur'yan

low blank values, while isooctane is less toxic. There- temperature [K], and F is the Faraday constant (96485
fore, stages of the KI oxidation in all the discussed coulombs).
methods are the same, but in the new methods they are To obtain PV of the tested oil, PV t should be corrected
combined with the iodine redox-potentiometric mea- for the blank (organic solvent -water system without oil).
surements in the same electrochemical cell. So, the Blank value PVo is calculated by the same formula, as
main point is that the uncertainty of iodine determina- PV t, for the same mass value m. Finally, PV of the oil is
tions by volumetric titration and the uncertainty of io-
dine redox-potentiometric measurements without titra- PV = PVt - PVo. (2)
tion should be compared.
To compare the iodine measurement uncertainties, The main PV uncertainty components following from
they are assessed by identification of the uncertainty Eqs. (J) and (2) are shown in Table 1. Note only some
sources, quantification of uncertainty components and cal- details for their evaluation.
culation of combined uncertainties according to the EURA- The final mass m of an oil test portion is the differ-
CHEM/CITAC Guide [8] as values uc(y(xl' x2,"" xn»= ence in masses between a beaker with the test portion
[L(U(y, x))2]1I2, where y(x 1, x2""'x n) is a function of pa- and the empty beaker (after oil transfer to the solvent).
rameters x \, x2, ... ,xn and u(y, xi) is the uncertainty in y These masses are weighed using the balance (Mettler
arising from the uncertainty in xi' i=l, 2, ... ,n. AE 163, Switzerland) with reading 0.0001 g and calibra-
tion expanded uncertainty of ±0.0002 g at the level of
confidence 0.95 and coverage factor 2 (normal distribu-
Uncertainty of measurement resuHs tion) in the range up to 100 g.
by the proposed methods Uncertainty in iodine concentration Cst in the stan-
dard solutions is calculated taking into account the
After completing the reaction of the KI oxidation by hy- manufacturer information on possible deviation of the
droperoxides contained in the oil test portion the equilib- iodine titer (0.02%/ q in a Titrisol ampoule (Merck,
Q

rium I2+I-~I3- is established at the KI excess. Thus, re- Germany), as well as the information on the volume
dox-potential El caused by the electrochemical revers- uncertainty for volumetric flasks according to DIN,
ible couple I.,-+2e-~3I- is measured in the aqueous Class A, used for the solutions preparation, and possi-
phase with the Pt indicator and AgiAgCl, 3 molll KCI, 3 ble temperature variation in the laboratory (in limits of
molll KN0 3 reference electrodes. After E\ measurement 20±2 Qq.
the standard addition of the iodine aqueous solution is A recommended standard addition for oil samples
introduced into the cell and potential E2 is measured (for with different expected PV is 0.1-1 ml of 0.01 N, 0.1 N,
more details see [3]). or 0.5 N iodine solutions: this volume should be negligi-
PV of the test portion is calculated using the follow- ble in comparison with the volume of the aqueous phase
ing equation: (70-110 ml). For transfer of the addition to the "oil-
organic solvent-water" system, a mechanical hand pi-
PVt = (1000 I m)[(C st x Vst ) I (1 Ot1E/S -1)], meg I kg, (I) pette is used (Gilson, France, calibrated at INPL based
on the gravimetric method [9]).
where m is the mass of the oil test portion [g]; Cst is the El and E2 are measured under the same conditions,
iodine concentration in the standard solutions for addi- both within 2-3 min, by the same instrument (a pHlion-
tion, expressed in gramequivalents per liter [N]; Vst is the meter PHM 95, Radiometer, France). The expanded
volume of the iodine standard addition [ml]; LlE=E\-E2 is measurement uncertainty is ±(0.2+0.0005E) mV accord-
the difference of the potentials [mY]; S is an electro- ing to the Radiometer's information. At the normal dis-
chemical parameter equal to 2.303 RTI2F [mY], where R tribution it corresponds to the standard uncertainty
is the universal gas constant (8.314 J/(K x mol», T is the u(E)=O.1 ±0.00025E mV. Since the E measurement range

Table 1 Values and uncertainty components in the proposed methods

Symbol of the Description Value x Standard uncertainty Relative standard uncertainty


source u(x) u(x)/x

m Mass of the oil test portion 5g 0.00024 g 0.00005


Cst [2 concentration in a standard addition 0.01-0.5 N 0.000007-0.00017 N 0.0007-0.00034
Vst Volume of the standard addition 0.1-1 ml 0.0002-0.0016 ml 0.0025-0.0016
~E E J-E 2 7-13 mV 0.3 mV 0.043-0.023
S 2.303xRTI2F 28.9-29.3 mV 0.2mV 0.007
PV Peroxide value 0.03-100 meq/kg 0.003-4.7 meq/kg 0.10-0.047
Uncertainty and other metrological parameters of peroxide value determination in vegetable oils 269

0.30
0.14
0.25
0.12

'1°.
0.20
10

~ 01' E1 0.08

• 0,10 ~
I::!, 0.06
~

0.05 0.04

0.02
0.00
5 9 13 17 21 0.00
AE,mV 1.5 2 2.5 3
r
Fig. 1 Dependence of the relative standard uncertainty u(PYt)/PY t
on the measured ditJerence of potentials 6E [mY]. The optimal Fig. 2 The expanded uncertainty U(PY) [meq/kg] as function of
6E value is shown by the dotted line. Wavy lines show the range the ratio r=PY/PY o. Line 1 corresponds to PYo=O.06 meq/kg. line
of recommended 6E values 2 to PYo=O.2 meq/kg. line 3 to PYo=O.5 meq/kg. LOD and LOQ
for PYo=O.5 meq/kg are shown by the dotted and wavy lines.
correspondingly
recommended in the method [3] is 282-333 mY,
u(E)=0.2 mY. So, the standard uncertainty of the differ-
ence between such two E measurements is line. Moreover, all the range f1E=7-13 mV can be rec-
ommended for practical use as far as it covers uncer-
u(f1E) =.J2 x u(E) = 0.3 mY. tainty values close to the optimal one: u(PVt)/PV t=
0.047±0.OlO. The same is correct for the blanks.
The main parameter influencing S value is the tempera- Uncertainty of the final result of PV determination
ture. Its variations in the laboratory in the range 291- in the tested oil calculated by Eq. (2) is u(PV)=
295 K (18-22 DC) lead to the S changes from 28.87 to [(u(PVt»2+(U(PVO»2]1/2. At the optimal f1E, assuring
29.27 mY. u(PV t)/PV t=u(PV o)/PVo=0.047, normal distribution and
As one can see from Table I, f1E measurement is the coverage factor 2, the expanded uncertainty U(PV) is
dominant source of the uncertainty in results of PV de-
U(PV) = 2 x (0.047[PV? + PVJ]1/2)
termination calculated by Eq. (I). So, the relative stan-
dard uncertainty of such a result u(PVt)/PV t is calculated = 0.094[PV? + PVJ ]1/2, meg I kg. (5)
by the logarithmic partial differentiation of the function
U(PV) as function of the ratio PV/PV o=r is shown in
(I) concerning f1E:
Fig. 2 in the range r=I-3 at PVo equal to 0.06, 0.20, and
0.50 meq/kg (Jines 1-3, correspondingly). From Eq. (5)
u(P~) I p~ = [2.303 X lOdEIS x u(L1£ I S)I 0.1£15 -I]. (3)
the relative expanded uncertainty is the following:
Taking into account u(f1E)=0.3 mV and S=28.87- U(PV) I PV = 0.094(r2 + 1)112 I (r + I). (6)
29.27 mY, Eq. (3) can be simplified:
When PVo is negligible, PV=PV p r+l~r, r2+I~r2 and
u(PVt ) I PVt = 0.024 I (1 - I I I OtlE/S). (4) the relative expanded uncertainty is U(PV)/PV=0.094.

The dependence of u(PVt)/PV t on f1E in the range 1-21


mV is shown in Fig. 1. From this dependence it follows LOD and LOQ prediction
that f1E<7 mV leads to an essential increase in the PV
uncertainty (f1E=7 mV and corresponding u(PVt)IPV t are Defining LOD as PV corresponding to the three standard
shown in Fig. I by a wavy line). To achieve value deviations (standard uncertainties) of the blank response
f1E> 13 mV the amount of iodine added with the standard or its 1.5 expanded uncertainties [2], the following pre-
solution may exceed three times the iodine amount diction can be obtained at the optimal f1E:
formed by the reaction of KI oxidation hy hydroperox-
ides contained in the oil test portion (f1E=13 mV and LOD = 1.5U(PVo) = 0.14PVo, meg I kg oil. (7)
corresponding u(PVt)/PV t are shown in Fig. I by the
wavy line also). So, the optimal f1E is 9 mV leading to In this case PV t according to Eq. (2) should be equal to
u(PV t)/PV t=0.047 which are shown in Fig. I by a dotted 1.14 PVo, i.e., r=1.I4 and U(PV)/PV=0.067 by Eq. (6).
270 I. Kuselman . E. Kardash-Strochkova . Y. I. Tur'yan

Table 2 Values and uncertainty components in the standard methods

Symbol of the Description Value x Standard uncertainty Relative standard uncertainty


source u(x) u(x)/x

m Mass of the oil test portion 5g 0.00024 g 0.00005


C th Thiosulfate concentration 0.01-0.1 N 0.0000070-0.000034 N 0.00070-0.00034
Vth-Vth_O Volume of thiosulfate spent for oil titration 0.1-4 ml 0.014ml 0.14-0.003
PV Peroxide value 0.2-100 meq/kg 0.02-0.56 meq/kg 0.10-0.0056

LOQ can be predicted as PV corresponding to the ten iodine concentration Cst in solutions also prepared from
standard uncertainties of the blank response or its five such ampoules.
expanded uncertainties: Volume of the thiosulfate solution spent for titration is
the more complicated parameter. A 2-ml microburette
LOQ = 5U(PVo) = 0.47PVo, meg 1 kg oil. (8) (Bein Z.M., Israel) with O.OI-ml divisions and a drop
size reduced to 0.008 ml was used for this titration [3].
It means PV t=1.47 PVo or r=1.47 and again, practically The manufacturer specifies a calibration accuracy of
as for LOD, U(PV)/PV=0.068. ±O.O I ml. The standard deviation of the burette filling
Values rand U(PV) appropriate to LOD and LOQ are obtained was equal to the standard deviation of calibra-
shown in Fig. 2 for the case of PVo=0.50 meq/kg, for ex- tion - 0.006 ml. Since the volumes V th and Vth-O are
ample, by the dotted and wavy lines, correspondingly. spent for titrations, they depend also on the standard de-
The limit values here are LOD=0.14xO.50=0.07 meq/kg viation of end-point detection. Using "potato starch for
and LOQ=0.47xO.50=0.23 meq/kg. For more pure sol- iodometry" at the end of titration, as recommended in the
vents, i.e., for blanks with lower PVo, LOD and LOQ are standard [5], which produces a deep blue color in the
lower also. presence of the iodonium ion, location of the end-point
It is clear from Eqs. (5), (6), (7), and (8) and Fig. 2 (when the blue color just disappears) is precise. It re-
how the uncertainty of the result of PV determination de- quires an analytical experience, especially at low iodine
pends on the solvent purity (blank PVo), especially for concentrations (low PV), but for an experienced analyst
fresh refined oils having PV~0.5 meq/kg. the main deviation in the end-point detection arises due
to drop size of the burette. For the described burette cor-
responding standard deviation is 0.008/"3=0.005 ml.
Uncertainty of measurement results The temperature uncertainty source is negligible here,
by the standard methods and therefore the standard uncertainty of each volume
spent for titration using such a burette is
According to the standard methods [4,5], iodine, liberat-
ed after completing the reaction of the KI oxidation by
hydroperoxides contained in the oil test portion, is titrat-
ed against sodium thiosulfate. Final result is calculated Note, the use of 50-ml burette, DIN, Class A (Duran,
as usually in titrimetry: Germany) with O.I-ml divisions, calibration accuracy
±0.05 ml and a drop size 0.05 ml, recommended in the
PV = (10001 m) x (Vth - Vth - O) X C th , meq 1 kg. (9) standard [10], leads to the standard uncertainty u(V th )=
u(V th-O)=0.05 ml. So, such a burette is not suitable for
where V th and Vth-O are the volumes of thiosulfate solu- fresh refined oils with PV~0.5 meq/kg, as far as the vol-
tion spent for titration of the test solution and of the ume of the 0.0 I N thiosulfate solution spent for titration
blank, correspondingly [ml]; C th is the thiosulfate con- in this case is Vth~0.25 ml.
centration [N]. From comparison of the main components of PV un-
The main PV uncertainty components in this way are certainty in Table 2, one can see that volume components
shown in Table 2. Some details for their evaluation are are dominant. So, assuring normal distribution and cov-
discussed below in same manner as previously for the erage factor 2, the relative expanded uncertainty of the
new methods. titration result is
The final mass of an oil test portion (m=5 g) is deter-
mined as described already: it is not depending on a U(PV) 1 PV = 2 X [(u(Vth ))2 + (U(Vth_O))2]1/2 1 (Vth - Vth - O)
method for PV determination. = 0.028 1(Vth - Vth - O). (10)
Uncertainty in thiosulfate concentration C th in solu-
tions prepared from a Titrisol ampoule (Merck, Ger- Dependence of U(PV)/PV on the volume spent for titra-
many) is calculated by analogy with the uncertainty in tion is shown in Fig. 3 in the range of (Vth-Vth-O) up to
Uncertainty and other metrological parameters of peroxide value determination in vegetable oils 271

Table 3 Results of PV o deter-


mination, meq/kg Solvents Proposed technique Standard titration

Mean Standard deviation Mean Standard deviation

Chloroform GR 0.063 0.002 0.037 0.006


Isooctane SP 0.084 0.007 0.080 0.010
Isooctane GR-p 0.114 0.002 0.130 0.010
Isooctane GR 0.30 0.01 0.26 0.02
Isooctane TR 0.45 0.02 0.44 0.03

1.50 Cth=O.Ol N the blank peroxide value is PVO=2V th-D meq/kg.


Therefore, for the standard methods the limits are
1.25
LOD = 1.5U(PVo) = 0.03PVo / Vth - O = 0.06 meg / kg. CII)

.. 1.00

to.
OJ
75
and

LOQ = 5UCPVo) = 0.1 OPVo / Vth - O = 0.20 meg / kg. (12)


0.50
Note, the titrimetric LOD and LOQ are not depending fi-
0.25 nally on the solvent purity, i.e., on the blank peroxide val-
ue PVo, as the redox-potentiometric ones. However, PVo
0.00 L--~---'----~---L~_'---_ _ _~ for both standard and proposed methods are the minimal
0.00 0.05 0.10 0.15 0.20 0.25 detectable peroxide values and their comparison can be
Vtb-V_ mL helpful for understanding the methods possibilities.

Fig. 3 Dependence of the expanded relative uncertainty U(PV)/PV


on the volume spent for titration V1h -V1h-<l [ml]. The case of Solvent analysis
PV=0.2 meq/kg is shown by the dotted line
Five kinds of solvents were analyzed. Results of the
analysis - means from three replicates and correspond-
0.25 m\. It is clear that greater blank response (greater ing standard deviations - are shown in Table 3. The sol-
Vth-D) leads here to greater relative uncertainty vents listed in the table are chloroform GR purchased
U(PV)/PV as in the proposed methods. from Baker (Holland); isooctane GR, isooctane for or-
The volume (V th- Vth-D) for oils with PV>4 meq/kg at ganic trace analysis CTR) and isooctane for spectroscopy
recommended m=5 g and Cth=O.l N is more than 0.2 m\. (SP) purchased from Merck (Germany). Isooctane
In this PV range the expanded uncertainty is U(PV)= signed as GR-p is isooctane GR purified by us using ion-
0.56 meq/kg according to Eqs. (9) and (10), and exchange resin with active -S03 groups C"Amberlyst 15"
U(PV)/PV<0.14. For oils with PV~4 meq/kg and from BDH, England).
Cth=O.OI N the volume is (Vth-Vth--D)~2 ml, U(PV)= The results of the analysis by proposed and standard
0.056 meq/kg, and U(PV)/PV~0.014. The uncertainties methods are close enough (not differ more than for one-
for fresh refined oils with PV~0.5 meq/kg are two standard deviations). The exception is chloroform
U(PV)/PV~O.II. For example, for PV=0.2 meq/kg and only: the U(PVo)/PV0=0.5 for the standard method in this
(Vth-Vth-D)=O.l ml it is U(PV)IPV=0.28: see shown in case (volume of 0.01 N thiosulfate solution spent for ti-
Fig. 3 by the dotted line. So, uncertainties of PV deter- tration is 0.02 ml, i.e., 3 drops in all). Anyway chloro-
mination by the standard methods are close to the uncer- form is most pure from all available solvents.
tainties by the proposed methods or less of them for oils Taking into account the chloroform PVo obtained by
with PV>0.5 meq/kg. However, for fresh refined oils the proposed method, minimal LOD and LOQ can be
with PV~0.5 meq/kg the proposed methods are better, if calculated for this method: LOD=0.14xO.063=0.01
the criterion is the uncertainty. meq/kg and LOQ=0.47xO.063=0.03 meq/kg.

LOD and LOQ prediction Analysis of oils

For a blank the volume Vth-D is relevant only in Eq. (10), To compare the methods at PV levels higher than PVo,
so U(PVo)IPVo=0.02N th-D' The same in Eq' (9): at five kinds of oils were analyzed using the purified isooc-
272 I. Kuselman . E. Kardash-Strochkova . Y. I. Tur'yan

Table 4 Results of PV determination in oils

Oil Proposed technique Standard ti trati on Fisher's ratio F Student's ratio t PVr/PV s'
PV p' meq/kg Sp' meq/kg PV s ' meq/kg Ss. meq/kg

Canol a I 0.45 0.02 0.44 0.01 1.38 1.12 102.3


Soya 0.42 0.01 0.43 0.01 2.76 1.54 97.7
Sunflower 2.38 0.02 2.41 0.02 0.76 2.07 98.8
Canol a 2 7.45 0.14 7.49 0.15 0.84 0.42 99.5
Olive 35.3 1.6 36.0 1.0 2.58 0.81 98.1
Maize 69.8 2.1 68.9 1.9 1.25 0.74 101.3

tane: canol a, soya, sunflower, olive, and maize. Table 4 though were stored in refrigerator, their PV have increased
shows the average results PV p and PV s obtained by the after some months. Therefore, results of PV determinations
proposed and standard methods from n=5 replicates for shown in Table 4 are higher than those in [3].
each sample; the standard deviations for these replicates - Since LOD and LOQ should be determined experimen-
Sp and Ss' respectively; Fisher's ratio F=Sp2/Ss2; Student's tally [11], only LOQ=0.2 meq/kg predicted for the stan-
ratio t=IPVs-PV pl/[(S/+Sp2)15]O.S; and PV jPVs' %. dard methods is approved based on the work [3] data for
The critical value for the F-ratio is 6.39 at the 95% fresh refined canola. Test of other predictions requires an
level of confidence and the number of the degrees of additional experiment with very good refined fresh oils.
freedom n-l=4, For the t-ratio the critical value is 2.31
at the 95% level of confidence and the number of the de-
grees of freedom 2(n-l )=8. Conclusions
From the comparison of the F-data with the critical
value it follows that the differences between repeatabili- 1. Metrological parameters of the new methods for re-
ty of the results obtained by the standard titration and by dox-potentiometric PV determination in vegetable
the proposed method are insignificant (all F values are oils are fit for purposes (similar to demonstrated by
less than 6.39), i.e., repeatability of the proposed method the standard methods).
is sufficient. 2. Uncertainties of results of PV determination by the
The accuracy of these techniques is approximately the proposed methods are close to those by the standard
same, since the deviations of the average PV p results ob- methods or worse of them for oils with PV signifi-
tained by the proposed method from the average results cantly more than 0.5 meq/kg. However, for fresh re-
obtained by the standard titration PV s are insignificant in fined oils with PV~0.5 meq/kg the proposed methods
comparison with the repeatability deviations (all t values are better than standards ones, if the criterion is the
are less than 2.3\), i.e., accuracy of the proposed tech- measurement uncertainty.
nique is sufficient. Average ratio PV/PV s is 99.6%. 3. For proposed methods the LOD=O.Ol meq/kg and
Similar results were obtained in our previous work with LOQ=0.03 meq/kg are predicted, while for the stan-
the same oils dissolved in chloroform [3]. However, in ex- dard methods there are LOD=0.06 meq/kg and
periment described in [3] the oils were more fresh, and LOQ=0.20 meq/kg.

References
I. EURACHEM (1998) EURACHEM 4. AOCS (1996) AOCS official methods, 9. ISO (1999) ISO/DIS 8655-6 Draft.
Guide. The fitness for purpose of ana- vol II, method Cd 8-53: Peroxide Piston-operated volumetric apparatus.
lytical methods. A laboratory guide to value. Acetic acid-chloroform method. Part 6: Gravimetric test methods.
method validation and related topics, AOCS, Champaign, IL, USA Geneva, Switzerland
1st edn. Teddington, UK 5. AOCS (1996) AOCS Official methods, 10. AOCS (1996) AOCS official methods,
2. Kuselman I, Sherman F (1999) Accred vol II, method Cd 8b-90: Peroxide val- vol II, method Ja 8-87: Peroxide value.
Qual Assur 4: 124-128 ue. Acetic acid-isooctane method. AOCS, Champaign, IL, USA
3. Kardash-Strochkova E, Tur'yan Va, AOCS, Champaign, IL, USA II. AOAC (1997) AOAC peer-verified
Kuselman I (2001) Talanta 54:411-416 6. Finne G, Ikins WG, Williams J Jr, methods program. Manual on policies
Welborn JL (1998) Inside Lab Manage and procedures. AOAC, Gaithersburg,
2:24-26 MD
7. Israel Standard No 216 (1994) Edible
vegetable oils. Tel Aviv, Israel
8. EURACHEM/CITAC (2000) EURA-
CHEM/CITAC Guide. Quantifying un-
certainty in analytical measurement,
2nd edn. Teddington, UK
Accred Qual Assur (1999) 4:504-510
© Springer-Verlag 1999

Thomas Anglov Uncertainty of nitrogen determination


Inge M. Petersen
Jesper Kristiansen by the Kjeldahl method

Abstract The uncertainty of the largest contribution to the uncer-


Kjeldahl method for determination tainty came from volumetric equip-
of nitrogen in insulin was evalu- ment. Systematic uncertainty bud-
ated according the procedure de- gets such as the design presented
J. Kristiansen (lEI)
scribed in the Guide to the Expres- here facilitate the uncertainty eval-
The National Institute of Occupational sion of Uncertainty in Measure- uation process and makes it easier
Health, LerSI/l Parka lie 105, ment. The relative standard uncer- to compare uncertainty evaluations
DK-2100 Copenhagen, Denmark tainty of the method was found to performed by different analysts.
e-mail: jkr@ami.dk be 0.19%, compared to the relative
Tel.: +45-39-165200
Fax: +45-39-165201 intermediate precision experimen-
tally found to be 0.085%. The un- Key words Uncertainty budget·
T. Anglov, I.M. Petersen certainty components were organ- Uncertainty evaluation .
Department of Metrology, Novo Nordisk
A/S, Krogshl/ljvej 51, DK-2XXO Bagsvrerd, ized in Tables, which allowed an Uncertainty component·
Denmark easy overview and evaluation. The Traceability . Reference standards

Introduction Methods
Determination of nitrogen content plays a key role in Kjeldahl method
assigning values to insulin reference materials. The ref-
erence materials in question serve as reference stand- The nitrogen content of dry insulin was determined by
ards when measuring insulin in drug products. Thus, the Kjeldahl method, which consists of three steps, di-
the Kjeldahl nitrogen determination is a crucial link in gestion, distillation and titration. In brief, approximately
the traceability chain. The uncertainty budget for the 50 mg of the sample was weighed accurately and trans-
Kjeldahl method published in this paper has three ob- ferred to a digestion test tube. Concentrated sulphuric
jectives: Firstly, to estimate the uncertainty of results acid was added and the mixture was heated to 390°C
obtained by the Kjeldahl method; secondly, to identify for 4 h (digestion). The tube with the digested sample
steps in the analytical procedure that may be targets for was placed in the Kjeldahl apparatus, sodium hydrox-
improvement; and thirdly, to contribute a generally ap- ide and steam was added, and ammonia was distilled
plicable procedure for evaluating an uncertainty budget off. The ammonia-rich steam was condensed in the re-
for a chemical analytical method to the current litera- ceiving flask (distillation). The content of ammonia in
ture. The need for such schemes or procedures is ur- the receiving flask was determined by end-point titra-
gent as accredited laboratories in the near future will tion to pH 4.5 with 0.1 molll hydrochloric acid (titra-
be required to state their uncertainty of measurement tion). The normality of the acid (N HCh in mol/I) was de-
[1]. The uncertainty budget published in this paper is termined by titration of tris-(hydroxymethyl)-amino-
based on existing guidelines [2, 3]. methane (Tris). Blanks consisted of empty digestion
274 T. Anglov . I.M. Petersen' 1. Kristiansen

tubes treated as samples. Samples were measured in The factors 1Iy2 and 1/y3 account for the number of
duplicate and blanks in triplicate. replicate determinations (2 and 3, respectively). The
Let a (mg) denote the amount of sample. If b (ml) expression may be rearranged in order to emphasize
and c (ml) denote the volume of hydrochloric acid used the sensitivity coefficient of each uncertainty compo-
for titration of the blanks and the sample, respectively, nent:
then the relative nitrogen content of the sample (Ntota, )
can be calculated as: (N .)2 = ( Ntotal )2 ( )2 + ( N total )2 (b)2
U totdl y2 . (c _ b) u C y3 . (c _ b) u
N _ 14.01 g/mol· (c - b)· N HC1
total - (1)
a + (N to tal)2 u(a)2 + (N to tal)2 u(N HCI )2 (4b)
y2·a N HCI
The method was validated using two insulin drugs. The
relative standard deviation under intermediate preci- The standard uncertainties u(N HCI ), u(a), u(b) and u(c)
sion conditions was found to be 0.085%. are themselves composed from various uncertainty
components. These uncertainties were likewise ob-
tained from uncertainty budgets (Appendix Bl-4).
Uncertainty budgets

The uncertainty of Ntotal was estimated by combining


standard uncertainties of N HC ]' a, band c. Uncertainties Relative uncertainty variance contributions
were combined by using the rule of "error propaga-
In this paper the relative uncertainty variance contribu-
tion" [3, 4]. In general, when a result of measurement
(y) is determined from other quantities, the relation-
tions are used to illustrate the relative significance of
ship between y and the values of these quantities (input different uncertainty components [5]. The relative con-
estimates) can be expressed by a function, / [2]: tribution (rJ of an uncertainty component Xj to the
combined uncertainty, is defined here as the standard
y = f(Xb X2 ••• ,Xj ••• ,XN) (2) uncertainty variance of the component multiplied with
its squared sensitivity coefficient divided by the com-
where XI .•. Xj ...XN represents N input estimates. The un-
certainty of y (u(y)) is related to the uncertainty of the bined standard uncertainty variance [5]:
input estimates by the equation:

(3) (5)

where U(XJ2 is the standard uncertainty variance and


where xj=N HC ], a, band c. It follows from Eqs. (3) and
u(xJ the standard uncertainty of the uncertainty com-
(5) that L rj = 1 (assuming the absence of correlated in-
ponent number i, and where a/ is the partial deriva- put estimates).
aXj
tive of the function, f, with respect to the uncertainty
component number Xj. The partial derivative is often
called the sensitivity coefficient because it describes Results
how the measurement result varies with changes in the
value of the input estimates [2]. It should be noted that Uncertainty of Ntotal
Eq. (3) is an approximation that is valid if there are no
correlated input estimates [2]. The result of measure- The values of N HC ]' a, b, c and Ntotal as well as the re-
ment of the Kjeldahl method is N total as given by spective standard uncertainties u(N HCI ), u(a), u(b), u(c)
Eq. (1), and the expression for N total is the function/in and u(Ntotal) and the corresponding sensitivity coeffi-
Eq. (2). Thus, the following expression for the uncer- cients are given in Table 1. The combined standard un-
tainty of Ntotal is obtained: certainty u(Ntota') was estimated to 0.00023, corre-
sponding to a relative standard uncertainty of 0.19%.

7f
U sing an coverage factor k = 2 the result of the meas-
( U(C))2 (U(b))2
( U(Ntotal))2 = if + urement should be reported as:
N total (c-b)
(4a) Ntotal ± V(N total ) =0.1233 (±0.00046)
u(a))
+ (U(N HCI ))2 + ( £ where V(Ntotal) is the expanded standard uncertainty.
Details on the uncertainty budgets for u(N HCI ), u(a),
N HC1 a
u(b) and u(c) are given in Appendix B.
Uncertainty of nitrogen determination by the Kjeldahl method 275

Table 1 Values of uncertainty components, their standard uncertainty and sensitivity coefficient

Component Symbol Value Standard Sensitivity coefficient Reference


uncertainty

Normality of hydrochloric acid N Hcl 0.1 molll 1.1 x 10-4 molll (N ,o,al) = 1.23 llmol Appendix B1
N Hcl

Amount of sample a 50mg 0.059 mg (NTo,al) = 1.74 X 10-3 mg- I Appendix B2


V2'a
Volume of titrant used on blank sample b 0.1 ml 4.0 x 10-3 ml ( N'o'a' )=1.fl2XI0- 2 ml- 1 Appendix B3
tHe-b)
Volume of titrant used on the sample e 4.5 ml 7.2x10 3 ml ( N'o'a' )=1.9HXI0- 2 ml- I Appendix B4
V2·(e-b)
Result of measurement Niota' 0.12329 0'()()()23 Eq. (4b)

Relative contributions to the uncertainty cludes the contributions from uncertainty of the vol-
ume of the titrant and from uncertainty of the tempera-
The relative contributions (r;) from U(N HCI ), u(a), u(b) ture of the titrant (Appendix Bl, 3 and 4). Weighing
and u(e) to the combined standard uncertainty variance (Fig. 2) includes first and second weighing of both Tris
u(Ntotal)2, are shown in Fig. 1. The two largest contribu- base and the sample (Appendix Bl and 4). Digestion
tions come from U(NHCI) and u(e). Both contribute (Fig. 2) includes all uncertainty components denoted in
35-38% to the combined standard uncertainty var- digestion in Appendix B3 and 4. Lastly, Tris-purity
iance. The uncertainty budget for U(NHCI) (Appendix (Fig. 2) denotes the uncertainty component associated
Bl) shows that the uncertainty of this component is with the purity of the Tris base evaluated in Appendix
composed mainly of uncertainty contributions from the B1. From Fig. 2 it can be seen, that the largest uncer-
temperature of the titrant, the weighing of the Tris base tainty contribution comes from the use of volumetric
and the volume of the titrant. The primary contribu- equipment, i.e. burettes used for titration of hydro-
tions to a(e) (see Appendix B4) come from the uncer- chloric acid and for titration of samples and blanks.
tainty of digesting the sample (which consists of the un- This uncertainty component is composed of the preci-
certainty of the amount of sample transferred to the di- sion of titration and of the uncertainty of the tempera-
gestion tube) and from the uncertainty of the volume of ture. The latter factor influences the volume of titrant
the titrant. used for titration. The second largest contribution
In Fig. 2 the relative contributions to the uncertainty comes from weighing (of Tris base and of the sam-
variance are summarized for various types of analytical ple).
steps and equipment used in the Kjeldahl method. The
contribution from volumetric equipment (Fig. 2) in-

~
c
0
~ i
.c
c
0
i.c i
..
;:
c
0
8.:
!J
u
..:
::

Fig.2 Relative contributions from volumetric equipment, weigh-


Fig. 1 Relative contributions from N He," a, band e to the com- ing, digestion and the purity of the Tris base to the combined
bined standard uncertainty variance u(N'o,al)2 standard uncertainty variance
276 T. Anglov . I.M. Petersen' J. Kristiansen

It was not found necessary to consider the correla-


Discussion tion between input estimates because the contribution
The uncertainty budget presented in Table 1 indicates from covariances was judged to be minor in the uncer-
that a typical result obtained by the Kjeldahl method tainty budget presented above. Examples are c and b in
(as described in this paper) will have a relative standard Eq. (1) which are determined using the same burette.
uncertainty of around 0.19%, which is approximately This means that uncertainties of c and b originating
twice as large as the experimentally derived relative from, e.g. long-term variations of volume (delivered by
standard deviation estimated under intermediate preci- the burette), wear of the burette, etc. are positively cor-
sion conditions. This is not surprising as some of the related. In other words, the uncertainty of the differ-
uncertainty components included in the uncertainty ence (c - b) tends to be unaffected by these effects. The
budget are difficult to evaluate experimentally, e.g. pur- same reasoning can be used for m1 and m2, which are
ity of the Tris base, accuracy of the pH meter, etc. determined for calculation of a (Appendix B2).
The uncertainty evaluated for the Kjeldahl method The structure of this uncertainty budget for the Kjel-
should be propagated to the uncertainty of the insulin dahl method may be used as a paradigm for other ana-
reference standards as well as to the analytical methods lytical methods. However, it must be understood that
that are calibrated with these standards. Failure to con- an uncertainty budget is designed for a specific method,
sider the uncertainty of the reference standards or of and is therefore only valid for a set of specified analyti-
the analytical method, or to base the uncertainty on less cal conditions (including the analytical method, type of
encompassing estimates (e.g. intermediate precision), sample and its concentration level). For highly auto-
may lead to an excess number of nonconforming results mated (i.e. computerized) analytical methods, the result
when evaluating the fulfilment of acceptance criteria of measurement is often calculated by specialized soft-
for insulin drug products. ware, and this makes the interdependency of various
The present uncertainty budget demonstrates that analytical steps less obvious. For such systems other ap-
the Kjeldahl method may be improved. The most proaches for making uncertainty budgets must be ap-
promising target for improvement is the titration. Ac- plied [5, 6].
cording to the analysis of the uncertainty contributions,
this can be accomplished by improving the temperature
control during titration or by improving the accuracy of Conclusions
the volume of titrant. The latter goal may be achieved
by using more accurate titration equipment, or, at least A structured evaluation of uncertainty components was
partly, by increasing the number of replicate determi- applied for the Kjeldahl method for nitrogen determi-
nations. Increasing the number of replicates will im- nation. The design offers a systematic approach for
prove the precision, but will not influence the trueness making uncertainty budgets for analytical chemical
of the volumetric equipment, nor does it compensate methods, and it may facilitate comparison of uncertain-
for long-term fluctuations that may affect the equip- ty budgets made by different analysts or laboratories.
ment. The evaluation of the contribution from individual or
The uncertainty budget in this study was organized grouped uncertainty components made it possible to
to make it easy to review the evaluation of individual suggest specific improvements of the analytical meth-
uncertainty components. The uncertainty budget is or- od.
ganized in three evaluation steps: In the first step, the
uncertainty is evaluated based on the input estimates
that are used directly in the calculation of Nlolal (i.e. Appendix A: Definitions, abbreviations and equations
N HCb a, band c in Eqs. 1 and 4b) (results in Table 1). In
the second step, the uncertainty of each of the input 1. Definitions and abbreviations
estimates (i.e. N HCb a, b and c) is evaluated and calcu-
lated. Correction factors are applied to correct for ef- Terms and definitions used in accordance with GUM
fects such as the influence of temperature on the vol- and VIM [2, 7].
ume, inaccuracy of the pH meter, etc. (equations and
results in Appendix B). Where necessary, a third step a: mg sample used in the measurement
of uncertainty evaluation was applied. An example is b: ml titrant (hydrochloric acid) used on the blank
air bubbles in the burette, water in the Tris base and sample (average of three determinations)
other effects that are evaluated and included in the un- c: ml titrant (hydrochloric acid) used on the sam-
certainty of qconlT (Appendix B1). Generally evalua- ple
tions in the third step should only be carried out if the k: Correction factor 14.01 g/mol containing the
uncertainty components contribute significantly to the atomic weight of nitrogen
combined uncertainty. MPE: Maximum permissible error
Uncertainty of nitrogen determination by the Kjeldahl method 277

Table 81 Estimation of the uncertainty of NHC1


No. Input Symbol Value Stated or Type Standard Sensitivity coefricient Contribution to Reference
quantiliy (x,) of input evaluated (AlB) and uncertainty ar the standard
estimate uncer- type of u(x,) ax, uncertainty
tainty distribution of the output
estimate
ar
u(x,)~
ax,

various
contri-
q~ulltr
lxlO-4, B
rectangular
CX~)-4)=5.xX 10-' N lln =0.1 molll
qC~lnlr
5.X x 10-6 molll Own estimate

butions'
2 MPE of qrll '" Oh Instrument
the pH- specification.
meter own estimate
3 Purity of qpu, 0.9989 (UKW)SC B 2.9 x 10-' molll Supplier
Tris (99.X9%) liJ- .x 0
0.005 _ 2 9 1 -4 Nlln=O.1 molll
qpur
certificate
4 First m, 11K) mg 0.1 mg B O.lmg 5.9 x 10-' molll Instrument
rectangular Nlln = 1.02 x 10-' moll(mg·l) specification
weighing
of Tris base 7~
5.8x 10-" mg
m,-III,

5 Second m, 2mg 0.02mg B 0.02 mg 1.2 x 10-' molll Instrument


weighing rectangular yr= ~= 1.02 x 10-' moll(mg·l) specification
ln2-n1 1
of Tris base 1.2 x 10-' mg
6 Volume of v 8.06 ml 0.004ml" A 0.004ml Nlln 5.0xI0-' molll Instrument
titrant normal - - = 0.012 moll(L'ml) qualification
v
report
Temperature O.lX1l2 6.9 x 10-' molll Own estimate
7 (1.2~~0-')=6.l)XIO-4 Nlln =0.1 molll
C
qlcmp
B
of titrant rectangular r J q'cmr Laboratory
temp.
15-25°C
Molar mass M ...,;, 12l.l4 ",Og/mol' - Own estimate
of Tris base glmol
u (XN",) =Y(5.8 x 10 ")' + (2.9 x 10 Sf + (5.9 x 10 'f + (1.2 x 10 ')'+ (5.0 x 10 ')'+(6.9 x 10 ')' = l.l x 10 -4 molll

a Contributions from: The water content of the silica gel; lack of C The stipulated purity of (hydroxymethyl)-aminomethane (Tris)

complete drying of the Tris base; variation in the time of drying; is 99.H9% with an uncertainty of (l.05%
variation in the time of dissolution of the Tris base; the hygrosco- d A standard deviation of 0.004 mL was found experimentally
py of the Tris base; air bubbles in the burette: Combined uncer- C Water density at 15°C is 0.99913 glmL, at 20°C 0.99H23 glml and

tainty estimated to 0.01 mg Tris base or 0.01 % of approximately at 25°C 0.99707 glmL. It is assumed that the temperature in the
100 mg Tris base laboratory (and the temperature of the acid) in average is 20°C,
h The pH-meter specifications stipulate a MPE of 0.02 pH units. but in worst case may vary ±5°C. The largest change in water
The form of the titration curve indicate that an uncertainty of 0.02 density from 20 to 25°C is 0.00116 g/ml, or 0.12%
pH corresponds to a negligible uncertainty in the volume of ti- f The uncertainty of the molar mass is assumed to give a negligi-
trant. Thus, the value of this uncertainty component is not in- ble contribution to the combined uncertainty
cluded in the uncertainty budget

Table 82 Estimation of the uncertainty of a


No. Input quantity Symbol Value of Stated or Type Standard Sensitivity Contribution Reference
(x,) input evaluated (AlB) uncertainty coefficient to the standard
estimate uncertainty and type of u(x,) ~r uncertainty
distribution ax, of the output
estimate
ar
u(x,)~
ax,

First weighing III, 52mg 0.1 mg B 0.1 mg 5.8x 10-' mg Instrument


of sample rectangular y1 specifications
2 Second weighing 2mg 0.02mg B 0.02 mg 1.2 x 10-' mg Ins(rument
vr
III,
of sample rectangular specifications

u(a)=Y(5.8XIO ')'+(1.2xlO ')'=0.059mg


278 T. Anglov . I.M. Petersen' J. Kristiansen

Table 83 Estimation of the uncertainty of b


No. Input Symhol Value of Stated or Type Standard Sensitivity Contrihution Reference
quantity (x,) input evaluated (AlB) uncertainty coefficient to the standard
estimate uncertainty and type of u(x,)
i!L uncertainty
distrihution ax, of the output
estimate
It (x,) af
ax,
Digestion' qJip.cst N/A N/A N/A N/A N/A Digestion not
performed
2 Distillation" qdistil Own estimation
3 Volumen of v 0.1 ml O.(X)4 ml A O.!X)4 ml e 4 x 10-' ml Instrument
titrant normal !!. = 1 qualification
v
report
4 Temperature qtcmp 1.2x 10-'" B
_b_=O.1 ml 6.9xlO-' ml Own estimation
of titrant rectangular qtcmp
Lahoratory
temp. 15-25°C
5 MPE of the qpH Instrument
pH meter specifications.
Own judgement

a No Kjeldahl mixture or acid was used. The digestion was not laboratory (and the temperature of the acid) on average is 20°C,
performed but in worst case may vary ± 5°C. The largest change in water
h Addition of water and NaOH; distillation time; leaks; incom- density from 20 to 25°C is 0.001 Hi glml, or 0.12%
plete transmission of NH3 to the receiving vessel; temperature of C The pH-meter specifications stipulate a MPE of 0.02 pH units.

the vapour: All components are assumed to be negligible The form of the titration curve indicates that an uncertainty of
C A standard deviation of 0.004 ml was found experimentally 0.02 pH corresponds to a negligible uncertainty in the volume of
d Water density at 15°C is 0.99913 glml, at 20°C 0.99823 glml and titrant. Thus, the value of this uncertainty component is not in-
at 25°C 0.99707 g/m\. It is assumed that the temperature in the cluded in the uncertainty budget

Table 84 Estimation of the uncertainty of c


No. Input Symhol Value of Stated or Type Standard Sensitivity Contrihution Reference
quantity (x,) input evaluated (AlB) uncertainty coefficient to the standard
estimate uncertainty and type of u(x,) af uncertainty
distrihution ax, of the output
estimate
u(x,)~
at"
ax,
Digestion' (Um B c 5.2 X 10-3 ml Own estimation
qdigcst
rectangular (O~2)= 1.2 x 10-' --=4.5ml
qdigcst

2 Distillation b qJistii Own estimation


3 Volume of v 4.5 ml O.(Xl4 ml A 0.004 ml' 4.0x 10-' ml Instrument
titrant normal ~= 1 qualification
v
report
4 Temperature (W1l2 B c 3.0x10-3 ml Own estimation
of titrant
qtemp
rectangular (().~12) =6.9 x 10-' --=4.5 ml Lahoratory
qJistil
temp. 15-25°C
5 MPE of the qpll zO ml" Instrument
pH meter speci[ications.
Own judgment
u(c)=V(5.2xlO ')'+(4xlO .l)'+(3.0xlO .l)2=7.22x 10-' ml

"The uncertainty components: Amount of catalysator, the C A standard deviation of O'()04 ml was found experimentally
amount of Kjeldahl mixture and H 2S0 4 , block temperature, di- d The pH-meter specifications stipulate a MPE of 0.02 pH units.
gestion time and boiling are assumed not to contribute signifi- The form of the titration curve indicates that an uncertainty of
cantly to the uncertainty. Transfer of sample to digestion vessel is 0.02 pH corresponds to a negligible uncertainty in the volume of
assumed to contribute with 0.1 mg or 0.2% titrant. Thus, the value of this uncertainty component is not in-
h Addition of water and NaOH; destillation time; leaks; incom- cluded in the uncertainty budget
plete transmission of NH3 to the receiving vessel; temperature of
the vapour: All components are assumed to be negligible
Uncertainty of nitrogen detennination by the Kjeldahl method 279

N: Number of quantities or uncertainty compo- rection factors for the water content of Tris, the accura-
nents cy of the pH meter, the temperature of the titrant, and
N HC (: The normality of the hydrochloric acid the purity of Tris (Table BI):
N tota (: Content of nitrogen in the sample (average of 2
samples)
qi: Correction factors
u(x;): Standard uncertainty of the input estimate Xi
U(y): Expanded uncertainty of y
u(y): The combined standard uncertainty of y 2. Estimation of the uncertainty of a
Xi: Input estimates
y: Output estimate (the result of a measurement) The input estimate a is obtained from weighing the
aJ. Sensitivity coefficient for Xi
sample (m2) and sample cup alone (m(), i.e.
aXi a = J(x;) = m2 - mi' Thus, the standard uncertainty of a
is given by the equation u,7 = u;', + u,;", (Table B2).

2. Equation for the standard uncertainty


3. Estimation of the uncertainty of b
In addition to Eq. (3) the following equations evaluat-
ing the standard uncertainty were applied: If the input In evaluating the uncertainty of b it was assumed that
estimates Xi in Eq. (2) is related to y only by multiplica- the volume of titrant used for titration of a blank de-
tion's and divisions, Eq. (3) can be simplified to [2, 4]: pends on factors associated with digestion, distillation,
pH measurement and the assessment of the volume of
(AI) titrant used. The combination of uncertainty compo-
nents was based on the following relation between the
with sensitivity coefficients given by: volume (b) and the various factors:

aJ y (A2)
aXi Xi where v is the estimated volume of titrant read from
If the input estimates Xi is related to y by additions and the burette and all the correction factors (qi) are equal
subtractions, Eq. (3) becomes: to I (Table B3).

U(y)2 = U(X()2 + U(X2)2 + ... + U(XN)2 (A3)


with all sensitivity coefficients equal to 1. 4. Estimation of the uncertainty of c

The uncertainty components and equations used are


Appendix B the same as for b (Appendix B3), i.e.

1. Estimation of the uncertainty of N He(


where v is the estimated volume of titrant read from
N Hc( is calculated from the amount of Tris base used, the burette and the correction factors (q;) are all equal
the molar mass of Tris, the volume of titrant, and cor- to I (Table B4).

References

1. International Standard ISO/IEC 17025 3. EURACHEM (1995) Quantifying un- 6. Hansen AM, Kristiansen J, Nielsen JL,
(199X) General requirements for the certainty in analytical measurement, Byrialsen K. Christensen JM (1990)
competence of testing and calibration 1st edn. EURACHEM Talanta 50: 367-379
laboratories (Draft). International Or- 4. Miller JC, Miller IN (1993) Statistics 7. ISO (1993) International vocabulary of
ganization for Standardization. Geneva for analytical chemistry. 3rd edn. Ellis basic and general terms in metrology
2. BIPM, IEC. IFCC, ISO, IUPAC. Harwood, New York (VIM), 2nd edn. International Organi-
IUPAP. OIML (1993) Guide to the 5. Kristiansen J. Christensen JM, zation for Standardization, Geneva
expression of uncertainty in measure- Nielsen JL (1996) Mikrochim Acta
ment. International Organization for 123:241-249
Standardization. Geneva
John Fleming Glossary of analytical terms*
Bernd Neidhart
Christoph Tausch
Wolfhard Wegscheider
John Fleming Wolfuard Wegscheider
LGC, Queens Road, Teddington, Montanuniversitat Leoben,
Middlesex TW 11 OLY, UK Franz-Josef-Strasse IS,
A-S700 Leoben, Austria
Bernd Neidhart, Christoph Tausch
Philipps-Universitat Marburg,
Hans-Meerwein-Strasse.
D-35032 Marburg, Germany

Introduction ""ill facilitate translation of the glossary into other languages,


and errors will be minimised ifnot excluded. The translation
Analytical data playa vital role in our daily lives, with will be performed by the E&TWG members, who are experts
increasing influence on both economy and ecology. The in the field and native speakers of the respective language,
harmonisation of the European market - including the and will finally be published in a suitable nationaljoumal.
Eastern European countries - and the opening of the inter- Feedback will be sought at both national and interna-
national borders for trade and communication have led to tionallevels to enable a dynamic development of the glossary
serious problems with terminology in analytical chemistry. at the highest scientific and linguistic levels possible. This
We can identify the three main reasons that have caused might also include the deletion of existing and the creation
this situation. These can be classified as "linguistics", of new words, if, in the latter case, the scientific definition
"semantics", and "acceptance". and meaning has no linguistic equivalent in a given language.
Frequent translations of a term through a chain of Let us take as an example the term traceability, which by
languages, and the use of terms by non-native speakers, definition describes a way to achieve quality (accuracy,
may lead to a misuse of terms followed by grave misunder- comparability) in chemical measurements. The equivalent
standings. In addition, the co-existence of different meanings in German would be Riickfiihrbarkeit but the term Riickver-
of terms due to their independent definition by national folgbarkeit is used as the respective DIN Standard. the
and international bodies or authorities, together with linguistic meaning of which is "follow the way (track) back".
recommendations given by international organisations like Consequently, the term Riickverfolgbarkeit is part of
IUPAC, leads to problems of semantics and confusion providing assurance of quality and not of creating quality.
resulting in reduced acceptance. Unfortunately, there is no English word for Riickverfolg-
barkeit. There are two ways of solving this problem: one
is to create a new English ,vord and the other to introduce
Astrategy on terminology the German ,vord into the English language.
We are willing to "grasp the nettle" and open the debate
During the last 5 years, the EURACHEM Education and on this issue by proposing the term trackability to cover
Training Working Group (E&TWG) has analysed this this concept.
situation and has developed a strategy which is expected
to resolve the dilemma. The first, and most important, step
in this concept is to provide a forum which initiates and Discussion forum
enables international discussions among experts in the field.
The catalyst for these discussions will be a dictionary-like It is proposed that the EURACHEM E&TWG should be
"glossary of terms" which will be published as a series in the catalyst which will promote a wider debate of the issues
this journal. Each term in the glossary is provided with a raised by this glossary of terms. All analytical scientists
definition (taken from the highest international leveL if
possible ISO) followed by a scientific description of the
meaning of the definition and one or more examples
explaining its practical use. In addition, translations of the * EURACHEM Education and Training
Working Group
term into other European languages are given. This structure
Glossary of analytical terms 281

are urged to contribute to the debate and work towards a zalnosc (PL); Uusittavuus (SF): Reprodukalhatos 'ag (H);
consensus on the usage of the key terms covered by the (RUS); Reprodutibilidade (P)
glossary. This debate can be pursued either by corresponding
Definition
with the editor of this journal or by sending an email message
to jwf@lgc.co.ukfor consideration by the working group. Precision under reproducibility conditions. 1

Description
Repeatability
Reproducibility is the closeness of the agreement between
Wiederholprazision (D,A, CH): Repetabilite (F, B); Repe- the results of measurements of the same analyte in distinct
tibilidad (E); E.panalhcimo? thta (GR): Ripetibilita (I); subsamples of a test material, where the individual
Herhaalbarheid (NL); Powtarzalnosc (PL): Toistettavuus measurements are carried out changing conditions such
(SF): Ismetelhetoseg (H): (RUS); Repetibilidade (P) as: observer, measuring instrument, location, conditions
of use, time, but applying the same method. 2
Definition
Example
Precision under repeatability conditions. 1
In a laboratory intercomparison samples (e.g. a surface
Description
water) were sent to a number of laboratories for
Repeatability is the closeness of the agreement between determination of e.g. nitrite. Each laboratory reports its
the results of independent measurements of the same analyte results as single values. The standard deviation from all
carried out subject to all of the following conditions: accepted individual results multiplied by 2.8 gives the
the same method of measurement, the same observer, the reproducibility at 95% confidence level.
same measuring instrument, the same location, the same Suppose that the reproducibility of a method has been
conditions of use, repetition over a short period of time. 2 determined to be x. If two of the laboratories in a real case
Independent measurements are made on distinct reported results for subsamples of the same sample which
subsamples of a test material. If possible, at least 8 differed by jX there would be a question concerning the
measurements should be performed. quality of performance.
Repeatability is a characteristic of a method not of a Methods which have a large reproducibility may not be
result. suitable for making valid comparisons in a given real
situation. In this case either the method must be improved
Example
or another method with a smaller reproducibility must be
Successive measurements under the above conditions gave applied.
eight single results from which a standard deviation is
calculated. The standard deviation multiplied by 2.8 gives 1 ISO 3534-1 (1993)
the repeatability at 95% confidence level. 2 International vocabulary of basic and general terms in
Suppose that an analyst uses a method for which the metrology, 1993, (BIPM, IEC, IFCC, ISO, IUPAC, IUPAP,
repeatability has been established as 2 mg/mL. OIML); ISO central secretariat, 1 rue de Varambe, CH-
If, in a real case, the same analyst reported results of a 1211 Geneva 20
measurement repeated over a short time interval as 50 and
56 mg/mL, there would be a question over the validity of
Traceability
these results as they are very unlikely to have differed by
6 mg/mL as a result of random variability. Riickfiihrbarkeit (D, A, CH); Tracabilite (F, B); Trazabilidad
(E); Ixnhla' thsh (GR); Riferibilita (I); Herleidbarheid (NL);
1 ISO 3534-1 (1993) Rastreabilidade (P); Jaeljitettaevyys (SF); Visszavezethet-
2 International vocabulary of basic and general terms in oseg (H); Zgodnosc (PL); (RUS)
metrology, 1993, (BIPM, IEC, IFCC, ISO, IUPAC, IUPAP,
Definition
OIML): ISO central secretariat, 1 rue de Varambe, CH-
1211 Geneva 20 The property of a result of measurement whereby it can
be related to appropriate standards, generally internatio-
nal or national standards, through an unbroken chain of
Reproducibility comparisons. 1
Vergleichprazision (D, A, CH): Reproductibilite (F, B);
Reproducibilidad (E): Anaparagvgymothta (GR);
Riproducibilita (I); Reproduceerbarheid (NL); Odtwar-
282 1. Fleming et al.

Description Track: Result


For each analytical measurement, it should be possible to
relate the result ofthe measurement back to an appropriate Calculation
national or international measurement standard through an
unbroken chain of comparisons. For measurement of weight,
this would be the kilogram standard in Paris, or for amount I Determination I
of substance it should be the SI unit, the mole. Ifcalibrated ~
by an accredited body, the balance is an instrument which Separation
can provide measures of weight which are traceable to
national measurement standards. Instruments for chemical
Preparation
analysis must be calibrated by the use of certified reference
materials, or other suitable reference materials.
Storage
Example
Determination of lead in water by atomic absorption
Sampling
spectrometry (AAS): The AAS instrument has be to •
calibrated using reference solutions made up by dissoh'ing
known amounts (balance) ofa certified reference material
(CRM) or a pure substance such as Pb(N03)2 in a defined Uncertainty of measurement
volume of pure water; in the latter case the pure substance
has to be compared with a CRM. A calibration graph which MeBunsicherheit (D, A, CH), Incertitude de me sure (F,
covers the concentration range of the analyte in the sample B), Intercidumbre de la medida (E), Ab bai6thta th m'trhsh
should be prepared. (GR); Meetonzekerheid (NL); Incerteza da medida (P);
For more complicated analyses, which might involve Meresi byzonytalansa g (H); incertezza di misura (I);
extraction and other analytical procedures, the traceability Mittauksen epaevarmuus (SF)
of the result of a measurement can be established by
Definition
subjecting a certified reference material - with similar
composition to the unknown - to the same analytical Parameter, associated with the result of a measurement,
procedures. Iffor example the measurement-standard used that characterizes the dispersion of the values that could
has not been compaired with a CRM of the same type the reasonably be attributed to the measurand. I
chain of comparison is broken.
Description
I ISO 3534-1 (1993) Uncertainty sets the limits within which a result is regarded
accurate, i.e. precise and true. Uncertainty of measurement
Trackability comprises, in general, many components. Some of these
Ruckverfolgbarkeit (D, A, CH), Relacionabilidad (E), components may be evaluated from the statistical distribution
Sporbarhet (NOR) of the results of series of measurements and can be
characterized by experimental standard deviations. The other
Definition components, which can also be characerized by standard
The property of a result of a measurement whereby the deviations, are evaluated from assumed probability
result can be uniquely related to the sample. distributions based on experience or other information. 2
Description Example
Each step of an analytical method has to be documented Overall uncertainty can be estimated by identifYing all factors
in a way that the result of a measurement can be linked which contribute to the uncertainty. Their contributions
unambiguously to the sample to which it refers. are estimated as standard deviations, either from repeated
observations (for random components), or from other sources
Example
of information (for systematic components). The combined
All samples must be uniquely labelled. All operations standard uncertainty is calculated by combining the variances
performed on a sample must be recorded in a notebook or of the uncertainty components, and is expressed as a standard
computer system. Chromatograms, spectra and other in- deviation. The combined standard uncertainty is multiplied
strumental outputs must be labelled with the sample by a coverage factor of2 to give a 95% level of confidence
identification. (approximately).
Glossary of analytical terms 283

The Uncertainty Estimation Process


Write down a clear statement of what is being
measured and the relationship between it and the
parameters on which it depends

IdentifY List sources of uncertainty fOr each part ofthe process


Uncertainty or each parameter
Sources

QuantifY Estimate the size of each uncertainty. At this


Uncertainty stage, approximate values sufIice; significant
Components values can be refined in subsequent stages

Convert to Express each component as a standard deviation


Standard
Deviations

Calculate the Combine the uncertainty components, either


using a spreadsheet method or algebraically.
Combined
Identify significant components.
Uncertainty
Re-evaluate the

The uncertainty for the determination of e.g. atrazine International vocabulary of basic and general terms in
in water consists of the calibration of several components metrology, 1993. (BIPM. IEC, IFCC. ISO. IUPAC.
of uncertainty, such as the uncertainty of the true content IUPAP. OIML); ISO central secretariat, I rue de Varambe.
of the atrazine standard, uncertainty from dilution of this CH-1211 Geneva 20
standard, uncertainty regarding the loss of atrazine in Quantifying uncertainty in analytical measurement,
sampling and storage prior to analysis. as \vell as that EURACHEM, Queens road, Teddington. Middlesex
associated \vith the preconcentration step after correction TWll0LYUK
for recovery. The result would be expressed as:
1.02BO.13 mg/L

Das könnte Ihnen auch gefallen