You are on page 1of 237

PREDICTION OF SOIL CORROSIVITY USING LINEAR POLARIZATION

by EUGENIA KALANTZIS

Department of Civil Engineering and Applied Mechanics McGill University, Montreal May 1997

A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDlES AND RESEARCH IN PARTIAL FULFlLLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ENGINEERING

O Eugenia Kalantzis, 1997

National Library l*lof Canada Acquisitions and Bibliographic Services


395 WellingtonSbeet Ottawa ON Kt A ON4

Bibliothquenationale du Canada Acquisitions et sewices bibliographiques


395.rue Wellington

OMwa ON KIA ON4


CaMda

canada

The author has granted a nonexclusive licence allowing the National Library of Canada to reproduce, loan, distribute or sel1 copies of this thesis in microform, paper or electronic formats.

L'auteur a accord une licence non exclusive permettant a la Bibliothque nationale du Canada de reproduire, prter, distribuer ou vendre des copies de cette thse sous la forme de microfichelfilm, de reproduction sur papier ou sur format lectronique. L'auteur conserve la proprit du droit d'auteur qui protge cette thse. Ni la thse ni des extraits substantiels de celle-ci ne doivent tre imprims ou autrement reproduits sans son autorisation.

The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts ?om it may be printed or othenvise reproduced without the author's permission.

This report presents the results of a study on the benefit of chlonde ion testing in the prediction of soil comsivity, which is determined using the method of iiiear polarization. Existing indusy standards such as AWWA Cl05 and PACE 82-3 are currently being used to evaluate the comsivity of soils. hese standards consist of various tests, whose results permit the calculation of a comsivity index. h e following tests are suggested: pH, oxidation-reduction potential, suifide ion content, resistivity, drainage ability, soil type, and moisture content. Up to this point, no standards have incorporated chloride ion testing into the testing procedure, even though the effect of chlonde ions on the corrosion rate is well documented. It is the goal of this project to determine whether there is enough evidence to suggest that chlonde ion content be introduced into existing standards.
In total, 153 soils were tested following the AWWA Cl05 and PACE 82-3

standards, as well as for the chlonde ion content. Of these, 75 soils were tested using linear polarization to determine the "huee' corrosivityof the soils. The analysis results showed that the information provided by the chlonde ion content was not significant enough to suggest that this variable be added to the existing grids. This is due to the fact that soi1 resistivity, which is a required test in both standards, accounts for the presence of chlonde ions. However, it should be noted that the chlonde ion content is a bener predictor of corrosivity than soil resistivity, and it is suggested that chlonde ion content be tested whenever possible.

Ce rapport prsente les rsultats d'une tude sur la ncessit de dterminer la teneur en chlorures dans I'valuation de la corrosivit des sols, cene dernire tant obtenue par la mthode de la polarization linaire. Prsentement, les normes utilises par l'industrie sont bases sur des grilles d'valuation permettant le calcul d'un index de corrosivit, e.g., les grilles AWWA Cl05 et PACE 82-3. Chaque g N e d'valuation est constniite partir d'une srie de tests, notamment le type de sol, le pH, le potentiel rdox, la teneur en sulfures, la rsistivit, le drainage, et l'humidit du sol. lusqu' date, aucune norme n'inwrpore la teneur en chionires dans sa grille d'valuation, mme si l'effet des chlomres sur la vitesse de corrosion est bien document. l'objectif de cette tude est donc de dteminer si la teneur en chionire devrait tre introduite dans les grilles d'valuation. Au total, 153 spcimens de sols varies ont t tests selon les grilles AWWA Cl05 et PACE 82-3, ainsi que pour la teneur en chionires. De ce nombre, 75 ont t tests par la mthode de la polarization linaire, et la corrosivit de ces sols a t ainsi formellemnt dtermine. Aprs l'analyse des rsultats des tests, il a t determin que l'information additionnelle fournie par la teneur en chionires n'a pas t significative pour suggrer que ce paramtre soit incorpor dans les grilles existantes. Ceci est d au fait que la rsistivit du sol, qui est une mesure dj incluse dans les deux normes, reprsente indirectement sa teneur en chionires. Par contre, il est important de noter que la teneur en chionires est une variable qui prdit mieux la corrosivit du sol que la rsistivit. Cependant, il est fortement suggr que la teneur en chlorures soit value et tudie si possible.

iii

TABLE OF CONTENTS
ABSTRACT ii

Rsm
LIST OF FIGURES LIST OF TABLES NOTATION AND ABBREVIATiONS ACKNOWLEDGMENTS
1. INTRODUCTION
2. CORROSION AND CORROSION CONTROL

111

...

viii xi

Xlll

...

xvii 1
4 4

2.1

Principles of Electrochemical Corrosion Necessary Elements for Corrosion Physical Foms of Corrosion 2.1.2.1 2.1.2.2 2.1.2.3 2.1.2.4 2.1.2.5 2.1.2.6 2.1.2.7 Uniform Attack Galvanic Attack Crevice Corrosion Pitting Corrosion Erosion Corrosion Selective Leaching Stress Corrosion

7 7
8

1O 11 11 12 13 14 16 22 23 23 25 27

Why Do Metals Corrode?


Deteminhg the Rate of Corrosion
, The Exchange Current Densiy, i

Detemination of r,, 2.1.6.1 2.1.6.2 2.1.7.1

and 4 ,

Activation Polarization Concentration Polarization

Effect of Varying Parameters Using Polarization Diagrams 27

'Concentration PO2 and H

2.1.7.2 2.1.7.3 2.1.7.4 2.1.7.5 2.1.7.6 2.2 2.2.1 2.2.2 2.3 2.3.1 2.3.2 2.3.3 2.3.4 2.4 2.4.1 2.4.2

O2 Solubility

Multiple Corrodents Galvanic Attack Passivity Chloride Content

Measuring Corrosion Rates Tafel Extrapolation Liear Polanzation Differential Aeration Cells Galvanic Anack Selective Leaching Stress-Corrosion Cracking AWWA Cl05 PACE 82-3

Soi1 Corrosion and Its Effects on Underground Infrastmcture

Standards for Determining Corrosivity of Soils

3. PROCEDURES AND APPARATUS

3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8

Soil Samples Soil Type Drainage AbilityIMoisture Content pH Oxidation-Reduction Potential (Redox Potential) Resistivity, p hifide Content Chloride Concentration 3.8.1 3.8.2 3.8.3 3.8.4 Necessary Equipment Sample Preparation Electrode Preparation Preparation of Calibrating Solutions

3.8.5 3.8.6 3.8.7 3.8.8 3.9 3.9.1 3.9.2 3.9.3 3.9.4 3.9.5 3.10

Calibration of Electrode Calibration C w e and Equation Testing Soil Samples Determination of Concentration of Chloride Ions of Soil Necessary Equipment Trial Runs and Reproducibility of Results Sample Preparation Preparation of the Working Electrode Polarization ofthe Steel Specimen

Linear Polanzation

Calculatingthe Corrosivity Indices According to AWWA and PACE

4. ANALYSIS O F EXPERIMENTAL RESULTS AND DISCUSSION 4.1 Analy~is of Preliminary Data 4.1.1 4.1.2 4.1.3 4.1.4 4.1.5 4.1.6 4.1.7 4.2 4.2.1 4.2.2 4.3 Data Exploration Transformation of Variables Regression of the Individual Variables Correlation Matrix RSQUARE Results Categoncal Variables Variables Retained For Further Analysis Determiniig Significance The Effect of Removing Outliers

Consideration of Chlorides in Predicting the Corrosion Rate

Power Analysis

5. CONCLUSIONS AND RECOMMENDATIONS


5.1 5.2 Summary of Results Recommendation for Future Work

APPENDM A: DERIVATION OF POTENTIAL EQUATIONS A.l


A.2

Equation for Czn Equation for b c u Equation for b+

A.3

APPENDIX B: TESTING FOR CHLORIDE ION CONCENTRATION B.l Creating a Concentration vs. Potential Curve

APPENDIX C: TRIALS FOR REPRODUCIBILITY APPENDIX D: PRINCIPLES OF REGRESSION ANALYSIS D.l D.2 D.3
D.4 D.5

Data Exploration Simple Linear Regression Analysis Data Transformations Multiple Variable Regression Categorical Variables Outliers Variable Selection Model Validation Power The SAS Statistical Package

D.6 D.7
D.8

D.9 D.10

vii

LIST OF FIGURES
The Two Stages of Crevice Corrosion Pining Corrosion Erosion Corrosion Microstructure of Gray Cast Iron Intercrystalline Crack Transcrystallie Crack Stable and Unstable Positions Schematic of a CdZn Banery

4 vs. log 1 for CdZn Banery


Polarization Diagram for Corrosion in Acidic Solution Polarization Diagram for Corrosion in Neutra1 Aerated Water Dependence of I,,, on the Value of 1, Activation Polarization Diagram Concentration Polarization Diagram Distribution of H 'in Time Effect of Varying PO2 Variation of 0 2 Solubility with NaCl Concentration Effect of Multiple Corrodents Galvanic Anack Polarization Diagram of a Metal Exhibiting Passivity Variation of 1, and 1, with Potential4 Tafel Curve Schematic of Setup for Tafel Test Tafel Curve Obtained by Varying Q Tafel Regions
, , fiom the Tafel C w e Obtaining i

L i e a r Polarization Curve Components of the Working Electrode

Typical Tafel Plot Typical Linear Polarization Curve Spreadsheet for Quick Calculation of Corrosivity Indices
SAS Output: Univariate Procedure using pHdir SAS Output: Univariate Procedure usingpHsat SAS Output: Univariate Procedure using Reddir SAS Output: Univariate Procedure using Reddir,

with Extreme Values Removed


SAS Output: Univariate Procedure using Redsar SAS Output: Univariate Procedure using Redsat,

with Extreme Values Removed


SAS Output: Univariate Procedure using Resdir SAS Output: Univariate Procedure using Ressar SAS Output: Univariate Procedure using Chloride SAS Output: Univariate Procedure using CorrRate SAS Output: Univariate Procedure using LChl SAS Output: Univariate Procedure using LResdir SAS Output: Univariate Procedure using LRessar SAS Output: Univariate Procedure using LCorr SAS Output: Univariate Procedure usingpHdir Residual SAS 0utput:pHdir Residual vs. Predicted Value of CorrRare SAS Output: Univariate Procedure usingpHsat Residual SAS 0utput:pHsat Residual vs. Predicted Value of CorrRaie SAS Output: Univariate Procedure using Reddir Residual SAS Output: Reddir Residual vs. Predicted Value of CorrRoie SAS Output: Univanate Procedure using Redsat Residual SAS Output: Redsat Residual vs. Predicted Value of CorrRate SAS Output: Univanate Procedure using LResdir Residual SAS Output: LResdir Residual vs. Predicted Value of CorrRate SAS Output: Univariate Procedure using LRessat Residual

SAS Output: LRessat Residual vs. Predicted Value of CorrRate SAS Output: Univariate Procedure using LChl Residual SAS Output: LChl Residual vs. Predicted Value of CorrRate SAS Output: Correlation Matrix SAS Output: Correlation Matrix for Clay Samples SAS Output: Correlation Matnx for San6 Samples SAS Output: Correlation Matrix for SandClay Samples Potentials Obtained f?om Calibrating Solutions: Senes 1 Calibration Curve for Senes 1 Trial No. 1: Tafel Results Trial No. 1: Linear Polarization Results Trial No. 2: Tafel Resilts Tnal No. 2: Linear Polarization Results Stem and Leaf Diagram Stem and Leaf Diagrarn and Boxplot Normal Probability Plot

Y vs. X Plot
Example of an Insignificant Predictor ANOVA Table Normal Distribution Normally Distnbuted Y Values

Y Values Not Distributed Normally


Ideal Y vs. X Distribution Non-Linear Relationship Between X and Y Venn Diagrams for 1 and 2 Independent Vanables Effect of Outliers on R'

LIST OF TABLES
Elecromotive Senes Galvanic Senes for Seawater Typical p Values Soil Type Results Drainage Ability Results Moisture Content Results pH-direct Results pH-saturated Results Redox-direct Results Redox-satwated Results psaturated Results p-direct Results Sulfide Content Results Using lodine Solution Sulfide Content Results Using HCI and Lead Acetate Paper Preparation of Calibrating Solutions Chloride Ion Concentrations Values Specified for Tafel Test Values Specified for Linear Polanzation Test Results Obtained for Soil Sample No. 96 Corrosion Rates Corrosion indices According to AWWA Corrosion Rates According to PACE Values of LChl Values of LResdir Values of LRessat Values of LCorr Residual Characteristics: pHdir Residual Characteristics:pHsut

Residual Charactenstics: Reddir Residual Charactenstics: Redrar Residual Characteristics: LResdir Residual Characteristics: LRessat Residual Characteristics: LChl Possible 1,2,3, and 4-Variable Models Possible Models with Corresponding SSE and DOF Values Cntical Values for F Information about Possible Models Correlation Matnx Dummy Variables for Soilfype: 2 Variable Case Dummy Variables for Soilfype: 3 Variable Case Relationship Between a,P, and Power Relationship Between n and Power Values of L for a = 0.5

xii

NOTATION AND ABBREVIATIONS

Type 1error Type II emor Point of intersection of a line with y = O Slope of a line Overvoltage Current density Current density at the anode Current density at the cathode Corrosion current density Exchange current density Mean Resistivity Summation Standard deviation Potential Nerst potential Standard Nerst potential Corrosion potential Oxidation potential Reduction potential Benchmark mode1 in a significance test Model being tested in a significance test Ohm Degrees Celsius Activity Amperes

xiii

ANOVA
&x

Analysis of Variance Activity of species being oxidized Activity of species &mg reduced Atmosphere American Water Works Association Coulomb Calories Cook's distance Variable representing chloride content of soi1 Centimeter Mallow's number Decade Degrees Degrees of fieedom Correlation coefficient Residual Electron Equivalent Faraday's Constant Effect size Current Corrosion Current Joule Kelvin Number of variables in the R-mode1 Mass transfer coefficient kilogram liters

ad atm AWWA C cal CD


Chloride

cm

CorrRare Variable representing the corrosion rate of a metal in a soi1

CP dec deg DOF

Drainage Variable representing the drainage ability of a soi1

xiv

Variable representing the logarithm of the cliloride concentration of a soi1 Variable representing the logarithm of the corrosion rate of a metal in a soi1 Variable representing the logarithm of the resistivity of a soil, when measured as received in the laboratory Variable representing the logarithm ofthe resistivity of a soil, when measured afier saturation with distilled water

M
mm MU+ Moisture mV N N P pHdir

Metal Millimeter Metal ion Variable representing the moisture coptent of a soi1 Millivolt Number of obse~ations Solution normality Number of variables in a mode1 Variable representing the pH of a soil, when measured as received in the laboratory Variable representing the pH of a soil, when measured afier saturation with distilled water Partial pressure of oxygen Parts per million Predicted residual sum of squares Procedure statement in SAS Adjusted partial correlation Variable representing the reduction potential of a soil, when measured as received in the laboratory Oxidation-reduction Variable representing the reduction potential of a soil, when measured afier saturation with distilled water Variable representing the resistivity of a soil, when measured as received in the laboratory

PO2 PPm PRESS PROC R2adjurted


Reddir Redox Redsat

Ressat

Variable representing the resistivity of a soil, when measured afler saturation with distilled water

Polarization resistance SandIClay A soi1 composed of a mixture of soil and clay particles SAS SSE Siatistical Analysis System Error surn of squares Variable representing the result of the sulfide content test using HCI and Lead Acetate Paper Suljl Variable representing the result of the sulfide content test using the iodine solution

SuIfHcr

Temperature Volt Variation Inflation Factor Weight percentage Mean of X, Dummy variable Independent variable Dependent variable Mean of Y year

v
VIF
wt. %
X'

xd

x,
Y
Y'

Y=.

xvi

ACKNOWLEDGMENTS
First and foremost, 1 would iiie to express my gratitude to my supervisor, Prof. Saeed M. Mina, whose unending guidance and encouragement proved invaluable in the successhl realization of this research program.
1 am also deeply indebted to Mr. Nourrediie Kadourn of COREXCO,

Montreal, for suggesting the research topic, and for devoting considerable attention to its progress. Furthemore, 1 am very grateful to Mr. Grard Benchtrit and to COPEXCO, Montreal for the unrestricted access to equipment and materials, without which this project would not have been possible. Fially, 1 would like to thank my family and fiiends for their support and encouragement. The research project was supported by the Natural Sciences and Engineering Research Council's PGS-A Scholarship held by the author.

xvii

CHAPTER 1: INTRODUCTION
The corrosion of underground infrastructure is a very widespread problem. Stmctures such as water mains, natural gas pipelines, and gasoline storage containers are only some of the many structures affected by soi1 corrosion al1 around the world. When a nahiral gas pipeline or a gasoline storage container fails, there is a high danger of fire and subsequent explosion. Furthemore, the environmental darnage caused by such failures is oflen devastating and irreparable. Failure of water mains can be equally dismptive, as Canadians depend on drinking water for domestic, industnal and fire fighting purposes. The physical integrity of the water distribution system is an essentiel component for the health and economic well being of Canadians. Every year, $200 million are spent on renewing iron water mains in Canada. The majority of the problems occur on water mains made up of cast or ductile iron, which account for 70% of the water mains. The fundamental cause of the detenoration of the pipes is soi1 corro~ionl'~. here is therefore a great need to determine the causes of soi1 corrosion, and to establish a quick and easy method of evaluating the corrosivity of soils. There has been much research done in the field of corrosion and, in particular, soil corrosion. Certain standards are now in use by the industry to determine the extent to which a soi1 is considered corrosive. Standards such as that of the American Water Works Association (AWWA C 105) and PACE 82-3 are widely used to determine whether or not a metal subjected to a given soi1 will suffer detenoration. In al1 of these standards, certain soi1 charactenstics are measured and a standard grid allows the technician to calculate a corrosivity index for the soil. The term grid refers to the established method of calculating the corrosivity index, and is composed of the test results in combiiation with the appropnate points allocated to each. However, none of the standards take into account the chloride ion content of the soil. It has been argued that chloride ion content is measured indirectly through the measurement of the soi1 resistivity, which is incorporated in some f o m in al1 the grids. This variable accounts for the total ion content responsible for the conductive nature of the soil. However, chlonde

ions have a dual role in the corrosion process. They not only promote corrosion because they are conductive by nature, but they also inhibit passivity of the metal, i.e. they inhibit the formation of an oxide layer on the metal surface which protects the metal from corrosion[21. For this reason, it is suspected that the ineasurement of chloride ion concentration will permit prediction of the corrosivity of a soi1 more accurately than is possible without the knowledge of this parameter. It is the main goal of this research program to determine whether the chloride ion concentration can provide the information that the variables already being tested in the standards do not provide. If the answer is affumative, then this variable can be recommended for incorporation into the existing grids, or a new grid be created to adequately account for the soil chloride ion content. The linear polarization test (an accelerated electrochernical test which can be used to evaluate the corrosion rate) will be used to determine the soil corrosivity. This method has been used extensively in the examination of steel corrosion in reinforced c~ncrete['.'.~.~], and has recently been used in the investigation of soil corrosion 17]. in this project, the variable obtained using the method of linear polarization is considered the
"true" corrosion rate of the pipe in the given soil, and it will be compared with the other

soi1 characteristics. The following soil characteristics are measured: soil type, drainage ability, pH, oxidation-reduction potential, sulfide content, resistivity, and chloride ion content. The above variables are analyzed using the statistical package SAS, and the relationship between the soi1 characteristics and the "true" corrosion rate will enable the analyst to determine the extent to which each soil characteristic predicts the actual corrosion ratels1. The objectives of this project are the following:
O

To study the method of linear polarization (applications and limitations), and to determine the extent to which it can be used in the field of soi1 corrosion. To become familiar with the AWWA and PACE standards for soil testing, and to outline the limitations and advantages of each standard. To study the relationship between the soi1 characteristics and the corrosion rate of the soil, and to determine which variables play the most important role in the corrosion

process. What is the role of the variables which are expected to be the rnost influential? What is the importance of the chlonde ion content of the soi1 in predicting the corrosion rate? To determine whether the chlonde ion concentration provides information that the variables already 'bcing tested in the standards do not provide and, if so, to suggest that this variable be incorporated into the existing grids or that a new grid be created to include this variable. The report is divided into two main sections: the rneasurement of the soi1 characteristics, and the analysis of the collected data using SAS. Chapter 2 introduces the basic phenornena underlying the corrosion process, and provides the background information essential to understand the variables being studied and their role in the corrosion process (Chapter 2: Corrosion and Corrosion Tesring). In Chaprer 3:

Procedures and Apparatus, the rnethods and equipment used to rneasure the vanous soi1
characteristics are presented. The statistical analysis of the experirnental data obtained in Chapter 3 is presented in Chapier 4: Analysis of Erperimental Results, and the results are discussed. Finally, conclusions and recommendations for future work are made in

Chapter 5: Conclusions and Recommendations.

CHAPTER 2: CORROSION AND CORROSION TESTING

To fully understand the factors that conhibute to corrosion in a particular environment, a thorough howledge of the various corrosion mechanisms is essential. A sound knowledge of the basic principles will allow the corrosion engineer to predict the aggressiveness of a given environment, to alter the environment to decrease its corrosivity to a particular material, to protect the materials from corrosion, or to choose materials which will not be affected by the existing aggressive environment. The basics of electrochemistry with respect to corrosion of metals in aqueous media are briefly reviewed, along with the information deemed essential to understandiig the variables selected for this study, and their role in the corrosion process. The second section of this report introduces the reader to the method of linear polarization, and examines the principles underlying the determination of the corrosion rate. The following section introduces the causes and effects of soi1 corrosion, and the final section discusses the AWWA and PACE standards currently being used by the industry to determine the corrosivity of soils.

2.1

Principles of Electrochernical Corrosion


2.1.1

Necessarv elements for corrosion

Corrosion can take various forms, and can occur under different circumstances. However, there are certain constants in al1 corrosion processes. Four elements must be present for corrosion to occur: an anode, a cathode, an electrical conductor, and an ionic conductor [2.7e91. The anode consists of a metal (Fe, Cu, etc.) which is oxidized in the presence of an oxidizing agent, or a corrodent. The metal, denoted by M, undergoes the following reaction:

M + Mn' +ne'

Oxidation of metal M Anodic Reaction

(2.1)

It is the anode that undergoes damage. The metal M dissolves, releasing ions (Mn') and n electrons. Some examples of metal oxidations are:

The cathode can consist of a metal, or a solution nch in oxygen or hydrogen ions. While the anode is undergoing oxidation, the cathode is undergoing reduction. During reduction, the cathode or corrodent is consuming the electrons released by the oxidation of the metal. The two corrodents that are of major importance are the acidic solution, and the neutral aerated water (e.g. rainwater or sea~ater)[~-'.~]. The reduction equations are as follows: 2H'+2e'+Hz Acid solution: Neutral Aerated Water: 112 0 2 + H?O+ 2 e' Reduction of H ' Reduction of 0 2 (2.3a) (2.3b)

+2 O K

The complete corrosion equation is obtained by combiniig the equation of the oxidation of metal M with one of the above reduction equations.
environment, the complete equation becomes:
in an acidic

A product of this reaction is hydrogen gas, which can often cause problems such as hydrogen blistenng, or hydrogen embrittlement of metals [2.91.

in neutral aerated water, the complete reaction is as follows:

n i e term 112 O2 in the above equation refers to the dissolved oxygen present in the water. Furthermore, the products of the above reaction often combine to form a precipitate:

If the metal M represents iron (Fe), then Fe(OH)2 or m t is precipitated when oxygen is the corrodent. Another element essential for corrosion to occw is an electrical conductor, which allows electrons to move fiom the anode, where they are released, to the cathode, where they are consumed
[2.7.91.

If this movement of electrons cannot proceed, then the

reduction reaction would stop. Furthemore, the anode would now be negatively charged due to the presence of the electrons released, and this disequilibnum would stop any further oxidation and release of electrons [2.7.91.

In the case when a piece of metal is the site of both the anodic and cathodic
reactions, or when the two sites are located on separate pieces of metal which are in physical and electncal contact with one another, then the metal itself is the electrical conductor. However, if the two sites are found on separate pieces of rnetal, then any metal wire connecting the two will act as the electrical conductor through which the eiectrons will move [2.7.91. The last essential element in the corrosion process is an ionic conductor, or the electrolyte. The electrolyte, which is the aqueous solution in contact with both the anode and the cathode, allows the movement of ions fiom the anode to the cathode thus ensuring electrical neutrality and allowing the corrosion process to continue 12.7.91.

i n summary, the f o u essential elements to the corrosion process are the anode, the
cathode, the electrical conductor, and the ionic conductor. The anode is the site where damage occurs as the rnetal is oxidized and electrons and ions are released. The electrons travel fiom the anode to the cathode via the electrical conductor, which is usually the metal itself, or a metal wire connecting the two sites. The cathode is the site where electrons are consumed while oxygen or hydrogen are reduced. As the ions move fiom

the anode to the cathode via an ionic conductor, which is an aqueous solution simultaneously in contact with the anode and the cathode, electrical neutrality is established. Corrosion cannot occur unless al1 of these four elements are present.

2.1.2

Phvsical forms of corrosion

Corrosion can take various forms. The most common forms of conosioii are the following 12*91: Uniform attack Galvanic attack Crevice corrosion Pitting corrosion Erosion corrosion Selective leaching
e

Stress corrosion

2.1.2.1

Uniform Affack

Uniform attack is the most common form of corrosion, making up 80-90% of the cases in practice [2.91. It is normally characterized by a reaction which proceeds uniformly over the entire surface of the metal. Al1 points on the surface corrode at a sirnilar rate because every point acts altematively as an anode and a cathode. There is not one fixed point acting as the anode, therefore not one fixed point of deterioration. This form of corrosion is easiest to predict, and can be prevented or slowed down most easily.

2.1.2.2

Galvanic Attack

Galvanic attack occurs wlien two different metals are placed in electrical contact

in a corrosive environment [2.9.'01. If the two metals are not in contact with one another,
they would each corrode at their own rate. However, when they are placed in electrical contact, the more anodic of the two metals suffers accelerated corrosion (anodic reaction) while the corrosion rate of the more cathodic metal decreases.

I n order to determine which of the two metals will corrode, the electromotive
series, which is an ordered list of each elements accompanied by their reduction potential, is consulted. Table 2.1 is a reproduction of this senes.
Standard Potenfial @(in volts) at ZS'C 1.50 1.2 0.987 0.854 0.800 0.789 0.521 0.337 0.000 -0.126 -0.136 C a . -0.2 -0.250 -0.277 -0.336 -0.342 -0.403 -0.440 -0.53 0.74 -0.91 -0.763 Ca. -1.1 -1.18 -1.53 1.63 1.66 - 1.70 1.80 1.85 -2.37 -2.71
Ca.

Elcctrode Reaction pi:Pd2' Hg:' AgHg:" Cu' Cu" 2H' Pb'Sn2' Mo'^ Ni:' Co"
AU"

= Au = pi + 2e'= Pd + 2e- = Hg + e- = Ag + Ze- = 2Hg + e- = Cu + te- = Cu + 2e- = H: + 2e- = Pb + 2e- = Sn + 3e- = Mo + 2e- = Ni + 2e- = Co n- + e- = n In3- + 3r' = In Cd:' + 2e- = Cd Fe" + 2e- = Fe Ga" + 3e- = Ga Cr" + 3e- = Cr C i ' + 2e- = Cr Zn2* + 2e- = Zn Nb" + 3e- = Nb Mn:' + 2e- = Mn Zr" + 4e- = Zr Ti:' + 2e- = Ti Al3' + 3e- = AI Hf" + 4e- = Hf + 3e- = U U" Be:' + 2e- = Be Mg" + 2e- = Mg Na' + e- = Na

+ 3e-

+ 2e-

Table 2.1

Elrctromotive Strier

"'

A shorcoming of the electromotive series is that is fails to take into account any

alloying, or the effect of the formation of protective films which occur in the various environments. A more practical alternative to the electromotive senes is the galvanic series which is specific to a given environment. Table 2.2 indicates the galvanic series for seawater.

Acliw (Read down)

Magnesium Magnesium ailoys Zinc Aluminum 5052H Aluminum 3004 Aluminum 3003 Aluminum 1100 Aluminum 6053T Alclad Cadmium Aluminum 2017T Aluminum 2OXT Mild steel Wrought iron Cast iron Ni.Resist 1 3 5 Chromium stainlcss steel. type 410 tactivel 50-50 lead-lin solder

18-8 stainlcss steel. typc 305 (active) 18-8. 3% Mo slainless steel. type 316 (active) Lcad Tin Muntz metal Manganese bronze Naval b a r s Nickel (active) 76% Ni-16% Cr-7% Fe (Inconel 600) (active) Yellow brass Aluminum bronze Red brass Copper Silicon bronze 5% Zn-ZE Ni. Bal. Cu (Ambrac) 70% Cu-3m Ni G88% Cu-2% Zn-IWt Sn ~comoosition bronze) 88% Cu-3% Zn-6.5% Sn-1.5% Pb tcomp. Nickel (passive) 76% Ni-16% Cr-7% Fe (Inconel 600) (passive) 7 1 % N i - 3 N Cu (Monel) Titanium 18.8 stainless steel. typc 305 (passive) 18-8. 3% Mo stainless steel. type 316 (passive) Nable (Read cp)

Table 2.2 Calvani Series for Seawater 12]

2.1.2.3

Crevice Corrosion

Crevice corrosion is highly localized, and reflects the site at which it occurs. As the name implies, corrosion occurs at crevices (openings of about 1 mm), or at points of contact between the two surfaces [2.91. The opening is suflicient to allow the corrodent to enter, but not large enough to allow the corrodent to flow. Corrosion o c c m in two stages, which are illustrated in Figures 2.la and 2.1b.

.tapa.t , Water

1WZ//
--

Figure 2.1 The Two Stages of Crevice Corrosion

In stage 1, unifom attack occurs in the crevice. However, afier some t h e the stagnant water is depleted of the dissolved oxygen, and stage 2 begins. Within the crevice, the reduction reaction cannot proceed because the dissolved oxygen is depleted. However, the oxidation of the metal continues. The electrons released in the crevice travel through the metal to a site outside the crevice where dissolved oxygen is present. The result is that the crevice continuously acts as the anode and suffers corrosion, while the remaining metal acts as the cathode and suffers no furher damage. The danger associated with this fonn of corrosion is that it is unpredictable, and that the damage proceeds undetected because its location is well hidden. Furthemore, the rate at which the crevice metal detenorates is quite high when the crevice area is

small with respect to the surface area in contact with the corrodent. This occurs because the crevice metal (anode) must produce electrons at a rate to satisfy the demand of the entire cathodic area.

2.1.2.1

PifringCorrosion

Pining corrosion is a highly localized generally form of corrosion. starts on It horizontal

surfaces which can hold water under gravity, and at a surface discontinuity (scratch or dent), and grows downward. As in crevice corrosion, two local sites are involved
[2.9."1.

The stagnant water within the pit is depleted of oxygen, and the tip of the
Figure 2.2 Pining Corrosion

pit becomes the anodic site. Electrons move through the metal to the surface of the metal which is in contact with aerated water (cathodic site) and enables the reduction reaction to proceed. Pining is one of the most destructive forms of corrosion. It can cause equipment to fail because of perforation and it can be extremely dangerous when it occurs on vessels whose contents are under pressure. Furthermore, it can be difficult to detect because the corrosion products ofien cover the pits, which continue to grow undetected. Figure 2.2 illustrates schematically a metal undergoing pining corrosion.

2.1.2.5

Erosion Corrosion

Erosion corrosion is normally associated with moving lunies [2.91. Solids in the slurry erode (or scrape off) the protective oxide layers which form on metal surfaces. These protective surface films provide metals such as aluminum, lead, and stainless steel

with their ability to resist corrosive environments. Corrosion occurs in the areas where the protective layer has been scraped off. The exposed metal is anodic to the metal protected by the surface film and, therefore, suffers corrosion as s h o w in Figure 2.3. This fonn of corrosion is usually accompanied by surface striations, i.e. gooves following a distinct direction.
Figure 2 3 Erorion Corrosion

m
Anodic Siics

Mctal Surface

2.12.6

Selecrive Leoching

Selective leaching is the removal of one element from a solid alloy. It occurs when an alloy is composed of two elements far apart fiom one another in the electrochemical series. The more anodic of the two metals will be the anode and will suffer accelerated corrosion, leaving behind the more cathodic metal [2.91. An example of a metal subject to selective leaching is brass which is made up of copper and zinc. Zinc, the more anodic metal, is "leached out" and the resulting material is a porous copper matnx. Another example of selective leaching is the well known phenomenon of graphitization of gray cast iron. Gray cast iron is composed of a network of graphite within a matnx of iron or steel. Figure 2.4 shows the microstnicture of gray cast iron. The graphite is in the form of flakes connected in such a way that the material is able to hold its shape as the iron dissolves 12.9.12s131. This dissolution occurs because graphite is cathodic to iron and a galvanic ce11 develops. lron dissolves
Figure 2.4 Mirrortruturc of Gray Cas1 lroo

"'

Ieaving behind a porous mass consisting of graphite, voids and rut, which can be easily cut with a knife. in contrast, the graphite in ductile or malleable irons is in the shape of nodules or spheres, and a porous matrix cannot form. As such, these matenals are not subject to graphitization.
2.1.2.7
Stress Corrosion

Stress corrosion is the result of the combiied effect of a weak applied or residual tensile stress and a weak corrodent [2.91. Each of these two components alone would not be problmatic, but together they accelerate the rate of corrosion. It has been observed that, in most cases, no corrosion would occur when a metal subjected to a weak corrodent is not subjected simultaneously to a tensile stress. stress are sufficient to cause severe damage [2.91. Stresses f?om 5-70% of the yield Another point of interest is tha: the

corrodent is metal specific, i.e. not ail corrodent will affect al1 metals. For example, a weak chloride solution will cause severe damage to stainless steels, but will not affect plain carbon steel at all. In addition, a weak nitrate solution will damage plain carbon steels, but will not affect stainless steel at al1 [2.91. Like pitting corrosion, the crack starts at a surface pit or scratch, and moves downward. n i e crack follows an anodic path. One example of an anodic path is that of zinc in brasses, which is an alloy of zinc and copper. An anodic path can also be created when an element of an alloy precipitates at either the grain boundary or within the grain itself leaving one of the two areas anodic to the other. When the grain boundary is anodic to the grain, the crack is said to be intercrystalline [2.91. When the grain itself is anodic to the boundary, then the crack is said to be transcrytalline 12.91. Figures 2.5a and 2.5b illustrate the difference between the two types of cracks. When cracks begin to form, the reduced cross-sectional areas are unable to withstand the design loads. Furthemore, solid corrosion products which often accompany the corrosion process cause additional stresses by their expansive nature. As the cracks grow under the combiied action of corrosion and stress, the tensile stress in the uncracked section grows exponentially and can lead to sudden unexpected failures '2,91.

Figure 2Ja Intcrcrystnlline Crack

Figure 2.Sb Tnnscrystallinc Crack

2.1.3 Whv do metals corrode? The electrochemical series indicates whether a metal is more anodic compared to another, and it can provide the potential

4 of a reaction. But what does this potential

represent and why does a metal corrode in the first place? Corrosion of a metal occurs because ofthe element's tendency to attain the natural state, which is the ionic form. The metallic form of most elements is unstable, and there is a potential for these metals to be oxidized:

M
unstable

+
+

Mn++ ne' nahxal state. stable ore

+oxidation

(2.1)

This potential c m be compared to the potential energy of a sphere when held at an elevated position .'91 As seen in Figure 2.6, at position 1 the sphere equilibrium is unstable and it possesses potential energy. Some ofthis energy will be used up as the bal1 moves to position 2, a point of lowcr potential energy. This is the spontaneous direction for this particular system. Movement fiom position 2 to position 1 would not occur spontaneously in nature. Energy fiom an extemal source must be provided for such a movement to occur.

Similarly,

electrochemical

reactions are accompanied by a potential

4,indicating the potential for the reaction


to proceed spontaneously. It must be noted absolute thatvalue, the potential but a relative 4 is notone. an Potentials of reactions are always measured with respect to a standard. The standard most often used is the reduction of hydrogen ions: 2H'+2e-+H2

fl
I

Position 2

Figure 2.6 Stable and Unstable Positions

where 4=0.000V

(2.3a)

By convention, the value of the potential of this equation is chosen to be equal to zero volts, and the potential of other reactions are measured against this standard. Another convention adopted is the used of reduction potentials, &,d, instead of oxidation potentials, 4 , , , in tables such as for the electrochemical and galvanic series. To obtain the value of a particular oxidation reaction, the value of h e d is simply multiplied by -1. For example:

The potentials listed in Tables 2.1 and 2.2 are termed half-ce11 potentials, because they accompany only half of the overall reaction. A complete reaction is made up of two reaction halves. One reaction-half is a reduction reaction, and the other is an oxidation reaction. For exarnple, for the following two reaction halves:

It is useful to determine which of the two reaction-halves will be reversed such that the potential of the entire system will be non-negative, i.e. proceed spontaneously. It

is easily noted that if Equation 2.7a is reversed, the total potential of the system will be equal to 4,

+ 4IrCd = -(-0.440 V) + 0.000 V = 0.440 V, which is positive. The system

will spontaneously behave according to the following equation:

When the two reaction halves are combined, the one with the srnaller reduction potential will be reversed and the element will undergo axidaiion. Retuming to the two most conunonly encountered corrodents, the acidic solution and neutral aerated water, it is evident from the electrochemical senes that the reduction potential of both hydrogen ion reduction and oxygen reduction is higher than most metals of interest to engineers:

The combination of one of the above corrodents with a metal whose reduction potential is lower than that of the corrodent will result in the oxidation, or corrosion, of that element.

2.1.4 Determinina the Rate of Corrosion Examination of the electrochemical or galvanic senes enables one to determine whether or not a metal will corrode in a given environment. But of more interest to the corrosion engineer is the determination of the raie at which this corrosion will proceed. Corrosion rates are determined by studying the polarization behavior of the two reaction halves. As seen previously, the two reaction halves are the following:

+ neANODIC REACTION: M + Mn+


CATHODIC REACTION: Acid solution: OR Neutral Aerated Water: 2H++2e'+H2 112 0 2 + HzO+ 2 e' + 2 O K

Oxidation of metal M

' Reduction of H
Reduction of Oz

In order to fully understand the above corrosion system, an analogy will Grst be made with the copperlzinc banery
[2.7.9.'48'51.

AS seen in Figure 2.7, a CdZn banery is ions (a solution of CuSOd), and a

made up of a copper rod immersed in a solution of CU''

Zn rod immersed in a solution of 2n2' ions (a solution of ZnSO4). The two solutions are co~ected by a diaphragm which allows the passage of ions, enswing electncai neutraliiy. From the electromotive series, it is observed that the reduction potential of Zn is lower than that of Cu, and therefore Zn will be anodic to Cu and suffer oxidation. The two resulting equations are: Zn + Zn2' + 2 ecu2++ 2e- + Cu Pnor to electncal contact of the metal rods, the two separate systems are at equilibnum, and no corrosion is occurring. Once the two metals at different potentials are placed in electncal contact, the system will attempt to reach a point of equilibrium at a potential somewhere between +a and +c, . The driving force of (+c, - OZ,, ) volts will cause Zn ta be oxidized, and copper to be reduced according to the above
I

Oxidation +& = 0.76 V Reduction = 0.34 V

(2.9a) (2.9b)

4 Control of resistance

T"'_rl

Figure 2.7 Schematic o l a CulZn Banery

equations. When Zn is oxidized, electrons and 2n2' ions are released. The electrons travel through the wire to the surface of the copper rod, where they combine with the cu2+ions from solution, to form Cu. As the electrons travel through the wire, a current is registered by an ammeter.

In order to study the variation of the potential with current, the system is
manipulated by varying the current permined to flow through the wire, via various resistors. obtained by Figure 2.8 is ploning the

potential of Cu and Zn venus the registered current [2.91. Three distinct points on the diagram are of interest: 1. the open circuit at I=O,
2. a

log l

point

of

restricted

Figure 2.8

$ vs. log 1 for CuRn Battery

current flow, and


3. the short circuit at I=Imax.

1. Open Circuit

The point on the diagram representing an open circuit is at I=O, i.e. when no current flows. This represents the behavior of the system when electrical contact is not provided, and the two metals behave independently. It is observed that the potential of each of the metals is the standard Nernst potential which is defined as [2.91:
=

I$N

$NO

+ 2.303 RT/nF * log [a&a,d]

(2.10)

where

$N

= Nernst potential
=

$NO

Standard Nernst potential (equilibnum potential of metal in contact with its own ions, at unit activity)

R = Gas constant (8.314 Jtdeg mole)

T = absolute temperature (K)


n = number of electrons transferred

F = Faraday's constant (96500 Cleq)


&%

= activity, = activity,

or concentration, of oxidized species or concentration, of reduced species

a,,d

For the CdZn banery, the Nernst potential is calculated for both the Zn and Cu electrode. The denvation of the following equations c m be found in Appendix A. For the reduction of Zn at 25 OC: Zn
=$
N ~ O

-t zn2*

+ 2 e- , Equation 2.1 O becomes:


(2.1 1)

+ 0.059212 log [zn2']


Cu
-t

For the reduction of Cu at 25'C:


4h.c" = ~ N C " '

cu2++ 2 e-, Equation 2.10 becomes: (2.12)

+ 0.059212 log CU^']

2. Point of Restricted Flow Between 1=0 and I=Imax, the current is manipulated to flow at a predetermined rate. Using resistors, the current is allowed to vary and the potential of each metal is measured and ploned versus the current. From the diagram, the change in potential for e x h metal, termed the overvoltage q, can be calculated as 12s91:
f)~u = $8

W n

= $b

- $NU -$~zn

(2.13) (2.14)

3. Short Circuit

The point of short circuit is the point when the current is allowed to fiow unresinctedly, i.e. the resistance R=O and the current FI,. The current is govemed only and the current by the potential difference of the system. This situation is the fiee corrosion situation. The equilibrium potential is called the free corrosion potential, , & ,,

, associated with 4

is the corrosion current, I,,.

The above analogy can serve to better undersland the two corrodents hat are most commonly encountered by corrosion engineers: the reduction of H 'ions in an acidic solution, and the reduction of 0 2 in a neutral aerated solution. When a metal M is placed in an acidic solution, the following reactions occur:

Since the reduction potential of H 'is larger than the reduction potential of most metals of interest, the metal will undergo oxidation (anodic reaction). The following diagram will result:

Figure 2.9 Poiarization Diagram for Corrosion in Acidic Solution

When the system is allowed to corrode freely, the potential, ,$ current I,,will

and the corrosion

apply. Fwthemore, the Nernst potential of H 'reduction becomes [291:

It is very interesting to note that, in the case of corrosion in acidic environments, the Nernst potential depends only on the temperature and on the pH. n i e derivation of the above equation can be found in Appendix A. When a metal M is placed in neutrol oerored water, e.g. rainwater or seawater, the following reactions occur:

Since the reduction potential of O2 is larger than the reduction potential of most metals of interest, the metal will undergo oxidation (anodic reaction). The following diagram will result :

Figure 2.10

Polarizalion Diagram Tor Corrosion in Neutra1 Aerated Water

When the system is allowed to corrode freely, the potential, ,$

and the corrosion

current Ic,,will apply. Furthemore, the Nernst potential of O2 reduction becomes [2.91:

4 ~ 0= 2 4~02'

+ 2.303 RTl4F log {PO~/[OHJ~}

where PO2 = partial pressure of oxygen in the solution.

In the case of corrosion in neutral aented water, the Nernst potential depends on the
temperature, pH (or pOH), and the partial pressure of oxygen in the water. 2.1.5 The Exchanee Current Densitv. 1 , which

A term that appeared ofien in the previous polarization diagrams was ions, the Nerst potential, $N,and exchange current density,

i,,

represents the exchange current density. When a metal is in equilibrium with its o u n
i,,

apply 12.91. An example of

this is a Cu rod placed in a solution of CU" ions (a solution of CuS04). The equilibrium reached is a dynamic one. Although no changes are visible to the naked eye, reduction and oxidation of the metal are taking place at equal rates. This rate is termed the exchange current density,
IO.

Electrons travel through the metal from the anodic to the

cathodic sites, which are continuously changing locations.

In

reduction and very

reactions such as H'


02

reduction, the exchange


i , , is

current density, of the metal

sensitive to the condition surface. is highly Furthemore, the corrosion current, I , dependent on the value of
i,.

As it is s h o w Figure
i,

log 1

schematically in

2.1 1, Ison increases as increases.

Consequently,

Figure 2.11 Dependence oll,, on the value 01 i.

the rate at which a metal

22

will corrode varies as the surface preparation of the metal varies 12*91. When testing metal samples to obtain Ln, it is very important to ensure that the surfaces of the samples are prepared consistently, determination of Ln. so that variations in
1 .

will not introduce errors in the

2.1.6

Determination of

& I ,,

As mentioned previously, the values of L ,and

4 ,

are obtained fiom the

intersection point between the anodic and the cathodic lines of polarization diagrams. Up to this point, the diagrams are similar in that each of the two lines is represented by a straight line. This will be correct in approximately 90% of cases in which activation polarization behavior governs
12.91.

However, an altemative to this situation warrants

some attention. This behavior is termed concentration polarization. The shape of the polarization lines are determined by either concentration or activation polarization. For the sake of compatibility in calculations, the term
I ,

is used instead of

Lon.The term r,
anode. 2.1.6.1

represents the corrosion current demity, and it is diiectly proportional

to Lon, the corrosion current. i n fact, Ln =r ,

* A, where A is the surface area of the

Activation Polarizotion

Activation polarization, or Tafel behavior, makes up 90% of the cases, and it occurs when the rate of a reaction is controlled by the slowest of the steps in the reaction sequence, Le. the electrochemistry of the system govems the rate which a slow species can move through the solution.
t2.91.

This behavior

occurs in well stirred solutions, where the reaction rate is not limited by the speed at

In activation polarization, both the reduction and the anodic reactions display
Tafel behavior, i.e. both behave linearly. The Tafel equation relates the overvoltage q to the current density r by the following equation:

? = P 1% (i/b)

(2.17)

where p = Tafel constant, or Tafel dope [2s91. The values of P have been tabulated for the vanous metals in different media. Table 2.3 shows some typical values.

Metal

1 Temperatore (OC)

Solution 1N HC1 0. IN HC1 0.1N HC1

1N HCI
0. IN HC1 2N Hzso4

1N HC1
1N 0.01-SN HCI
Table 2.3 Typieal p Values l4

Figure 2.12 shows a typical diagram in which activation polarization govems. In order to solve for values of , ,$
i , , ,

the following two equations are solved simultaneously, and the

and i , are obtained:

I
2.1.6.2

lcon

log 1

O '
Figure 2.12 Activation Polarizntion Diagram

Concentration PolarCation

In concentration polarization, only the reduction reaction is affected. The


oxidation reaction exhibits Tafel behavior as it did in the case of activation polarization.
A typical diagram is s h o w in Figure 2.13.

Icorr=l~

log 1

Figure 2.13 ConecntrationPolarization Diagram 12'91

Concentration polarization usually governs in cases where the solution is stagnant. The rate of the reaction depends on how quickly certain species are capable of diffusing hrough the stagnant solution, towards the metal surface where corrosion occurs [291. For example, in corrosion due to H ' reduction where the solution is stagnant, the initial condition is represented by Figure 2.14a. However, as H 'reduction occurs at the surface of the metal, H ' ions are used up. As a consequence, a thin boundary layer is formed in which the concentration of H 'ions varies iom the concentration H 'in the buk solution,
[H+]b,

10 zero. This concentration gradient causes

ions to diffuse towards the surface

where they are then consumed by the corrosion process. Figure 2.14b illustrates the boundary layer in question.

, Distance from metal surface

1O

IH'I.

*
Distance from metal surface

'ions in Time Figure 2.14 Distribution of H

The rate at which certain species (in this case the

H ' ion) are able to diffuse


in

through the boundary layer will govern the rate of the corrosion reaction.

concentration polarization, it is said that mass transfer controls the rate of the reaction. The maximum expected current in concentration polarization cases is called the limiring
current,
i,.

As it can be seen in Figure 2.13, when concentration polarization is srnaller than it would be if activation polarization governed, i.e. if the

governs,

I,,

solution had been well stirred.

The limiting current densities for the reduction reaction can be calculated from the following equation
[2.91:

11. = knF

[arc&

(2.1 9)

where k = mass tmnsfer coefficient (cmlsec) n = number of electrons transferred in the reduction F = Faraday's constant (96500 Cleq) [ a & ,
= activiy,

or concentration, of the reduced species

2.1.7

Effect of Varvine Parameters Using Polarization Diaerams

Polarization diagrams have many uses. They help one visualize electrochemical phenomena which would otherwise be quite abstract. Polarization diagrams also help visual understanding and prediction of the effect of varying certain parameters influencing the corrosion rate in the two corrodents of interest, acidic solutions and neutral aerated water. The parameters discussed in the following sections are:
r POz and Hiconcentration,
r 0 2 solubility, r r

multiple corrodents, passivity,

r galvanic anack, and


r

chloride content.

2.1.7.1

PO2 and

Concentration

In cases of corrosion in neufral aerafed water, the value of PO2, the partial
pressure of oxygen (1 atm for pure oxygen, 0.4 atm in air), affects only the value of the Nernst potentiai, 4
~ 0 12.91: 2

h 0 2 =h

O 2 O

+ 2.303 RTl4F * log { P O ~ / [ O H ~ ~ }

As the value of PO2 increases, so does the value of bol, and this causes the cathode line to shift upwards. This is illustrated in Figures 2.15a and 2.15b, for activation and concentration polarization, respectively. In the case of activation polarization, the results of increasing PO2 are as follows: increase in I, increase in,,, $ , no change in
10,

no change in the anode line. The above results also apply in the case of concentration polarization because, as P02 increases, so does the value of [02]b. and this leads to an increase in line is therefore shifted to the nght.
IL.

n i e verticai

In cases of corrosion in acidic solutions, as the concentration of H' ions increases


and the pH decreases, the Nernst potential
$NH+

increases ($NH+

-2.303 RTE

* pH).

This results in the reduction lines shifting upwards. In the case of activation polarization, the results of increasing the concentration of H' are as follows: increase in, , I increase in,,, $ ,
0

,
, I

no change in

no change in the anode line. The above results also apply in the case of concentration polarization, because as the value of [H+Ib increases, the value of IL increases as well, and the vertical portion of the reduction cuve is shifted to the nght.

increases increases PO, = l atm (bubbling O,)

1con lncreases

I I I 1 I

PO, = 0.21 atm (air wtunted water)

log 1

increases

mm 1

l corr InCrePSeS

I
I

log t

Figure 2.15 Elleel of Vsrying PO1

This parameter must not be confused with

PO2

studied previously. in this case,

the PO2 is kept constant, but the solubility of Oz varies dependiig on the presence of impurities such as chloride ions in the aqueous medium [2.91. The solubility of affects only water, where the
0 2

cases of

corrosion in neutral aerated concentration The varies with polarization govems. solubility of
0 2

O, Solubility

chloride content as illustrated in Figure 2.16. The 0 2 solubility is highest at approximately 3% NaCl content, which corresponds to typical seawater
[2.91.

1 6 ,
\ Fresh :
- 3%
% NaCl

I
Figure 2.16

I
Variation olOl Solubility with NaCl Concentration
0 2

Assuming a constant PO2,

[OH] and temperature, an increase in the concentration of dissolved causes an increase in


IL ( 1 ~ = k n F [ 0 2 ] b

in the water

). Consequently, this will result in an increase in

2.1.7.3

Multiple Corrodents

Situationswhere a metal is subjected to the effects of more than one corrodent are not uncommon. For example, acid rain is a corrodent rich in both oxygen and H+ ions.

in such a case, two cathodic reactions occur simultaneously 12.91:

However, only one anodic reaction is involved: M +M''++neo Figure 2.17a illustrates the situation of multiple corrodents in cases where activation polarization governs. The terms
ta

and

ib

on the polarization diagram

represent the current density that would apply if one corrodent was acting at a time. In situations of multiple corrodents acting simultaneously, a new reduction l i e must be drawn. This line is constructed by addiig
i,

and

ib

at any given value of 9. This line is

used to determine the actual current density existing at the metal surface. The value of to IO>+
t ~ +the , t,,,

the total current density of the oxidation of the metal, is equal

current densifies of the reduction of 0 2 and H ' ,respectively. Figure 2.17a


i,,

shows clearly that when a second corrodent is introduced in a system, the value of and

4 ,

increase. Another interesting point to note is that the reduction of that does the reduction of H 'ions. and 9,,

02

ha a

higher contribution to t,,

In cases where concentration polarization govems, the result of addiig a second corrodent to a system is to increase both
t,,

This conclusion is more easily The new line


i,,

reached when studying the polanzation diagram in Figure 2.17b.

representing the situation of multiple corrodents has, as before, a constant value of


rotai. which

is equal to

102

+i

~ + .

1con H4

1con O*

log t

l con t a u 1

!
Iranuitai

log t

~ r o n KIcon0,

Figure 2.17 Elleet of Multiple Corrodentr 19]

2.1.7.4

Galvanic Atrack

Galvanic ttack occurs when two metals are placed in electrical contact in the presence of a corrodent. In this case, there is one cathodic reaction (reduction of
02),

H 'or

and two anodic reactions:

The resulting corrosion system is illustrated in Figure 2.18. If only metal Mi is present, lines 2 and 4 would apply and i
~ would i

result, while if only metal M 2 is present,

Figure 2.18 Galvanic Attark 19'

lines 1 and 5 would apply and I

~ Q would

result [91. When both metals are involved, then

lines 3 and 6 would apply. Lines 3 and 6 are obtained by adding the value of the curent densities of lines 1 and 2, and lines 4 and 5, respectively. It can be seen that when two metals are involved, the rate of corrosion of the more anodic of the two metals, Ml, will increase and the rate of corrosion of the more cathodic of the two, M*, will decrease. Metal Ml is said to suffer accelerated corrosion, or galvanic atta~k[*~].

Many metals, such as Fe, Cr, Ni, Ti, and Al, exhibit passivity in various conodents. Passivity is the formation of a protective oxide layer on the surface of the rnetal which causes it to corrode at a much slower rate than that predicted by Tafel behavior [91. Figure 2.19 illustrates a typical polarization line of a metal which exhibits passivity. Three distinct regions can be discemed: the active region, the passive region, and the transpassive region.

Figure 2.19

Polarization Diagram of a Metal Exhibiting Passiviiy 12.91 The currents in this region vary between the exchange

The active region, considered so far, is the region limited by the Nernst potentkil,

4NM, and the passive potential, 4,,.

current density, L , and the cntical current density, 1 , . In this region, the metal exhibits siandard Tafel behavior.
$ , The passive q i o n is the region limited by the passive potential, .

and the

transpassive potential,

6,.

In this iekion, the current is equal to the passive current

density, i,,and does not vary with potential. Passivity is due to the adsorption of Oz onto the metal surface. This adsorption
, at which point passivity begins to breakdoun. occurs at potentials between Op, and $

As it can be seen in Figure 2.19, the lower is the value of i,,the lower is the value of region.

I,,

obtained when the intersection of the two polarization lines occurs within the passive

., The transpassive region is the region where the potential is higher than O

The

breakdown of passivity begins at $ , when the adsorbed layer of Oz is no longer stable and begins to disintegrate Iz9]. The value of the current is not constant in this region, but increases with increasing potential.

2.1.7.6

Chloride Content

The effect of the presence on chloride ions in a solution, and to a lesser extent halogen ions, is to increase the value of the exchange current density, 1 . , of the metal in the given solution, and to breakdown its passive layer Iz1. Chloride ions break down, andlor prevent the formation of a passive layer in metals such as Fe, Cr, Ni, Co, and stainless steels. The passive layer forms due to the absorption of oxygen onto the metal surface. When chlorides are introduced into the solution, they compete with 0 2 for absorption Iz1. Unlike the adsorbed Oz which causes the rate of the metal dissolution to decrease, chloride ions favour hydration of the metal ions and therefore increase the rate of dissolution I2l. The value of the potential of the system will determine whether Oz or C1' ions will be adsorbed, Le. whether passivity will form or breakdown. Below a certain potential, chloride ions cannot displace the adsorbed OZand the passive layer will remain stable and

corrosion will be negligible. This potential is termed the cntical potential . l ' [ displacing adsorbed 02, thus destroying the passive layer.

At

potentials higher (or more noble) than the critical potential, CI- ions are capable of Breakdown of passivity occurs locally and is not spread out uniformly over the metal surface. Destruction of the passive layer tpically starts at a point of discontinuity

i l m . The result is localized attack and the formation of pits in the passive f

[21.

This

combination of snall anodic area, the pit, and large cathodic area, the remaining metal surface, results in a situation of accelerated corrosion. Furthermore, the higher the current flow at any pit, the less likely that other pits will form nearby, i.e. the number of pits per
2 ] . An effective inhibitor for unit area is smaller for deeper pits than for shallower ones 1

Cl- ion anack is the addition of extraneous anions to the solution. Species such as N O j and SOJ', which will not break down the passive layer, compete with Cl- ions for sites on the passive film and, consequently, inhibit the formation of pits. l ' [ The effect of Cf ions can be so pronounced that in some cases stainless steels, which are known for their resistance to most corrosive environments, have been obsewed to corrode at rates similar to those of metals that do not exhibit passivity at al1. l ' [

2.2

Measuring Corrosion Rates

In this section, the theory behind the corrosion rate measurements is outlined. It is
on these basic principles that corrosion-measuring equipment are developed. Essentially, there are two methods used to obtain the corrosion rate electrochemically: Tafel Extrapolation, and Liiear Polarizztion [9s'61. A metal which is exposed to a corrodent such as an acidic solution or neutral

, aerated water will acquire a certain potential, .4

This can be seen on the polarization

diagram of Figure 2.20a that at this potential, the current resulting fiom the metal oxidation is equal in magnitude to the current feediig the reduction of the corrodent, i.e. at this point of equilibrium, the electrons are being produced and consumed at the same rate. This current is termed the corrosion current density, i , , .

If the system is manipulated such that a potential 4, other than + ,

is applied,

then the anodic and cathodic currents, i, and i,, will no longer be equal and a net current,
i,

will flow. Figures 2.20b and 2 . 2 0 ~ illustrate this point. When the potential increases then the cment leaving the anode will increase, causing the metal to dissolve
[9.'61.

, above +

more quickly. This phenomenon is called anodic polarization potential is decreased below ,O ,

Conversely, if the

then the cment leaving the anode will decrease and the

metal will dissolve at a slower rate. his phenomenon is called cathodic polarization[9.161.
if the imposed potential is varied and each value is plotted against the logarithm

of the resulting current, a curve resembling Figure 2.21 would be obtained. The section of the curve below , ,$ represents the region of cathodic polarization, and the section

, above it represents the region of anodic polarization. When the potential is equal to g
no net current is expected to flow. The above theory forms the basis of the two methods used to determine corrosion rates electrochemically: Tafel Extrapolation and Liear Polanzation.

COLT

------------1%

'
1

I
I

log 1
la

lc

lcorr

,
log t
la lcon lc

(cl
Figure 2.20 variation of 1 . and 1 . wilh Potential E$

38

K
Cathodic Polarization
Figure 2.2 1 Tafel Curve

Anodic Polarization ofmetai M

/
log t

2.2.1 Tafel Extrapolation

In Tafel Extrapolation, corrosion rates are measured using data obtained by


polarizing a metal sample cathodically and then anodically. schematic diagram in Figure 2.22 illustrates the typical setup. The metal under study is called the working electrode. It is placed in the corrodent along with the awiliary and the reference electrodes. The auxiliary electrode is usually made up of an inert metal, such as graphite or platinum. The purpose of this electrode is to act as either a source, in the case of anodic polarization, or a sink, in the case of cathodic polarization, for the resulting current measures the potential
i.

The very simplified

The reference electrode

of the metal, and a potentiometer records these values.

Simultaneously, an ammeter records the current flow to or fiom the working electrode. Finally, a potentiostat is used to impose the desired potential on the system.

Figure 2.22 Schematic of Setup for Tnfel Test ['61

The first step, prior to polarizing the metal sample, is to determine the value of
&,,.

The metal sample is placed in the corrodent, and the potential is allowed to attain its

equilibriurn, and the anodic and cathodic reactions are allowed to proceed undisturbed.
= ic= i , , and $ = . , ,$ There is no net flow of electrons, i.e. ia

This potential is called

the opepl circuit corrosionporenrial, and it is measured by the reference electrode. Once the value of,,$ , is recorded, the potentiostat then imposes a potential of

$con-A4. This situation is represented by point a in Figure 2.23. The potential remains at $,-A4

for a specified amount of t h e , and the value of the resulting current, a, is and the

, recorded. The potential is then increased by a predetermined increment,,,$


resulting current is again ploned at this new value

4 value.

This continues until the and $,,,+A$

, potential reaches 4

+ A$, and thus al1 potential values between $-A$

have been scanned. The result of ploning the imposed potential versus the logarithm of the resulting current is the complete curve illustrated in Figure 2.23.

Figure 2.23 Tale1 Curve Obtained by Varying @ , Ig1 . ,

Another mical curve is illustrated in Figure 2.24. At low currents, this curve is non-linear. However, the two branches of the curve become linear at higher current values. This region of linearity is called the Tafel region. The slopes of the cathodic and anodic polarization lines in the Tafel regions are termed

P, and Pa, respectively. The

0 x 1 1 50 to 250 mV, or more. Typically, the Tafel region begins at value of Ag can range 6

+,,

f .50 mV, and ends when the various phenornena cause the linearity of the curve to

be lost, e.g., the potential attained encourages the formation of a passive layer and the cuve suddenly continues vertically upward (current does not increase with increasing potential) f9.'61. The value of
i,,

is obtained by extrapolating the Tafel regions back to the

corrosion potential, g,,, where the two l i e s intersect. Figure 2.25 shows the intersection of the two dashed lines at a point where $ = , , ,$ and 1 = I,,. Once the value of i,, is known, the corrosion rate in mm/yr. can be computed.

Tafel

' ,

log i
O

Figure 2.24 Tafel Regionr 19J61

Figure 2.25 Obnining i , , from the Tafel Curve 19"q

2.2.2 Linear Polarization


An alternative to Tafel Extrapolation is the method of Linear Polarization

which has been studied extensively to date. The procedure is the same as that for Tafel Extrapolation with the following exceptions 19.'?
O O

n i e value of A$ is approximately 10-10 mV, The values of

p, and p, are not obtained automatically, but must be k n o m or

estimated before hand,


O

The data points obtained during polarization are ploned on a linear-linear paph, and not on a linear-log plot. In the method of Linear Polarization, once the value of , $ is recorded, the

potential is dropped to ( $ ,
($con

- 20 mV).

It is then raised incrementally up to a potential of

+ 20 mV), and the current is recorded at each step. The $ values and corresponding
are ploned on a linear scale and the resulting graph resembles Figure 2.26.

ivalues

4 corr + AI$ 4 corr 4 corr - A$

Figure 2.26 Linear Polarization Curve

Under these conditions of slight polarization, Le. with A+ i :20 mV, the potential varies linearly with the resulting current. Stem and Geary (1957) derived the following relationship to obtain the value of I , , [ ~ ~ ' ~ - ' ~ ' :

where the term (A+ 1 Ai) is also called the polarization resistance, %, given in ohms. The values of p, and p, can be either determined by the method of Tafel Extrapolation, or it can be estimated. The value of i,,, is determined by the Stem-Geary equation, and the corrosion rate in mmlyr. can then be computed.

2.3

Soi1 Corrosion and Its Effects on Underground Infrastructure This section deals with the principles of soi1 corrosion and its effects on the

underground infrastructure. Underground pipelines make up the greatest proportion of the metals threatened by soi1 corrosion. The various mechanisms of soi1 corrosion are outlined and explained from an electrochemical perspective. The deterioration of metal pipelines in soils can be due to many phenomena. The most important ones are the following: the formation of differential aeration cells,
r galvanic attack,

selective leaching, and


r stress-corrosion cracking.

2.3.1 Differential Aeration Cells


When a pipeline is exposed to conditions which vary along its length, it can be subjected to variations in the 0 2 exposure 112.91. This results in potential differences and, consequently, the corrosion in the pipe section located in the area of low 02content.

A situation which is often faced is a pipe which encounters different soi1 types dong its path. Different soils have different porosities and therefore different
0 2

contents. For example, clays typically have very low porosities and, consequently, low O2 concentrations. On the other hand, sands are highly porous and well aerated, and generally contain higher levels of 02. When a pipe rns through both of these soils, a corrosion cell is created. h e section of pipe located in the clay will have a lower potential (since the O2 concentration is loaer) than the section in the sand. As a result, the section in the clay will be anodic to the section in the sand, and corrosion will occur in the pipe located in the clay. n i e pipe itself will serve as the electrical conductor allowing electrons to move from the anode to the cathode, and the groundwater will serve as the ioNc conductor. The circuit is completed, and localized corrosion will proceed at an accelerated Pace II1.
A similar situation may be created when a pipe passes under a paved surface, such

as a parking lot or a street ['l. The soi1 beneath the paved surface generally has a lower oxygeri content than does the soi1 beneath the unpaved surface, which is more readily exposed to air and oxygen-nch rainwater. A corrosion cell is therefore set up with the pipe beneath the pavement being anodic to the surrounding pipe. Once again, the pipe itself acts as the electncal conductor, and the groundwater as the ioNc conductor. Another cause of differential aeration cells is the improper installation of new pipes[71. Pipes are usually rested directly on undisturbed soi1 and then covered with relatively loose backfill. The backfill is generally more permeable than the compacted, undisturbed soil, and will contain higher concentrations of oxygen. A cell is, therefore, formed with the pipe bottom being anodic to the pipe crown. Electrons move through the pipe itself, from the bonom to the more aerated crown, with the groundwater acting as the ionic conductor. This explains why most corrosive attacks on pipelines occur on the bonorn 114 of the pipe.

2.3.2

Galvanic Attack

Another very common rnechanism of soi1 corrosion is the phenomenon of galvanic attack. As it was descnbed previously in Section 2.1.2.2, galvanic attack occurs when dissimilar metal are placed in electrical contact, and exposed to a corrosive environment. The more anodic of the meials suffers accelerated corrosion, while the rate of corrosion of the more cathodic metal decreases A common example of galvanic attack is the corrosion of steel (iron) water and gas mains at the point of contact with the copper pipe services [Il. Copper, being cathodic to iron, will result in the iron pipe to suffer accelerated corrosion. Luckily, this situation does not cause too much damage because the area of the anode (the iron pipe) is much larger that the area of the cathode (the snialler copper line), and the corrosion is spread out over a large area. Galvanic attack can also occur when a new pipe is placed in electncal contact with an old pipe, even if the pipes are niade of the same material [Il. At first glance, galvanic anack may not be suspected because the matenals are not different. However, over the years a protective surface film has formed on the surface of the old pipe, providing passivity and resistance to corrosion. The old steel is therefore cathodic to the new steel, which will suffer accelerated corrosion when the pipes are in contact with one another. Before long, the new pipe may be in worse condition than the old one, leading to the erroneous conclusion that the pipe material itself is to blame. This situation is often encountered when the capacity of a water pipe is insufficient and an additional water pipe is laid parallel to the old one and the two are connected by cross-overs. The old pipe is the cathode, the new pipe is the anode, the metallic cross-over is the electrical conduc:or, and the groundwater is the ionic conductor. Another example of galvanic attack is the accelerated corrosion of iron pipes placed in contact with a soi1 containing cinders [Il. Cinders are essentially made up of carbon, and are therefore cathodic to the iron pipe. The potential difference between the two metals is in the range of 0.8 to 1.1 V, which can cause very senous damage to the pipe.

2.3.3

Selective Leaching

Selective leachiig, as described in Section 2.1.2.6, is the removal of one element om a solid alloy. This occurs because the alloy is composed of elements whose potentials are very different, resulting in the more anodic of the two beiig "corroded", leaving behiid a porous mass consisting of the more cathodic element.
An example of this is the graphitization of cast iron pipes
[ I l .

Cast iron is

composed of graphite flakes within a matrix of iron. Graphite is cathodic to iron, therefore a galvanic ce11 exists. As iron dissolves, it leaves behind a weak porous material which is characterized by a dark gray color.

2.3.4

Stress-Corrosion Cracking

Stress-corrosion crackiig (SCC) results when a metal is subjected to a combination of weak corrodent and a weak tensile stress apparent. An example of localized stresses in buried pipes is "cold bendiif of pipes
[ ' l . [1s2.91.As

described in Section

2.1.2.7, failwe c m appear quite suddenly because no general surface corrosion is

When underground pipes are manufactured, they are often subjected to "cold bending" to produce bends. f i s c m result in significant residual stresses forming at the bends of the pipes. Also, the pipe c m be subjected to localied stresses when they are forced into alignment once placed in the ground. These forces are suficiently large to cause serious SCC problems. The weak corrodent is usually neutral aerated groundwater, a weakly acidic groundwater. The result is the accelerated corrosion of the pipe in the areas where the pipe is subjected to tensile stresses.

2.4

Standards for Determining Corrosivity of Soils The majority of the standards for determining soil corrosivity were designed to

respond to a particular need, and as such, many different standards have been developed in North America, France, and Germany. Typically, the variables tested are the same, although the testing proceedure may vary. Two standards which are used extensively in Quebec are AWWA Cl05 and PACE 82-3. his project focuses on these two standards. 2.4.1 AWWA Cl05

The Amencan Water Works Association (AWWA) Standard was designed to assist the engineer to decide whether or not to use polyethylene pipes instead of traditional materials. It mus1 be kept in mind that the AWWA is a pnvate organization and not an independent national entity, and as such, the grid developed may be biased to some extent. Nonethelass, the AWWA standard is used extensively in North America. The soi1 characteristicsexamined in the AWWA Standard are the following:
a

soi1 type, drainage ability, soil resistivity, PH,

a
a

oxidation-reduction potential, and sulfide content.

These soil characteristicsare evaluated separately and the appropriate point is allocated to each result dependig on the extent to which the factor contributes to the corrosivity of the soil. The points are then sununed, and a fmal corrosivity index is reported. According to the AWWA standard, an index of 10 or more indicates that the soi1 tested is corrosive, whereas an index below 10 suggeststhat the soil is not corrosive 1251. A detailed description of the testing procedure is presented in Chaprer 3 :

Procedures and Apparatus. This section deals with the factors tested and the points
allocated to each. The following soi1 characteristics are considered:

Soi1 type is a characteristic which is recorded in the AWWA grid, but which is not

allocated any points. The type of soil (sand, clay, silt) is reported along with the following characteristics: color, odor, presence of rocks or pebbles, and the presence of organic materials. The drainoge ability of a soil estimates the ease in which the soil is penetrated by water. The better the drainage ability of a soit, the less Iiely that a soil will become anaerobic and permit bactenal corrosion. The drainage ability is classified as either excellent, good or poor, and the following points are allocated:

Excellent

Soil resistivity is a measure of the ability of a soil to conduct a current. The lower the

resistivity of a soil, the beer are the soil's electrolytic properties, and the higher is the rate at which the corrosion can proceed. Soil resistivity is measwed in ohm-cm, and the following points are allocated:

The pH of a soil is a rneasure of the H+ion content of the soil. H+ion reduction is an important reaction in the corrosion process. The following points are allocated to this factor:

r Oxidation-reductionpotential,

or redox potential, is a rneasure of the potential

+ of

the soil. The potential of a soil indicates whether or not a soil is capable of sustainiig sulfate-reducing bacteria, which contribute greatly to the corrosion problern. A low potential indicates that the oxygen content of the soil is low and, consequently, the conditions are ideal for the proliferation of sulfate-reducingbactena. The following points are allocated:

The sulfide content of a soi1 serves as an indicator to the presence of sulfate-reducing bacteria. The greater the sulfide content, the greater the possibility of the presence of sulfate-reducing bacteria. The following points are allocated:

When sulfides are present and the pH of the soi1 lies between 6.5 and 7.5, an additional 3 points shall be added to the calculated index. These points are added to account for the fact that the conditions are optimal for the proliferation of sulfatereducing bacteria.

2.4.2

PACE 82-3

The PACE 82-3 standard was designed to assist the engineer in the decision to provide protection to buried steel reservoirs, such as a petroleum tanks. In the original standard, three soil samples are taken rom the site and tested in the laboratory. Each soil sample is tested individually, and the results are compared with those of the other two samples. The three samples are originally located at a distance of 30 meters rom one another, and their locations form an equilateral triangle when viewed from above.

An adaptation of this test was used in this project. The soi1 samples were received
and tested individually, with no comparison made between samples. The soil characteristics examined in the PACE standard are the following: moisture content, soi1 resistivity,
0

pH,and sulfide content.

These soi1 characteristics are evaluated separately and the appropriatepoint is allocated to each result depending on the extent to which the factor contributes to the corrosivity of the soil. The points are then summed, and a final coaosivity index is reported.

A deailed description of the testing procedure is presented in Chapter 3 :

Procedures and Apparatus. This section presents the factors tested and the points

allocated to each. The following soi1 characteristicsare considered: The moisture content of a soil describes the state in which the soi1 is received in the laboratory. This parameter indicates the extent to which a soil is saturated during the year. The soil is classified as either dry, moist or saturated, and the following points are allocated:

I
r

Moist

Soi1 resistiviy is measured in ohm-cm, and the following points are allocated:

The pH of a soi1 is a measure of the H 'ion content of a soil. The following points are allocated to this factor:

The sulfide content is classified as positive or negative. The following points are allocated:

1Sulfide Content 1 Points 1

CHAPTER 3: PROCEDURES

AND APPARATUS

The various laboratory experiments performed, the apparatus and the matenals used, the purpose of the experiment, and the results obtained are described in this chapter. For each soi1 sarnple collected, the following variables were evaluated: soi1 type, drainage ability and moisture content, pH: direct and saturated, oxidation-reduction (redox) potential: direct and saturated, resistivity: direct and saturated, sulfide content: using HCl + lead acetate paper, and a solution of iodine + Na3N, concentration of C1' ions, rate at which a standard metal sample will corrode in the given soil using the method of linear polarization, and calculated corrosion indices according to the AWWA and PACE methods.

3.1

Soil Samples The soil samples tested were obtained fiom the various regions of Quebec. In

most cases, the samples were taken for the purpose of beiig tested according to the AWWA or the PACE methods by COREXCO, Montreal, to determine the need for cathodic protection of various metallic structures embedded in the given soil.

i n total, 153 soil samples were tested. Of these, only 75 were available in
quantities sufficient enough to permit testing for the corrosion rate using the method of linear polarization.

3.2

Soil Type During the course of al1 of the tests to follow, the technician should observe

certain characteristics that will enable the determination of the soil type, i.e. a sand, a clay, or a mixture of both (sandclay). For example: s The ability of water to penetrate a soil is a good clue to the soil type. For example, a sand is very quickly penetrated by water, a sandclay is penetrated slowly, and a clay is almost not penetrated at all.

c The consistency of the soil when manipulated in one's hands: fine sand forms clumps
that c m b l e easily, whereas clayey rnatenals typically f o m clumps that are either hard or malleable, but do not c m b l e easily.

c The ease with which the soil is washed off the equipment, e.g. electrodes, plastic
bowls, soil box, rnetal spatulas, etc. Sand rinses off equipment easily, requiring no scnibbiig at all. Clays, on the other hand, require significant brushing to be rernoved, and sandclays are relatively easy to wash off, but notas easily as pure sand. Experience will enable the technician to confidently classifi a soil as a sand, a sandclay, or a clay. The soil types of the samples tested are presented in Table 3.1, in which a sand is represented by S, a clay by C, and a sandlclay by SC.

S o i l#
1

------ - - - - --.
S o i l#
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78

i o i l type SC S S S S SC SC SC S SC S S S S S S S S S S S S C SC C S S SC SC SC SC SC S S SC SC SC SC SC

S o i l#

i o i l type SC S SC S S S S SC S S SC SC SC C SC SC SC SC SC SC SC SC SC SC SC SC S SC S S S SC S SC SC C S SC SC

S o i l#

i o i l spe SC SC SC C SC C SC SC SC S S SC SC SC SC SC SC SC S SC S C SC SC S S S S S S C C C SC C S

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

-----Table 3.1 Sou Type Results

79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 1 O3 104 105 106 107 108 109 110 111 112 113 Il4 115 116 117

118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 247

56

3.3

Drainage Ability 1 Moisture Content n i e AWWA and PACE standards define humidity differently. According to

AWWA, humidity is the ability of a soi1 to be penetrated, or to drain water. his variable is referred to as the drainage ability of the roil. Accordiig to PACE, humidity refers to the moisture content of a soil on site, or as it is received in the laboratory. This variable is termed the moisture content of the soil in this thesis. Drainage Ability The definition of the humidity index in the AWWA Standard is the drainage ability of a soil. In the laboratory, this parameter is determined very subjectively. Soi1 is placed in a bowl, and distilled water is added slowly to the soil. The speed with which the water penetrates the soil is observed. The drainage ability of the soi1 is th-n classified in one of the following three categories: Excellent : a soil that is easily penetrated by water, e.g. a sand Good: a soil that is penetrated slowly by water, e.g. a sandlclay Bad: a soil that is almost not penetrated by water at all, e.g. a clay The drainage ability of the soil samples tested are presented in Table 3.2, in which excellent drainage is represented by E, good drainage by G, and poor draiiage by B. Moisture Content Unlike AWWA, the humidity index in the PACE grid is a measure of the moisture content of the soil sample as it is received in the laboratory. Again, this is a subjective evaluation, and it is dependent on the experience of the technician. The moisture content of the soi1 is determined by visual inspection, and by rollig the soil in one's hands. The moisture content of the soil is then classified in one of the followingthree categories: Saturated Moist

Dry his parameter estimates the moisture content of

the soil under usual

circumstances. Knowledge of the moisture conditions that a soil is subjected to throughout the year will enable the engineer to determine how corrosive the soil is to a water pipe placed permanently in that soil. For example, irrespective of the corrosivity of a soi1 in sanirated condition, if it is kept very dry throughout the year, the pipe will not suffer any corrosion. However, the state of one sample does not indicate the general yearround conditions. This test should therefore be used in conjunction with interviews with the individuals who are knowledgeable of the condition of the soil in general, i.e., percentage of time that a soi1 is saturated, moist, and dry. The rnoisture contents of the soil samples tested are presented in Table 3.3, in which a dry soi1 is represented by D,a moist soi1 by M, and a saturated soil by S.

Xainage

-S o i#
Xainage

a e Ability

Ability
=

79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 1 O3 1 O4 105 1 O6 107 108 109 110 111 112 113 114 115 116 117

Ability

G G
E

G
E E E

G G G
E E E E

E
E E E E

G
E E

B G B G G G

G E
E E E E E E E E E E E

E
E

B B B B B G
G

G G G G G
G

G
G

G
E

B
G

B
E E

G G G G G G B
G G

G
E

G G G E
E

E
E E E

B B B
G

E
E E

B
E E
G
G

E
E

G G B
E G

B B B B B
E

G G G

-G

Table 3.2 Drainage Abiliy Resulis

59

40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 7s 76 77 78

Soi #

Table 3 3 Moisture Content ResuItc

60

The pH of the soi1 samples was measiued in two different ways. The first method consisted oftesting the soi1 in the state in which it was received in the laboratory, i.e. pHdirect. This test serves to represent the conditions found on site. The second method consisted of testing the soil once it had been saturated with distilled water, i.e. pHsaturated. This test may better represent the case in which the soil is saturated after a heavy rainfall, or snow melt. Furthemore, it represents the conditions in which the soi1 is found during the linear polarization test. Although both procedures have their limitations in applicability, the pH of the soil was detennined accordiig to these two procedures because they were recommended by the AWWA and PACE grids, and were required to calculate the corrosivity index accordiig to each of these grids. Necessary Equipment

pH meter 30 ml plastic container with cap Distilled water The pH was measured using a pH meter, an electronic device with a probe that can be inserted into a solution of an unknown pH. A pH meter is an example of an ionselective, or ion-specific, elecirode.

It is based on the principle that the measured

potential of a solution depends on the concentration of the reactants and the products involved in a cell reaction. The pH meter has three main components: a standard electrode of known potential, a special glass electrode that changes potential depending on the concentration of H 'ions in the solution into which it is dipped, and a potentiometer that measures the potential between the two electrodes. The potentiometer reading is automatically converted electronically to a direct reading of the pH of the solution being tested.

The g l a s electrode contains a reference solution of dilute hydrochlonc acid in contact with a thin g l a s membrane. A silver wire coated with silver chlonde is embedded in the solution. The electrical potential of the glass clectrode depends on the difference in H+ concentration between the reference solution and the solution being used in the test. Thus the electrical potential vanes with the pH of the solution tested 1'4.151.

The AWWA recommends that the pH of the soil be determined for the soil as it is found in its natural state. The pH electrode is simply immersed into the soil and the value obtained is noted once it has stabilized. Extreme care must be taken when attemptiiig to plunge the pH meter into dry clay, or into a soi1 containing small pebbles, because of the delicate nature of the glass bulb. The values of pH-direct of the soi1 samples tested are listed in Table 3.4.

PACE recommends thrit the pH of the soi1 be determined by testing a slurry consisting of soil and distilled water. The 30 ml plastic container is filled halfway with soil and then filled almost to the top with distilled water. The container is then capped and shaken vigorously. The mixture is allowed to rest for approximately 5 minutes. The pH of the slurry is then determined by immersing the pH electrode into the saturated soil, and allowing the value to stabilize. The values of pH-saturated of the soil samples tested are listed in Table 3.5.

---Soil #

PHdirect

--direct
6.8 7.2 7.5 7.6 7.4 7.8 8.8 8.1 6.1 6.7 7.6 6.7 6.8 7.9 6.9 7.7 7.3 7.7 6.9 6.9 7.5 7.1 5.9 6.7 5.8 6.1 6.2 7 7.2 7.1 7.8 7.7 5.9 7.7 7.4 8.2 7.7 7.5 7.6 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117

Soil #

PH-

PH-

direct
=

--63

7.3 7.9 7.4 8.2 7.2 6.7 6.1 5.7 6 5.9 7 7.2 7.3 7.4 7.4 7.6 7.4 8 7.4 7.8 7.2 7.6 7.9 7.5 6.7 6.8 7.8 7.7 8 7.5 7.1 8.1 8.3 7.1 6.6 7.3 7.3 6.8 7.2

Table 3.4 pH-direct Results

Soil # Soil #

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78

Table 3.5 pH-saturated Results

64

3.5

Oxidation-Reduction Potential (Redox Potential) Like pH, the redox potential is measured in two different ways for each soil

sample. The first method is the direct measurement, redox-direct, in which the soi1 is tested as it is received in the laboratoiy to represent the condi:ions found on site. Also, this variable is required to calculate the corrosivity index according to the AWWA standard. The second method involves testing the soi1 once it has been saturated with distilled water. This measurement is referred to as redox-saturated. This method serves to represent the conditions in which the soil is found during the linear polanzation tests.

Digital voltmeter

r r

30 ml plastic container with cap


Distilled water The oxidation-reduction potential is measured using a digital voltmeter. This

instrument measures the "driving force" or the "pull" of the soi1 on electron. These electrons would be supplied by the oxidation of an anode placed in contact w i t h the soil, i.e. metal objects embedded in the soil. This potential is the electromotive force (emf) of the cell, and it is a measure of the tendency of the soi1 to corrode a metal. The unit of eleclrical potential is volts, V. The first voltmeter measures the potential by drawing current through a wire of known resistance ['4s151. However, when the current flows through a wire, the frictional heating that occurs wastes some of the potentially useful energy of the cell. A traditional voltmeter will therefore measure a potential that is less than the maximum cell potential. The key to determining the maximum potential is to perfonn the measurement under conditions of zero current, so that no energy is utilized. Traditionally, this has been accomplished by inserting a variable voltage device, powered ffom an external source, in opposition to the cell potential. The voltage on this instrument, called apotentiometer, is

adjusted until no current flows in the ceIl circuit. Under such conditions, the ce11 potential is equal in magnitude and opposite in sign to the voltage setting of the potentiometer, and is the mnrimum ce11 potential since no energy is wasted in heating the wire. More recently, advances in eleckonic technology have allowed the design of the
digital voltmeters, such as the one used in this project, that draw only a negligible amount

of current

[14.'51.

These instruments have since replaced potentiometers in the modem

laboratory due to their ease of use. Redox-Direct The AWWA recommends that the redox potential be determined for the soil, as it is received in the laboratory. The platinum electrode is immersed into the soil, and the redox value is noted once the value has stabilized. Le. the redox value does not Vary above 1 mV per minute. The values of redox-diuect for the soil samples tested are listed in Table 3.6. Redox-Saturated The slurry prepared for pH testing accordiig to the pH-saturated method is used to test for the redox-saturated value. The platinum electrode is immersed into the sahirated soil, and the value of the potential is noted once it has stabilized. The values of redoxsaturated for the soi1 samples tested are listed in Table 3.7.

In testing for the redox potential, an attempt was made to limit the exposure of the
soi1 to the ambient air. Redox tests on soil samples were always performed first, as soon as the container of soi1 was opened, and this container was closed as soon as possible afier retrieval of the soil sample. It has been observed in the laboratory that a soi1 whose redox potential is below O mV, once left open to ambient air for half an hour to an hour, it may later register a potential above 100 mV. It is essential to keep the soil container well sealed, to ensure that the readiig taken is not affected by the exposure to the oxygen in the air.

Redox-

Redox-

-Redox-

Redox-

direct

direct

Soi #

direct

(mv)

(mV)
214 -3 8 150 184 178 118 175 191 219 232 200 183 180 200 220 23 1 203 210 20 1 224 -3 3 -64 178 240 134 14 -38 219 194 144 112 4 1O 164 170 154 187 192 214 169

(mV)
209 208 220 194 181 216 204 263 216 200 210 183 21G -49 132 185 208 81 228 228 274 260 228 219 228 185 193 171 155 175 190 147 121 183 229 180 260 225 218

(mv)
160 230 230 184 190 50 237 281 320 156 192 208 190 185 197 181 167 188 180 147 178 148 191 219 165 195 189 155 139 150 130 246 247 154 28 217

direct

67

Table 3.6 Redoxdirect Results

-Redox-

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

Soi #

aturated

--(mv)
196 188 225 168 258 194 252 256 218 198 153 165 149 121 140 156 176 186 180 214 264 229 163 191 206 187 179 162 155 154 171 111 155 165 226 115 270 204 197

Soi #

Redoxidhnated

(mv)
206 197 223 80 187 40 262 267 167 134 175 188 191 180 190 173 159 177 145 174 183 188 193 205 136 161 185 128 99 112 152 225 230 93 -15 222

118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 247

68

-Table 3.7 Redox-saturatedResulis

3.6

Resistivity, p It is found that the nature of the electrolyte, in this case the soil, has a significant

influence on the rate of corrosion of a metal exposed to it. The types and the amounts of the various dissolved salts in a soil, particularly those which ionize most readily, are estimated by measuring the electrical resistivity of the soil. The lower the resistivity, the more the electrolyte contributes to corrosion. The resistivity of a soi1 is measured in two different ways. The first method involves measuring the resistivity of the soil as it is received in the laboratory, pdirect. The second method is to measure the resistivity of the soi1 once it has been saturated with distilled water, p-saturated. The latter represents the worst case, when the soi1 conductivity is at its highest. Both measurement are necessary to calculate the corrosivity indices according to A W 7 Aand PACE.

Soil box Ohmmeter Four wires with clamps at both ends The resistivity of a given object is calculated by making use of the relationship between resistivity, resistance and geometry. Resistance, R, is the property of a body or mass with discemible geometry, e.g. a piece of wire, or a block of soi1 of a given size. Resistivity, p, on the other hand, is a characteristic property of the material, e.g. copper, or a specific soil. While resisance is a function of geomeby, resistivity is not dependent on the geometry of the body Il1. The resistance of a recangular body of any substance, when measured between parallel faces, is directly proportional to its length and inversely proportional to its crosssectional area. In other words, as the depth and width increase, resistance decreases. The following equation shows the relationship between these variables

where R = the resistance of the rectangular body (Ohms) p = the resistivity of the substance making up that body (Ohm-cm)
W = the width of the body (cm)

D= the depth of the body (cm)


L =the lengtli of the body (cm)
When measuring p, what is actually being measured is R, and p is then calculated using Equation 3.1. R is measured using a soi1 box and an ohmmeter. The soi1 box is a rectangular box with an open top, made of a non-conducting material (usually plastic) wih metal ends and two metal pins inserted into the side of the box ['l. The box is filled to the top with soil, such that the values of W, D, and L are known. It is then connected to the ohmmeter, and the current is introduced by means of the two end plates ani the potential is measured across the two pins. n i e value of R is then calculated according to the following relationship ['l:

The value of the resistivity, p, is then calculated automatically and displayed by the ohmmeter. p-saturated The AWWA method suggests that the resistivily of the soil be determined when the soil is saturated. l i s represents the woet possible case, i.e. when the conductivity of the soil is a maximum. A suficient quantity of soil is placed in a bowl, and distilled water is added gradually in small quantities. The soil and water are mixed continuously to encourage

penetration of the water into the soil. Some expenence is needed to ensure that the soi1 has reached saturation, and extreme care must be exercised when adding water to avoid supersaturating the soil. When a soil is supersaturated, the excess water may separate 6om the body of soi1 and it will not be transferred to the box with the rest of the soil. Ions such as chlorides, which are found in this excess water and which give the water its conductive properties, will be absent fiom the soil box. This will result in a higher rcsistivity, which will not be tmly representativeof the soil's ability to conduct ions. When the soi1 is saturated, it is transferred to the soil box a linle at a t h e , where it is compacted well to eliminate any air bubbles or voids, and to ensure uniformity and reproducibility of the measurement. The box is then attached to the ohmmeter, and a readiig is taken. The values of p-saturated for the soi1 samples tested are listed in Table

3.8.

According to the PACE method, the soil is tested for resistivity in the state in which it is received in the laboratory. Therefore, the wet or dry soi1 is added to the soil box and compacted. Once more, air bubbles must be absolutely avoided, as they will result in higher values of p. The box is then wired to the ohmmeter, and the reading is taken. The values of p-direct for the soil samples tested are listed in Table 3.9.

-Pahnated

--)hmsm:
1947 1377 7410 1571 1888 1417 2623 2181 2211 2359 3060 5980 5140 2625 250 87.2 102.6 81.9 1838 5780 6490 2309 4800 4720 11000 8220 34400 18310 9370 17150 3070 2589 2523 1572 2135 2269 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 247

-228 190 780 165 5300 4330 4770 2343 2097 1834 1284 1856 2396 1748 3580 367 6 1400 409 132600 5820 4020 90800 299 4080 3650 4960 3200 778 682 613 2076 1008 1112 1706 1040 2890 1591 2685 2275 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78

Soi #

Soi#

Pahnated

--

Table 3.8 p saturated Results

ePha

Sol#

direct ( 0 1
280 228 1592 173 5300 4340 4780 2277 1940 1713 1223 1904 3550 1813 3580 1443 101300 833 220800 5820 180000 309 18570 3840 13820 10580 825 707 723 2235 1311 1568 3830 1210 7750 2593 10870 3210

P-

P
Sol#

3820 1410 9280 12550 31800 1987 2198 4120

direct ohm@

direct ohm-cm:
4100 3170 2218 82400

P-

ohmcm:
1947 1683 13300 1577 5140 1429 3750 4400 7600 4850 90100 13360 27180 42000 iO58 568 779 28 1.4 5000 151800 7490 22340 16230 37600 27860 40900 30400 35100 3860 3460 2612 1548 2032 3870

direct

P-

1 2 3 4 5 6 7 8 9 IO 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

40 41 42 43

4 4
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78

9460 129300 1943 1618 1247 1205 3820 3220 5660 3140 3160 1271 205.3 1482 1736 841 496 4000 2012 8230 67900 26970 40200 120700 77200 6040 11930 4080 8900 1247 2980

62100 44700 10440 6040 9040 61800 14960 16610 11230 3830 1389 2121 2688 1131 2159 7140 1504 2078 1790 2064 623 146700 224.8 2449 7340 1901 8240 2599

73

Table 3.9 p direct Results

3.7

SuIfide Content The sulfide content of the soil is determined in two different ways. The first

method uses a solution of iodiie and 3% NajN, and the second a solution of HCI along with a strip of Iead-acetate paper. These two methods were chosen because they were recommended by AWWA and PACE, and are required to calculate the corrosivity indices. NecesSan, Eauioment

c 2 standard test tubes c Concentrated HCl acid (15%)


A strip of lead acetate paper
%

A solution of 12 (aq) + 3% Na3N

The AWWA procedure for testing for sulfides is to saturate a small quantity of the soi1 with a solution of iodine and 3% Na3N, and to observe the resulting reaction. A mal1 amount of soi1 is placed in a test tube, and the iodine solution is poured into the test tube to top the soil. The mixture is then shaken well, and the degree of reaction is obsewed and classified as either violent, normal, or absent. This test is a qualitative one, and may be quite subjective. This is because the reaction is never very violent, and it is ofien difficult to differentiate between the degrees of reactions, especially between the normal and the violent. Although only visual observation is recommended by AWWA, sound was also used to help distinguish between a violent and a normal reaction. If bubbles can be heard to be exploding at a quick pace, the soil is considered to undergo a violent reaction. If only a slight sound can be heard (or none at all), and bubbles c m be seen, then the soil is classified as reacting

normally. If no bubbles are seen or heard, then the soi1 is assumed to contain no sulfides at all. The degree of the reaction was then used to establish the sulfide content of the soil. If the reaction was classified as violent, then the sulfide content of the soi1 was assumed to be high. Ifthe reaction was classified as normal, then the soi1 was assumed to contain traces of sulfide. Fially, if no reaction was obsewed, then the soi1 was assumed to contain no suifides. The sulfide content of the soi1 samples, as determined by AWWA, are presented in Table 3.10, in which N represents no sulfides, T represents traces of sulfides, and P represents the presence of sulfides. HCI and Lead Acetate Paper PACE recommends using concentrated HC1 in combiiation with lead acetate paper to determine whether the soi1 contains sulfides. A small amount of soi1 is placed in a test-tube, and then 15% HCl is added to top the soil. A strip of lead acetate paper is introduced and held at the top, and the test tube is then covered at the top with the thumb of the tester. The mixture is then shaken gently, and care is taken not to wet the indicator paper. AAer a couple of minutes, the paper is obsewed for signs of a brown discoloration, usually present along the edges of the paper. Any discoloration indicates the presence of sulfides. Another indication of the presence of sulfides is the smell of ronen eggs, characteristic of the H2S gas. However, as this product is extremely toxic, it is highly recommended that one avoids breathiig il, and that the room be kept well ventilated or, better still, that this experiment be carried out under a fume hood. When the acid is added to the soil, it is very common to observe a violent reaction, and a lot of bubbling. This may be the result of the reaction between HCI and any carbonates that may be present in the soil. The bubbling is the result of the formation of hydrogen gas, and does not indicate the presence of sulfides. Only the discoloration of the lead acetate paper can correctly determine whether or not sulfides are present. The sulfide contents of the soil samples, as determined by PACE, are presented in Table 3.1 1, where N represents no sulfides, and P represents the presence of sulfides.

Suifide content ?due)


N T N T N P

=
Sulfide content &xne)
T T N N N N N N T T N N N N N N N N T T P P N T T P P N T T P P N N N T

Suifide content 5odine)


N T N N N N N N N N N T N P P N N N N T N N N N N P T P N N N P N T N T N N N

=
Soil #

Suifide content riodine)


N
N

P
P P P P

--118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 747
T T N P N T N P P P P T T T P T T P P P T T N T T P P P P T P P P T

P
N N T N N N P P P P N

N
N N N N N N N P N N N T T T

T P

Table 3.10 SuIfide Content Results Usiug Iodinc Solution

---

76

3.8

Chloride Concentration The test for chlonde content is not a part of either the AWWA or the PACE

procedures. It is one of the goals of this project to determine if knowledge of the CI- ion content would permit us to evaluate the corrosivity of a soil better than it could be determined if only the other parameters were known, i.e. p, redox potential, pH, etc. The overall procedure can be summarized as follows: The sample is dried and pulverized. The finest particles are kept and combied with distilled water. The mixture is allowed to sit ovemight, permitting the iee C1' ions to enter into the water. The potential of the solution is recorded with a chlonde-specific electrode, and the rccorded value is compared with the pretabulated values to obtain the chlonde ion concentration of the solution. The chloride concentration of the soil itself is then calculated. 3.8.1 Necessaw Equioment

Potentiometer Chloride-specific electrode Electrode wening agents, Le. solutions of 1M (PJH4)2S04or 1M KNOJ
8

Ceramic bowl and hammer 30 ml plastic containers 2 mm and 200 pm sieves 200 ml beakers

8 8

Microwave oven Scale Powdered KCl Distilled water J-cloth + an elastic band Stop watch

The chloride ion concentration is calculated indiidly fiom the potential that is measured using an ion-specific elccirode, i.e. an electrode that is sensitive to the concentration of a particular ion. It is based on the p ~ c i p l that e the measured potential of a solution depends on the concentration of the reactants and the products involved in a ceil reaction 1'4.151. An example of an ion-specific eleckode is the pH meter discussed in Section 3.4.1. Glass electrodes can be made sensitive to ions such as Na', K', N I & ' , and CI' by changing the composition of the membrane. In this case, a CI' ion-specific electrode was used to determine the potential resulting fiom the presence of Cl- ions only 114.151. Unlike the pH meter, the potential is not converted automatically, but m u t be obtained through a series of steps which are discussed in the following sections. 3.8.2 Samvle Prevaration A 200 ml beaker is filled half with soil, and covered with a piece of J-cloth which is secured in place with an elastic band. The sarnple is dried in a microwave at a high temperature for 3-5 minutes, or as long as necessary to thoroughly dry the soil. Extreme care must be exercised when handling the beaker as it reaches very high temperahues. When the soil has cooled suficiently to permit handling, some of it is kansferred to a ceramic bowl. The soil is pounded with a hammer for a few minutes to separate any larger pebbles from the fmer soil. The pulverized soil is passed through a 2 mm sieve to remove the pebbles that cannot be pulverized. The recuperated fine soil is then rehmed to the bowl where it is pulverized further to a fine consistency. The soil is then passed through a 200 pn sieve, and the soi1 recuperated is ansferred to a clean, tarred 30 ml plastic container. Approximately 5 g of soi1 should be recuperated. If the quantity is insufficient, the above procedure can be repeated until such an amount is retrieved. The exact weight, in grams, of soils in the tarred container is recorded, and distilled water is added to the soil in a ratio of 2:1, i.e. 10 g of water are added to 5 g of soil. The container is then capped, and shaken vigorously for 30 seconds. h e sample is then allowed to sit ovemight.

The potential of the previously prepared sample is recorded with the aid of an ionspecific electrode, which is sensitive to C1' ions only. W s electrode, when atached to a voltmeter, registers the potential of a solution due to the presence of only the Cf ions. The electrode is rinsed thoroughly with distilled water, and filled with a wetting agent. The wetting agents used in these expenments were 1M KN03, 1M (NH&S04, or a commercially prepared wetting agent of unknown composition. The choice of wetting agent appears to make no difference in the final results. When filling the electrode, care must be exercised to enswe that no air bubbles are present in the wetting agent. When the electrode is ready, it must then be calibrated using solutions of known chlonde ion concentrations. 3.8.4 Pre~aration of Calibratine Solutions

In order to calibrate the electrode, the potential of different solutions of known CI'
concentrations are recorded. These solutions are prepared by simply adding KCI, or NaCl crystals to distilled water in the correct quantities such that solutions with the desired concentration of CI' ions are obtained. Solutions of 0.01%, 0.03%, 0.33%, 0.65%, and 1.3% CI' ions are required. Table 3.12 shows the weight of KCl to be added to 1 kg of distilled water in order to obtain the desired concentrations, as well as the equivalent concentration in ppm.

- -

Table 3.12 Preparation of Csiibrating Solutions

3.8.5

Calibration of Electrode

Once the calibration solutions and the electrode are prepared, the electrode is calibrated. This is done by altematively reading the potential of each of the calibration solutions. The electrode is placed into a solution, and held upnght for a predetermined amount of time, e.g. 1 minute is usually sufncient, but 3 minutes may be needed to achieve stability of the reading. This time must be chosen pnor to taking the fmt reading, and must remain the same for al1 subsequent readings. Each of the five calibration solutions are tested in tum, fiom the most concentrated to the least concentrated solution, and then in random order. In order to avoid contaminating the calibration solutions, the electrode should be rinsed with distilled water and tapped dry before taking the next readiig. Each solution is tested twice, and the reproducibility of the potential is determined. If the potentials are approximately equal, calibration is complete. If the values Vary significantly, then the electrode should be checked closely for any problems such as the presence of an air bubble in the wening solution of the electrode, the lack of wetting agent due to a leak, etc. Further measurements are then taken until reproducibility of the potentials is obtained, and the technician is confident of the results obtained. An exarnple of the potentials obtained during one calibration exercise are given in Appendix B.

3.8.6 Calibration Curve and Eauation


h e calibration curve is constructed fiom the potentials registered for each of the five calibration solutions. For each solution (0.01, 0.03, 0.33, 0.65, and 1.3 % Cr)the average potential is calculated. The values of the chloride ion concentration (%) are ploned against the average potential values, and an exponential c w e is fined to the five points. This curve, along with the correspondiig equation, will be used to obtain the C1' concentrations of the solution of the samples. A calibration c w e , along with its equation, is presented in Appendix B.

3.8.7

Testine. Soil Samales

The samples prepared the previous day are now ready to be tested, given that the standards indicate that 2-6 hours are sufficient to allow al1 C r ions to enter into the distilled water. The mixture of soil and distilled water has now separated into two parts. The liquid part contains the Cl- ions, and the deposited soi1 particles. Care must be taken not to disturb the settled layer before testing the liquid part. The calibrated electrode is lowered into the 30 ml container, until the membrane at the tip is fully immened in the solution liquid above the precipitate. The electrode is then held upright for the tirne chosen during calibration (1-3 minutes), and the potential of the solution is registered. 3.8.8 Determination of Concentration of Chloride Ions of Soil Once the potential of the liquid fiaction of each sample is obtained, it must be transformed into a value more intuitively understandable: percentage concentration, or ppm of C r ions. This is very easily, and quickly done by readiig off the concentration value in percentage terms fiom the calibration cuve, or by calculating it using the calibration equation. An example of this is s h o w in Appendix B. The variable of interest is the concentration of CI' ions in the soil, and not in the liquid fiaction of the prepared sample. This value is obtained by simply doubling the concentration of Cf ions in the liquid fraction. This is due to the fact that a ratio of 2:l between the water and soil weights was used during the sample preparation. Finally, the concentration of Cf ions of the soil, in ppm, is obtained by multiplying the concentration in percentage by 10,000. The values in ppm are retained for further data analysis, although the concentration in % could have equally been used. The chloride ion concentrationsof the soil samples are presented in Table 3.13.

-----Ciment

CI-

CI-

@PI ----@Pm)

Soi #

:ontent

Soi #

:ontent

Soi #

@ P )
243 347 59 160 282 310 268 419 328 258 125 52 34 82 9223 13652 22664 17754 759 75 53 29 149 30 30 28 14 26 72 47 95 154 155 73 35 1042

---Table 3.13 Cbloride Ion Concentrations

40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78

80 1835 155 168 81 2310 4592 1094 249 436 38 46 58 77 54 81 148 111 148 716 345 3257 123 758 448 2030 537 391 423 160 380 1345 7161 12556 210 56 407 36 157

79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 1O7 108 109 110 111 112 113 114 115 116 117

172 190 222 17 5 O O 6 3 51 362 283 272 253 142 207 81 22 1O9 265 5294 326 320 2067 5394 481 2712 340 191 42 1 1830 756 233 65 17 9 28 759 374

118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 247

C-e

83

3.9

Linear Polarization The final test is the h:ar polarization of a standard steel sample exposed to each

of the soil samples. The result of this test is the corrosion rate, in d y r . , that the steel sample will undergo in the given soil. Although very informative and precise, linear polarization is a test that is t h e consuming, and that requires a very expenenced technician. Furthemore, the equipmeni is quite expensive. Al1 this makes the test generally inaccessible, and encourages corrosion engineers to depend on corrosivity indices such as those proposed by AWWA and PACE, which can be calculated ftom the results of very simple and inexpensive tests. The corrosion ce11 set up is made up of the following elernents:
a

A working electrode: the steel specimen which plays the role of the anode during anodic polarization, and the cathode during cathodic polarization,
An auxiliary electrode: the graphite rod which plays the role of the anode in cathodic

polanzation, and the cathode in anodic polarization,


a

A reference electrode: the CuICuS04 electrode which measures the potential of the working electrode at any point during linear polanzation,
An ionic conductor: the saturated soil sample which allows ions to travel fom the

cathode to the anode, and


a

The potentiostat which controls the potential of the system, and which acts as the electncal conductor between the anode and the cathode. The procedure for testing the soil samples can be summarized as follows: the soil

sample is saturated and a glass jar is filled up to a specific height with the saturated soil. The surface of the metal specimen (working electrode) is prepared according to a specific procedure and immersed into the saturated soil, along with the reference and the auxiliary electrode. The electrodes are wired to the potentiostat, and the corrosion rate is obtained from Tafel and polarization resistance diagrams.

3.9.1

Necessarv huioment

Potentiostat: the CMSlOO Elecirochemical Measument ystem by Gamy, Inc. Graphite electrode (auxiliary electrode) Cu/CuS04 electrode (reference electrode) Standard metal specimen, and specimen mount A hand drill and support No. 200 and 400 sand paper 1.O micron, agglomerate-6ee, alpha alumina powder by Leco Acetone Caliper Glass jar The reference electrode used was the CdCuSO, electrode, which consists of metallic copper immersed in a solution of saturated copper sulfate. This means that, instead of measuring the potential of the system against the standard of hydrogen ion reduction (whose potential is 0.000 V by convention), the potential is measured against the reduction of copper ions, cu2+. This half-ce11 reaction is the reduction of CU" ions:

In order to obtain the potential against the standard of hydrogen ion reduction, the
value of 0.337 V is subtracted fiom the value of the potential obtained against the CuICuS04 electrode. For example, a potential of 0.400 V Vs CU'+ reduction

, is

equivalent to a potential of 0.063 V against H+reduction. Regardless of which electrode is chosen, it is used to measure the potential of the working electrode at any given time during the experiment. The standard metal specimen used in the expenments consisted of a small cylinder with an approximate height of 14.2 mm and a diameter of 9 mm. It a a s created 6om material cut 6om a ductile iron pipe removed 6om the ground for testing by the

authority of COREXCO, Montreal. The specimen was machined in the Civil Engineering Materials Testing Laboratory. It \vas fomied w i t h a "thread" m i n g through the center, such that it could be screwed onto the erid of a rod. This rod consists of a long thin tube

through which runs a wire comecting the metal specimen at one end, to the potentiostat
at the other. This wire ensures that the steel specimen, which is later immened into the soil, is in constant contact with the potentiostat. Figure 3.1 shows a schematic of the working electrode made up of a steel specimen screwed onto the rod.

..a.

to poentiostat

-",..."..'.6

Rod

Steel Specimen

Figure 3.1 Componcnts of the Working Elcctrode

3.9.2 Trial Runs and Reproducibilitv of Kesults Before testing any of the soils that were retained for further analysis, trial u s were performed on expendable soi1 samples to determine the exact procedure to be followed such that reproducibility of results is ensured. This process is very important, because if the technician is unable to perform the test in a repeatable manner within a prescribed tolerance, then the result would be unreliable. The trial runs served the following purposes: To identify the method of preparing the soi1 samples, To identify the method of preparing the surface of the steel specimen,

To provide the technician experience and the ability to perform the tests quickly and consistently, To determine the scan rate and the scan range to be used in obtainiig the Tafel and polarization resistance diagrams, and To determine the time to be provided for the steel specimen to stabilize in the soil pnor to polarization. Trial runs were performed on three different soils. The complete procedure was

established and is presented in the following section. The results of the trial run performed on sample No. 123 are presented in Appendix C. From these results, it can be seen that the procedure established yielded reproducible results to a satisfactory degree.

The soi1 sample is tested under conditions of saturation. Soi1 is placed in a bowl, and distilled water is added gradually in small quantities, and worked into the soil, until the soil is saturated. Care must be taken not to ovenaturate the soi1 because the conductive properties of the soi1 may be incorrectly estimated if the excess water bleeds to the surface of the soi1 during the polarization testing. The soi1 is saturated in order to represent the worst case scenario in which the soil's conductive properties are highest. Furthemore, it eliminated one of the variables that differs between the samples, i.e. moisture content. It should be noted that, if a soi1 is completely dry, linear polarization of the steel specimen is not possible because the system is missing a key element: the ionic conductor. Once the soil is saturated, it is transferred to the mason jar, which is filled to a specified height. The height requirement is intended to ensure reproducibility of the cathodic area, which consists of the area of the graphite rod which is in contact with the soil. If the graphite rod is immersed into the soil such that its end touches the bottom, and the height of the soi1 is always the same, then the same area of graphite will be in contact with the soil, i.e. constant cathodic area.

The soil must be observed for signs of air bubbles. Lightly shaking the jar may consolidate the saturated soil and eliminate any air bubbles, which tend to increase the overail resistivity of the soil. Furthemore, if the steel specimen is in contact with an air bubble, the actual anode area will be smaller than what has been assurned, and therefore the corrosion rate will be underestimated. The soil sarnple is prepared first and the reference electrode and graphite rod are secured into place within the mason jar. The steel specimen is prepared next.

3.9.4 Pre~aration of the Workiie Elechode


This section serves to outline the method of preparing the surface of the specimen prior to each polarization sequence. As observed previously, surface preparation plays an extremely important part in the corrosion process, i.e. it can greatly affect the rate at which the corrosion w".l proceed. For example, the presence of a protective surface film would result in a lower corrosion rate. If such a film is not properly removed, or if the steel sample is exposed to ambient air after cleaning such that a protective film is allowed to form prior to testing, then the results obtained would be rnisleading. For this reason, the specimen must be prepared in a consistent manner each time to ensure reproducibility of the results. For each soil sample, the steel specimen is polarized four times. The surface must be prepared thoroughly pnor to each of the tests. The first step in the surface preparation is the sanding of the surface. In order to obtain a uniform sanding, an ordinary hand drill is mounted securely on a stand and a screw, whose diameter is compatible with that ofthe steel specimen thread, is inserted into the "nose". The steel specimen is then secured onto the end of the screw. Two sandig papers are used: sizes 400 and 600. As the drill rotates the specimen, it is sanded on al1 sides with the size 400 paper first, and then with the sue 600 paper. The specimen is then sanded with alumina paste, which ensures a smooth preparation of 1.0 p.The specimen is then removed fiom the drill, and screwed onto the end of the working electrode rod. When a tight seal is ensured, the specimen is rinsed thoroughly with acetone to eliminate any greases, and then Mise with distilled

water. The specirnen is then quickly immened into the saturated soil sample, and the appropriate test is m.
3.9.5

Polarization of the Steel Soecimen

Once the soil sample has been prepared and the working, auxiliary and reference electrodes have k e n immersed into the soil, the first of the four polarization tests is initiated. The goal of this test is to obtain the Tafel diagram, and to extract fiorn it the values of the Tafel constants, P, and Pa. The potentiostat used enables the technician to introduce the desired values of the scan rate, the scan range, etc. The following variables were specified:

i 250 mV fiom Open Cicuit Potential. Eoc

Delay provided to attain E,


IR drop compensation

1000 s or 0.017 mVls (1 mVlmin)

Anodic area, i.e. metal surface area

approximately 4.5 cm2(subject to change)

Table 3.14 Values Specilied for Tafel Test

Once this test is cornpleted, a graph such as that illustrated in Figure 3.2 is obtained. The values of

P, and P. are obtained by plotting the anode and cathode lines

such that their slopes coincide with the dope of the Iinear Tafel regions.

Taiei C u m 'jk96tfdtan 2Wl199512 10 20

PM*, EOC 4 M W 7 S V h452lin2 E b 707g*uixO 27'RvEpur

C W B m OFF
D*ON

Km*

Figure 3.2 Typical Tafel Plot

Once the values of p, and

p,

are obtained, the steel specimen is removed from the

soil and cleaned accordiig to the standard method. The specimen is then inserted into the soil again, and the second test is initiated with the goal of determinimg the corrosion rate of the steel sample by the method of linear polarization. h e following variables are introduced into the program prior to polarization:

Delay provided to anain E, IR &op compensation Anodic area, i.e. metal surface area Density of metal Equivalent weight of metal

f 20 mV 6om Open Circuit Potential, Eoc 1000 s or 0.0 17 mVls (1 mVlmin) On approximately 4.5 cm2(subject to change)
7.87 g/cm'

27.92 g

Table 3.15 Values Specilied for Linear Polarhtion Test

90

Once the test is completed, a curve such as that illustratcd in Figure 3.3 is obtained. n i e value of

%, IO, and the corrosion rate are obtained by plotting

a line

whose dope coincides with that of the line in the region imrnediately sunounding the point on the curve at which the current equals zero. Once the corrosion rate is obtained, the saturated soi1 is discarded and the entire process is repeated a second time with a tesh sample of the same soil. n i e soi1 and the steel specimen are prepared according to the specified methods, and the two tests are run again to obtain new values of p, and p , and then the corrosion rate. Table 3.16 gives the values obtained for soi1 sample No. 96. The results obtained indicate that the procedure followed yielded reproducible results. The values of the corrosion rate for each of the soi1 samples tested are presented in Table 3.17.

I
Figure 3.3 Typieal Linear Polsrizatioo Curve

Linear Polarization

I
-851 -852.1 8.914 2.970

Eoc

4,.,
i , , ,

(mV) (mv) (A 10-6A/cm2)


(A 104-3 ohm c d )

Table 3.16 Results Obtained for Soi1 Ssmple # 96

Comsi01 Rate

(-90

Table 3.17 Corrosion Rates

3.10

Calculating the corrosiviy indices according to AWWA and PACE The defmitions of the variables included in each of the corrosivity gids have k e n

discussed in Chapter 2, and reviewed in the previous sections of this chapter. This section introduces the spreadsheets used to obtain the corrosivity indices quickly and without error. Figure 3.4 shows the spreadsheet used to calculate the corrosivity indices according to AWWA and PACE. The values of the appropnate variables are entered in lines A, C, and E, and the corrosivity indices are given automatically in lines D and G. The corrosivity indeces of the soils are presented in Tables 3.18 and 3.19.

ANALYSIS OF SOIL CORROSIVITY

SOIL SAMPLE: JK-33

On'gin: Dale: Descn'p(ion:

Stanvtpot#lO

WC6195
Siltyclay,lightbrow

METHOD 1: AWWAC-105

I f h pH is b e w n 6 . 5 a r d 7.5,ard sulides are preentandlorthe redoxi negadve,add 3poim:

METHOD 2 : PACE

Boalvsls
&2h

1.0

0.0

1.0 INDEX

IF

170.01~

Figure 3.4 Spreadshnt Used for Quick Calculation of Corrosivity Indices

95

AWWA index

Table 3.18 CorrosMty Indices According to AWWA

Table 3.19 Corrosivity Indices According to PACE

CHAPTER 4: ANALYSIS OF EXPERIMENTAL RESULTS AND DISCUSSION

4.1

Analysis of Preliminary Data n i e statistical package SAS was used to analyze the data collected during the

experimental phase of this project. This data was presented in Chapter 3. Furthermore, the information presented in this Chapter is selective and consists only of the material deemed to be essential. Furthermore, Appendix D: Principles o f Regression Anabsis is included for the information of the reader, and it is recomrnended that Appendix D be consulted pnor to readiig this chapter. n i e analysis consists primady of regressing the variables, individually and in combiiation, with the dependent variable. The dependent variable 01) in the analysis is the corrosion rate obtained by the method of linear polanzation. This variable is denoted 'CorrRate', and it is considered to be the "bue" corrosivity of a soil. It was the objective of this study to derive the relationships of the other variables with CorrRate, both individually and in appropriate combiiations. Once the relationships between the variables is understood, the importance of the chlonde content of a soi1 is evaluated, and a decision is made on whether or no1 this variable provides suficient information to be considered significant. There is a total of 12 independent variables O(,), seven discrete and five categoncal: 1. pHdir: pH of the soil, obtained by testing the soi1 in the state in which it is received in the laboratory (discrete), 2. pHsot: pH of the soil, obtained by testing a portion of soil supenaturated with distilled water (discrete),
3. Reddir: redox potential of the soil, in mV, obtained by testing the soil in the state in

which it is received in the laboratory (discrete),

4. R e m : redox potential of the soil, in mV, obtained by testing a portion of soil

supersaturated with distilled water (discrete),


5. Resdir: resistivity of the soil, in ohm-cm, obtained by testing the soil in the state in

which it is received in the laboratory (discrete),


6. Ressat: resistivity of the soil, in ohm-cm, once it had been saturated with distilled

water (discrete),

7. Chl: chlonde ion content ofthe soi1 in ppm (discrete),


8. Soilfype: categoncal variable representing soil m e (S for sand, SC for sandklay,

and C for clay),


9. Moisture: categoncal variable representing moisture content of the soil as it is

received in the laboratory @ for dry, M for moist, and S for saturated), 10. Stilfl: categoncal variable representing sulfide content obtained by testing the soi1 using a solution of iodine and Na3N (N for negative, T for trace, and P for positive), 11. Sul'CI: categorical variable representing sulfide content obtained by testing the soil using concentrated HCI and lead acetate paper (N for negative and P for positive), and 12. Drainage: categoncal variable representing ability of the soil to 'drain' water (E for excellent, G for good, and B for bad). Of the 12 variables, only 10 will be used in this analysis. Drainage will not be included in any of the following analyses because the information it provides is alrnost identical to that of the variable Soilfype. In the majonty of the cases, a sand will have an excellent drainage ability, a sandlclay will have a good drainage ability, and a clay will have a poor drainage ability. One of the two variables is therefore redundant, and it was decided to retain the variable Soilfype. Furthermore, the variable SulfHCl will also be eliminated fom the list because it is felt that errors were made during testing for this parameter. As a consequence, the SulfHCl value is unavailable for many observations, and this results in a decrease in the reliability of the results of the statistical analyses obtained using this variable. The analysis of the data is divided into the following sections: Data Exploration Transformation of Variables

Regressing Discrete Variables One At A Time Correlation Matnx RSQUARE Procedure Includiig Categorical Variable Variables Retained for Further Analysis Determining Sigiificance Discussion of Results 4.1.1 Data Exaloration

The first step in any analysis is the familiarization with the experimental data. Each of the eight discrete variables is studied individually and the distribution of the values are observed for signs of normality, outliers, skewness, etc. The distribution of the data plays a very important role in ensuring that the results of regression analyses are consistent. Furthermore, outliers are also very influential, and they must be identified and observed during the course of the statistical tests that follow. For each variable, the following information is extracted fiom SAS output files, and examined:
a

The number of observations,(N), the mean, the standard deviations, the variance, and the skewness, The five highest and five lowest observations, The five quantiles, the range and the h i g e spread, The stem and leaf diagram, the box plot, and the normal distribution plot, and The outliers.

a a a
a

Figure 4.1 displays the information produced by SAS for the variable pHdir, includig the stem and leaf diagram, the box plot and the normal distribution plot.

The pHdir values range between 4.2 and 8.8. The data are slightly negatively kewed, i.e. the mean is slightly smaller than the median. This generally indicates the presence of outliers in the lower end of the distribution, and this is quite evident when the stem and leaf diagram and the box plot are studied. There are seven outliers: one in the upper end, and six in the lower. Besides the outliers, the data points seern to be well distributed and the box plot appears to have a standard shape. Fially, the normal distribution plot is not exactly linear, in fact it appears to be slightly curved. This indicates a small deviation fiom normality. The usefulness of a transformation is exmined in the next chapter.

Figure 4.2 displays the information produced by SAS for the variable pHsur, including the stem and leaf diagram, the box plot and the normal distribution plot. The pHsa! values range between 4.7 and 9.2. As withpHdir, the data are slightly negatively skewed, with four outliers in the lower end only. Besides the outliers, the data seern to be well distributed, with a relatively good box plot. observable. n i e normal distribution plot is a linle less c w e d than that ofpHdir, but a slight deviation fiom normaliy is still

ISl-<Id lWl=<ld O < WON Iill'ld (9P (IL 15P IC 15L 12'6 16'8 16'8 16'8 18'8 2-@TH P I 198 (LZ P I lL'5 15'5 11.5 15 IL'b S 6'5 b.9 P' 8 9.8 6'8

$1
$5 $01 $06 8S6 $66 L'b I'L L.L 51'8
UTU $0

uean PX SSJ
GySo~Inx

IEZ
sqo

1 0 $52 pan $OS C O ESL

ameyieli
uns sa6n wns

sqo

asan01

Reddir

Figure 4.3a displays the information produced by SAS for the variable Reddir, includig the stem and leaf diagram, the box plot and the normal distribution plot. The Reddir values range between -528 mV and 320 mV. It may appear from the various diagrams that the situation is unacceptable and that the distribution is not at al1 normal. This may not be the case. The very large extreme values (-528, -138 and 320 mV) may be the cause of the box plot and the normal distribution plot having such a distorted form. The presence of these three observiitions force the diagrams to be drawn with large intervals, and as such, the remainimg variables tend to be lumped together. A clue to this can be drawn from the observation of the quantiles. If the extreme values are ignored and only the values withh the hiige range are examined (behveen 4 3 and QI), an equal number of observations are noted to be above and below the median. This is characteiisticof the symmetnc normal distribution. In this case, the hiige range is equal to 43-41
=

61.5 mV. Dividing this value in h o , gives a values of 30.8 mV. For a

normal distribution, the values obtained by adding and subtracting this value from the median will correspond approximately to the values of 4 3 and Q1. The values obtained by 187.5 f 30.8 mV are 218.3 mV and 156.7 mV. These values are very close to the actual ones of 216.5 mV and 155 mV, and therefore, the variables are well distributed within the hinge area. However, it cannot be concluded form the above test that the data set is normally distnbuted.

In order to determine the normality of the vanable, the three extreme values are
removed and the process is repeated. Figure 4.3b displays of relevant information, includig the stem and leaf diagram, the box plot, and the normal probability plot for
Reddir without the three extreme values. The data are slightly negatively skewed, with

13 outliers in the lower end. This is a somewhat high number. However, aside from the outliers, the data points are well distributed and the box plot appears to have a standard shape. Fially, the normal distribution plot is not quite linear, in fact it is significantly curved. This indicates a deviation 60m normality which may be corrected by a transformation in the next section.

Rebat
Figure 4.4a displays the information produced by SAS for the variable Rebat, including the stem and leaf diagram, the box plot and the normal distribution plot. The Rebat values range between -475 mV and 282 mV. Once more, it may appear fiom the various diagrams that the situation is unacceptable and that the distribution is not at al1 normal. However, an examination of the quantiles results in following: the value of half the hinge range is equal to 5912 = 29.5 mV and 174 i 29.5
=

206.5 mV and 144.5 mV. hese values are quite close to the true values of 196 mV and 137 mV, respectively. This was the result anticipated, and M e r analysis is therefore warranted. In this case, it appears that the cause of the distortion in the diagrams is the one extreme value of -475 rnV. When this value is removed and the process repeated, the resulting boxplot and normal distribution plot are greatly improved. The results obtained are presented in Figure 4.4b. The data are slightly negatively skewed, with the outliers in the lower end. However, aside fiom the outliers, the data points are well distributed and the box plot appears to have a standard shape. Fially, the normal distribution plot is fairly linear, except for the lower end outliers. This indicates a slight deviation from normality.

O N P n N O P * N P N N N N I

Resdir
Figure 4.5 displays the information produced by SAS for the variable Resdir, including the stem and leaf diagram, the box plot and the normal distribution plot. The Resdir values range between 173 and 22080 ohm-cms. Once more, it appears fiom the various diagrams that the situation is unacceptable and that the distribution is not at al1 normal. Unlike the case of the redox potentials, the situation is not the result of one or two extreme values. In fact, it appears that the entire set of values contributes to the problem. This can be concluded form the examination of the quantiles. The value of half the hinge range is equal to 925212 = 4626 ohm-cms and the predicted quantiles are equal to 3750

+ 4626 ohm-cms, that is

8376 and -876 ohm-cms. There is a very large

difference between these values and the actual quantiles reported in Table 4.5 and therefore, it is not only the extreme points that contribute to the distortion of the diagrams. but also the entire body of the values. Furthermore, examination of the normal probability plot suggest that the majority of the variables are concentrated below 5000 ohm-cms, but that there are a significant number of points which are several orders of magnitude larger. It appears that a logarithrnic transformation may be indicated. This will be studied in the next section.

Ressat
Figure 4.6 displays the information produced by SAS for the variable Ressat. including the stem and leaf diagram, the box plot and the normal distribution plot. The Ressat values range between 73 and 183400 ohm-cms. As in the case of

Resdir, it appears that the whole set of observations contribute to the distorted shape of
the boxplot and normal probability plot. Examination of the quantiles rtveals the following: half the hinge range is equal to 350412 = 1752 ohm-cms and 2259 I 1752 = 401 1 and 507 ohm-cms. These values are far fiom the calculated quantiles of 4800 and 1296 ohm-cms, respectively. Furthermore, examination of the normal probability plot

seems to suggest, as in the case of Resdir, that a logarithmic transformation may be able to correct the normality problem.
Chloride

Figure 4.7 displays the information produced by SAS for the variable Chloride. including the stem and leaf diagram, the box plot and the normal distribution plot. The Chloride values range between O and 22664 ppm. As in the case of Resdir and Ressar, it appears that the whole set of observations contribute to the distorted shape of the boxplot and normal probability plot. Examination of the quantiles reveals the following: half the hinge range is equal to 42312 = 211.5 ppm and 190 i 211.5 = 401.5 ppm and -21.5 ppm. These values are far ftom the calculated quantile values of 481 and 58 ppm, respectively. Furthermore, examination of the normal probability plot seems to suggest, as in the case of Resdir and Ressc!, that a logarithmic transformation may be able to correct the normality problem.
CorrRate

Figure 4.8 displays the information produced by SAS for the variable CorrRote, including the stem and leaf diagram, the box plot and the normal distribution plot. The CorrRate values range between 0.06 and 0.26 d y r . The data are positively skewed, i.e. the mean is slightly larger than the median. This generally indicates the presence of outliers in the upper end of the distribution, and this is quite evident when the stem and leaf diagram and the box plot are studied. There are two outlien: 0.19 and 0.26 d y r . Besides the outliers, the data points seem to be well distributed and the box plot appears to have a standard shape. Fially, the normal distribution plot is not exactly linear, in fact it appears to be slightly curved. This indicates a small deviation ftom normality. The usefulness of a transformation will be studied in the next section.

o o o m n OP."r<P

P N P

yo=N -

8 "

NU)W N N o m N
N

n
U

C o l 4 0

U-4

a] S c &E-u-

0 - -

EEZZOGZzOk
O P P U ) " P U ) O m m

E E l l r n V A ~ t t

; 2
+ .
DL
O " 7

2.1NW::Z;Z

m P P P - P m m w m .N W.r.,"NW _ m . d " P 1 N N m

.*

Fin

. ,, . * . ,, . . , . , . .,
I

. ,, . . . . . ,,,, .
a ,

* ,

..

+ I N I

OOrnLD.? O O i n r n P ' m N N " I I N 3

w - m N O " O i n N O n 3
3

* 3
3

'
DI
4 O

O n

*
X
a m f 1 3. .

c C a Y1 c 3U o
z u s

-",

IOZDS mou

m s s a s
O Y 1 0 " ? 0 O P in N

22

m m N N m c - P " . . . ? . . ~ N L i W O ~ O O r < Y I ~ O P r < O i O O o 0 m m . O 0 0 , . O W W r n . d m . - Y I 0 0 0 m.. . N N 4

..

4.1.2

Transformation of Variables

There are rnany different ways to transform data. The values can be logged, inversed, square-rooted, and so forth. Although a transformation can be applied to any variable, each one seems to work best on a particular type of variable. Of al1 the transformations, the one that may be of value is the logarithmic (or log) transformation, which is typically applied to variables representing physical characteristicssuch as length, weight and concentrations. The variables being studied represent concentrations (as in the case of pH, redox potential and chlorides) and physical characteristics ( such as soi1 resistivity), and rnay be rendered 'normal' by the log transformation. The variables which appear to be in most need of transformation: Chloride.
Resdir, and Ressat are studied first. The normal probability plots of these three variables

are very similar, and it is suspected that the same transformation can be useful for al1 three. The transformation proposed is the replacement of the original value with the logarithm of that value. These new variables, referred to as LChl, LResdir and LRessat, are presented in Tables 4.1, 4.2, and 4.3. The analysis performed on the original data (Section 4.2) is repeated using the logged data, and the results are presented in Figures 4.9.4.10, and 4.11.

In the case of LResdir and LRessat, the distribution has greatly improved. The
shape of the boxplots is acceptable, and the normal probability plots are fairly linear. The transformation is therefore considered a success and, fiom this point fonvard, the variables LResdir and LRessat will be used instead of Resdir and Ressat.

In the case of LChl , the distribution has also improved dramatically. The boxplot
has a shape which is alrnost perfect, and the normal probability plot is very close to being perfectly linear. This transformation is considered a success, and the variable LChl will replace Chloride fiom this point fonvard. The transformation of the remainimg variables: pHdir, pHsat, Reddir, and Redsat. Like Chloride, these variables represent the concentrations: and oxygen in the case of the redox potential. ions in the case of pH, However, unlike Chloride, the

concentration hm already been logged in obtaining the pH and the redox potential. n i e formulae for obtainiingthe pH and the redox potential, 4, of a solution are the following:

It can be clearly seen that the pH and the potential4 are not direct measurements of the concentration, but represent the concentration indirectly given that these concentrations have been logged. For this reason, it is considered unreasonable to perfonn a second logarithmic transformation on these variables, which are already the result of a logarithmic transformation. It is possible that a transformation will render the data more attractive, but it must be understood why a transformation is performed. Any data, through a series of transformations, may be made to exhibit characteristics of 'normality'. However, if these transformations cannot be justified or understood intuitively, it is bener not to include them at all. Finally, another variable which is considered a candidate for the log transformation is CorrRate. The values of the new variable, denoted LCorr, are presented in Table 4.4. The LCorr values were analyzed and the results are presented in Figure 4.12. It c m be seen that the distribution of the data has not improved much. For this reason, the original variable will be retained for the analysis to follow.

In conclusion, the following discrete variables will be used 6om this point
fonvard: pHdir, pHsut, Reddir, Redsat, LResdir, LRessat, LChl, and CorrRate.

(ppm)

LChl

-1.903 3.264 2.190 2.225 1.908 3.364 3.662 3.039 2.396 2.639 1.580 1.663 1.763 1.886 1.732 1.908 2.170 2.045 2.170 2.855 2.538 3.513 2.090 2.880 2.651 3.307 2.730 2.592 2.626 2.204 2.580 3.129 3.855 4.099 2.322 1.748 2.610 1.556 2.196 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 ues of LC

LChl @Pm)

Soi1 #

3.813 3.994 3.474 3.951 2.464 2.173 2.255 1.255 0.845 1.204 1.477 1.380 1.000 0.699 1.O4 1

1.623 1.903 2.117 3.294 2.201 1.544 1.954 1.892 2.806 2.857 3.3 11 2.467 2.816 2.744 2.072 2.771 2.525 2.913 2.342 2.43 1

->le4.1 \

S o i l#

1 2 3 4 5 6 7 8 9 1O 11 12 1 3 14 1 5 1 6 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78

S o i l#

79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 1 O0 101 102 1O 3 104 105 106 107 108 109 110 111 112 113 114 115 Il6 117 es of LR

S o i l#

Il8 Il9 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 247

Soil #

Table 4 3 Values of LResrol

122

- m . 4 w w . w n m

N n

a-

. . - N. N. ".

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

Soil #

40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78

Soil #

Soil #

79 80 81 82 83 84 $5 86 87 88 89 90 91 92 93 94 95 96 97 98 99 1O0 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117

118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 247

Soi #

Table 4.4 Valua of LCorr

124

rnN"YIN in0 I O N 0 0
0 ,
8 ,

.. .. ..
Y 3 3 3 0 0

""3

m w w P P m m m m 0 0 ~ 3 N
I I

8443-3 l , , , ,

4.1.3

Remession of the Individual Variables Now that the distribution of each variables has been studied, it is tirne to study the

distribution of the residuols arising fiom the regression of the independent variables with the dependent variable CorrRate. n i e first step is to regres each X variable individually and to observe the residual distribution. Here, the goal is to study the distribution of the residuals for signs of anomalies, and to keee track of the outliee on Y. In the previous section, the outliers obtained were outliers on the X variable only. In this section, the outliers on the residuals are examined, i.e. outliers on the model chosen to fit the data. It is most probable that the distribution will not be perfectly normal. This usually suggests that another X variable should be added to the model to account for the variance which was not accounted by the f i s t variable. In a later section, a complete model (consisting of a cornbiiation of X variables) is regressed against CorrRate and a normal distribution is anticipated. If this does not occur, it will be assumed that model is not correct, and the search for the missing X variable to be added to the model will continue. For each variable, the following information was extracted and examined:
a a

Number of observations N, SSE, PRESS statistic, R2, R2-adjusted,F-ratio, Outlien identified by a z-score > 3, Outliers identified by a Cook's Distance (CD) >l, Plot of residuals vs. predicted value of y ( e vs. y ), Stem and Leaf diagram, box plot and normal distributionplot constructed from the set of residuals. pHdir and pHsat Tables 4.Sa and 4.5b display the information related to the variables pHdir and

a
a

pHsut, respectively. Furhermore, Figures 4.13a through 4.13d display the SAS output for the variablespHdir and pHsot, includiig the e vs. y plot, the stem and leaf diagram, the box plot, and the normal probability plot for the residuals.

Residual Informatiou: VariablepHdir


N SSE PRESS statistic

R ' R'-adjusted F-Ratio Outlien with z > 3 Outliers with CD > 1

Table 45a Residual Characteristies:pHdir

Residual Information: Variable pHs@

PRESS statistic

z>3 Outliers ~4th Outliers \sithCD > 1


I l

Table 4.5b Residual Characterktics: pHs01

Only 74 observations were available for regressing the variable pHdir against CorrRate. h e increase in the error sum of squares was 8%, which is quite acceptable. This value is obtained by the following equation:
% increase in the error surn of squares =

PRESS - SSE SSE

The above percentage gives an idea of the ability of the equation to fit foreign data, i.e. to fit data which were not a part of the observations used to create the equation itself.

Another way to measure this is to compare R2. to R2. The decrease in R2is called the shrinkage, and it represents the decrease in predictive power of the equation when used on the population as a whole. in this case, the shrinkage is 9%, which is acceptable. The F-ratio is used to determine if the variable is significant in predicting y. h e value obtained for pHdi is equal to 10.61, which is substantially higher than the critical value of 4.00 (see Figure F.l) and, as suspected, pHdir is considered significant in predicting CorrRate. The outliers are identified by two different methods: by the z-score of the residual and by Cook's Distance (CD). When the z-score of a residual is larger than 3 andor when the value of CD is larger than 1, the observation is considered a residual. For every variable, observation #72 is considered an outlier with a z-score > 5. Furthemore, observation #149 has a z-score close to 3 and should also be examined in future analyses. Finally, normality of the residuals is determined using the e vs. y plot, and the normal distribution plot. The e vs. y plot is used to identifi any underlying trends in the distribution of the residuals. if the plot shows a particular pattern in the residuals, this generally indicates that the proposed model is not a complete one, i.e. that not al1 the variance in the system has been accounted by the model proposed, and that the addition of another variables may be necessary. This can also be determined by studying the shape of the cuve of the normal distribution plot. A straight line is characteristic of a normally distnbuted set of values, and therefore, any non-linear c w e would indicates a deviation from normality which may be corrected by the addition of another variable. in the case of pHdir, the e vs. y plot does not show any clear trends in the distribution of the points. However, it does show that the majority of the residual values are between i 0.05 mmlyr, and that one point in particular (#72) is much higher than the nom. This can also be seen on the boxplot where observation #72 is clearly an outlier, and on the normal probability plot where the outlier is located far above the c w e . Fially, the cuve on the normal probability plot is slightly c w e d which, as suspected, indicates that some other variable should be added to the model ofpHdir alone.
in the case of pHsut, 74 observations were analyzed and revealed an F-ratio of

4.12. This value is larger than the critical value of 4.00 and, as such, pHsat is considered

to be significant in predicting CorrRate. Furthermore, there is an 8% increase in the error sum of squares, and a 24% decrease in R ~ . From the above results, it is clear thatpHsat is not as good aspHdir in predicting CorrRate, even though the distribution of the residuals is very similar. However, because one variable is k i n g studied at a tirne, it cannot be concluded that pHsat will not be better in predicting CorrRate when it is used in combination with other variables. This will be discussed in a later section.

C - m w m P m - N T m m o r - e m n o - 0 "LI . O .

? ?Y&? 0 0
1

--Y) P O 3 W N P m - r m n o

193
0 0 0

I O S O Z s s s s s OYIOYIO O P Y I N

X U C 4 n *LI-*

:20

- 0 0 E l u

Reddir and Redsat


Tables 4.6a and 4.6b display the information related to the variables Reddir and

Redsat, respectively. Furthemore, Figures 4.14a through 4.14d display the SAS output
for the variables Reddir and Redsot, including the e vs. y plot, the stem and leaf diagram, the box plot, and the normal probability plot for the residuals.

Residual Information: Variable Re&

N
SSE PRESS statistic R2 R2-adjusted F-Ratio i t hz >3 Outliers w Outliers with CD > 1
- --

Table 4.6a Residual Cbaracteristics: Reddir

Residual Information: Variable Rea3at

N SSE
PRESS statistic R~ R~-adjusted F-Ratio i t hz >3 Outliers w Outliers w i t h CD > 1

Table 4.6b Residual Cbaracteristics: Redsal

Only 74 observations were available for regressing the variable Reddir against
CorrRate. The F-ratio is equal to 0.324, which is far below the critical ratio of 4.00 and

which indicates that Reddir is not significant in predicting CorrRate. This is also indicated by the value of 0.0044 for R*, which shows that the correlation between Reddir and CorrRate is very low. n i e results obtained fiom Reddir appear to indicate that this variable will not play an important role in future analyses. This will be s h o w to be true in the following sections. The e vs. y plot shows that the majority of the residuals are between k0.05 mm/y. There are two points which appear to be located far fiom the rest: observation #72 and #149. According to the z-score and Cook's Distance, the only bue outlier is observation #72, but #149 should also be observed because of its hi& z-score. Fially, the normal probability plot indicates a deviation fiom normality, as well as the presence of the two outliers which are located far fiom the line.

In the case of Redsat, the results indicate that this variable performs worse than
Reddir. The F-ratio of 0.0012 clearly shows that this variable is not significant in

predicting CorrRate, and the extremely low value of R2 indicates that there is little correlation between Redsat and CorrRate.

D @
L I n N m " Yir-m
4

Y I - 0 0 0
-

, 0 0 0 1 1 1

N N ? . O 0 . O
0 1

.-<Y 3

m
O

Y C

Y? o lc 9o U
.4

X V 4 n w

i . 4

Z U a

E O E O Z
O P YI N
4

20:

",o..mn m m ? P - 0 1 1 N O m mW . r

iOYiYi0 P .<D
Y I N I I

979

"

m N m 4 m r - w u m m w m P N O n w m C i n N O " " " N O . O . O 0 . O 0 . . O 1 0 0 0 1 1

3
3

0 89 8 8 8 .3g 0 1 0 0 1
1 1

v i o r n m m P O O N W nY)inw<O Y I N O - '

" " 0 0 0

m w m "O... Cm", o m o O N 0

LResdir and LRessai


Tables 4.7a and 4.7b display the information related to the variables LResdir and LRessat, respectively. Furthemore, Figures 4.15a through 4.15d display the SAS output for the variables LResdir and LRessat, including the e vs. y plot, the stem and leaf diagram, the box plot, and the normal probability plot for the residuals.

Residoal Information: Variable LResdir


N

SSE PRESS statistic


R2 R~-adjusted F-Ratio Outliers with z > 3 Outliers with CD > 1

1
1
I

Table 4.7a Residual Characteristics: LResdir

Residoal information: Variable LRersnf

N SSE PRESS statistic


R2 R2-adjusted F-Ratio Outliers with z > 3 Outliers with CD > 1

Table 4.7b Residual Characteristics: LRessal

in the case of LResdir, 70 observations were used. The F-ratio obtained is 2.553,

which is below the cntical F-ratio of 4.00 and indicates that the variables is not significant in predicting CorrRate. his is contrary to the expectation, considering the importance of resistivity in the corrosion process. There is a 39% decrease in the R2value and a 14% increase in the error sum of squares, when applying the equation to the population as a whole. These values are somewhat higher that expected. Furthermore, observation #72 is identified as an outlier, with observation #149 also has a high z-score. The e vs. y plot shows that 90% of the residuals are between 10.04 d y r ,with only two residuals being above 0.06 mdyr. These two residuals also appear on the normal probability plot as points located off the curve, as seen in Figure 4.16d. The curve of the normal probability plot also suggested a slight deviation form normality.
in the case of LRessat, 74 observations were analyzed. The F-ratio of 2.598 is

also below the cntical value and therefore, contrary to what is expected, the analysis shows that LRessat is not significant in predicting CorrRate. Furthermore, there is a 39% decrease in the R2 value and a 4% increase in the error sum of squares. Once more, observation #72 is identified as an outlier, with observations #149 and #42 exhibiting nutlier behavior. The e vs. y plot shows that 95% of the residuals are between 10.04 d y r , with only two residuals above 0.06 d y r . in general, the results suggest that LRessat is better that LResdir in predicting CorrRate. This is the expected result because CorrRate was obtained by testing a soi1 which has been saturated with distilled water and, as such, whose resistivity during testing was better represented by Ressat than by Resdir.

4 .A U C

: o

P N " , N W c m - ~ w Y I N - - P d 0 0 1 1

m m -

29909

o i m N O N O 4

;Oy'y

mg9

n
W

E
Y

LIN-<IN" O m N - - m U w d - N m z m r < m m m m o - m m " 4 . O O O i I O . . . .


0 0 0 0

-----

* O * m P - O N O m P * m m m o m m m m - n o - P 0 0 NI* 0 o n w o . 7mo.o.0 0 O m o 0

'?Y

LChl Tables 4.8 displays the information related to the variable LChl. Furthermore, Figures 4.16a and 4.16b display the SAS output, includiig the e vs. y plot, the stem and leaf diagram, the box plot, and the normal probability plot for the residuals.

Residual informatiou: Variable LChl

N SSE PRESS statistic R~ R~-adjusted F-Ratio Outliers with z > 3 Outliers with CD > 1

Table 4.8 Residual Characteristics: LChl

In total, 73 observations were used to analyze the relationship between LChl and
CorrRate.

The F-ratio of 8.572 is much higher than the critical value of 4.00 and, as

such, this variable is considered very significant in predicting CorrRafe. The decrease in

R~is only 12% and the increase in the error sum of squares is 7%. These value are quite
acceptable. Furthermore, observations #72 and #149 are identified as outliers, with #42 exhibiting outlier behavior. This can also be seen on the e vs. y plot, where 95% of the ' residuals are between rt0.04 M y r , with only e two outliers above 0.06 M y r . Fially, the normal distribution plot shows a slight deviation fiom normality which may be corrected by the addition of another variable to the model.

The results analyzed up to this point suggest that the pHdir variable performs the best, followed by LChl ,pHsut and LRessar. The variables which appear to add the least information are Reddir and Redsar. It is not at al1 surprising that the pH valuc plays such an important role, as the reduction of H' ions is one of the iwo reactions expected to contribute to the corrosion problem. The other reaction expected is the reduction of 0 2 and, as such, the insignificant role of Reddir comes as a surprise. The influence of the chloride content was also expected. Chlorides have a dual effect on the corrosion rate. Firstly, they decrease the resistivity of the soi1 because they

are ions and conductive by nature, and secondly, they inhibit the formation of the
protective passive layer on the steel specimen. On the other hand, what is very surprising is the insignificant role played by the variable LRessat. It was thought that because resistivity indirectly measures the chloride content, as well as the general ionic content of the soil, that LRessar would provide almost as much information as LChl . However, this has not been s h o w yet.

4.1.4 Correlation Matrix In the previous section, the relationship between CorrRate and each of the independent variables was studied. One could easily proceed and regress each variable in turn with al1 the others in order to identifi the extent to which the variables are intercomlated. An alternative to this is to study the correlation matiix of the set of independent vanables, plus the dependent one. The correlation matrix for the data under study is presented in Figure 4.17.

The ideal situation is one in which the correlations between the dependent variable (CorrRare) and each independent variable are high, and the correlation behveen the independent variables themselves is low. This would result in the least amount of multicollinearity, i.e. redundant information, and would lead to a situation where each variable that is added to an equation would provide new information and would serve to significantly increase the effectiveness of the equation. An examination of the correlation between CorrRaie and the independent variables will quickly reveal that the results are the same as those obtained in the previous section: the pHdir variable perfonns the best, followed by LChl ,pHsot and LRessar. The variables which appear to add the least information are Reddir and Redsar. Another important point is the sign of the correlation between CorrRafe and LResdir. One would expect that the corelation would be negative, i.e. the higher the resistivity, the lower the corrosion rate, but this is not the case. However, it must be kept in mind that the variable LResdir was obtained by testing the soi1 in the state in which it was received in the laboratory, which means that some soils were tested when dry and others were tested in a saturated condition. The results obtained are therefore misleading. Another important fact observed fiom the correiation matrix is the presence of high correlations betweenpHdir and pHsa!, Reddir and Redsaf, and between LResdir and LRessaf. As each of these pairs essentially measure the same soi1 property, it is not at al1 surprising to see high correlations. This indicates that the information provided by the variables is essentially the same, and that only one of the two variables needs to be included in a model. The choice of the variable to be retained will be discussed later. Another correlation of particular interest is that between LChl and each of the resistivity variables, LRessot and LResdir. The correlation between LChl and LRessaf is very high, and this indicates that the two variables essentially provide the same information. Although LChl is the better of the two variables, it remains to be determined whether or not the extra information provided by LChl is sufficient to consider this variable significant when the variable LResllir is already known. The correlation matrix provides information about the interaction of the variables. This is the first step in the determination of a model which describes the corrosion

phenomenon well. The next step consists of comparing the possible models, which will

be done using the RSQUARE procedure in SAS.

4.1.5

RSOUARE Results

The RSQUARE procedure in SAS is used to obtain a list of the 10 best 1-variable, 2-variable, 3-variable models, etc. This procedure is considered better than the stepwise, fonvard, and backward regression procedures because it does not present one final model as the best model. instead, it provides the analyst with a set of models that perform best, and allows the analyst to compare the models and to select the one shown to be the most logical. Table 4.9 presents several 1, 2, 3 and 4-variable model which are made up of a combination of variables which is considered acceptable by the analyst. in the case of the one-variable model, the results obtained are similar to those obtained in the previous exercises. As expected, the variables which appear to be correlated best with CorrRate are the pH variables, and LChl

However, it is swprising that the variable LResdir

performs better than LRessat, and that LResdir appears in almost al1 the best 2 and 3variable models, when LRessat appears in r few. Furthermore, these two variables often appear in the same models, which suggests that the information provided by each of the variables is not necessarily repetitive. And finally, even though LResdir often appears together with LChl, LRessat never does. These results will be considered M e r in a later section. Another important result is that pHsot andpHdir never appear in the same model. This suggests that the information provided by one variable is not necessary when the other variable is already in the model. This was an expected result. Furthermore, the models containing pHdir almost always perform better than those containing pHsat. For this reason, it can be safely concluded thatpHdir outperformspHsat.

Finally, Redrat does not appear in any of the models and, although Resdir does, it does not appear to play a very important role. This is certainly a surprishg result that will be considered further in a later section.

Variables in Mode1
PHDIR PHSAT LCHL LRESDIR LRESSAT REDDIR REDSAT LRESDIR LCHL PHDIR LCHL PHDIR LRESDIR PHSAT LCHL PHDIR LRESSAT PHDR REDSAT PHDIR REDDIR PHSAT LRESDIR PHDIR LRESDIR LCHL PHSAT LRESDIR LCHL PHSAT LRESSAT LRESDIR REDSATLRESDIR LCHL REDDIR LRESDIR LCHL REDSATLRESSAT LRESDIR REDDIR LRESSAT LRESDIR PHDIR REDDIR LRESSAT LRESDIR PHDIR REDDIR LRESDIR LCHL PHSAT REDDIR LRESDIR LCHL

Possible 1,2,3, and 4 Variable Models

4.1.6 Cateeorical Variables Up to this point, only discrete variables have been considered. The influence on the categoncal variables such as Soiltype, Moisfure and Sulflhave been ignored. One way of includiig categoncal variable in the analysis is to transform each one into a set of

. 5 ) .The dummy variables which can then be treated like discrete variables (see Section D
results of such an analysis are not very obvious, and for this reason, a simpler exercise will be performed to investigate the general effect of the categorical variables. This exercise consists simply of calculating the correlation matrix and performing the RSQUARE procedure on the data which has been sorted. For example, to examine the effect of the variable Soiltype, the data is first sorted into the three categories: sand, sandlclay, and clay. Then, for each of these three categories, the correlation matrix is calculated and the RSQUARE procedure is performed. following important points are extracted form the matrices:
a

The correlation matrices

obtained for the variable Soiltype are presented in Figures 4.18a through 4.18~. The For clays, the variables which perform best arepHdir and LChl . Conversely, LRessat performs very poorly. Furthermore, the variable Reddir performs quite well. For sandclays, pHdir performs best, followed by LChl aid LRessaf which perform equally well. The redox variables appear not to be very helpful. For sands, the variables which perform best are LChl and LRessat. The pHdir variables seems to perform poorly. The above results suggest that the variable Soilfype plays an important role. For example, the pH of a soi1 seems to be more important when the soi1 is a clay, and the resistivity is more important when the soi1 is a sand. Furthermore, both variables are important when the soil is a sandclay, and the chloride content appears to be important irrespective of the soil type. It is therefore concluded that the variable Soiltype should be included in future analyses, and as such, this categoncal variable will be transformed into a set of three dummy variables and treated in the same manner as the other discrete variables.

- d m PN.. n m O . .O

O N N P m

-.
0 P , d

P P n P N ' m m

.O

'29
& O

X?
O ;

O O O

n
4

nO

OClN m O N m O W . O .O

".a N C W N m O m O i
.O

O O O
0 0 0 .

N N

-Pr> P N i m m 0)o
- 7 .

io oo

W m N m n N m m T O . m .O

m m N P N O O m .O . O

m 3 N .+ON N O

O O O

N N

h?
80

X?
; O

W N N n d N WO
- 0

m o n * "" w m - m
0 .

o0

W .

-- .
.O O .O

nN N m N N n v O

m .+ N m N N P m m O O .O

-.

O O
0 0 0 .

N
N

.O

LndN " O N N O n o m . .O O

W n N m 0 m o mO & .

N m

m n m m 3 W N O P

-.
& O

W O N N - N " O

-- -.
W . O .O
- 0

Ln"N m P N W"

-.
- 0

W P N m - N m m

V N N m m N O "

" m N O m N

m o n
- 0 3

.O

09
80

:?
& O

o0

'O N .

- O

N m N W

oo

- n N m.4N O

- 0

4 "

m O m O .
. O

N N

" N N m.4C P N N . & O

w - N m w N m N O

-.

" m N n w N N O " O
W .

m w m " N d w m " N m .

& O

8-

N N O N N O m O . .O

""

O O O
0 .O .
d

" r i N m N N P m .O O

"
-.
- 0

m m N N C N
r("

9. O ?

" m N P W N m o m o Ln.

n"
W .

O N N P " m m

N N

H ? "

" " N N O N

. O

2. O ?

" n N NDIN N m & O

m m " m o
W .

W m N

O m N DOON
- 0

2. O 9

- A n W N " - O

n?
&O

& O

I
E

: :
Di

5
YI

3
P A
10

P
cl

Correlacion Analysis Pearson Correlation Coefficients / Prob > IR1 under Ho: Rho-O / Number of Observations

PHDIR PHDIR 1.00000 0.0 54

PHSAT 0.61974 0.0001 51

REDOIR -0.03843 0.7826 54

REOSAT -0.03715 0.7897 54

LRESDIR -0.06464 0.6660 47

LRESSAT -0.13583 0.3419 51

LCHL -0.01416 0.9223 50

PHSAT

REDDIR

-0.03843 0.7826 54 -0.03715 0.7897 54 -0.06464 0.6660 47 -0.13583 0.3419 51 -0.01416 0.9223 50

-0.06473 0.6452 53 -0.23635 0.0884 53 -0.02610 0.8633 46 -0.26105 0.0671 50 0.20294 0.1620 49

1.00000 0.0 54 0.82414 0.0001 54 0.38833 0.0070 47 0.37746 0.0063 51 -0.41076 0.0030 50

0.82414 0.0001 54 1.00000 0.0 54 0.22604 0.1266 47 0.40600 0.0031 51 -0.48817 0.0003 50

0.38833 0.0070 47 0.22604 0.1266 47 1.00000 0.0 47 0.74594 0.0001 47 -0.57108 0,0001 45

0.37746 0.0063 51 0.40600 0.0031 51 0.74594 0.0001 47 1.00000 0.0 51 -0.88587 0.0001 48

-0.41076 0.0030 50 -0.48817 0.0001 50 -0.57108 0.0001


15

REDSAT

LRESDIR

LRESSAT

-0.885a7 0.0001 48 1.00000 0.0 50

UHL

Figure 4.18b SAS Outpuk Correlation Mntrix for Sand Snmplcs

n mo m
-UII

gio
i

& O

m m m r w r O m & O

n n m n m ,O

w o c

m o o m W"

W Y C

g":""N:

m-

nns
,O

'"
.O

~ m n m r

m O O

m r

& O

:9
;O

"
5

N
O

r
~

m W" O

P n

N n m p r o m

X0

ON", m n r P I

X0

" .
. O O

Cr", NP m m

Y O P

r - N

-4.4~

W O

g ? , O

& O

29

,O

Z?

P O

UI

N m m m m r

" O. O ; O

-N . . O O -. N ;O

2 m

o m r ' O r -n . O

nro
r
O .

Z=P
X0

r n n
N

~
P

m
I ~

m m

= 4

- .
. O O

o n

Crw n O . < O

Pm

w r - r o r m o m o P . ; O

O O

0 0

- - r m o r .

3 0 m n n c WC)

g? ;O

m , O

O - N

0 m

P W "
- 3

n o P
. . O O

n n r n a
w
3 .

L P C

o m n o m r UIW
0

n n r m N m
0 .
N - P

o
O 0 0
0

r
P

n r

$ 0

Io i"

r - o m o r m o m o P .
O

w o m o W .

r - N w m m W O P w w r m r N

Io 4"

-.

.
$
0

NOUI 2 gP

w O r g4P N O w .

- - w m o m o

-0,.

"7d .

y &

X0

X0

;x
:
m o

O O

N d C m . , N m

nnc

~ m mm,. Win

2;

w r r n mNcm n o m
0 .

n n m
- 0 1

m m o m
0

. O

5
D D

ngz
P m m o . O

& .

mg:
NUI w 4 & O

g
0 0

O O ' O

:9

O - n 0 - r "W

--UI

NUIP m m

& O

!2P
& O

27

o w o m n r r m

m n m

" m "
Nrn O b. 0 . O

& O

zL4 80

!j
D W ' O

zq
; O

w m w m m r ~n w
4

W O P

n r

- m P

;O

N O N . ,O

m r r U I n c n w w
4

n r w m r c P m

; O

98

n r o O &

m w m . O

~
r

n nn m d m o n N . ;O
0.-

..

P 89
' O

O O

n
P

",OP N O & O

%9

o m n m m " P m

Z9
< O

3 m 2 1 , O

~ o n U I P

0 m4 -

n o r
P W W.,
4 .

o m r P P C

N O -

2;

ES:
9,

O r n P

22:
& O

; O

" " 0?

; O

2
I

!j
I

g n

8 e

2 2

;
V1

W i

8 8

The results obtained fioz?. the RSQUARE procedure indicate that the variables
LChl and pHdir play a very important role, and that both the resistivity variables

conhibute new information.

In some cases, LRessat performs bener than LResdir, and

in other the opposite is tme. Furhermore, the variables LChl and LResdir often appear in the same model, whereas LRessot rarely appears in a model wi?hLChl . It is becoming clear that the variables LRessat and LChl provide overlapping information, and that
LResdir appears to represent some other inherent characteristic of the soil.

AAer a similar analysis involving the categoncal variables Moisture and Sulf7, it was concluded that these variables do not introduce new information, nd it was decided that they will not be considered in future analyses. In the case of the variable Moisture, it is not at al1 surprising that the variable is unnecessary. Moisture is a measure of the saturation state of the soil as it was received in the laboratory. This is not the state in which the soil was tested to obtain the value of CorrRote, the dependent variable. Therefore, the variable Moisture cannot be usehl in predicting the value of CorrRate when it is so wholly unrelated to the conditions under which CorrRote was obtained. The variable Sulji appeared to provide no new information, and was considered not io be useful. However, it is believed that this conclusion is not one that would necessarily apply to future studies. The sulfide content of a soil is generally considered an important influence on corrosivity. in fact, sulfides (s*) are very corrosive and can cause severe damage to metal surface. The fact that the variable SuIf7 does not play an important role in this analysis may be a result of errors in the testing procedure. Another reason why the effect of sulfides is minimized may be due to the fact that sulfides are only measured quantitatively. Perhaps sulfide content should be measured with more This sirnply means that it may be precision, as in the case of chlonde content.

insuficient to qualiQ a soi1 as containing either no sulfides (N), trace amounts of sulfide

(T), or a lot of sulfides (P). in the case of chlonde content, it was observed in Section
4.1.2 that the variable characteristics, pnor to the logarithmic transformation, were simply unacceptable. The variable Chloride did not exhibit normality, and could not be used in regression analyses. Like the chloride content, sulfide content is a concentration, and it

may be necessary to perform a similar transformation on this variable before it can be used. This will be discussed further in a later section.

4.1.7 Variables Retained For Further Analvses Up to this point, al1 of the discrete variables were included in the analyses. However, it has become clear that some variables perform better than others. From this point on, only the variables which are considered useful in predicting the dependent variable CorrRofe will be retained. This step is one of "cleaning-up". Many different variables were measured during the experimental portion of this project, but the reason they were measured should be remembered. Not al1 of the variables can provide information on the phenomenon measured by the variable CorrRate. The first step is to choose between the pH variables. Which one of the two represents most accurately the conditions under which the variable CorrRate was obtained? PHdir was obtained by testing the soil as it was received in the laboratory, whereas pHsat was measured when the soil was supersaturated. It is felt that the method used to obtainpHsat is unacceptable. When the soil is mixed with water at a ratio of 1:1, the soil is greatly beyond saturation. However, the soil tested for CorrRafe is just barely saturated. in most cases, the soil received in the laboratory is already moist, and only a small amount of water is added prior to testing for CorrRate. For this reason, pHdir is considered to be the most representative variable of the two.
in the case of Reddir vs. Redsat, it is felt that Redsaf is not a good measure of the

oxidation-reduction potential of the soi1 because of the large quantity of water added to the soil pior to testing. The water added contains a certain amount of oxygen which will influence the readiig. For this reason, the variable Reddir is considered the most representative variable of the two.
in the case of LResdir and LRessat, it was decided to include both variables in

future analyses. Although LRessat is the variable which represents the condition of
CorrRate testing most accurately, the variable LResdir appears to introduce information

that LRessat does not. It is suspected that LResdir represents some inherent property of the soil which iniuences its resistivity. This suspicion arises fiom the fact that the majority of the soils obtained in the laboratory are already moist, i.a. they have a very similar moisture content. However, they do not have a moisture content which optimizes their conductive properies until they are saturated, Le. until the movement of ions is optimized. It may be possible that LResdir measures some conductiveproperty of the soil which is not the result of the movement of the ions in the soil. It is further suspected that the property represented by LResdir may be the soil type, or the soil content. Perhaps certain soil particles are inherently more conductive than others and that LResdir measures this phenomenon. It is for this reason that both LRessat and LResdir are retained for further analyses.

In summary, the following variables are retained for further analyses: CorrRare,
LChl ,pHdir, LResdir, LRessat, Reddir, and Soiltype.

4.2

Consideration of Chlorides in Predicting the Corrosion Rate 4.2.1 Determinine Sienificance

Now that the general behavior of each variables is understood, it is time to determine the significance of the chosen variables. The term signiycance refers to the statistical significance of a variable.

A variable is considered significant if the

information provided by this variable is sufficiently important, such that its addition to a set of other variables increases the ability of the set to explain the phenomenon under consideration. Determinimg significance consists of studying the results of a series of

ANOVA tables and detemining, at each step, whether the variable added is significant. An ANOVA table pemits rapid calculation of the F-ratio and the correlation coefficient, R~(see Section D.4).
The variables on which attention is focused are the following: pHdir, LChI,
LRessat, LResdir and Soilrype. The goal of the analysis is to answer the following hvo

questions: '1s the variable LChl necessary when LRessat is already included in the model?', and 'Does the variable LResdir provide the same information as Soiltype and, if so, should it be included in a model containing the variable Soiliype?'. The following information is entered in the ANOVA table, and is required to determine significance of a model R:
a

A benchmark model, a,to which rnodel R is compared,

a The error sum of squares, SSE, of each of the two models,


a

The degrees of fieedom, DOF, of each of the two models, and The cntical F-ratio with which the calculated F-ratio is compared. The SSE and the DOF of the possible models are listed in Table 4.10. The critical

F-ratio is obtained from Tables D . l . For the sake of simplicity, the cntical F-ratio will

Variable in Mode1
Intercept Intercept + Soiltype Intercept + pHdir Intercept + LChl Intercept + Soilype + pHdir Intercept + Soiltype + LChl Intercept + Soiltype + LRessat Intercept + Soiltype + Reddir Intercept + Soiltype + LResdir Intercept + pHdir + LResdii Intercept + Soilspe + pHdu + LChl Intercept + Soiltype + pHdu + LRcssat Intercept + Soiltype + pHdir + L R e d i Intercept + Soiltype + pHdir + Reddu Intercept + Soilspe + pHdi + LChl + Lressat

DOF

SSE

Table 4.10 Possible Modelr with Corrnponding SSE and DOF Values

161

be taken as 4.00 when we are comparing two models with a difference of 1 DOF, and 3.15 when comparing two models with a difference of 2 DOF's. These values correspond to an a value of .O5 and a DOF of 60 for the R-model. This is a conservative choice. There are an infmite number of ANOVA tables which can be constructed &om these variables, but only the tables considered relevant are presented here. n i e model tested (R) and the model against which it was tested (o) are presented along with the Fratio and the result of the significancetest, i.e. whether it is significant or not. The first step of the analysis is to determine if the variable Soiltype is significant. This variable is selected first because of the need to determine if the variable LResdir represents some aspect of Soiltype. The fdlowing ANOVA table results:

SOURCE
Difference ----- ----------------interce*+ -Soilwe(S2) intercept(o)

DOF
2 68 70

SSE
0.00752 0.05433 0.06185

MS
F
0.00376 0.00079 4.70

R~

0.12

The resuits indicate that the variable Soiltype is indeed significant. The next step is to determine which variables cm be added to Soiltype significantly. Each variable is tested in tm,and the results indicate that only pHdir, LChl and LRessat can be added, with variable pHdi pirforming best. The next step consists of determinimg which variables can be added to Soiltype and pHdir significantly. Each of the remaining variables was tested in tum, and the results indicate that both LChl and LRessat can be added. As suspected, LResdir cannot be added to pHdir and Soihype with significance. It would be interesting to see whether or not LResdir could have been added topHdir if Soiltype were not already in the model. The following ANOVA table results:

The results indicate that LResdir could have been added to pHdir if Soiliype were not already in the model. It appears that these two variables contribute similar information. Perhaps LResdir measures some property of the soi1 independent of conductivity related to the ion content. LResdir was determined by testing a soil which was not saturated with water, but was only moist. It may be possible that, in this state, resistivity measures the conductivity of a soi1 due to particle charge, conductivity of certain types of soil particles, or even the air content. the model. As stated earlier, both LRessat and LChl can be aaded to pHdir and Soilfype sigiificantly, witn LChl performing better than LRessat. The question that m u t now be answered is whether or not LChl is necessary when LRessai is already in the model, and vice versa. The corresponding ANOVA table, in which the model pHdir+Soiltype+LChl
+LRessat is compared to the model pHdir+Soiliype+LChl follows:

Perhaps these soil properties are taken into account by the

variable Soiltype and, as such, one variable is unnecessaty when the other is present in

Difference intercept + Soiltype + pHdir + LChl + LRessat R intercept + Soiltype + pHdir + LChl (o)

0.00000 0.04032 0.04032

0.00000 0.00062

0.00

0.00

It is very clear that the variable LRessar need not be added lo the model containhg
LChl

. 1s the addition of LChl to a model containimg Lressat necessary? The following

ANOVA table illustrates the situation:

The results indicate that LChl need not be added to the model when LRessat is already included. Although either one of the two variables can be added to the pHdir+Soiltype model, when one of the two variables is included in the model, the other is not needed. The better variable is LChl , but it is also a variable which is much more difhult to obtain. As it was described in Chapter 2, obtainiig the chloride concentration is more time and effort consuming. It was one of the objectives of this thesis to detennine whether or not the determination of chloride concentration is essential to estimate the corrosivity of a soil. If so, a suggestion would be made to incorporate the chloride ion concentration into the existing grids to estimate corrosivity, i.e. PACE and AWWA. It was initially suspected that the concentration of conductive chloride ions was already incorporated in the resistivity measurement, but the effect of chloride ions is so important that m e r research is warranted. In fact, is strongly believed that the effect of chloride ions is very important and that, even though the presenr results show that chloride ion concentration need not be added to the existing grids, the measurement of chloride concentration provides invaluable information to the potential corrosion problem.

in conclusion, the model which appears to provide the most information with the
least amount of redundancy is the following: Soiltype

+ pHdir + LRessar. For the

moment, it is suggested that the variable LChl need not be added to the existing

corrosivity grids. However, it must be remembered that the variable LChl performed bener than the variable LRessat and that the only reason that the suggestion of replacing

LRessar with LChl was not made was because the chloride content is more difficult and
time consuming to obtain. If a soi1 testing laboratory is equipped to test for the chloride ion content, then it is recommended that they do so. The information provided by this parameter can be invaluable in certain cases.

4.2.2

The Effect of Removintr Outliers

Outliers can have a very large influence on the results obtained using regression analysis. For this reason, outliers must be identified and studied carefully. It is incorrect to simply eliminate an outlier from a data set simply because its behavior is different from the rest of the data. in fact, in a data set made up of 150 observations, the presence of one or two deviant observations is not unusual. A popular way of dealing with outliers is to perform two separate studies, one includiig the outliers and the other excludiig them. The results are then compared and the f i a l conclusions are drawn.
in this case, the outliers have been identified as observations # 72 and # 149.

These observations have unusually high corrosion rates which the variables studied were unable to explain fully. This does not mean that they must be eliminated from the set. On the contrary, these two soils should be studied further because they may provide information unlike al1 the other soils. However, for the sake of completeness, the analysis was repeated on the data set without the two outliers. The conclusions drawn from the results obtained are very similar to those obtained from the complete data set. Certain differences did appear and they warrant some attention. These difference are:
a

The variable Soiltype does not appear to be very significant, and is not included in the fia1 model.
in general, the variable LRessat performs more poorly and, as a consequence,

a a

The variable LChl appears to be even more important than before.

The conclusion that would have been drawn fiom the above data set would be that the variable LRessar is not usefu! in determining corrosivity, and that only LChl and
pHdir can predict the corrosivity of a soil. These results are difficult to accept. How can

the variable LRessat be useless? It is well known that the resistivity of a soil is a key indicator to its corrosivity. Then why is this parameter badly represented in this data set? Furthermore, can the two deviant observations be eliminated without M e r study? It is felt that, in this preliminary analysis, the outlien should not be ignored. It is also suggested that these two soils be studied M e r to determine what variable(s), which have not been studied up to this point, are responsible for the corrosive nature of the soils. Finally, it has already been determined that the variable LChl performs better than LRessar. The goal of this report was not to convince the industry to begin testing for chloride content. Any legitimate corrosion testing Company is already aware of the importance of chloride ions in the corrosion process, and is most probably already testing for this parameter. The goal of this report was to determine whether or not the suggestion should be put fonvard to add this parameters to the existing grids. The results obtained up to this point do not suggest this conclusively.

4.3

Power Analysis
The power of a statistical test is the probability of finding a variable significant

when it is in fact so. in cases when a variable is not significant, it is important to determine the power of the statistical test. A low power may be the reason why a variable did not prove to be significant, and consequently, the analyst rnay choose to disregard the results obtained. in this case, the variable LChl proved not to be significant when the variable LRessat was already in the model. The power will therefore be checked to ensure that the probability of finding the variable significant is adequate. A power of .70 is generally considered acceptable. When the variable LChl was added to the model consisting of pHdir, LRessar and the two soil type variables, the value of the power parameters were as follows:

K1=4
ks= 1

K= 5
The value of N, the number of observations, is taken conservatively as 70 and the value of L is determined to be 9.6 (see Equation D.13). The power is obtained by interpolating between values obtained fiom Table D.8. The power of this statistical test is 0.87, which means that there is an 87% chance of h d i n g LChl significant, if it is so. his result indicates that the power of the statistical tests is not responsible for fmding LChl insignificant when LRessat is already in the model.

CHAPTER 5: CONCLUSIONS AND RECOMMENDATIONS


In total, 153 soils were tested for the following: pH, oxidation-reduction

potential, sulfide content, resistivity, soil type, drainage ability, moisture content, and chloride ion content. Of these, 75 soils were tested uing the method of linear polarization, an accelerated electrochemical test used to evaluate the corrosion rate of ductile iron embedded in soil. his testing method proved to be a powefil tool in the evaluation of soi1 corrosivity, and the applications in this field appear endless (see Section 5.2 for more details on possible future work). However, certain limitations of linear polarization testing must be remembered. The corrosion rate obtained using this method is the corrosion rate of the soil as it is found during testing. Any future changes to the soil, or the presence of any extemal influences affecting the corrosion rate, cannot be accounted for by the method of linear polarization. For example, the following possibilities cannot be accounted for: The presence of stray current corrosion,
0

Galvanic attack of a ductile iron pipe when connected to copper service laterals, The future migration of chlofide ions from the surface to the depth of the embedded metal, and The potential establishment and proliferation of sulfate-reducing bacteria.

In sum, only the corrosivity ofthe soi1 itself is measured, and as such, this parameter must
be considered one part of a complete study in the determination of the corrosion potential.

5.1

Summary of Results
Each soi1 was tested accordiig to the AWWA Cl05 and PACE 82-3 Standards,

and additionally, the chloride ion content and the corrosion rate were obtained using the linear polarization test. This data was analyzed using the Statistical Analysis System (SAS) and the following results were obtained:

The variable which plays the most important role in the prediction of the corrosion rate is the pH of the soil. Furthemore, of the two pH testing procedures, saturated vs. unaltered, the method in which the soil is tested in its unaltered state proved to be the bener predictor of the corrosion rate.
a

The chloride ion content proved to be an excellent predictor of the corrosion rate. Second only to pH, this variable was highly correlated with the corrosion rate, and appeared in al1 the best predictor models. The information provided by the chlonde ion content and the resistivity of the soil when saturated overlapped, and as such, when one variable is included in a predictor model, the other variable is insignificant. Although chloride ion content outperfomed soil resistivity, the additional information provided by this variable was not significant enough to suggest that it be added to a model containimg soil resistivity. Furthemore, the power of the significance test was examined and was shown to be acceptable. Soil resistivity was retained instead of the chlonde ion content, because this variable is currently being used in the industry standards, and it can be determined rapidly and easily. Conversely, chlonde ion testing is t h e consuming and requires an experienced technician. Furthemore, soil resistivity can be measured in-situ, whereas chlonde ion content can only be measured under laboratory conditions. For these reasons of practicality, it is suggested that chloride ion content not replace resistivity in the existing standards. The important role of chloride ion content in the predictions of the corrosion rate bas been established. It is therefore strongly recommended that this variable be tested whenever possible. Although not included in the industry standards, the information provided by this variable will provide a bener understanding of the soil and its corrosive properties.

Soil resistivity proved to be a good predictor of the corrosion rate, although not as good as expected. Of the two testing procedures, unaltered vs. saturated, the resistivity measured when the soi1 was saturated with distilled water represented the bener of the two in predicting the corrosion rate.

The variable representing the resistivity of the unaltered soil, i.e. measured as received in the laboratory, proved to be an insignificant predictor of the corrosion rate when the soil s p e was included in the predictor model. It appears that when the resistivity of a moist soil is measured, the result may indirectly represent some inherent resistivity which depends on the soil type, e.g. the Lherent conductivity of clay particles vs. that of sand particles. The variable which proved to be the least important in the prediction of the corrosion rate is the oxidation-reductionpotential of the soil. This is a very surpriskg result, given the importance of this parameter in the corrosion process. It is strongly suspected that the method by which the soil is handled, the time the soil is exposed to air before testing, and the addition of distilled water, al1 served to alter the potential of the soil, and as a consequence, only a small correlation between the oxidationreduction potential and the corrosion rate was observed. All statistical analysis steps were performed on two data sets: one containing al1 observations, and another from which outliers were excluded. Although the numerical results varied, the conclusions drawn were essentiallythe same.

5.2

Recommendations for Future Wnrk First and foremost, a new set of soil samples should be created in the laboratory. Soil samples should have a predetermined pH, chloride content, sulfide content, oxidationreduction potential, resistivity, and clay content. This will enable the researcher to study the effect of varying one parameter at a time, which is not possible in a set of randomly selected soi1 samples.

The accuracy of the method of linear polarization can be studied further. Soil samples obtained from sites of pipe failures can be tested using this method, and the results can be compared to the actual record of 'lears to break" or "breaks per year". This requires a well organized, long term study in which a minimum of 100 soi1

samples must be analyzed and the background information related to the cortoded pipe m u t be gathered. Furthermore, a thorough knowledge of the other influencing phenomena will enable the researcher to identify cases of stray current corrosion. galvanic attack, and sulfide attack, which may cause the pipe to fail prematurely, and which are not measurable using the method of linear polarization. The tests used to determine the sulfide content should be investigated further. Inconsistencies in the results obtained fiom the two tests suggest that the tests may be improved for future use of the AWWA and PACE standards. It is also strongly suggested that the sulfide ion content be determined more accurately in future laboratory experiments, and that the results obtained using the tests described in this report may not be suficient to represent the true effect of sulfide ions in the corrosion process.

Study the various methods currently being used to determine the chloride ion content (e.g. reading potentials, and titration). Determine which method is most accurate, which is least subjective, and which is the least subject to human error. This project is a very important one because it will create a proper base for further research in the area of chloride content. Furthermore, in the testing procedure outlined in this report, only the finest particles of soi1 were retained for the chloride ion test. The effect of retaining only the fmest particles, as opposed to using a more representative specimen, should be studied further.

Based on the method of linear polarization, develop an in-situ test for corrosion rate.

Repeat the study undertaken in this report with the following changes:
a

Sulfide test is replaced by the actual sulfide ion concentration, Soils are tested for pH, oxidation-reduction potential and resistivity immediately after, or before, Iinear polarization testing, in order to ensure that al1 test are performed under the same conditions,

Use the chlonde ion test which yields the most reproducible results, and which is subject to the least human error, Upgrade the soi1 type classification to include more types of soils, e.g. organic, and silt, and Include the various types of metals in the study. Consider incorporating temperature and moisture content in the existing grids to account for seasonal variations in these factors. Using the method of linear polarization, laboratoiy testing can be done to determine the effect of temperature and moisture content on the corrosion rate, and the knowledge of insitu conditions will enable the engineer to determine the potential nsk for corrosion with more accuracy. Using the method of linear polarization, study the variation of the corrosion rate NI
rime.

Time-dependent phenomena such as the development and subsequent

proliferation of a sulfate-reducing bacteria colony, can be studied by creating the proper environment, and testing for the corrosion rate at given intervals.

BIBLIOGRAPHY
Parker, M.E., "Corrosion by Soils", NACE Basic Corrosion Course, National Association of Corrosion Engineers, Houston, Texas, 1969, p. 6-1. Uhlig, H.H, and Revie, R.W., Corrosion and Corrosion Conbol, John Wiley Sons, New York, 1985. Funahashi, M., and Young, W.T., "Investigation of E-LOG I Tests and Cathodically Polarized Steel in Concrete", Proceedings of NACE Conference CORROSION-94, paper no. 301, National Association of Corrosion Engineen, Houston, Texas, 1994, p. 30111. Sehgal, A.D., Kho, Y.T., Osseo-Aszre, K., and Pickering, H.W., "Reproducibility of Polarization Resistance Measurements in Steel-in-Concrete Systems",
Corrosion, Vol. 48, No. 9, September 1992, p. 706.
B;

Feliu, S., Gonzalez, J.A., Andrade, C., and Feliu, V., "Polarization Resistance Measurements in Large Concrete Specimens: Mathematical Solution for a Unidirectional Current Distribution", Materials and Structures, Vol. 22, 1989, p. 199. Macdonald, D.D., Urquidi-Macdonald,M., Rocha-Filho, R.C., and El-Tantawy, Y., "Determination of the Polarization Resistance of Rebar in Reinforced Concrete",
Corrosion, Vol. 47, No. 5, May 1991, p. 330.

Lavrenko, V.A., a7d Shvets, V.A., "Determination of the Corrosion Activity of Soi1 in Relation to Steel by the Polarization Resistance Method", Institute of Problems of Material Science, Academy of Sciences of the Ukraine, Kiev, Translated fiom
Fiziko-KhimichesknyaMekhanika Materialov, No.3, May-June 1992, p. 108.

Rogers, W.F., "Statistical Predictions of Corrosion Failures", Proceedings of


NACE Conference CORROSION-89 (New Orleans), paper no. 596, National

Association of Corrosion Engineers, Houston, Texas, 1989, p. 59611. Fontana, M.G., and Greene, N.D., Corrosion Engineering, McCaw-Hill, New York, 1967.

[IO] Wakelin, R.G., and Gummow, R.A., "The Effect of Copper on the Corrosion of bon Watennains", Proceedings of Texas, 1990, p. 38311. [ I l ] ASM, Standard Guide for Examination and Evaluation of Pitting Corrosion, ASM Specification G 46-94. [12] Sears, E.C., "Cornparison of the Soi1 Corrosion Resistance of Ductile bon Pipe and Gray Cast iron Pipe", Materials Protection, Vol. 7, No. 10, October 1968, p.33. [13] De Rosa, P.J., and Parkinson, R.W., "Corrosion of Ductile iron Pipe", Water
Research Center External Report TR 241, United Kingdom, October 1986. NACE Conference CORROSION-90 (Las Vegas), paper no. 383, National Association of Corrosion Engineers, Houston,

[14] Segal, B.G., Chemistry Experin~eni and Theory, John Wiley & Sons, New York, 1985. [15] Zumdahl, S.S., Chemistry, D.C. Heath and Company, Massachesetts, 1989. [16] Ailor, W.H., Handbook on Corrosion Testing and Evaluation, John Wiley & Sons, New York, 1971. [17] Oldham, K.B., and Mansfeld, F., "On the So-Called Linear Polarization Method for Measurement of Corrosion Rates", Corrosion, Vol. 27, No. 10, October 1971, p. 434. [18] Mansfeld, F., and Oldham, K.B., "A Modification of the Stem-Geary Linear Polarization Equation", Corrosion Science, Vol. 11, 1971, p. 787. [19] Stem, M., "A Method for Determinhg Corrosion Rates From Linear Polarization Data", Corrosion, Vol. 14, 1958, p. 440. [20] Townley, D.W., "Determination of Maximum Scan Rate for Linear Polanzation Measurements", Corrosion, Vol. 47,No. 10, October 1991, p. 737. [21] Oldham, K.B., and Mansfeld, F . , "Corrosion Rates from Polarization Curves: A New Method", Corrosion Science, Vol. 13, 1973, p. 813. [22] ASTM, Standard Practice for Calculation of Corrosion Rates and Related Information fiom Electrochemical Measurements, ASTM Specification G 102-89. [23] ASTM, Standard Reference Test Method for Making Potentiostatic and Potentiodynamic Anodic Polarization Measurements, ASTM Specification G 5-87.

[24] Fitzgerald III, J.H., "Evaluating Soil Corrosivity - Then and Now", Proceedings of NACE Conference CORROSION-93 (New Orleans), paper no. 4, National Association of Corrosion Engineers, Houston, Texas, 1993, p. 411. [25] Stroud, T.F., "Corrosion Control Measures for Ductile Iron Pipe", Proceedings of NACE Conference CORROSION-93 (Las Vegas), paper no. 585, National Association of Corrosion Engineers, Houston, Texas, 1993, p. 58511. (261 Stevens, J., Applied Muhivariate Statistics for the Social Sciences, Lawrence Erlbaum Associates, Mahwah, New Jersey, 1996. [27] Draper, N.R., and Smith, H., Applied Regression Anaiysis, John Wiley & Sons, New York, 1966. [28] Montgomery, D.C., and Peck, E.A., Introduction to Linear Regression Ana!vsis, John Wiley & Sons, New York, 1982. [29] SAS Institute, SAS / STAT User's Guide, Volume 1, Version 6, SAS Institure bc., North Carolina, 1990. [30] SAS Institute, SAS / STAT User's Guide, Volume 2, Version 6, SAS Institue Inc., North Carolina, 1990. [31] Coben, J., "A Power Primer", Psychological Bulletin, Vol. 112, No. 1, 1992, p. 155.

APPENDM A: DERIVATION OF POTENTIAL EQUATIONS

A.1

Equation for + N , . The reduction of Zn is represented by the following equation:

The Nerst potential is obtained by substituting the appropriate values into the following equation:
= +$ + 2.303

RTInF

log [a&[a,d]

(A4

For the reduction of Zn, the following values are substituted into Equation A.2: R = 8.314 Jldeg mole,
0
0

T = 298 Kelvin,
n = 2 electrons transferred,
F = 96500 C Ieq,

r [h,] = [zn2'] ,and

[ared] = [Zn(s)] = 1, because the concentration of a solid is equal to 1. Once the above values are substituted, Equation A.2 becomes:

hZn = hZnO + 0.059212 *

log [zn2+]

A.2

Equation for 4~h.c.

The reduction of Cu is represented by the following equation:

The Nerst potential is obtained by substituting the appropnate values into the following equation:
+ N =

c " + 2.303 RTInF *

log [a.,,]I[a,d]

(A.2)

For the reduction of Zn, the following values are substituted into Equation A.2:

R = 8.3 14 Jldeg mole,

T = 298 Kelvin, n = 2 electrons transferred,

F = 96500 C Ieq,
[h,] [a,,d]
= [cu2']

,and
1, because the concentration of a solid is equal to 1.

= [Cu(s)] =

Once the above values are substituted, Equation A.2 becomes:

hcu = hcuO + 0.059212

log [CU*']

A.3

Equation for

eNB+

The reduction of H+ is represented by the following equation:

The Nerst potential is obtained by substituting the appropnate values into the following equation:
= +d + 2.303 RTInF

log [a.,J/[a,,d]

64.2)

' , the following values are substituted into Equation A.2: For the reduction of H
O

n = 2 electrons transferred for each Hl released,


[a.,,] [a,d]
= [ p l 2 , and = partial pressure of

Hz(g) = 1 atm, because the reduced species is a gas under

normal pressure. Once the above values are substituted, Equation A.2 becomes:

h=

+ 2.303 RT/2F

log [ P l 2 / 1

Furthemore, given the following two facts:


a
a

reduction of H+ is taken as the baseline potential, i.e. h pH = - log [H+],

= 0, and

the fmal equation becomes:

APPENIX B: TESTING FOR CHLOFUDE ION CONCENTRATION

B.1

Creating a Concentration vs. Potential C u w e The first step in creating a concentration vs. potential curve is to record the

potentials of five calibrating solutions of known concentration: 0.01%, 0.03%, 0.33%,

0.65%, and 1.3%. This is done a minimum of two times, and the average potential for
each calibrating solution is calculated. The standard deviation of the concentrations obtained for each calibrating solution are then compared to the maximum values permitted: 1.5 for 0.01%, and 1.0 for the remainimg solutions. An example of the potentials obtained during a calibration exercise are presented in Figure B. 1. For each calibration solution, the average potential is plotted versus the known chlonde ion concentration, and a curve, such as the one presented in Figure B.2, is obtained. 7 . e curve which fits the five data points best is an exponential one. The equation of this curve is presented in the upper nght hand corner of Figure B.2. The chloride ion concentrations of the soil samples tested are obtained from the curve or from the equation in Figure B.2. The measured potential of the soil sample is located on the curve and the corresponding concentration is obtained either fiom the equation, or from the cuve itself. For example, for a potential of following concentrationsare obtained:
a
0

- 40 mV,
= 0.816 %

the

From the curve, the concentration is approximately equal to 0.82 % From Equation B.2, the concentration is equal to 0.1622 e

The choice of which method to use depends on the degree of precision required.

LECRODE:

JACQUES-CARTIER

ATE:
AhWLES:

MAY 26,1995
JK-01 JK-82

Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Trial 6 Trial 7 Trial 8 Tria1 9 Trial 1 0

'ALUES TO BE PLO'ITED:

Potential
(mv)

Chloride Concentration
(%)

-51.3 -34.8 -16.9 40.2 7 0 . 1

1 . 3 0.65 0.33 0.03 0 . 0 1

Figure B.1 Poteotisls Obtained for Calibrsting Solutions: Series 1

SERIES 1: Chloride Concentrationvs Potential


Chbilde Conunmfhn ( 7 4
94;

Figure B . 2 Calibration CUNCfor Stries 1

APPENDiX C : TRIALS FOR REPRODUCIBILITY


Trial nins were performed on the various soil samples in order to establish a complete procedure for testing subsequent soil samples. Soi1 sample No. 123 is presented for analysis. The corrmion rate was obtained twice, through two independent sets of tests. Each test is composed of two parts: the Tafel test from which the values of P. and P, are obtained, and the Liear Polarization test from which the corrosion rate is obtained. The results indicate that the corrosion rate of a metal sample placed in soi1 No. 123 is equal to 0.1 19 mm/yr. This result was obtained fiom both Trial No.1 and Trial No.2. The results of each test are presented in Figures B.3 through B.6. As it can be seen, the results obtained fiom the hvo trial runs are very sirnilar, indicating that the procedure followed yields reproducible results.

TafelCuve
'jkl23tLdia' 27n11995-12:59:34

Figure 8.3 Trial No. 1 :Tafel Results

Figure 8.4 Trial No.1 :Llnear Polarkation Raulb

Figure B.5 Trial No. 2 Tafel Results

Figure B.6 Trial No. 2 :Linear Polarhtion Rnults

APPENDlX D: PRINCIPLES OF REGRESSION ANALYSIS

This chapter presents the techniques used in a n a l m g the data presented in


Chapter 3: Procedures and Apparatus, and includes the follouing:

Data exploration Simple Iinear regression analysis Data transformation Multiple variable regression Categorical data Outliers Variable Selection Mode1 Validation Power The SAS Statistical Package

D.1

Data Exploration
Prior to begiming statistical analysis of the data using sophisticated computer

packages and advanced statistical techniques, it is vely important to be familiar with the data set. Each variable should examined individually, and the following quantities should be obtained for each 126291 :
a

the usual descriptive statistics: number of data points, mean, standard deviation, variance, skewness, etc. quantiles including the median,

a
9

stem-and-leaf diagram and box-and whisker plot, and the nomal probability plot

Quantities such as the number of data points, the mean, and the standard deviation of a variable are easily calculated; they provide considerable information about the data. They are also the values that will be needed in subsequent calculations, e.g. the number of data points, N, is a quantity that plays an important part of almost every statistical calculation: it is used to determine significance in the ANOVA table. It is necessary to calculate the power of a statistical test, and plays a key role in selecting the number of variables that will make up an equation. The quantiles obtained (median, upper and lower hinges, etc.) provide considerable information about the range of values of a given variable. Are the values al1 within a limited range, or are they spread out? Are there any values that are remarliably different form the general trend? Quantiles are also used to calculate the range outside which a value is considered an outlier. The identification of outliers is extremely important in statistical analysis. Stem-and-leaf diagrams, such as the those presented in Figure D.l, are constmcted with the values of a given variable, and can help sumrnanze the distribution of the data in a visual way, which is usually more easy to understand [26291. Furthermore, it makes the calculation of the quantiles, quantile ranges and outliers quick and easy.

Stem Leaf
16 14 12 10 8 6 4 2 O -0 -2 -4 -6 -8 -10 -12 -14 -16 6 61 36 679 01148 O 0033683 2345 75 8497 87 900866

e
1 2 2 3

5
1 7 4

2
4 2 6 2 1 2

30
O 80

1
1

S
----+----+----+----+

nnltiply Stem.Leaf by IO**-1

Figure D.l Stem and LeaiDiogram

It is almost always a good idea to display numerical information graphically where it is possible. The s!ern-and-leaf diagram is an excellent tool for the graphical display of the distribution of the data, but it contains more information than its ofien needed. The box and whisker plot aims to display the elementary information (median. hinges and outliers) pphically
126291.

A box and whisker plot is illustrated in Figure

D.2.

Stem Leaf
16 14 12 IO 8 6

6
61 36 679 01148 O 0023683 2345 75 8497 87 900866 30 O 80

S 1

2
3 5 1 7 4

4
2 O -0 -2

-4

-6 -6 -10 -12 -14 5 -16 5 ----*----*----+----+

2 4 2 6 2
1

2
1 1

Uultiply Srem.Leaf by 10--1

Figure D . 2 Stem and Leal Dingram and Boxplot

The normal probability plot is a graphical display that permits the analys! to determine if the data values art. distributed nomally. The observaticns are arranged in an increasing order of magnitude and then plotted against expected normal distribution values. The plot should resemble a straight line if nonnality is tenable 126291. Figure D.3 shows a normal probability plot.

Normal Probability Plot

Figure D . 3 Normal Probabiliy Plot

D.2

Simple Linear Regression Analysis

In simple tenns, liear regression analysis is the action of fining a straight l i e to


the data. Simple regression analysis involves a dependent variable, Y, and an independent variable, X. For each value xi observed, there is a correspondingvalue of y,. The goal is to denve an equation that will link the values of xi and y,, with the least

. 4 shows a plot amount of error possible. This is better understood graphically. Figure D
of Y vs. X. It is our goal to fhd the line which runs through these points such that the vertical distances between the line and the y values are miniiized, i.e. the values of e, are minimized. The equation relating the values of each observation of y and x of the following fom:

y,=(Po+P~x,)+e, where (p,

(D.1)

+ Pi x, ) is the portion of y, predicted by the straight line, and e, is the portion of

y, that the straight l i e fails to predict, which is the error or the residual. The term Y-axis. The term pi represents the slope of the straight line. How determine if can the one line

Po

represents the intercept, i.e. the value of y at the point where the straight line meets the

represents a good esthate of the relationship between X and Y ? The answer lies in the study of the residuals, ei. The method most commonly used to calculate the

PO
O

I 0

magnitude of the error of the equation, is to add the squared values of each of the individual error. This value
Figure D . 4 Y vs. X Plot x,

is referred to as the Error S m of Squares. or SSE:


SSE = z e,2

P.2)

It should be noted that when the errors are squared, the effect of the larger values are emphasized. The result of this is that outliers, whose residuals are high, can have an enormous effect on the value of SSE and, consequently, on the best fit line. It is, therefore, important that outliers not be ignored, but studied closely. The outliers will be discussed later. For the case when the line is to be chosen such that the SSE is minimized, there exist a closed form solution for the parameters P,and PI, given by 126291:

where x' = the mean of the x values, y' = the mean of the y values,

N = the number of observation of x and y,


L(xy) = the sum of the product of x and y for the N observations, Lx =the sum of the x values for the N observations, Ly = the sum of the y values for the N observations, and Lx2= the sum of the x2 for the N observations. Besides SSE, other quantities are computed to determine the measure of fit of a line. The variance of the estimate. S2 variance [26291. Up to this point, it has been assumed that the variable X has an infiuence on the value of Y, and an equation has been chosen which includes the variable X. It is not always the case that a variable X provides any information about the behavior of the variable Y.

, represents the average of the squared errors

{L~,?(N-2)}, and the standard error of the estimate, Syh, is simply the square root of the

Figure D.5 illustrates such a case. The slope of the line, is very small or

PI,

even

insignificant. In fact, the straight line which best represents the points seems to be the horizontal line which runs through the mean value of y, y'. It will not always be so obvious that a variable X is insignificant in predicting Y. So, how can one determine if the
Figure D. 5 Example of an Inaignilicant Predictor

variable X is significant? This is done by comparing the SSE of the mode1 including X, to the SSE of the model bfised only on the mean value of y, to as the benchmark model. The equation relating X and Y using the benchmark model based only on the mean value of y, i.e. the horizontal line, is the following:

. This last model is referred

where Po is equal to y'. The value of the error sum of squares is referred 10, in the case of this model only, as SSY.
A measure of fit that is very cornrnonly used is the squared multiple correlation,

R2. This value represents how well a model predicts Y in cornparison to the benchmark
model. R' is calculated as follows [26291:

R ' =

SSY-SSE SSY

P.6)

The value of R2 is always positive, and ranges between O and 1. For example, R2 = 0.10 means that taking X into consideration in predicting Y will result in a 10 % decrease in the error sum of squares, i.e. a 10 % irnprovement in the prediction of Y.

However, considering that this improvement is compared to a model which is based on nothimg but the mean, it does not appear to be such an important improvement. 1s the improvement important enough to consider the variable X significant? This significance is determined by examinimg another important parameter in statistics: the F-ratio. Pnor to introducing the F-ratio, it should be mentioned that the benchmark model used in this computation does not have to be based only on y'. A model being tested can be compared to any model that is a submodel of itself. This simply means that the model being tested is an extension of the benchmark model. For example, if it is required to

3 can be added significantly to a model which includes prove whether or not a variable X
the variables XI and X 2 , then the benchmark model is the one that contains XI and X 2 , and the model to be tested is the one containimg al1 three. From this point onward, this benchmark model will be termed the w-model, and the model beiig tested as the Qmodel. Like , ' R the F-ratio is a measure of the improvement of one model over another, however the F-ratio takes into account the number of variables that were added to obtain this improvement. For example, it is certainly bener if an investment of $100 yielded a retum of $10 000, rather than if this return were obtained firom an investment of $1000. The investment made can be compared to the variables added to obtain a better prediction. It is preferable that fewer variables be added and, as such, the significance of a model m u t be determined by taking into account the number of variables added. This is done by including the degrees offeedom in the equation. The degree of fieedom of a model is equal to the number of observations, N, minus the number of parameters k i n g fined by the model. The value of the F-ratio is calculated as follows [26291:

The ideal situation is a large drop in the SSE accompanied with a small &op in the degrees of freedom. This will result in a large F-ratio. F=l is considered the baseline

performance. If the ratio is close to 1, this signifies that almost no significant

improvement was made, i.e. there has been no return on the investment made. Table D.l shows the critical values of F that m u t be obtained in order to consider the R-mode1 significant. The term df for Numerator refers to the &op in the degrees of 6eedom in going 6om the w-mode1 to the R-model. The term df error refers to the degrees of 6eedom of the R-model. If the F-ratio calculated is larger than the appropriate critical value, the R-mode1 is considered significant 126291. A tool that is used to allow the analyst to quickly calculate the values of R ' , F, and SylX,and check al1 the relevant information at a single glance is the analysis of
variance table, or ANOVA table.

Figure D.6 shows a typical ANOVA table. The degrees of 6eedom and SSE of the w-model, and the R-mode1 are entered into the table. The degrees of 6eedom and SSE of the Diff-model, i.e. difference model, are obtained by subtracting the entries of the R-mode1 6om those of the o-model. The mean squares (MS) for each model are can be obtained by dividing the SSE by the degrees of 6eedom, and the value of SylX

' value is obtained by obtained by taking the square mot of the mean squares. The R
dividing the SSE of the Diff-mode1 by the SSE of the w-model, and the F-ratio is obtained by dividing the MS of the Diff-mode1 by the MS of the R-model. The ANOVA table permits rapid calculation of the relevant parameters, and presents the information in an organized format.

(1) DOF (Diff) = DOF(R) DOF(w) (2) SSE (Diff) = SSE(R) SSE(w) (3) MS (DIFF) = SSE @IFF)IDOF @FF) (4) MS ( 0 ) = SSE (n) I DOF ( n )

( 5 ) MS (w) = SSE (w) 1 DOF (a) (6) F = MS (DIFF) 1 MS (R) (7) R ' = SSE O F F ) 1 SSE (O)

Figure D.6 ANOVA Table

Table D . l Critiral Values for

194

Table D . l (cont'd) Critical Valucs for FI2']

Table D . l (cont'd) Critical Values for F '261

196

Table D . l (cont'd) Crincal Values for F 1261

D.3

Data Transformations Linear regression is used to calculate the best-fit equation relating a set of x

variables to a set of y variables, Le.

Po, PI, and e, are determined. Once this equation is

obtained, the next step is to determine if this model is significant. When a model is being tested for significance, there are certain assumptions that m u t be checked in order to ensure that the result obtained is credible. Ifthese assumptions do not hold true, then we cannot depend on the results obtained from the test. n i e following fhree assumptions m u t be checked [26291:
r The set of residuals for al1 x values are normally distributed, with a mean value of

zero and a standard deviation of a .

As it can be seen in Figure D.7, when al1 the individual residuals, e,, are ordered and ploned, they must exhibit normality. A normal distnbution has a mean of zero, and a standard distnbution of O, with only 20% of the residuals falling outside of O f 20. A tool which is very helpful in determining normality is the normal distribution plot, as discussed in Section D.1.

Figure D . 7 Normal Distribution

r For each individual x value, the y values are normally disiribuied, with a mean value

of zero and a standard deviation of a .

As it can be seen in Figure D.8, the y values must be distributed normally for each value of x. Figure D.9 shows an example of data failing to meet this criterion. Again, the normal probability plot is a useful tool in determining normality ofthe y values.

Figure D.8 Normally Distributed Y values

Figure D.9 Y Values Not Distributed Normally

r The residuals are distributed independenrlyfrom one another. This staternent implies that the behavior of the residuals independent fiorn one another, e.g. the reason that one point has a high residual has nothing to do with the fact that another point value has a high residual. However, this is not ofien true. Most commonly, there is another factor not yet accounted, which would link the two phenomena.

The first two assumptions usually go hand in hand. If one holds tnie, usually the other will as well. When the raw data is received, Y is usually ploned versus X and the characteristics of the resulting cuve are studied. The ideal situation is that the plot resembles the one presented in Figure D.lOa. Ideally, the (a)
Figure D.lOa ldeal Y vs. X Distribution

relationship between X and Y is a linear one. However, this is not often the case. When a non-linear

relationship exists, such as those presented in Figures D.lOb and D.lOc, linear regression cannot be used. Forhmately, with an appropnate transformation most relationships can be made linear. The most commonly used transformation is the logarifhrn of the data, either of the x variable, or the y variable, or both, if necessary. It is generally ageed that the data that responds best to this transformation is the data representing physical magnitudes such as weight, temperature, concentration, length, etc. Furthermore, the data must be nonnegative, with values which are not very close to zero. Other transformations include the following 126291: reciprocal, where xi becomes llxi: usually for physical measurements,
a

square root, where xi becomes dx,: usually for fiequencies, arcsine, where x, becomes arcsinedxi: usually for proportions, and log odds, where xi becomes log{ x 1 (1-x) 1: usually for proportions, where no O or 1 values are present. Any data can be manipulated to eventually appear linear, however, the key to a

simple, effective transformations is the knowledge of the phenomenon being studied.

Figure D.lOb,c Non-linear Relalionsbips Behveen X and Y

D . 4

Multiple Variable Regression Multiple variable regession involves one dependent variable, Y, and two or more

independent variables, XI, X2, etc. The equation relating Y to the X's is of the following form:
YI=

Po + PI XII + B 2 x12+ ...+ PL;X A + e,

(D.8)

The value of the error sum of squares, SSE is calculated accordig to the following equation: SSE = Z (y,

- Po - PI

X,I

xa - ...

Pli x,k

)2

0.9)

The parameters of the equation are obtained easily by making use of certain basic principles of matnx algebra. This lengthy calculations are reserved for s o h a r e packages such as the SAS, which will be introduced in a later section. The concepts presented in Section D.l on simple linear regession also apply to multiple variable regression. Although more difficult to visualize, multiple regression can be thought of as fitting a line through a set of points in a three dimensional space, or one of a higher dimension. The residual can be thought of as the distance in the ydirection between a point in space and the line. As in simple regession, the SSE is a measure of the accumulated error of a model predicting Y. In multiple regression, SSE is used to determine which combination of variables best predicts Y. This can be best explained using an example: n i e gas consumption (Y) of 45 automobiles ir studied. The independent variables considered to best predict gas consumption are the weight of the automobile (W), and the automobile length (L). The analyst m u t determine which of the two variable best predicts the gas consumption, and whether or not both variables should be used together. The analyst begins by obtainiig the equation, and the SSE, of al1 the possible combinations of L, W, and the intercept O.The following results were obtained:

Table D. 2 Inlormation about Possible Models

The results indicate that the best one-variable model is the one consisting of only the weight, because it has the smaller SSE. The best two-variable model is the one consisting of weight + intercept. Finally, the best (and only) three-variable model is the one consisting of al1 three variables. It is obvious that the lowest SSE is obtained for the model in which al1 three variables are involved. This will alwqw be the case. However, the analyst must decide whether or not adding a variable produces a decrease in the SSE which is significani. The first step is to decide between the one-variable model, and the two-variable model. The following ANOVA table shows al1 the relevant information:

The F-ratio obtained is larger than the critical F-ratio, therefore the addition of the weight is significant and so therefore, the two-variable model is retained. The next step is to compare the two-variable model with the three variable model to determine if

the variable L is significant. The following ANOVA table shows al1 of the relevant information:

Difference Intercept + W + L (R) Intercept + W (a)

1 42 43

1 29 30

1 0.69 0.70

, 1.45 < F

n i e F-ratio obtained is smaller than the critical F-ratio, therefore the addition of L is not significant. This means that the best model to predict gas consumption is the one containing only weight and the intercept. It sliould be noted, that the one-variable model containimg only the intercept had been used initially, and checked whether is was significant to add L to the model, the answer would have been affirmative. Continuing the exercise to check whether adding W to the two-variable model would be significant, it would have been noted that it would not be so. The conclusion would have been that the model consisting of automobile length and the intercept was the best model. An explanation to the significance of the various models follows. When two dependent variables, such as L and W, provide redundant information, it seems easy to understand that only one of the two variables will be needed in the model. One of the two might be bener than the other, as W is in this case, but in the absence of this variable, the second one may provide almost as much information. This concept is called multicollineari/y, and it can be bener understood with the aid of the following Vem diagrams presented in Figures D.ll. Figure D.l l a shows the case of one independent variable, X. If each of the two circles is of unit area, the shaded area represents R*, or R ~ i.e. ~ the ~ propoition , of the variance in Y that can be explained by the variable X. In the case when X is insignificant in predicting Y, the shaded area is very small. Conversely, the higher the correlation between X and Y, the larger is the shaded area.

Figure D.1 l b shows the case when two independent variables are involved. h e total proportion of the variance in Y that can be explained by the two variables. Xi and

X2, is equal to the total shaded area. This value is called the squared multiple correlation, R ~ , , ~The . proportion of the variance of Y accounted for by X2, with XI partialled out,
is indicated by the shaded area in Figure D.1 lc. This represents the extra information that X2 provides when Xi is already in the equation. It is referred to as the squared partial correlation of X2 with Y and with X I partialled out, R ~ ~ , It , .is easy to see that the less

XI and X2 overlap, the higher the usefulness of each of the two variable, and the larger is
the proportion of Y accounted on an overall basis [261.

Figure D.ll Venn Diagrams lor 1 and 2 independent variables'16'

A good tool for examinimg the extent of overlapping is to shidy the correlation

matrix of a set of variables, includiig the dependent and the independent variables. One such matrix is presented in Table D.3. The ideal situation is to have high correlations

between the Y variable and each of the X variables, and to have low correlations between the X variables themselves. This will most probably result in a large part of the variance of Y being accounted, Le. a large growth in R' as each of the variables is added to the model.

Table D.3 Correlation Matrix

D.5

Categorical Variables The techniques studied so far take only discrete variables into account. Discrete

variables represent a specific quantity, e.g. a pH of 7.4 or a chioride content of 4763 ppm. Parameters such as the mean, standard deviation, and SSE can be calculated for such variables, and a linear equation can be determined. But how can this be achieved for categorical variables which represent a category instead of a specific quantity, e.g. soiltype (sand, clay, or sand/clay), sulfide content (positive, trace, or negative) ? This is obtained by "expanding" the categorical variable into an appropriate number dummy variables. This can be best explained with an example. The variable Soilgpe is a categorical variable with the following classes: sand, clay, and sand/clay. In the present form, the variable cannot be studied in the same way as the discrete variables. For this to be possible, categorical variables such as this one must be "expanded" into a set of dummy variables. The number of dummy variables to be created will depend on the number of classes. in this case three classes exist and the analyst can choose to use either two or three variables. in the two variable case, the dummy variables will assume the following values [261:

Table D . 4 Dummy Variables for Soiltype: 2 Variable Case

In this case, the class 'clay' is considered the reference category against which the
behavior of 'sand' and 'sandklay' are assessed. In the three variable case, no reference category exists, and the dummy variables will assume the following values:

Sand SandIClay Clay

1
O O

O O

1
O

Table D . 5 Dummy Variables for Soiltype: 3 Variable Case

The difference between the discrete and categorical variables is the effect they have on the final equation relating Y to the X's. For example, if the discrete variable 'pH' and the categorical variable 'soilSpe' are used to predict the variable 'CorrRate', the following overall equation would result (for the mode1 with three dummy variables):

n i e variable pH has an effect on both Po and P i , i.e. on the intercept and the dope of the l i e . However, the dummy variables cm be viewed as having a direct effect on only the intercept because they assume a value equal to O or 1. For example, for a sand the equation would become:
CorrRate = (Po + Pr) + Pi pH

0.11)

In essence, the term p, represent the 'jump' in CorrRate resulting 6om the soi1 being a sand. Similarly, clay, respectively. The technique of creating dummy variables for coding categoncal variables is used to extend the use of multiple regression analysis to include variables that could not be included othenvise. Other techniques are also available, e.g. interaction variables, and non-linear combiations relating x and y. These methods did not prove to be useful in this project, but could be beneficial in future research on the subject. References 29 and 30 should be consulted for M e r information on these techniques.

P,, and P, represent the jumps resulting 6om a sandlclay and

D . 6

Outliers Outliers are data points that split off, or are very different 6om the rest of the data.

They can occur because of two fundamental reasons: (1) a data recording or entry error was made, or (2) the subjects are simply different 6om the rest. The first type of outlier can be identified by always listing the data and checking to ensure that the data has been entered accurately. The amount of time it takes to list and check the data for accuracy is well worth the effort, and the computer time is minimal. Statistical procedures in general can be quite sensitive to outliers. This is particularly true for the regression techniques. It is very important to be able to identify outliers and then decide how to consider them. This is quite important, because the results of the statistical analysis m u t reflect most of the data, and not to be highly influenced by just one or two errant points f261. Outliers can have a very large effect on the correlation coefficients, R,. Figure D.12 shows graphically how the inclusion of an outlier can drastically change the interpretation of the relationship between X and Y. In case A, there is no relationship without the outlier, but there is a strong relationship with the outlier. Convenely, in case

B the relationship changes 6om strong, without the outlier, to weak when the outlier is
included.

Figure D.12 Eert olOutliers on R ' [261

Besides the graphical method, outliers can be detected by studying z scores. For each variable being studied, the z score can be calculated as follows:

(D. 12)

where z,, = the z score of observation i for variable j,


x , = the recorded value of observation i for variable j,

pJ= the mean value of the observations of variable j, and


a ,=the standard deviation of the observations of variable j.

If the variable is approximately nrmally distributed, then z scores with absolute values near 3 should be considered as potential outliers. This is because, in a distribution which is normal, about 99 % of the scores should lie withii three standard deviations of the mean. Therefore, any z score value larger than 3 indicates a value very unlikely to occur. Of course, if the number of observations is large (Say >100), then simply by chance, it may be reasonable to expect a few subjects to have z scores of over three. However, the above rule is generally considered reasonable lZ6'. Up to this point, the measurement cf the outliers on the predictor variables, Xj, have been considered. The Z scores can also be calculated for the residuals obtained when a model is fitted to the data. These standardized residuals are used for fmding observations whose predicted y values are quite different from their actual y value, i.e. they do not fit the model well. As in the previous case, an observation whose standardized residual is greater than three in absolute value is considered an outlier 1261. Altematively, an outlier can be defmed as a point, which if deleted, can produce a substantial change in at least one of the regression coefficients. That is, the prediction equations with and without the point are quite different. A quantity that measures this change is the Cook's distance (CD). Unlike the z scores which identi6 the outliers on Y or on the X's individually, Cook's distance measures the combined effect of a point being

an outlier on Y and on the set of predictors. Cook and Weisberg (1982) indicate that a

CD, > 1 would generally be considered too large, and would therefore identify probable

outiiers [261. Once the outliers are identified, a decision m u t be made on whether or not the errant point should be eliminated fiom the set. his action m u t not to be taken lightly, and without serious consideration. if one fin& after further investigation of the outlying points thzt an outlier was due to a recording or entry error, then of coune, the appropriate correction should be implemented and the analysis m u t be repeated with the corrected data. However, if the errant data is due to an instrumentation error, then it is legitimate to drop the outlier. However, if none of these appear to be the case, then one should not drop the outlier, but report two analyses (one including the outliers and the other excluding it). Outliers should not necessarily be regarded as 'bad'. As a matter of fact, it has been argued that outliers can provide some of the most interesting cases for further research [261.

D.7

Variable Selection The number and type of variables, which should be included in a model, needs to

be considered. Most ofthe methods of model selection are strongly based on the concept of multicollineanty and semipartial correlations, which were introduced in Section D.4. Prior to introducing the techniques for mode1 selection, it mus1 be emphasized that the single most important tool in selecting a subset of variables for use in a model is the knowledge of the area under study. Furihermore, it is important for the investigator to be judicious in the selection of predictors. If too many variable are used, the prospects of cross validation may be influenced negatively. The analyst can exercise hislher judgment

in the creation of new variables fiom the existing ones. if, for example, the analyst
knows that two different variables essentially measure the same thing, a new variable may be created by averaging them, or by adding the z scores of the two. An alternative is the removal of one of the variables fiom the set.

A quantity which measures the extent to which a variable provides redundant

information is the variable infIationfactor ( VIF). which is based on the calculation of the correlation between the independent variables only Each independent variable is regressed in tum against the remainimg X's, and the correlation is obtained. A high correlation indicates that the remainiig X variables account for a large amount of the variation in the variable under study. This means that the variable provides little information that the remaining variables do not already provide, i.e. it provides redundant information. It is suggested that a variable be removed if VIF > 10. Variables should be eliminated one at a tirne, and the new VIF values should be calculated prior to the removal of any subsequent variables [261. The methods most commonly used to select a mode1 are the forward, backward and stepwise selection procedures. constant. Al1 these procedures involve examining the contribution of a predictor with the effect of the other predictors partialled out, or held Through the use of semipartial correlations, as was obtained in the ANOVA tables presented in Section D.4, the correlations among the predictors are disentangled and the unique variance of each predictor related to the variance of y is determined. The automobile example of Section D . 4 is a good example of the fonvard selection procedure. The first predictor that enters the equation is the one with the highest simple correlation with y. If this predictor is significant, the predictor wih the largest semipartial correlation with y is considered, etc. At some point, a given predictor will not be significant and the procedure will be terminated. In the forward selection procedure, once a variable enters the equation, it is not removed [261. The stepwise procedure is basically a variation of the forward selection procedure. However, at each stage ofthe procedure, a test is made for the least useful predictor. The importance of each predictor is constantly reassessed, and a predictor that may have been the best entry candidate earlier may now be superfluous, and is removed. The backward selection procedure involves the removal of predictor fiom an equation initially containing al1 the predictors. At each step, the partial F-ratio is calculated for every predictor. The smallest value is compared to the critical F-ratio, and

the appropriate variable is removed. The new equation is computed, and the process continues until al1 insignificant variables are removed. The forward, stepwise and backward selection procedures do not necessarily propose the sarne final model. In general, the stepwise procedure is considered the best of the three methods because it verifies al1 of the variables at each step and removes the one(s) that are redundant. A mistake that is commonly made by analysts is to consider the final mode1 proposed by these methods as the besr model possible. This is not the case. The model proposed may be just one of the many models which provide the best prediction for Y. For this reason, these methods are limited in their use. The one technique that appears to offer the analyst with the most choice in the model is the Rsquare procedure. This technique does not propose a model, but simply lists the 10 best combinations of one-variable, two-variable, three-variable models, etc., ranked according to their overall RZ value. most reasonable model. It is generally agreed that the number of variables, k, to be included in an equation depends on the number of observations, n, in the data set. The rule of thumb proposed is to chose k such that n/k > 10 [261. Another criterion often used is Mallows' C,. measures the total squared error, and chooses the model(s) where C, This measure was introduced by Mallow (1973) as a criterion for selecting a model. It n i e analyst can then compare behveen the various combinations with high R2 values, and is free to use hisher judgment is selecting the

= p, where p = k+l.

For these models, the amount of underfitting andor overfitting is minimized, i.e. there are neither too many nor too few predictors in the equation. Mallows' Cp is given in the output file created whenever a SAS program is used to propose a model [16'. It is suggested that al1 the of above methods be examined individually prior to deciding on a final model. However. ultimately it is the knowledge of the researcher of the phenomena under study that will ensure that the best and the most reasonable model is selected.

D.8

Model Validation It is cmcial for the researcher to obtain some measure of how well the regression

equation will predict on an independent sample of data, i.e. can the equation be generalized? There are essentially three ways of validating a model: data splitting, computing the adjusted R2, and the PRESS statistic.
Data splitting involves randomly splitting the available data into two parts

(roughly 113 and 213). The regression equation is denved using the so-called derivation data (2/3), and then applied to the other set of data, the validation data. The predicted values of y for the validation data are compared with the recorded y values, and the correlation between the two sets is calculated. This correlation rrpresents how well the equation works on an independent sample of data I z 6 ' . The adjusted R' value measures the shrinkage in predictive power. Shrinkage refers to the decrease in R~as it is measured in the sample with the equation derived 6om it, versus what it would be in the population as a whole using the same equation. Certainly the equation will not predict as well. The adjusted R2value estimates how well a prediction equation derived from one sample, would work on the population sample, i.e. the theoretical sample consisting of al1 possible data points. It does not indicate how well the denved equation will predict for the other samples 6om the same population. The adjusted R ' value of the population is compared to the R2value of the sample, and the percentage of decrease is noted

in many cases, there is not enough data to permit random splitting. One can still
obtain a good measure of the predictive power by the use of the PRESS statistic bredicted residual sum of squares). in this approach, the y value for each observation is set aside and a predictive equation is denved with the remaining data. This is done for each of the n observations, and as a result, n prediction equations are derived and n true errors are determined. The PRESS statistic is simply the sum of the squares of these errors. Unlike the SSE, the PRESS statistic is more representative of the true error because the equation of the line was obtained without the observation under study, Le. the line was not fined to this particular point prior to computing the error.

D . 9 Power
is the probability of rejecting the null Type 1 error, or the level of significance (a) hypothesis when it is m e , i.e. fmding a variable to be significant, when in fact it is not 12&
311.

The a level set by the experimenteris a subjective decision, but it is usually set at .O5

or .O1 to rninimize the probability of making that kind of error. There is, however, another type of error that can be made in conducting a statistical test: type II error, denoted P,which is the probability of accepting the nul1 hypothesis when it is false, i.e. finding a variable to be significant when in fact, it is not. Not only can either of these errors occur, but they are inversely related. An example of the two-group problem with 15 observations follows:

Table D . 6 Relationrhip behveen a,P. and Power

'26'

The entries in the last column, (1- P), is called the power of the experiment, and it is the probability of rejecting the null hypotheses when it is false, i.e. fmding a variable to be significant when it is. Depending on the circumstance, power analysis can be undertaken before or after the data has data has been collected and analyzed. For example, if a researcher is going to invest a lot of time and money in canying out a study, then he or she would certainly want to have a hi& power, Le. a high probability of finding what they are looking for if it is really there. Altematively, if a researcher has already cornpleted a study and has found that a certain variable is insipificant, it is important to know whether or not the power was high enough. If the power was low, the chances of fmding significance may have been too low, and as such, significance was not found even though it may have been there. A low power may lead the research to make false conclusions about the significance of a variable [261.

The power of a statistical test depends on the following factors ['? The a level set by the experimenter,

c The sample size n, and The effect size, i.e. to what extent is the effect ofthe variable observable. Power is heavily dependent on the sample size. For example, for a medium effect

size and an a = .05, the power of a test for different values of n is presented in Table D.7.

- --

Table D . 7 Relationsbip between n and Power lZ6l

As the above example suggests, when a sample size is large, power is rarely a problem. It is only when small sample sizes are evaluated that power cm influence the results obtained. The effect size is usually classified as small (f large (f

' > 3.5) ['Il.

a 0.2), medium (f ' = 1.5), or

A large effect size is usually associated with a phenornenon which,

when present, is very easy to detect. In general, the effect size of phenomena are considered medium. The equation relating the sample size n, the effect size f , and the number of variables in the R-model K.is ['Il: n = L+K+l (D. 13)

where L is a parameter which depends on the a value chosen, the difference in the number of variables between the R-model and the o-model (k~),and on the power of the statistical test. L is obtained by consultingtables ruch as Table D.8 for a = .O5 ["l.

It is generally considered tme that a sample size of 50 or Iarger is sufficient to detect a medium effect, i.e. the power of the test would be approximately 0.70.

D.10 The SAS Statistical Package


The Statistical Analysis System (SAS) was selected for use in this project because126.2%301.

**
*O *O

It is very widely distributed, It is easy to use, It can be used for a veIy wide range of analyses, fiom very simple statistics to complex multivariate analyses, and It is a well documented package, having been in development and use for over two decades. Essentially, the SAS program reads a file created by the analyst and performs the

**

various analyses requested.

Stmcturally, a SAS program is composed of three

fundamental blocks: the staternents setting up the data, the data lines, and a series of procedure (PROC) statements which describe the statistical analyses to be performed on the data entered 129.301. For a list of the procedures and a complete description, it is suggested that the reader refer to two volumes: the SASISTAT USER'S GUIDE, VOLUME 1 and 2
[29.301.

The most preferred volume in this project is VOLUME 2 which contains the fundamental regression procedures. However, it is suggested that both volumes be consulted to fully understand the scope of this statistical package, and becorne familiar with al1 of the possible techniques that may be used to analyze the data.

Table D . 8 Values of L for a = 0.5 13']