Beruflich Dokumente
Kultur Dokumente
Table of content................................................................................................................................1 Abstract.............................................................................................................................................2 Introduction......................................................................................................................................3 Objective...........................................................................................................................................4 Method..............................................................................................................................................4 Data set.........................................................................................................................................4 Exploratory data analysis..............................................................................................................4 Statistical analysis.........................................................................................................................5 Linear Mixed Models (LMM) .................................................................................................5 Generalized Estimating Equations (GEE)................................................................................5 Software........................................................................................................................................6 Results..............................................................................................................................................6 Exploratory data analysis..............................................................................................................6 Linear Mixed Models (LMM)......................................................................................................9 Generalized Estimating Equations (GEE)..................................................................................10 Discussion and conclusion..............................................................................................................11 References.......................................................................................................................................13
Abstract
Wood is often used to generate fire, but may be contaminated by dioxines. The extent of contamination affects the chromatogram (light spectrum of fire). The experiment was conducted on wood fire, the various levels of contamination present is measured between 1.00 to 0.00, where 1.00 refers to 100 percent free of contamination while 0.00 refers to maximal contaminated materials. The studies exist out of 42 trials each trial conducted 7 times at 6 different contamination levels. The chromatogram is divided into seven different zones and abundance of chromatogram within each zone is measured. The purpose of this work is to study how the chromatogram changes with different levels of contamination. The data was analyzed using statistically methods that takes the correlated nature of the measure variables into consideration. The result of the analysis shows that significant interaction exist between zone and contamination fraction and also that the effect of wood contamination on the area of the chromatogram changes with zone.
Introduction
Fire which could be generated by burning wood is the rapid oxidation of material in a chemical process of combustion releasing heat, light and various reaction products.(1) The flame which is the visible portion of fire is a mixture of reacting gases and solids emitting visible, infrared, and sometimes ultraviolent lights, the frequency spectrum which depends on the chemical composition of the burning materials and intermediate reaction products.(1) The burning of wood and other solid particles produce the familiar red-orange glow fire which is visible to the human eyes.(2) The visible light spectrum forms part of the electromagnetic light spectrum that is visible to the human eye. The wavelength which is related to frequency and energy of the light perceived by human eye which in essence affects the different color scheme produced by presence of impurities in the burning source of heat, ranges from 4x10 -7 to 7x10 -7 m. The light spectrum is a continuous one which is divided into seven different areas, though, there is no clear bounds on where a color actual ends.(3) In most cases the colors overlap with each other and intensity of the colors in the different zones that makes up the light spectrum could be studied. Contamination affects both the intensity and color of flame produced by firewood. In this report the effect of different contamination levels on the wavelength of flame produced by firewood were studied. The effect of different levels of contamination of dioxine present in the wood on the different zones of the chromatogram is studied by comparing the emission in each zone of the chromatogram of the contaminated wood with the control chromatogram (chromatogram of uncontaminated wood).
Objective
The objective of this report is to study whether, and how, the chromatogram characteristics shift with wood fraction.
Method
Data set
42 trials of wood fire are conducted characterized by different dioxine contamination fractions of the wood. Each trial is conducted 7 times at 6 different contamination levels. The different contamination levels ranges from 0 (maximal contaminated) to 1 (uncontaminated wood). Each trail results into a chromatogram which is divided into 7 different zones. The integrated abundance of the chromatogram within each zone is quantified which will be referred to as area throughout this report.
Exploratory data analysis was done to gain insight in the structure of the data. Graphs and tables are used for this study
Statistical analysis
In the statistical analysis area will be treated as a binary and as a continuous response variable. Considering the clustered nature of the data it is no longer appropriate to use classical methods such as ordinary linear regression and ordinary logistic regression. Methods who take into account the correlation structure of the data will be used. Linear mixed models (LMM) was used to analysis area as a continuous response variable and generalized estimating equations (GEE) was used to analyze area as a binary response variable. An alpha level of 0.05 was used in all analysis.
Linear Mixed Models (LMM) LMM (4) was used to study the effect of contamination fraction on the area when area is considered as a continuous response variable. Zone and contamination fraction are treated as continuous variables. An interaction between zone and contamination fraction was introduced into the model to allow the effect of contamination fraction to be different in different zones. A compound-symmetry model is used to fit the model since we are mainly interested in the population-averaged effect instead of the subject-specific effects. The compound symmetry model also allows negative intraclass correlation. The compound-symmetry model is formulated:
Yij = + 1 zone + 2 contaminat ion fraction + 3 zone * contaminat ion fraction + ij
i ~ N(0, Vi = dJni + 2Ini) I did not add extra explanation because I did not found it in the book. Sigma square is the intrasubject variability and d is the intersubject variability but in de CS model d and sigma square?? is a covariance and not a variance as in the RI model therefore we can allow negative intraclass correlation. Leave the explanation away?? Generalized Estimating Equations (GEE) GEE (5) was used to study the effect of the contamination fraction on the area in the different zones when area is considered as a binary response variable. Zone and contamination fraction are again treated as continuous variables. An interaction between zone and contamination fraction 5
was introduced into the model. GEE takes the dependency of observations within clusters (chromatograms) into account by specifying a working correlation structure. 2 working correlation structures where considered: independent and exchangeable To choose the most appropriate working correlation structure the empirical standard errors are compared with the model based standard errors. The working correlation structure where the distance between empirical standard errors and the model based standard errors are the smallest is chosen. This was the independent working correlation structure. The p-values based on the empirical standard errors are considered. The GEE model is formulated:
logit ( ij ) = + 1 zone + 2 contaminat ion fraction + 3 zone * contaminat ion fraction
Software
All analysis where done is SAS version 9.2.
Results
Exploratory data analysis
The dataset contains 4 variables: chromatogram, zone, area and the contamination fraction. The description of the variables are presented in table 1.
Table 1: explanation of different variables in the dataset
Description Indicator of the independent trials (subjects) Numerical indicator for each of the seven zones into which the chromatogram is subdivided Integrated abundance of the chromatogram within a zone Fraction of wood present (1.000 refers to uncontaminated material, 0.000 refers to maximal contamination)
Chromatogram is an indicator of the 42 independent trials and zone is an indicator of the different zones within each chromatogram. Zone ranges from 1 to 7. It represents the different wavelength
areas of the light spectrum which is a continuous scale ranging from 4x10
-7
to 7x10
-7
m.
Therefore zone will be treated as a continuous variable throughout this study under the assumption that zone follows the natural order of the wavelength area. Area is the integrated abundance of the chromatogram within each zone. Contamination fraction is an indicator for the level of dioxine contamination present in the wood used for fire. There are 6 different contamination levels: 0, 0.5, 0.666, 0.75, 0.875 and 1 where 0 refers to maximal contamination and 1 to uncontaminated wood. In this study contamination fraction will also be treated as a continuous variable. The dataset consist of 294 observation from 42 independent chromatograms. Observations from different zones within the same chromatogram are no longer independent. To explore area as a continuous variable the means of the areas within each zone for the different levels are calculated. (table 2)
Table 2: the average area of zone by wood fraction contamination level
Wood fraction contamination level overall zone 1 2 3 4 5 6 7 overall mean 0 2.99 1.63 8.92 7.05 26.84 21.67 30.90 14.29 0.5 5.91 11.14 8.68 14.98 24.50 15.04 19.76 14.29 0.666 6.02 11.84 7.66 15.78 20.73 17.14 20.83 14.29 0.75 7.59 15.18 8.00 18.78 20.44 14.45 15.56 14.29 0.875 7.00 16.66 6.70 20.60 16.29 16.25 16.50 14.29 1 8.05 15.85 4.98 19.73 14.39 19.15 17.85 14.29 mean 6.26 12.05 7.49 16.15 20.53 17.28 20.23 14.29
The sum of the area over the 7 zones in a chromatogram sum up to 100. Therefore the average area in the different contamination levels when the different zones are ignored are all equal to 14.29. The average area within each zone is different. A graphical presentation of table 2 is presented in figure 1. This figure shows the effect of the contamination fraction on the average area within each zone. The average area within a certain zone changes at different contamination levels. The effect of contamination fraction is not the same in each zone. In certain zones (like zone 1, 2, 4) the average area increases when the contamination fraction increases. In other zones 7
(like zone 5 and 7) the average area decreases when the contamination fraction increases. This may indicate that there is a interaction effect between zone and contamination faction.
Figure 1: average profile for each zone at different wood fraction level
A binary response variable is created by comparing the area within a certain zone with the average area in the corresponding zone of the control chromatogram (pure willow chromatogram). If the area is higher or equal to the average area in the control chromatogram the response is 1, else 0. The proportions of area higher or equal to the control chromatogram within each corresponding zone for different contamination fractions is presented in table 3. Proportions at contamination level 1 (pure willow chromatogram) are as expected around 0.5. A deviation from 0.5 indicates that the areas tend to be higher (when proportion goes to 1) or lower (when proportion goes to 0) than the average area in the control chromatogram. The proportions within a certain zone changes as wood fraction decreases. In some zones the proportions goes to 0 (like zone 1 and 2) and in other zones the proportions goes to 1 (like zone 3 and 5) which indicates again that there might be an interaction between zone and contamination fraction. The effect of the contamination fraction depends on zone.
Table 3: proportions for each zone at different wood fraction levels
zone 1
0 0.00
Wood fraction contamination level 0.5 0.666 0.75 0.00 0.00 0.29
0.875 0.14
1 0.57 8
2 3 4 5 6 7
This implies that the area measurements for different zones in the same trial (chromatogram) are negatively correlated. This is also observed in the exploratory data analysis. The area in different zones in the same chromatogram sum up to 100. Therefore any increase in the area in a zone should be compensated by a decrease in the area in another zone from the same chromatogram. This also indicates the assumption of positive correlation between repeated measures (area in zone) within a cluster (chromatogram) is not valid for the data set at hand. Negative estimates for variance component in linear mixed model have meaningful interpretations in the implied marginal model (Molenberghs and Verbeke, 2000). The parameter estimates for the fixed effect using compound symmetry covariance structure is shown in table 4.
Table 4: parameter estimates for fixed effects from compound symmetry model
There is a significant interaction between zone and contamination fraction (p-value <0.0001). Significant interaction implies that chromatogram characteristics, measured in wavelength areas, changes over the zones with level of wood contamination. Since there is a significant interaction term we do not interpret the main effect of zone and contamination fraction. The influence of a given wood contamination on the integrated abundance of chromatogram change in zone (in natural order), can be expressed as follow. That means the change in the average area with a unit increase in zone when the wood fraction is fixed is given by the follow expression: Area = 15.03 -3.76(contamination fraction) I am still not sure whether this is correct. This is my solution: Option A The change in the average area with a unit increase in zone when the contamination fraction is held constant is: Area = 4.71 -3.76(contamination fraction) Option B The change in the average area with a unit increase in contamination fraction when zone is held constant is: Area = 15.03 -3.76(zone) I prefer option B because we are interested in the effect of contamination level and not on zone.
Parameter
Estimate
Standard
P-value
10
There is again a significant interaction between zone and contamination fraction (pvalues<0.0001). Significant interaction tells us that the effect of wood contamination on chromatogram characteristics depends on the zone.
11
The GEE analysis shows the same results. The analysis revealed again a significant interaction effect between zone and contamination fraction indicating that the effect of contamination level of the wood depends on the zone in a chromatogram.
12
References
1. Fire. Wikipedia, date retrieved: 19/05/2011, URL: http://en.wikipedia.org/wiki/Fire 2. Flame. Wikipedia, date retrieved: 10/05/2011, URL: http://en.wikipedia.org/wiki/Flame
retrieved:
19/05/2011,
URL:
4. Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data.
5. Molenberghs, G. and Verbeke, G. (2005). Models for Discrete Longitudinal data. New
York: Springer.
13