Sie sind auf Seite 1von 13

Table of content

Table of content................................................................................................................................1 Abstract.............................................................................................................................................2 Introduction......................................................................................................................................3 Objective...........................................................................................................................................4 Method..............................................................................................................................................4 Data set.........................................................................................................................................4 Exploratory data analysis..............................................................................................................4 Statistical analysis.........................................................................................................................5 Linear Mixed Models (LMM) .................................................................................................5 Generalized Estimating Equations (GEE)................................................................................5 Software........................................................................................................................................6 Results..............................................................................................................................................6 Exploratory data analysis..............................................................................................................6 Linear Mixed Models (LMM)......................................................................................................9 Generalized Estimating Equations (GEE)..................................................................................10 Discussion and conclusion..............................................................................................................11 References.......................................................................................................................................13

Abstract
Wood is often used to generate fire, but may be contaminated by dioxines. The extent of contamination affects the chromatogram (light spectrum of fire). The experiment was conducted on wood fire, the various levels of contamination present is measured between 1.00 to 0.00, where 1.00 refers to 100 percent free of contamination while 0.00 refers to maximal contaminated materials. The studies exist out of 42 trials each trial conducted 7 times at 6 different contamination levels. The chromatogram is divided into seven different zones and abundance of chromatogram within each zone is measured. The purpose of this work is to study how the chromatogram changes with different levels of contamination. The data was analyzed using statistically methods that takes the correlated nature of the measure variables into consideration. The result of the analysis shows that significant interaction exist between zone and contamination fraction and also that the effect of wood contamination on the area of the chromatogram changes with zone.

Introduction
Fire which could be generated by burning wood is the rapid oxidation of material in a chemical process of combustion releasing heat, light and various reaction products.(1) The flame which is the visible portion of fire is a mixture of reacting gases and solids emitting visible, infrared, and sometimes ultraviolent lights, the frequency spectrum which depends on the chemical composition of the burning materials and intermediate reaction products.(1) The burning of wood and other solid particles produce the familiar red-orange glow fire which is visible to the human eyes.(2) The visible light spectrum forms part of the electromagnetic light spectrum that is visible to the human eye. The wavelength which is related to frequency and energy of the light perceived by human eye which in essence affects the different color scheme produced by presence of impurities in the burning source of heat, ranges from 4x10 -7 to 7x10 -7 m. The light spectrum is a continuous one which is divided into seven different areas, though, there is no clear bounds on where a color actual ends.(3) In most cases the colors overlap with each other and intensity of the colors in the different zones that makes up the light spectrum could be studied. Contamination affects both the intensity and color of flame produced by firewood. In this report the effect of different contamination levels on the wavelength of flame produced by firewood were studied. The effect of different levels of contamination of dioxine present in the wood on the different zones of the chromatogram is studied by comparing the emission in each zone of the chromatogram of the contaminated wood with the control chromatogram (chromatogram of uncontaminated wood).

Objective
The objective of this report is to study whether, and how, the chromatogram characteristics shift with wood fraction.

Method
Data set
42 trials of wood fire are conducted characterized by different dioxine contamination fractions of the wood. Each trial is conducted 7 times at 6 different contamination levels. The different contamination levels ranges from 0 (maximal contaminated) to 1 (uncontaminated wood). Each trail results into a chromatogram which is divided into 7 different zones. The integrated abundance of the chromatogram within each zone is quantified which will be referred to as area throughout this report.

Exploratory data analysis


To study the effect of the contamination fraction on the chromatogram characteristics the response variable area will be considered as a continuous variable and as a binary variable. The binary variable is created in the following way:

a 1 i f z o i na er >e = am e( zao ni na er ) ef oa pr u wr ei l cl oh wr o m ,mai =t 1o ,g. .r. , 7 b i n = a r y 0 o t h e r w i s e


The area in a zone is compared to the average area in the corresponding zone of the control chromatogram (pure willow chromatogram). Area higher or equal to the average area of the control chromatogram are indicated as 1, areas lower than the average area of the control chromatogram are indicated as 0.

Exploratory data analysis was done to gain insight in the structure of the data. Graphs and tables are used for this study

Statistical analysis
In the statistical analysis area will be treated as a binary and as a continuous response variable. Considering the clustered nature of the data it is no longer appropriate to use classical methods such as ordinary linear regression and ordinary logistic regression. Methods who take into account the correlation structure of the data will be used. Linear mixed models (LMM) was used to analysis area as a continuous response variable and generalized estimating equations (GEE) was used to analyze area as a binary response variable. An alpha level of 0.05 was used in all analysis.

Linear Mixed Models (LMM) LMM (4) was used to study the effect of contamination fraction on the area when area is considered as a continuous response variable. Zone and contamination fraction are treated as continuous variables. An interaction between zone and contamination fraction was introduced into the model to allow the effect of contamination fraction to be different in different zones. A compound-symmetry model is used to fit the model since we are mainly interested in the population-averaged effect instead of the subject-specific effects. The compound symmetry model also allows negative intraclass correlation. The compound-symmetry model is formulated:
Yij = + 1 zone + 2 contaminat ion fraction + 3 zone * contaminat ion fraction + ij

i ~ N(0, Vi = dJni + 2Ini) I did not add extra explanation because I did not found it in the book. Sigma square is the intrasubject variability and d is the intersubject variability but in de CS model d and sigma square?? is a covariance and not a variance as in the RI model therefore we can allow negative intraclass correlation. Leave the explanation away?? Generalized Estimating Equations (GEE) GEE (5) was used to study the effect of the contamination fraction on the area in the different zones when area is considered as a binary response variable. Zone and contamination fraction are again treated as continuous variables. An interaction between zone and contamination fraction 5

was introduced into the model. GEE takes the dependency of observations within clusters (chromatograms) into account by specifying a working correlation structure. 2 working correlation structures where considered: independent and exchangeable To choose the most appropriate working correlation structure the empirical standard errors are compared with the model based standard errors. The working correlation structure where the distance between empirical standard errors and the model based standard errors are the smallest is chosen. This was the independent working correlation structure. The p-values based on the empirical standard errors are considered. The GEE model is formulated:
logit ( ij ) = + 1 zone + 2 contaminat ion fraction + 3 zone * contaminat ion fraction

Software
All analysis where done is SAS version 9.2.

Results
Exploratory data analysis
The dataset contains 4 variables: chromatogram, zone, area and the contamination fraction. The description of the variables are presented in table 1.
Table 1: explanation of different variables in the dataset

Variable CHROMATOGR AM ZONE AREA CONTAMINATI ON FRACTION

Description Indicator of the independent trials (subjects) Numerical indicator for each of the seven zones into which the chromatogram is subdivided Integrated abundance of the chromatogram within a zone Fraction of wood present (1.000 refers to uncontaminated material, 0.000 refers to maximal contamination)

Chromatogram is an indicator of the 42 independent trials and zone is an indicator of the different zones within each chromatogram. Zone ranges from 1 to 7. It represents the different wavelength

areas of the light spectrum which is a continuous scale ranging from 4x10

-7

to 7x10

-7

m.

Therefore zone will be treated as a continuous variable throughout this study under the assumption that zone follows the natural order of the wavelength area. Area is the integrated abundance of the chromatogram within each zone. Contamination fraction is an indicator for the level of dioxine contamination present in the wood used for fire. There are 6 different contamination levels: 0, 0.5, 0.666, 0.75, 0.875 and 1 where 0 refers to maximal contamination and 1 to uncontaminated wood. In this study contamination fraction will also be treated as a continuous variable. The dataset consist of 294 observation from 42 independent chromatograms. Observations from different zones within the same chromatogram are no longer independent. To explore area as a continuous variable the means of the areas within each zone for the different levels are calculated. (table 2)
Table 2: the average area of zone by wood fraction contamination level

Wood fraction contamination level overall zone 1 2 3 4 5 6 7 overall mean 0 2.99 1.63 8.92 7.05 26.84 21.67 30.90 14.29 0.5 5.91 11.14 8.68 14.98 24.50 15.04 19.76 14.29 0.666 6.02 11.84 7.66 15.78 20.73 17.14 20.83 14.29 0.75 7.59 15.18 8.00 18.78 20.44 14.45 15.56 14.29 0.875 7.00 16.66 6.70 20.60 16.29 16.25 16.50 14.29 1 8.05 15.85 4.98 19.73 14.39 19.15 17.85 14.29 mean 6.26 12.05 7.49 16.15 20.53 17.28 20.23 14.29

The sum of the area over the 7 zones in a chromatogram sum up to 100. Therefore the average area in the different contamination levels when the different zones are ignored are all equal to 14.29. The average area within each zone is different. A graphical presentation of table 2 is presented in figure 1. This figure shows the effect of the contamination fraction on the average area within each zone. The average area within a certain zone changes at different contamination levels. The effect of contamination fraction is not the same in each zone. In certain zones (like zone 1, 2, 4) the average area increases when the contamination fraction increases. In other zones 7

(like zone 5 and 7) the average area decreases when the contamination fraction increases. This may indicate that there is a interaction effect between zone and contamination faction.

Figure 1: average profile for each zone at different wood fraction level

A binary response variable is created by comparing the area within a certain zone with the average area in the corresponding zone of the control chromatogram (pure willow chromatogram). If the area is higher or equal to the average area in the control chromatogram the response is 1, else 0. The proportions of area higher or equal to the control chromatogram within each corresponding zone for different contamination fractions is presented in table 3. Proportions at contamination level 1 (pure willow chromatogram) are as expected around 0.5. A deviation from 0.5 indicates that the areas tend to be higher (when proportion goes to 1) or lower (when proportion goes to 0) than the average area in the control chromatogram. The proportions within a certain zone changes as wood fraction decreases. In some zones the proportions goes to 0 (like zone 1 and 2) and in other zones the proportions goes to 1 (like zone 3 and 5) which indicates again that there might be an interaction between zone and contamination fraction. The effect of the contamination fraction depends on zone.
Table 3: proportions for each zone at different wood fraction levels

zone 1

0 0.00

Wood fraction contamination level 0.5 0.666 0.75 0.00 0.00 0.29

0.875 0.14

1 0.57 8

2 3 4 5 6 7

0.00 1.00 0.00 1.00 0.86 1.00

0.00 1.00 0.00 1.00 0.00 0.57

0.00 1.00 0.14 1.00 0.14 0.71

0.43 1.00 0.29 1.00 0.00 0.00

0.86 1.00 0.86 1.00 0.00 0.14

0.57 0.43 0.43 0.57 0.57 0.43

Linear Mixed Models (LMM)


The compound-symmetry model can be fitted directly by absorbing the effect of the random intercept in the marginal covariance structure. This model is fitted to study how the chromatogram characteristics shift with wood fraction when area is considered as a continuous response variable. The compound symmetry covariance estimate was -3.28 and the covariance estimate of the residual was 22.97. The intraclass correlation was -0.167.

This implies that the area measurements for different zones in the same trial (chromatogram) are negatively correlated. This is also observed in the exploratory data analysis. The area in different zones in the same chromatogram sum up to 100. Therefore any increase in the area in a zone should be compensated by a decrease in the area in another zone from the same chromatogram. This also indicates the assumption of positive correlation between repeated measures (area in zone) within a cluster (chromatogram) is not valid for the data set at hand. Negative estimates for variance component in linear mixed model have meaningful interpretations in the implied marginal model (Molenberghs and Verbeke, 2000). The parameter estimates for the fixed effect using compound symmetry covariance structure is shown in table 4.
Table 4: parameter estimates for fixed effects from compound symmetry model

Parameter Intercept Zone Contamination fraction Zone*Contamination fraction

Estimate -4.5559 4.7104 15.0296 -3.7573

Standard Error 1.2279 0.3070 1.7302 0.4326

P-value| 0.0002 <.0001 <.0001 <.0001

There is a significant interaction between zone and contamination fraction (p-value <0.0001). Significant interaction implies that chromatogram characteristics, measured in wavelength areas, changes over the zones with level of wood contamination. Since there is a significant interaction term we do not interpret the main effect of zone and contamination fraction. The influence of a given wood contamination on the integrated abundance of chromatogram change in zone (in natural order), can be expressed as follow. That means the change in the average area with a unit increase in zone when the wood fraction is fixed is given by the follow expression: Area = 15.03 -3.76(contamination fraction) I am still not sure whether this is correct. This is my solution: Option A The change in the average area with a unit increase in zone when the contamination fraction is held constant is: Area = 4.71 -3.76(contamination fraction) Option B The change in the average area with a unit increase in contamination fraction when zone is held constant is: Area = 15.03 -3.76(zone) I prefer option B because we are interested in the effect of contamination level and not on zone.

Generalized Estimating Equations (GEE)


GEE model based on the independent working assumption was fitted for the binary data to answer the question of interest. Table 5 shows parameter estimates together with the empirical standard errors.
Table 5: parameter estimates from GEE model

Parameter

Estimate

Standard

P-value

10

Intercept Zone Contamination fraction Zone*Contamination fraction

3.6713 0.8884 4.5897 -1.1430

Error 0.5454 0.1427 1.0178 0.2454

<.0001 <.0001 <.0001 <.0001

There is again a significant interaction between zone and contamination fraction (pvalues<0.0001). Significant interaction tells us that the effect of wood contamination on chromatogram characteristics depends on the zone.

Discussion and conclusion


The objective of this report is to study whether and how the chromatogram characteristics shift with contamination fraction. 7 independent trials are conducted at 6 different contamination fractions. The integrated abundance of the chromatogram within each zone is measured (area). Area measurements at different zones from the same chromatogram are no longer independent measurements. Therefore appropriate techniques will be used for the analysis to take the correlated nature of the data into account. The response variable area will be analyzed as a continuous variable and as a binary variable. The binary response variable is created by comparing the area to the average area of the control chromatogram (chromatogram from uncontaminated material) within the corresponding zone. Area higher or equal to the average area of the control chromatogram is indicated by 1 else 0. Since we are mainly interested in the marginal population average effects and not in the subject specific effects only models with marginal interpretation are used. Area as a continuous response variable is analyzed by a compound-symmetry model and the binary response variable is analyzed using generalized estimating equations. The results from the linear mixed models revealed that there was a negative correlation between the area observations in different zones from the same chromatogram. This makes sense because the sum of the areas at different zones within the same chromatogram is fixed. Increase of the area in a zone needs to be compensated by a decrease in area in another zone. Further there was a significant interaction between zone and contamination fraction. The effect of the contamination in the wood on the area in the chromatogram changes with zone.

11

The GEE analysis shows the same results. The analysis revealed again a significant interaction effect between zone and contamination fraction indicating that the effect of contamination level of the wood depends on the zone in a chromatogram.

12

References
1. Fire. Wikipedia, date retrieved: 19/05/2011, URL: http://en.wikipedia.org/wiki/Fire 2. Flame. Wikipedia, date retrieved: 10/05/2011, URL: http://en.wikipedia.org/wiki/Flame

3. Visible spectrum. Wikipedia, date http://en.wikipedia.org/wiki/Visible_spectrum New York: Springer.

retrieved:

19/05/2011,

URL:

4. Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data.

5. Molenberghs, G. and Verbeke, G. (2005). Models for Discrete Longitudinal data. New

York: Springer.

13

Das könnte Ihnen auch gefallen