Sie sind auf Seite 1von 10

26/02/2016 https://onlinecourses.science.psu.

edu/stat507/print/book/export/html/34

Published on STAT 507 (https://onlinecourses.science.psu.edu/stat507)


Home > 3.5 - Bias, Confounding and Effect Modification

3.5 - Bias, Confounding and Effect


Modification
Consider the figure below. If the true value is the center of the target, the measured
responses in the first instance may be considered reliable, precise or as having negligible
random error, but all the responses missed the true value by a wide margin. A biased
estimate has been obtained. In contrast, the target on the right has more random error in
the measurements, however, the results are valid, lacking systematic error. The average
response is exactly in the center of the target. The middle target depicts our goal:
observations that are both reliable (small random error) and valid (without systematic
error).

Accuracy for a Sample Size of 5

Bias, confounding and effect modification in epidemiology


When examining the relationship between an explanatory factor and an outcome, we are
interested in identifying factors that may modify the factor's effect on the outcome (effect
modifiers). We must also be aware of potential bias or confounding in a study because
these can cause a reported association (or lack thereof) to be misleading. Bias and
confounding are related to the measurement and study design. Let 's define these terms:

Bias: A systematic error in the design, recruitment, data collection or analysis that
results in a mistaken estimation of the true effect of the exposure and the outcome.

Confounding: A situation in which the effect or association between an exposure


and outcome is distorted by the presence of another variable. Positive confounding
(when the observed association is biased away from the null) and negative
confounding (when the observed association is biased toward the null) both occur.

https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34 1/10
26/02/2016 https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34

Effect modification : a variable that differentially (positively and negatively) modifies


the observed effect of a risk factor on disease status. Different groups have different
risk estimates when effect modification is present..

If the method used to select subjects or collect data results in an incorrect association, .

THINK >> Bias!

If an observed association is not correct because a different (lurking) variable is


associated with both the potential risk factor and the outcome, but it is not a causal factor
itself,

THINK >> Confounding!

If an effect is real but the magnitude of the effect is different for different groups of
individuals (e.g., males vs females or blacks vs whites).

THINK >> Effect modification!

Bias Resulting from Study Design

Bias limits validity (the ability to measure the truth within the study design) and
generalizability (the ability to confidently apply the results to a larger population) of study
results. Bias is rarely eliminated during analysis. There are two major types of bias:

1. Selection bias: systematic error in the selection or retention of participants

Examples of selection bias in case-control studies:

Suppose you are selecting cases of rotator cuff tears (a shoulder injury). Many older
people have experienced this injury to some degree, but have never been treated for
it. Persons who are treated by a physician are far more likely to be diagnosed (and
identified as cases) than persons who are not treated by a physician. If a study only
recruits cases among patients receiving medical care, there will be selection bias.
Some investigators may identify cases predicated upon previous exposure. Suppose
a new outbreak is related to a particular exposure, for example, a particular pain
reliever. If a press release encourages people taking this pain reliever to report to a
clinic to be checked to determine if they are a case and these people then become
the cases for the study, a bias has been created in sample selection. Only those
taking the medication were assessed for the problem. Ascertaining a case based
upon previous exposure creates a bias that cannot be removed once the sample is
selected.
Exposure may affect the selection of controls e.g, hospitalized patients are more
likely to have been smokers than the general population. If controls are selected
among hospitalized patients, the relationship between an outcome and smoking may
be underestimated because of the increased prevalence of smoking in the control
population.
In a cohort study, people who share a similar characteristic may be lost to follow-up.
For example, people who are mobile are more likely to change their residence and
be lost to follow-up. If length of residence is related to the exposure, than our
sample is biased toward subjects with less exposure.
In a cross-sectional study, the sample may have been non-representative of the
general population. This leads to bias. For example, suppose the study population
includes multiple racial groups but members of one race participate less frequently in
the type of study. A bias results.
https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34 2/10
26/02/2016 https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34

2. Information bias (misclassification bias): Systematic error due to inaccurate


measurement or classification of disease, exposure or other variables.

Instrumentation - an inaccurately calibrated instrument creating systematic error


Misdiagnosis - if a diagnostic test is consistently inaccurate, then information bias
would occur
Recall bias - if individuals can't remember exposures accurately, then information
bias would occur
Missing data - if certain individuals consistently have missing data, then information
bias would occur
Socially desirable response - if study participants consistently give the anwer that the
investigator wants to hear, then information bias would occur

Misclassification can be differential or non-differential.

Differential misclassification: the probability of misclassification varies for the different


study groups, i.e., misclassification is conditional upon exposure or disease status.

Are we more likely to misclassify cases than controls? For example, if you interview cases
in-person for a long period of time, extracting exact information while the controls are
interviewed over the phone for a shorter period of time using standard questions, this can
lead to a differential misclassification of exposure status between controls amd cases.

Nondifferential misclassification: the probability of misclassification does not vary for


the different study groups; is not conditional upon exposure or disease status, but appears
random. Using the above example, if half the subjects (cases and controls) were
randomly selected to be interviewed by the phone and the other half were interviewed in
person, the misclassification would be nondifferential.

Either type of misclassification can produce misleading results.

Confounding and Confounders

Confounding: A situation in which a measure of association or relationship between


exposure and outcome is distorted by the presence of another variable. Positive
confounding (when the observed association is biased away from the null) and negative
confounding (when the observed association is biased toward the null) both occur.

Confounder: an extraneous variable that wholly or partially accounts for the observed
effect of a risk factor on disease status.. The presence of a confounder can lead to
inaccurate results.

A confounder meets all three conditions listed below:

1. It is a risk factor for the disease, independent


of the putative risk factor.
2. It is associated with putative risk factor.
3. It is not in the causal pathway between
exposure and disease.

The first two of these conditions can be tested with


data. The third is more biological and conceptual.

Confounding masks the true effect of a risk factor on a disease or outcome due to the
https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34 3/10
26/02/2016 https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34

presence of another variable. We determine identify potential confounders from our:

1. Knowledge
2. Prior experience with data
3. Three criteria for confounders

Example of Confounding

Hypothesis: Diabetes is a positive risk factor for coronary heart disease

We survey patients as a part of the cross-sectional study asking whether they have
coronary heart disease and if they are diabetic. We generate a 2 2 table (below):

'0' indicates those who do not have coronary heart disease, '1' is for those with coronary
heart disease; similarly for diabetes, '0' is the absence and '1' the presence of diabetes.

The prevalence of coronary heart disease among people without diabetes is 91 divided by
2340, or 3.9% of all people with diabetes have coronary heart disease. Similarly the
prevalence among those with diabetes is 12.04%. Our prevalence ratio, considering
whether diabetes is a risk factor for coronary heart disease is 12.04 / 3.9 = 3.1. The
prevalence of coronary heart disease in people with diabetes is 3.1 times as great as it is
in people without diabetes.

We can also use the 2 x 2 table to calculate an odds ratio as shown above:

( 2249 26) / ( 91 190) = 3.38

The odds of having diabetes among those with coronary heart disease is 3.38 times as
high as the odds of having diabetes among those who do not have coronary heart
disease.

Which of these do you use? They come up with slightly different estimates.

It depends upon your primary purpose. Is your purpose to compare prevalences? Or, do
you wish to address the odds of dibetes as related to coronary health status?

Now, let's add hypertension as a potential confounder.

https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34 4/10
26/02/2016 https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34

1) Ask: "Is hypertension a risk factor for CHD (among non-diabetics)?"

First of all, prior knowledge tells us that hypertension is related to many heart related
diseases. Prior knowledge is an important first step but let's test this with data.

We consider the 2 2 table below:

We are evaluating the relationship of CHD to hypertension in non-diabetics. You can


calculate the prevalence ratios and odds ratios as suits your purpose.

These data show that there is a positive relationship between hypertension and CHD in
non-diabetics. (note the small p-values)

2) This leads us to our next question, "Is diabetes (exposure) associated with
hypertension?"

We can answer this with our data as well (below):

Again, the results are highly significant! Therefore, our first two criteria have been met for
hypertension as a confounder in the relationship between diabetes and coronary heart
disease.

https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34 5/10
26/02/2016 https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34

3) A final question, "Is hypertension an intermediate pathway between diabetes (exposure)


and development of CHD?" or, vice versa, does diabetes cause hypertension which then
causes coronary heart disease? Based on the biology, that is not the case. Diabetes in
and of itself can cause coronary heart disease. Using the data and our prior knowledge,
we conclude that hypertension is a major confounder in the diabetes-CHD relationship.

What do we do now that we know that hypertension is a confounder?

Stratify....let's consider some stratified assessments...

Stratification and Adjustment - Diabetes and CHD relationship confounded by


hypertension: A cross-sectional study - Example

Earlier we arrived at a crude odds ratio of 3.38.

Now we will use an extended Maentel Hanzel method to adjust for hypertension and
produce an adjusted odds ratio When we do so, the adjusted OR = 2.84.

The Mantel-Haenszel method takes into account the effect of the strata, presence or
absence of hypertension.

If we limit the analysis to normotensives we get an odds ratio of 2.4.

Among hypertensives we get an odds ratio of 3.04.

https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34 6/10
26/02/2016 https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34

Both estimates of the odds ratio are lower than the odds ratio based on the entire sample.
If you stratify a sample, without losing any data, wouldn't you expect to find the crude odds
ratio to be a weighted average of the stratified odds ratios?

This is an example of confounding - the stratified results are both on the same side of the
crude odds ratio.This is positive confounding because the unstratified estimate is biased
away from the null hypothesis. The null is 1.0. The true odds ratio, accounting for the
effect of hypertension, is 2.8 from the Maentel Hanzel test. The crude odds ratio of 3.38
was biased away from the null of 1.0. (In some studies you are looking for a positive
association; in others, a negative association, a protective effect; either way, differing from
the null of 1.0)

This is one way to demonstrate the presence of confounding. You may have a priori
knowledge of confounded effects, or you may examine the data and determine whether
confounding exists. Either way, when confounding is present, as in this example, the
adjusted odds ratio should be reported. In this example, we report the odds-ratio for the
association of diabetes with CHD = 2.84, adjusted for hypertension.

If you are analyzing data using multivariable logistic regression, a rule of thumb is if the
odds ratio changes by 10% or more, include the potential confounder in the multi-variable
model. The question is not so much the statistical significance, but the amount the
confounding variable changes the effect. If a variable changes the effect by 10% or more,
then we consider it a confounder and leave it in the model.

We will talk more about this later, but briefly here are some methods to control for a
confounding variable (known a priori):

randomize individuals to different groups (use an experimental approach)


restrict / filter for certain groups
match in case-control studies
analysis (stratify, adjust)

Controlling potential confounding starts with good study design including anticipating
potential confounders.

Effect Modification (interaction)

Effect modification: occurs when the effect of a factor is different for different groups.
We see evidence of this when the crude estimate of the association (odds ratio, rate ratio,
risk ratio) is very close to a weighted average of group-specific estimates of the
association. Effect modification is similar to statistical interaction, but in epidemiology,
effect modification is related to the biology of disease, not just a data observation.

https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34 7/10
26/02/2016 https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34

In the previous example we saw both stratum-specific estimates of the odds ratio went to
one side of the crude odds ratio. With effect modification, we expect the crude odds ratio
to be between the estimates of the odds ratio for the stratum-specific estimates.

Effect modifier: variable that differentially (positively and negatively) modifies the
observed effect of a risk factor on disease status.

Consider the following examples:

1) The immunization status of an individual modifies the effect of exposure to a pathogen


and specific types of infectious diseases. Why?

2) Breast Cancer occurs in both men and women. Breast cancer occurs in men at
approximately a rate of 1.5/100,000 men. Breast cancer occurs in women at
approximately a rate of 122.1/100,000 women. This is about an 800 fold difference. We
can build a statistical model that shows that gender interacts with other risk factors for
breast cancer, but why is this the case? Obviously, there are many biological reasons why
this interaction should be present. This is the part that we want to look at from an
epidemiological perspective. Consider whether the biology supports a statistical interaction
that you might observe.

Think about it!

Why study effect modification? Why do we care?

to define high-risk subgroups for preventive actions,


to increase precision of effect estimation by taking into account groups that may be
affected differently,
to increase the ability to compare across studies that have different proportions of
effect-modifying groups, and
to aid in developing a causal hypotheses for the disease

If you do not identify and handle properly an effect modifier, you will get an incorrect crude
estimate.The (incorrect) crude estimator (e.g., RR, OR) is a weighted average of the
(correct) stratum-specific estimators. If you do not sort out the stratum-specific results, you
miss an opportunity to understand the biologic or psychosocial nature of the relationship
between risk factor and outcome.

To consider effect modification in the design and conduct of a study:

1. Collect information on potential effect modifiers.


2. Power the study to test potential effect modifiers - if a priori you think that the effect
may differ depending on the stratum, power the study to detect a difference.
3. Don't match on a potentially important effect modifier - if you do, you can't examine
its effect.

To consider effect modification in the analysis of data:

1. Again, consider what potential effect modifiers might be.


2. Stratify the data by potential effect modifiers and calculate stratum-specific estimates
of the effect of the risk on the outcome; determine if effect modification is present. If
https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34 8/10
26/02/2016 https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34

so,
3. Present stratum-specific estimates. Use Breslow-Day Test for Homogeneity of the
odds ratios, from Extended Mantel-Haenszel method, or -2 log likelihood test from
logistic regression to test the statistical significance of potential effect modifiers and
to calculate the estimators of exposure-disease association according to the levels of
significant effect modifiers. Alternatively, if assumptions are met, use proportional
hazards regression to produce an adjusted hazards ratio.

Example: Diabetes as a Risk for Coronary Heart Disease

When you combine men and women the crude odds ratio = 4.30.

Stratifying by gender, we can calculate different measures. Look at the odds ratios above.
The odds ratio for women is 6.66, compared to the crude odds ratio of 4.30. Therefore,
women are at much greater risk of diabetes leading to the incident coronary heart disease.
For men, the odds ratio is 2.23.

Is diabetes a risk for incident heart disease in men and in women? Yes. Is it the same
level of risk? No. For men the OR is 2.23, for women it is 6.66. The overall estimate is
closer to a weighted average of the two stratum specific estimates. Gender modifies the
effect of diabetes on incident heart disease. We can see that numerically because the
crude odds ratio is more representative of a weighted average of the two groups.

What is the most informative estimate of the risk of diabetes for heart disease? 4.30 is not
very informative of the true relationship. What is much more informative is to present the
stratum specific analysis.

During data analysis, major confounders and effect modifiers can be identified by
comparing stratified results to overall results.

In summary, the process is as follows:

1. Estimate a crude (unadjusted) estimate between exposure and disease.


2. Stratify the analysis by any potential major confounders to produce stratum-specific
estimates.
3. Compare the crude estimator with stratum-specific estimates and examine the kind
of relationships exhibited.
https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34 9/10
26/02/2016 https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34

With a Confounder:
the crude estimator (e.g. RR, OR) is outside the range of the two stratum-
specific estimators ( in the hypertension example - the crude odds ratio was
higher than both of the stratum specific ratios);
If the adjusted estimator is importantly (not necessarily statistically) different
(often 10%) from the crude estimator, the adjusted variable is a confounder.
In other words, if including the potential confounder changes the estimate of
the risk by 10% or more, we consider it important and leave it in the model.
Statistical methods (Extended Mantel-Haenszel method, multiple regression,
multiple logistic regression, proportional hazards) are available to calculate the
adjusted estimator, accounting for confounders.
With Effect modifiers:
the crude estimator (e.g. RR, OR) is closer to a weighted average of the
stratum-specific estimators;
the two stratum-specific estimators differ from each other
Report separate stratified models or report an interaction term.

To review, confounders mask a true effect and effect modifiers means that there is a
different effect for different groups.

You have reached the end of the reading material for Week 3!!! Go to the Week 3
activities in ANGEL.

Source URL: https://onlinecourses.science.psu.edu/stat507/node/34

https://onlinecourses.science.psu.edu/stat507/print/book/export/html/34 10/10

Das könnte Ihnen auch gefallen