You are on page 1of 9

CHAPTER

QUALITATIVE EXPLANATORY
VARIABLES REGRESSION MODELS

QUALITATIVE VARIABLES
Qualitative variables are nominal scale variables which have
no particular numerical values.
We can quantify them by creating the so-called dummy
variables, which take values of 0 and 1
0 indicates the absence of an attribute
1 indicates the presence of the attribute

For example, a variable denoting gender can be quantified as


female = 1 and male = 0 or vice versa.
Dummy variables are also called indicator variables,
categorical variables, and qualitative variables.
Examples: gender, race, color, religion, nationality, geographical region,
party affiliation, and political upheavals

DUMMY VARIABLE TRAP


If an intercept is included in the model and if a qualitative
variable has m categories, then introduce only (m 1) dummy
variables.
For example, gender has only two categories; hence we introduce only
one dummy variable for gender.
This is because if a female gets a value of 1, ipso facto a male gets a
value of zero.

If we consider political affiliation as choice among Democratic,


Republican and Independent parties, we can have at most two
dummy variables to represent the three parties.
If we do not follow this rule, we will fall into what is called the
dummy variable trap, the situation of perfect collinearity.

REFERENCE CATEGORY
The category that gets the value of 0 is called
the reference, benchmark, or comparison
category.
All comparisons are made in relation to the
reference category.

If there are several dummy variables, you must


keep track of the reference category; otherwise,
it will be difficult to interpret the results.

POINTS TO KEEP IN MIND


If there is an intercept in the regression model, the number of dummy
variables must be one less than the number of classifications of each
qualitative variable.
If you drop the (common) intercept from the model, you can have as
many dummy variables as the number of categories of the dummy
variable.
The coefficient of a dummy variable must always be interpreted in
relation to the reference category.
Dummy variables can interact with quantitative regressors as well as
with qualitative regressors. If a model has several qualitative variables
with several categories, introduction of dummies for all the
combinations can consume a large number of degrees of freedom.

INTERPRETATION OF DUMMY VARIABLES

Dummy coefficients are often called


differential intercept dummies, for they show
the differences in the intercept values of the
category that gets the value of 1 as compared to
the reference category.
The common intercept value refers to all those
categories that take a value of 0.

INTERPRETATION OF DUMMY VARIABLES


If we have: Yi = B1 + B2 Fi
where Y = wage and F = female dummy variable
Then, on average, females earn a wage of (B1 + B2) and males
earn a wage of B1. (Note that B2 can be negative.)
Thus females earn a wage that is B2 higher than males.

Since wages tend to be skewed to the right, we might


instead model the wage function as: lnYi = B1 + B2 Fi
In this case, females earn (eB2 1)*100% more than males on
average.
On average, male wages are equal to eB1, and female wages are
equal to e(B1+B2).

USE OF DUMMY VARIABLES


IN SEASONAL DATA
The process of removing the seasonal component from a
time series is called deseasonlization or seasonal
adjustment
The resulting time series is called deseasonalized or seasonally
adjusted time series.

Consider the following model predicting the sales of


fashion clothing:
Salest A1 A2 D2t A3 D3t A4 D4t ut
where D2 =1 for second quarter, D3 =1 for third quarter, D4= 1 for
4th quarter, Sales = real sales per thousand square feet of retail
space.

USE OF DUMMY VARIABLES


IN SEASONAL DATA
In order to deseasonalize the sales time series, we
proceed as follows:
1. From the estimated model we obtain the estimated sales
volume.
2. Subtract the estimated sales value from the actual sales
volume and obtain the residuals.
3. To the estimated residuals, we add the (sample) mean
value of sales. The resulting values are the deseasonalized
sales values.