Beruflich Dokumente
Kultur Dokumente
CHRISTOPHER A. LLONES
Presented to:
Dr. Moises Neil V. Serio
As Partial Requirements in
Econometrics
1st Semester S.Y. 2015-2016
SEPTEMBER 2015
Table of Contents
Table of Contents
Introduction
Data Collection
Data Analysis
Choice of Model and Variables
Scope and Limitation
Regression Analysis
Summary and Descriptive Statistics 0f the Model
Modifying and Organizing the Variables
Correlation
Regression
Regression Diagnostic Tests
Test for the Normality of Residuals
Test for Heteroscedasticity
Test for Multicollinearity
Test for Specification Error
Test for Autocorrelation
Prais-Winsten Regression
Findings
References
Page
i
1
1
1
1
1
2
2
2
3
4
4
4
5
6
7
7
8
8
10
Introduction
United Arab Emirates (UAE) is a member of the Organization of Petroleum Exporting
Country (OPEC) and the country is located at the Middle East. The country has a huge production
of crude oil which made UAE the fourth largest supplier of crude oil amounting to 12% of the total
world supply of crude oil based from Energy Supply Security 2014 of International Energy
Authority (IEA). Crude oil is a non-renewable resource found in natural underground reservoir
and it has no close substitute yet found making its demand inelastic in the side of consumer and
supply elasticity also is quite inelastic since market of oil is monopolistic in nature.
This paper aimed to apply analysis in regression to estimate supply elasticity of crude oil
of United Arab Emirates (UAE) as a function of real oil price of crude oil, production of crude oil
advanced by 1 year and export of crude oil lagged 1 year.
Data Collection
The data used in this paper were collected from the website of OPEC and World Bank.
Data Analysis
The study used descriptive statistics to describe data used and Stata to conduct regression
analysis and diagnostic tests in estimating the coefficient of supply elasticity in crude oil.
Choice of Model and Variables
The study used a basic double-log model with lag and lead variable to estimate the
coefficient of elasticity of crude oil. The dependent variable is the annual export of crude oil of
UAE to represent supply outside the country. Explanatory variables were real crude oil price,
production of crude oil advanced by 1 year and export of crude oil lagged by 1 year. A variable
has been lagged by a year to account the effect of previous supply of the country at present
exportation while a variable has been advanced by a year to account future anticipation in the
market of oil by the exporter which can affect present willingness to supply.
Scope and Limitation
The data used a time series data from 1960-2014 collected from OPEC and World Bank.
This paper focused primarily in conducting and applying regression analysis and gave less
elaboration in discussing the implication of the estimates generated from the model. The author
discourages that the model would be used in any policy recommendations. This papers primary
objective was only to apply methods in regression using Stata as the statistical software.
Regression Analysis
The model was based from export of crude oil=f (real oil price, production of crude oil
advanced by a year, export of crude oil lagged by 1year). This can be expressed as;
= 0 + 1 + 3 + 3 +
Obs
Mean
year
CrudeXport
RealOilPrice
CrudeProd
55
35
43
53
1987
7.634224
3.024035
7.165557
Std. Dev.
16.02082
.3228516
.5816263
.9962443
Min
Max
1960
6.975414
2.257588
2.639057
2014
8.161945
4.23931
7.936303
Obs=.
year
CrudeXport
RealOilPrice
CrudeProd
20
12
2
Obs>.
Obs<.
Unique
values
Min
Max
55
35
43
53
55
34
42
51
1960
6.975414
2.257588
2.639057
2014
8.161945
4.23931
7.936303
Using the command summarize the stata has provided a summary of the variable where
it shows number of observation, mean, the standard deviation and the minimum and maximum
value of the variables. The starting year is 1960 until 2014 based from min. and max. of the variable
year in the summary table which has 55 observations. Then, if number of observation is below 55
the variables has missing observations. Using the command misstable summarize, all stata will
generate table summarizing number of missing observation (obs=.), number of observation (obs<.)
where observation is less than missing values since stata treats missing observations as large
positive values. Therefore, the model has 20, 12 and 2 missing values in export of crude oil, real
oil price and crude production, respectively.
3
. d CrudeXport RealOilPrice CrudeProd
variable name
CrudeXport
RealOilPrice
CrudeProd
storage
type
int
float
int
display
format
value
label
%8.0g
%9.0g
%8.0g
variable label
Xport Crude
Real Oil Price
Crude Production
. replace
CrudeXport=ln( CrudeXport)
CrudeXport was int now float
(35 real changes made)
. replace RealOilPrice=ln( RealOilPrice)
(43 real changes made)
. replace CrudeProd=ln( CrudeProd)
CrudeProd was int now float
(55 real changes made, 2 to missing)
After transforming variables into a log form the storage type will change into float as shown
by the command d (short for describe) which shows storage type before the variable was
transformed into log form. Since the data will be in a time series, the command tsset was used
to tell stata that the data will be in time-series data and also we could use lag and lead options to
generate a lag and lead variables which can only be used if the data would be in time-series.
. tsset year, yearly
time variable:
delta:
. gen lagCrudeXport=L1.CrudeXport
(21 missing values generated)
. gen leadCrudeProd=F1.CrudeProd
(2 missing values generated)
The option yearly would tell stata that the time variable is annually. Then the time-series
operator L. (for lag) and F. (for lead) can be used. The number 1 means that export in crude
will be lagged by 1 year and crude production will be advance by a year. This is to capture the
researchers hypothesis that expected production and past exportation will affect the amount to
be supplied at present aside from price.
Correlation
Before regression, it would be useful to determine the possible associations among the
variables in the model. The command pwcorr would perform a pairwise correlation.
. pwcorr CrudeXport RealOilPrice CrudeProd lagCrudeXport leadCrudeProd
CrudeX~t RealOi~e CrudeP~d lagCru~t leadCr~d
CrudeXport
RealOilPrice
CrudeProd
lagCrudeXp~t
leadCrudeP~d
1.0000
0.7437
0.9844
0.9551
0.9681
1.0000
0.6701
0.7250
0.6385
1.0000
0.9681
0.9855
1.0000
0.9168
1.0000
The result of the pairwise correlation means that all the independent variable has a
positive association with the independent variable. The magnitude of associations among
variables are quite strong since the coefficient is close to 1.
Regression
Using the basic double-log form of the model;
= 0 + 1 + 3 + 3 +
The independent variable would be regressed by its explanatory variables using the regress
command in stata.
. regress CrudeXport RealOilPrice leadCrudeProd lagCrudeXport
Source
SS
df
MS
Model
Residual
3.20418579
.096579859
3
29
1.06806193
.00333034
Total
3.30076565
32
.103148927
CrudeXport
Coef.
RealOilPrice
leadCrudeP~d
lagCrudeXp~t
_cons
.0580351
.6313507
.357686
-.0604724
Std. Err.
.0258407
.0851185
.0882998
.309896
t
2.25
7.42
4.05
-0.20
Number of obs
F(
3,
29)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.032
0.000
0.000
0.847
=
=
=
=
=
=
33
320.71
0.0000
0.9707
0.9677
.05771
.1108853
.8054375
.5382794
.573336
Based from the F-test which measures the overall significance of the model, the model is
significant at 1% significance level since the p-value is less than 0.01 margin error. This implies
that at least one of the explanatory variables has able to explain the variability in crude export. The
explanatory variables have able to explain the variability in the exportation of crude oil by 97%
based from the R-square. Based also from a t-test that tests if the individual independent variables
have linear relationship with the dependent variable, the independent variables: RealOilPrice,
leadCrudeProd and lagCrudeXport are significant at 5% and 1%. However, before making an
inference out of the results in this regression, the model must undergo a diagnostic test to determine
if the coefficients are unbiased and p-values are valid.
5
. predict r, residual
(22 missing values generated)
. pnorm r
. kdensity r, normal
Using the result in kernel density estimate and normal probability (pnorm), it shows a slight
deviation from normal. Nonetheless, the residuals were quite close to a normal distribution. In
order to have a clear result if the residuals are normally distributed, a Shapiro-Wilk test for
normality will be used using the command swilk.
. swilk r
Shapiro-Wilk W test for normal data
Variable
Obs
33
0.97667
0.796
z
-0.473
Prob>z
0.68200
The null hypothesis of Shapiro-Wilk test is that the distributions are normal. Based from
the result, it fails to reject the null hypothesis and accept that residuals are normally distributed.
It is quite difficult to trace a pattern if heteroscedasticity is present using the plot above
since the number of points in not enough to established a good pattern. However, it can be roughly
estimated that heteroscedasticity is not present since the data points is not quite narrowing to the
right. Using White test and Breusch-Pagan test it can be concluded if heteroscedasticity is present
using the p-value. The command estat imtest and estat hettest is for White test and BreuschPagan test, respectively.
. estat imtest
Cameron & Trivedi's decomposition of IM-test
Source
chi2
df
Heteroskedasticity
Skewness
Kurtosis
7.90
2.00
0.85
9
3
1
0.5440
0.5723
0.3552
Total
10.76
13
0.6311
. estat hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of CrudeXport
chi2(1)
Prob > chi2
=
=
1.19
0.2756
The p-values of the tests are against the null hypothesis that the variance of the residuals
are homogeneous or homoscedastic. Since the p-value for both test is large and not significant
then, the test fails to reject the null hypothesis and accept that the variance of the residuals is
homogeneous or homoscedastic. If the results reject the null hypothesis, the option vce(robust)
in the regression will be used to come up with an estimate of the coefficients adjusted for the
presence of heteroscedasticity.
VIF
1/VIF
lagCrudeXp~t
leadCrudeP~d
RealOilPrice
7.09
6.27
1.96
0.140980
0.159400
0.510924
Mean VIF
5.11
The rule of thumb states that vif with greater than 10 and tolerance (1/vif) less than 0.1
shows presence of multicollinearity. Then the result for variance inflation factor (vif) and tolerance
here is fine then, the variables is not a near perfect linear combination of the other or the variables
is not capturing the same thing.
SS
df
MS
Model
Residual
3.20427346
.09649219
2
30
1.60213673
.003216406
Total
3.30076565
32
.103148927
CrudeXport
Coef.
_hat
_hatsq
_cons
.7272369
.0181491
1.022852
Std. Err.
1.651505
.1098676
6.196664
Number of obs
F( 2,
30)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.44
0.17
0.17
0.663
0.870
0.870
=
=
=
=
=
=
33
498.11
0.0000
0.9708
0.9688
.05671
4.10006
.2425286
13.67813
The test creates two variables the _hat and _hatsq, the _hatsq should not be significant so
that the predictor of our model is specified correctly. The _hat is the variable of prediction and
_hatsq is the variable of the squared prediction. The primary concern in this test is the test for
_hatsq. Based from the link test _hatsq is not significant and it fails to reject the assumptions that
the model is specified correctly.
The ovtest command shows if the model has omitted a variable that is essential in the
model and supposedly be included in the model.
. ovtest
Ramsey RESET test using powers of the fitted values of CrudeXport
Ho: model has no omitted variables
F(3, 26) =
0.82
Prob > F =
0.4969
The test fails to reject the null hypothesis that the model has no omitted variables, then
there are no omitted variables in the model.
Lastly, using a lag and lead in time series data is prone to autocorrelation. The command
estat bgodfrey for Breusch-Godfey and estat dwatson for Durbin-Watson test for serial
correlation of the error term or disturbance.
. estat bgodfrey
Breusch-Godfrey LM test for autocorrelation
lags(p)
1
chi2
df
4.867
4,
33) =
2.55771
The two tests rejected the null hypothesis of no serial correlation, then the eror term is
serially correlated. Using the Prais-Winsten and Cochrane-orcutt regression, according to the stata
manual prais uses the generalized least-squares method to estimate the parameters in a linear
regression model in which the errors are serially correlated. Specifically, the errors are assumed
to follow a first-order autoregressive process. Using the Prais-Winsten regression, a new estimates
of the coefficient can be obtain adjusted for autocorrelation.
. prais CrudeXport RealOilPrice lagCrudeXport leadCrudeProd, rhotype(theil)
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
4:
5:
rho
rho
rho
rho
rho
rho
=
=
=
=
=
=
0.0000
-0.2901
-0.3060
-0.3064
-0.3064
-0.3064
SS
df
MS
Model
Residual
14.0513271
.084277236
3
29
4.6837757
.002906112
Total
14.1356043
32
.441737635
CrudeXport
Coef.
RealOilPrice
lagCrudeXp~t
leadCrudeP~d
_cons
.0497158
.3543518
.6477729
-.1344191
rho
-.3063733
Std. Err.
.0190028
.0688691
.0662351
.2252445
t
2.62
5.15
9.78
-0.60
Number of obs
F( 3,
29)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.014
0.000
0.000
0.555
=
33
= 1611.70
= 0.0000
= 0.9940
= 0.9934
= .05391
.0885809
.4952049
.7832388
.3262576
Findings
The p-value of the F-test is less than the marginal error of 0.01, the model is significant at
1%. We have evidence to say that at least one of the explanatory variables has able to explain the
variability of crude oil export of UAE. Based from the R-square the independent variables has able
to explain the variability of crude oil export by 99%. Furthermore, all the explanatory variables
are significant by 5% and 1% based from the p-value of the t-test. Since the model had undergone
diagnostic tests and estimates of the coefficients are adjusted in the presence of autocorrelation,
we can now make inferences based from the estimates of the coefficients.
Based from the classical law of supply it is expected that real oil price should have a
positive sign, as well as for future production of crude oil and past exportation of the country. The
coefficient of RealOilprice is the supply elasticity of crude oil for United Arab Emirates. The
elasticity of supply at 0.0497 is inelastic which means that exportation of crude oil is not responsive
to changes in price of crude oil. Crude oil has no close substitute then the demand for this good is
inelastic. United Arab Emirate is a member of the Organization of Petroleum Exporting Country
(OPEC) which has the power of a monopoly to set prices then it coincides with the estimates that
the elasticity of supply of crude oil would be inelastic. A 1% decrease in price would only decrease
exportation by 0.049% since prices is dictated by the OPEC itself. A percent increase in the
previous exportation of crude oil of UAE will increase its present exportation by 0.354 percent,
the percentage of increase in present is lower than the previous because sellers kept supply at low
level to maintain higher price level and also crude oil are non- renewable resources. Lastly, if
anticipated production would increase by 1% exportation of crude oil will increase by 0.67%. The
explanation is quite straightforward, when production of the good increases sellers has more to
supply.
10
References
Chen, X., Ender, P., Mitchell, M. and Wells, C. (2003). Regression with Stata, from
http://www.ats.ucla.edu/stat/stata/webbooks/reg/default.htm
INTERNATIONAL ENERGY AGENCY (IEA), 2014. Energy Supply Security 2104: Emergency
Response of IEA Countries, pp. 502-510
ORGANIZATION of the PETROLEUM EXPORTING COUTTRIES (OPEC) 2015. OPEC Annual
Statistical Bulletin- 50th Edition.
WOOLDRIDGE, JEFFREY, 2009. Introductory Econometrics, Fourth Edition, p. 339-435