Beruflich Dokumente
Kultur Dokumente
Abstract
Theil (1968) proposed a transformation of regression residuals so that they are
best (minimizes the trace of its covariance matrix), linear, unbiased and subject to the
constraint that its covariance matrix is scalar (BLUS) in the sense that it is proportional
to the identity matrix. Despite their desirable theoretical properties Theil’s tests for
autocorrelation and heteroscedasticity using BLUS residuals are not much used by
researchers, perhaps because of computational difficulties. My R program is checked
against Ford (2008), who provides an example with implementations in Eviews and SAS
software. Vinod (2010) suggests going beyond testing by making efficient adjustments
to overcome the ill effects of non-scalar covariances. Links for R software to implement
those tools are provided here near the end of the paper. I hope that my R software will
help researchers fill a gap in the literature by studying the size and power of Theil’s tests
in comparison with other tests in the literature, and begin to focus on simultaneously
overcoming these two common problems.
which plays an important role in statistical diagnostics and inference. The vector of BLUS
residuals are developed in Theil (1968) as a transformation of r and claimed to be more
useful for inference than r.
Note that the matrix M is symmetric and idempotent whose (T − p) eigenvalues are 1
and p eigenvalues are zero. The trace (sum of diagonals) of M is T − p. M = (I − H)
and M X = (X − X) = 0, a matrix of zeros hold true. Substituting (1) in (4) we have an
important relation between residuals and true unknown regression errors.
r = M y = M ε, (5)
where the residuals are obviously not ‘independent’ since they are orthogonal to the column
space of X (by construction) implying that
X 0 r = 0, (6)
([XX0−1 ]0 [XX0−1 ])−1 = [(X00 )−1 X 0 XX00 ]−1 = X0 (X 0 X)−1 X00 . (8)
Now consider the eigenvalue eigenvector decomposition of the last matrix expression of
eq.(8):
X0 (X 0 X)−1 X00 (9)
which is symmetric and positive definite. Hence its eigenvalues are always positive and may
be denoted as d2i , and whose eigenvectors are denoted as: gi for i = 1, · · · , p. Using these
eigenvalues and outer products of eigenvectors we define a p × p matrix
p
di
[gi gi0 ]
X
Z= (10)
i=1 1 + d i
u = r1 − X1 X0−1 Z r0 . (11)
The BLUS residuals are widely known in theoretical Econometrics. For example, Chow
(1976) provides an alternative derivation of BLUS residuals. However, due to computational
difficulty they are not popular. Vougas (1980) uses them for unit root testing. I hope that
availability of the new R software provided in Section 2 can make them more accessible.
3
1.1 Von-Neumann ratio test for autocorrelation
(Theil, 1971, sec. 5.4) describes the modified Von Neumann ratio test statistic
PT −p−1
0 t=1 (ut+1 − ut )2
Q = (13)
(T − p)s2
where u are defined in eq. (11). Theil’s Appendix also provides Tables of one-tailed test
against positive or negative autocorrelations at (0.1, 1 , 5)% levels.
The popular Durbin-Watson (DW) test, which also uses the relation (5) in its derivation,
is subject to an indeterminate range of values (DWL , DWU ) of the statistic wherein we cannot
either accept or reject the null hypothesis (regarding autocorrelated regression errors). An
appeal of the test based on eq. (13) is that one can avoid such indeterminate range, which
can be important for small samples. Section 2 provides an R function called ‘Qdash’ to
implement the test.
making sure that the number of terms in the numerator and denominator are equal. Section
2 provides an R function called ‘hete’ to implement the test in equations (14) and (15).
BLUS residuals allow testing of both autocorrelation and heteroscedasticity, since both
problems may be simultaneously present. More research is needed comparing the power and
size properties of these tests based on BLUS residuals. Vinod (2010) describes R tools for
overcoming both autocorrelation and heteroscedasticity.
4
rm(list=ls()) #clean up R memory and remove the prompt
options(prompt = " ", continue = " ", width = 68,
useFancyQuotes = FALSE)
#print(date())
The following R code contains comments to help reader understand the steps taken using
the definitions from the previous section. It defines an R function called ‘blus’ with two
arguments ‘y’ which is a T × 1 vector and ‘x’ is a T × (p − 1) matrix of regressors. Our
definition of X has a column of ones as the additional first column for the intercept. The
user of the function ‘blus’ should not provide the column of ones.
# blus residuals
blus = function(y, x) {
T = length(y)
u = resid(lm(y ~ x))
ane = rep(1, T)
bigx = cbind(ane, x)
p = NCOL(bigx)
u0 = u[1:p]
# print(u0)
u1 = u[(p + 1):T]
X0 = bigx[1:p, 1:p]
X0inv = solve(X0)
# print(X0inv)
X1 = bigx[(p + 1):T, ]
xtxinv = solve(t(bigx) %*% bigx)
mtx = X0 %*% xtxinv %*% t(X0)
ei = eigen(mtx, symmetric = TRUE)
disq = ei$values
# print(c("disq=", disq),quote=FALSE)
di = sqrt(disq)
c1 = di/(1 + di) # p constants
q = ei$vectors # matrix of eigenvectors
sum1 = matrix(0, p, p) #initialize to 0 before summing
for (i in 1:p) {
qi = q[, i] # ith column is the eigenvector
mtx2 = qi %*% t(qi) #outer product
# print(mtx2)
sum1 = sum1 + (c1[i] * mtx2) #p by p matrix
# print(sum1)
} #end of the for loop
5
u1blus = u1 - X1 %*% v1 #(T-p) by 1 vector
return(u1blus)
}
# Examples blus(y,x) b1=blus(y,cbind(x1,x2))
We applied the above function to the data and example with T = 100 in Ford (2008) having
two regressors which has been implemented in the literature using Eviews and SAS. Assuming
that the data for y, x1 and x2 are in the memory of R, we issue the following commands.
R=blus(y,cbind(x1,x2))
cbind(R,Eviews,SAS)
The side-by-side estimates of the vector of BLUS residuals by R, Eviews and SAS software
programs are given below in the form of R output.
R Eviews SAS
[1,] 2.474975028 2.4750 2.4749
[2,] -0.242299798 -0.2423 -0.2423
[3,] 1.309228165 1.3092 1.3092
[4,] -0.332577320 -0.3326 -0.3325
[5,] 0.805051654 0.8051 0.8051
[6,] -0.992374033 -0.9924 -0.9924
[7,] -0.270666400 -0.2707 -0.2707
[8,] -1.361790435 -1.3618 -1.3618
[9,] 0.232320116 0.2323 0.2324
[10,] -0.661842061 -0.6618 -0.6619
....
Several lines are omitted for brevity.
....
[89,] 0.900319449 0.9003 0.9003
[90,] -0.457601186 -0.4576 -0.4576
[91,] -0.057414171 -0.0574 -0.0574
[92,] 0.439014636 0.4390 0.4390
[93,] 0.501753239 0.5018 0.5018
[94,] -0.168714062 -0.1687 -0.1687
[95,] -1.118825156 -1.1188 -1.1189
[96,] 0.716946513 0.7169 0.7170
[97,] -2.116965075 -2.1170 -2.1170
This Table shows that our implementation of BLUS residuals agrees with that in other
software programs. Now we provide code for computing Theil’s test statistics described in
Sections 1.1 and 1.2.
6
# bring the function blus in R memory
Qdash = function(y, x) {
bigt = length(y)
p = NCOL(x) + 1
r = resid(lm(y ~ x))
s22 = sum(r^2)
u = blus(y, x)
su1 = 0
for (i in 1:(bigt - p - 1)) {
su1 = su1 + (u[i + 1] - u[i])^2
}
qd = su1/s22
print(c("Q prime statistic =", qd), quote = FALSE)
return(qd)
}
#now test hetero
hete=function(y,x){
bigt = length(y)
p = NCOL(x) + 1
u = blus(y, x)
ndash=floor((bigt-p)/2)
nd=(bigt-p)/2
nlow=ndash+1
if (nd>ndash) nlow=ndash+2
num=sum(u[1:ndash]^2)
den=sum(u[nlow:(bigt-p)]^2)
fndash=num/den
print(c("n prime =",ndash,"statistic=",fndash ),quote=FALSE)
qf1=qf(0.95,df1=ndash, df2=ndash)
print(c("95 percent critical value=", qf1),quote=FALSE)
list(ndash=ndash,fndash=fndash)
}
#Now an illustrative examples for using the above functions
set.seed(342)
y=runif(20);x1=runif(20);x2=runif(20)
Qdash(y,cbind(x1,x2))
hete(y,cbind(x1,x2))
The output of an illustrative example showing the use of these two functions is as follows.
We have T = 20, p = 3.
Qdash(y,cbind(x1,x2))
[1] Q prime statistic = 1.29446047305466
7
[1] 1.29446
hete(y,cbind(x1,x2))
[1] n prime = 8 statistic= 0.607158355696447
[1] 95 percent critical value= 3.43810123337316
$ndash
[1] 8
$fndash
[1] 0.6071584
The first three autocorrelations for the residuals of the regression of y on x1 and x2 of
the artificial example are: (0.123 0.193 0.027). Hence we are concerned about positively
correlated errors. The degrees of freedom for the Q0 test for this artificial example are
17. From Theil’s tables the 5% level value is 1.207 for a one-sided test against positive
autocorrelation. The tabled value for a one-sided test against negative autocorrelation is
2.809. Since the observed Q’ (=1.2945) is larger than the former, we do not rule out positively
autocorrelated errors.
The conclusion regarding heteroscedasticity is that since observed F value 0.6072 is
smaller than the critical value 3.4381, computed here by using the R function ‘qf’ with
suitable degrees of freedom, we reject heteroscedasticity. This conclusion agrees with the
conclusion based on the Breusch-Pagan test implemented by the R function ‘bptest’ of the
R package called ‘lmtest’, where the p-value for the null hypothesis of homoscedasticity is
0.2241.
3 Final Remarks
In conclusion, the R software here will help researchers fill a gap in the literature by studying
the size and power of Theil’s BLUS-based tests in comparison with other tests in the liter-
ature. Of course, it is even better to go beyond testing. One can use the R tools described
in Vinod (2010), to simultaneously overcome autocorrelation and heteroscedasticity. The R
software using generalized least squares (GLS) for efficiently correcting for both autocorre-
lation and heteroscedasticity is available for download as described below.
1. The following code is needed because the function ‘sort.matrix’ is called by my other
programs. This function sorts a matrix on a specified column while carrying along all
other rows. http://www.fordham.edu/economics/vinod/sort.matrix.txt.
2. This code provides initial OLS estimation, estimation of autocorrelations among resid-
uals and heteroscedasticity indicated by squared residuals and finally generalized least
squares (GLS) estimation correcting both problems. http://www.fordham.edu/economics/
vinod/autohetero.txt.
3. Following code describes the simulation discussed in Vinod (2010) showing that my
tools work. This code snippet may not be complete, but will give the interested reader
8
enough software tools and hints for evaluating my proposals. http://www.fordham.
edu/economics/vinod/hetero5b.noprint.txt.
References
Chow, G. N. (1976), “A Note on the Derivation of Theil’s BLUS Residuals,” Econometrica,
44, 609–610.
Ford, G. S. (2008), “BLUS Residuals for Eviews,” SSRN eLibrary, URL http://ssrn.com/
abstract=1293947.
Theil, H. (1968), “A Simplification of the BLUS Procedure for Analyzing Regression Distur-
bances,” Journal of the American Statistical Association, 63, 242–251.
— (2010), “Superior Estimation and Inference Avoiding Heteroscedasticity and Flawed Piv-
ots: R-example of Inflation Unemployment Trade-Off,” in “Advances in Social Science
Research Using R,” , ed. Vinod, H., New York, USA: Springer, pp. 39–63.
Vougas, D. (1980), “Unit root testing based on BLUS residuals,” Statistics and Probability
Letters, 78 (13), 1–9.