Sie sind auf Seite 1von 348

Applied Econometrics using MATLAB

James P. LeSage Department of Economics University of Toledo

October, 1999

Preface

This text describes a set of MATLAB functions that implement a host of econometric estimation methods. Toolboxes are the name given by the MathWorks to related sets of MATLAB functions aimed at solving a par- ticular class of problems. Toolboxes of functions useful in signal processing, optimization, statistics, finance and a host of other areas are available from the MathWorks as add-ons to the standard MATLAB software distribution. I use the term Econometrics Toolbox to refer to the collection of function libraries described in this book. The intended audience is faculty and students using statistical methods, whether they are engaged in econometric analysis or more general regression modeling. The MATLAB functions described in this book have been used in my own research as well as teaching both undergraduate and graduate econometrics courses. Researchers currently using Gauss, RATS, TSP, or SAS/IML for econometric programming might find switching to MATLAB advantageous. MATLAB software has always had excellent numerical algo- rithms, and has recently been extended to include: sparse matrix algorithms, very good graphical capabilities, and a complete set of object oriented and graphical user-interface programming tools. MATLAB software is available on a wide variety of computing platforms including mainframe, Intel, Apple, and Linux or Unix workstations. When contemplating a change in software, there is always the initial investment in developing a set of basic routines and functions to support econometric analysis. It is my hope that the routines in the Econometrics Toolbox provide a relatively complete set of basic econometric analysis tools. The toolbox also includes a number of functions to mimic those available in Gauss, which should make converting existing Gauss functions and ap- plications easier. For those involved in vector autoregressive modeling, a complete set of estimation and forecasting routines is available that imple- ment a wider variety of these estimation methods than RATS software. For example, Bayesian Markov Chain Monte Carlo (MCMC) estimation of VAR

i

ii

models that robustify against outliers and accommodate heteroscedastic dis- turbances have been implemented. In addition, the estimation functions for error correction models (ECM) carry out Johansen’s tests to determine the number of cointegrating relations, which are automatically incorporated in the model. In the area of vector autoregressive forecasting, routines are available for VAR and ECM methods that automatically handle data trans- formations (e.g. differencing, seasonal differences, growth rates). This allows users to work with variables in raw levels form. The forecasting functions carry out needed transformations for estimation and return forecasted values in level form. Comparison of forecast accuracy from a wide variety of vector autoregressive, error correction and other methods is quite simple. Users can avoid the difficult task of unraveling transformed forecasted values from alternative estimation methods and proceed directly to forecast accuracy comparisons. The collection of around 300 functions and demonstration programs are organized into libraries that are described in each chapter of the book. Many faculty use MATLAB or Gauss software for research in econometric analysis, but the functions written to support research are often suitable for only a single problem. This is because time and energy (both of which are in short supply) are involved in writing more generally applicable functions. The functions described in this book are intended to be re-usable in any number of applications. Some of the functions implement relatively new Markov Chain Monte Carlo (MCMC) estimation methods, making these accessible to undergraduate and graduate students with absolutely no programming involved on the students part. Many of the automated features available in the vector autoregressive, error correction, and forecasting functions arose from my own experience in dealing with students using these functions. It seemed a shame to waste valuable class time on implementation details when these can be handled by well-written functions that take care of the details. A consistent design was implemented that provides documentation, ex- ample programs, and functions to produce printed as well as graphical pre- sentation of estimation results for all of the econometric functions. This was accomplished using the “structure variables” introduced in MATLAB Version 5. Information from econometric estimation is encapsulated into a single variable that contains “fields” for individual parameters and statistics related to the econometric results. A thoughtful design by the MathWorks allows these structure variables to contain scalar, vector, matrix, string, and even multi-dimensional matrices as fields. This allows the econometric functions to return a single structure that contains all estimation results. These structures can be passed to other functions that can intelligently de-

iii

cipher the information and provide a printed or graphical presentation of the results. The Econometrics Toolbox should allow faculty to use MATLAB in un- dergraduate and graduate level econometrics courses with absolutely no pro- gramming on the part of students or faculty. An added benefit to using MATLAB and the Econometrics Toolbox is that faculty have the option of implementing methods that best reflect the material in their courses as well

as their own research interests. It should be easy to implement a host of ideas and methods by: drawing on existing functions in the toolbox, extending

these functions, or operating on the results from these functions. As there is an expectation that users are likely to extend the toolbox, examples of how

to accomplish this are provided at the outset in the first chapter. Another

way to extend the toolbox is to download MATLAB functions that are avail- able on Internet sites. (In fact, some of the routines in the toolbox originally came from the Internet.) I would urge you to re-write the documentation

for these functions in a format consistent with the other functions in the toolbox and return the results from the function in a “structure variable”.

A detailed example of how to do this is provided in the first chapter.

In addition to providing a set of econometric estimation routines and doc- umentation, the book has another goal. Programming approaches as well as design decisions are discussed in the book. This discussion should make it easier to use the toolbox functions intelligently, and facilitate creating new

functions that fit into the overall design, and work well with existing toolbox routines. This text can be read as a manual for simply using the existing functions in the toolbox, which is how students tend to approach the book.

It can also be seen as providing programming and design approaches that

will help implement extensions for research and teaching of econometrics. This is how I would think faculty would approach the text. Some faculty

in Ph.D. programs expect their graduate students to engage in econometric

problem solving that requires programming, and certainly this text would eliminate the burden of spending valuable course time on computer pro- gramming and implementation details. Students in Ph.D. programs receive the added benefit that functions implemented for dissertation work can be easily transported to another institution, since MATLAB is available for almost any conceivable hardware/operating system environment. Finally, there are obviously omissions, bugs and perhaps programming errors in the Econometrics Toolbox. This would likely be the case with any

such endeavor. I would be grateful if users would notify me when they en- counter problems. It would also be helpful if users who produce generally useful functions that extend the toolbox would submit them for inclusion.

iv

Much of the econometric code I encounter on the internet is simply too specific to a single research problem to be generally useful in other appli- cations. If econometric researchers are serious about their newly proposed estimation methods, they should take the time to craft a generally useful MATLAB function that others could use in applied research. Inclusion in the Econometrics Toolbox would also have the benefit of introducing the method to faculty teaching econometrics and their students. The latest version of the Econometrics Toolbox functions can be found on the Internet at: http://www.econ.utoledo.edu under the MATLAB gallery icon. Instructions for installing these functions are in an Appendix to this text along with a listing of the functions in the library and a brief description of each.

Acknowledgements

First I would like to acknowledge my students who suffered through previous buggy and error filled versions of the toolbox functions. I owe a great deal to their patience and persistence in completing projects under such conditions. I would like to thank Michael R. Dowd would was brave enough to use early versions of the vector autoregressive library routines in his research. In addition to providing feedback regarding the functions and reading early drafts of the text, Mike also provided LaTeX assistance to produce this text. David C. Black is another faculty colleague who tested many of the functions and read early drafts of the text. Michael Magura who has always been a great editor and helpful critic also read early drafts of the text. Some of the functions in the Econometrics Toolbox represent algorithms crafted by others who unselfishly placed them in the public domain on the Internet. The authors names and affiliations have been preserved in these functions. In all cases, the documentation was modified to match the format of the Econometrics Toolbox and in many cases, the arguments returned by the function were converted to MATLAB structure variables. In other cases, functions written in other languages were converted to the MATLAB language. I am indebted to those who implemented these routines as well. The following persons have generously contributed their functions for inclusion in the Econometrics Toolbox. C. F. Baum from Boston College, Mike Cliff from UNC, Matt Hergott a Duke B-school graduate and Tony E. Smith from University of Pennsylvania. Tony also spent a great deal of time looking at the spatial econometrics functions and reporting problems and bugs. Many others have sent e-mail comments regarding errors and I am grate- ful to all of you.

v

Contents

1 Introduction

 

1

2 Regression using MATLAB

 

5

 

2.1 Design of the regression library

 

6

2.2 The ols function

8

2.3 Selecting a least-squares algorithm

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

12

2.4 Using the results structure .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

17

2.5 Performance profiling the regression toolbox

 

28

2.6 Using the regression library

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

30

2.6.1 A Monte Carlo experiment

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

31

2.6.2 Dealing with serial correlation

.

.

.

.

.

.

.

.

.

.

.

.

.

32

2.6.3 Implementing statistical tests

 

38

2.7 Chapter summary

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

41

Chapter 2 Appendix

 

42

3

Utility Functions

45

3.1 Calendar function utilities

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

45

3.2 Printing and plotting matrices

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

49

3.3 Data transformation utilities

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

65

3.4 Gauss functions .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

69

3.5 Wrapper functions

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

73

3.6 Chapter summary

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

76

Chapter 3 Appendix

 

77

4

Regression Diagnostics

 

80

4.1 Collinearity diagnostics and procedures

 

.

.

.

.

.

.

.

.

.

.

.

.

80

4.2 Outlier diagnostics and procedures

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

94

4.3 Chapter summary

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

100

 

vi

CONTENTS

vii

Chapter 4 Appendix

 

101

5

VAR and Error Correction Models

 

103

5.1 VAR models

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

103

5.2 Error correction models

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

113

5.3 Bayesian variants .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

125

5.3.1

Theil-Goldberger estimation of these models

.

.

.

.

.

138

5.4 Forecasting the models .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

139

5.5 Chapter summary

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

145

Chapter 5 Appendix

 

148

6

Markov Chain Monte Carlo Models

 

151

6.1 The Bayesian Regression Model

 

154

6.2 The Gibbs Sampler .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

156

6.2.1 Monitoring convergence of the sampler

 

159

6.2.2 Autocorrelation estimates

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

163

6.2.3 Raftery-Lewis diagnostics

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

163

6.2.4 Geweke diagnostics .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

165

6.3 A heteroscedastic linear model

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

169

6.4 Gibbs sampling functions

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

175

6.5 Metropolis sampling

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

184

6.6 Functions in the Gibbs sampling library

.

.

.

.

.

.

.

.

.

.

.

.

190

6.7 Chapter summary

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

197

Chapter 6 Appendix

 

199

7

Limited Dependent Variable Models

 

204

7.1 Logit and probit regressions

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

206

7.2 Gibbs sampling logit/probit models

 

211

7.2.1

The probit g function

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

218

7.3 Tobit models

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

220

7.4 Gibbs sampling Tobit models

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

224

7.5 Chapter summary

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

227

Chapter 7 Appendix

 

228

8

Simultaneous Equation Models

 

230

8.1 Two-stage least-squares models

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

230

8.2 Three-stage least-squares models

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

235

8.3 Seemingly unrelated regression models

.

.

.

.

.

.

.

.

.

.

.

.

.

240

CONTENTS

viii

 

8.4

Chapter summary

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

244

Chapter 8 Appendix

 

246

9

Distribution functions library

 

247

9.1 The pdf, cdf, inv and rnd functions

 

248

9.2 The specialized functions

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

249

9.3 Chapter summary

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

256

Chapter 9 Appendix

 

257

10

Optimization functions library

 

260

10.1 Simplex optimization .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

261

 

10.1.1 Univariate simplex optimization

.

.

.

.

.

.

.

.

.

.

.

.

261

10.1.2 Multivariate simplex optimization

.

.

.

.

.

.

.

.

.

.

.

268

 

10.2 EM algorithms for optimization

 

269

10.3 Multivariate gradient optimization

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

278

10.4 Chapter summary

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

287

Chapter 10 Appendix

 

288

11

Handling sparse matrices

 

289

11.1 Computational savings with sparse matrices

 

289

11.2 Estimation using sparse matrix algorithms

 

297

11.3 Gibbs sampling and sparse matrices

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

304

11.4 Chapter summary

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

309

Chapter 11 Appendix

 

310

References

313

Appendix: Toolbox functions

 

320

List of Examples

1.1

Demonstrate regression using the ols() function

 

8

2.2

Least-squares timing information

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

12

2.3

Profiling the ols() function .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

28

2.4

Using the ols() function for Monte Carlo

 

31

2.5

Generate a model with serial correlation

.

.

.

.

.

.

.

.

.

.

.

.

33

2.6

Cochrane-Orcutt iteration

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

34

2.7

Maximum likelihood estimation

 

37

2.8

Wald’s F-test

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

38

2.9

LM specification test

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

40

3.1

Using the cal() function

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

47

3.2

Using the tsdate() function

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

48

3.3

Using cal() and tsdates() functions

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

49

3.4

Reading and printing time-series

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

49

3.5

Using the lprint() function .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

51

3.6

Using the tsprint() function

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

56

3.7

Various tsprint() formats .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

58

3.8

Truncating time-series and the cal() function

.

.

.

.

.

.

.

.

.

60

3.9

Common errors using the cal() function

 

.

.

.

.

.

.

.

.

.

.

.

.

60

3.10

Using the tsplot() function

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

61

3.11

Finding time-series turning points with fturns()

 

64

3.12

Seasonal differencing with sdiff() function

 

.

.

.

.

.

.

.

.

.

.

.

67

3.13

Annual growth rates using growthr() function

.

.

.

.

.

.

.

.

68

3.14

Seasonal dummy variables using sdummy() function

 

68

3.15

Least-absolute deviations using lad() function

 

.

.

.

.

.

.

.

.

71

4.1

Collinearity experiment

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

82

4.2

Using the bkw() function

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

86

4.3

Using the ridge() function

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

88

4.4

Using the rtrace() function

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

90

4.5

Using the theil() function

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

92

4.6

Using the dfbeta() function

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

94

 

ix

LIST OF EXAMPLES

x

4.7

Using the robust() function

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

97

4.8

Using the pairs() function

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

99

5.1

Using the var() function

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

105

5.2

Using the pgranger() function

 

107

5.3

VAR with deterministic variables

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

110

5.4

Using the lrratio() function

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

111

5.5

Impulse response functions .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

112

5.6

Using the adf() and cadf() functions

.

.

.

.

.

.

.

.

.

.

.

.

.

.

116

5.7

Using the johansen() function

 

119

5.8

Estimating error correction models

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

123

5.9

Estimating BVAR models

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

128

5.10

Using bvar() with general weights

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

130

5.11

Estimating RECM models .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

137

5.12

Forecasting VAR models .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

140

5.13

Forecasting multiple related models

.

.

.

.

.

.

.

.

.

.

.

.

.

.

141

5.14

A forecast accuracy experiment

 

143

6.1

A simple Gibbs sampler

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

157

6.2

Using the coda() function

.

.

.

.

.

.