Sie sind auf Seite 1von 348

Applied Econometrics using MATLAB

James P. LeSage Department of Economics University of Toledo

October, 1999

Preface

This text describes a set of MATLAB functions that implement a host of econometric estimation methods. Toolboxes are the name given by the MathWorks to related sets of MATLAB functions aimed at solving a par- ticular class of problems. Toolboxes of functions useful in signal processing, optimization, statistics, ﬁnance and a host of other areas are available from the MathWorks as add-ons to the standard MATLAB software distribution. I use the term Econometrics Toolbox to refer to the collection of function libraries described in this book. The intended audience is faculty and students using statistical methods, whether they are engaged in econometric analysis or more general regression modeling. The MATLAB functions described in this book have been used in my own research as well as teaching both undergraduate and graduate econometrics courses. Researchers currently using Gauss, RATS, TSP, or SAS/IML for econometric programming might ﬁnd switching to MATLAB advantageous. MATLAB software has always had excellent numerical algo- rithms, and has recently been extended to include: sparse matrix algorithms, very good graphical capabilities, and a complete set of object oriented and graphical user-interface programming tools. MATLAB software is available on a wide variety of computing platforms including mainframe, Intel, Apple, and Linux or Unix workstations. When contemplating a change in software, there is always the initial investment in developing a set of basic routines and functions to support econometric analysis. It is my hope that the routines in the Econometrics Toolbox provide a relatively complete set of basic econometric analysis tools. The toolbox also includes a number of functions to mimic those available in Gauss, which should make converting existing Gauss functions and ap- plications easier. For those involved in vector autoregressive modeling, a complete set of estimation and forecasting routines is available that imple- ment a wider variety of these estimation methods than RATS software. For example, Bayesian Markov Chain Monte Carlo (MCMC) estimation of VAR

i

ii

models that robustify against outliers and accommodate heteroscedastic dis- turbances have been implemented. In addition, the estimation functions for error correction models (ECM) carry out Johansen’s tests to determine the number of cointegrating relations, which are automatically incorporated in the model. In the area of vector autoregressive forecasting, routines are available for VAR and ECM methods that automatically handle data trans- formations (e.g. diﬀerencing, seasonal diﬀerences, growth rates). This allows users to work with variables in raw levels form. The forecasting functions carry out needed transformations for estimation and return forecasted values in level form. Comparison of forecast accuracy from a wide variety of vector autoregressive, error correction and other methods is quite simple. Users can avoid the diﬃcult task of unraveling transformed forecasted values from alternative estimation methods and proceed directly to forecast accuracy comparisons. The collection of around 300 functions and demonstration programs are organized into libraries that are described in each chapter of the book. Many faculty use MATLAB or Gauss software for research in econometric analysis, but the functions written to support research are often suitable for only a single problem. This is because time and energy (both of which are in short supply) are involved in writing more generally applicable functions. The functions described in this book are intended to be re-usable in any number of applications. Some of the functions implement relatively new Markov Chain Monte Carlo (MCMC) estimation methods, making these accessible to undergraduate and graduate students with absolutely no programming involved on the students part. Many of the automated features available in the vector autoregressive, error correction, and forecasting functions arose from my own experience in dealing with students using these functions. It seemed a shame to waste valuable class time on implementation details when these can be handled by well-written functions that take care of the details. A consistent design was implemented that provides documentation, ex- ample programs, and functions to produce printed as well as graphical pre- sentation of estimation results for all of the econometric functions. This was accomplished using the “structure variables” introduced in MATLAB Version 5. Information from econometric estimation is encapsulated into a single variable that contains “ﬁelds” for individual parameters and statistics related to the econometric results. A thoughtful design by the MathWorks allows these structure variables to contain scalar, vector, matrix, string, and even multi-dimensional matrices as ﬁelds. This allows the econometric functions to return a single structure that contains all estimation results. These structures can be passed to other functions that can intelligently de-

iii

cipher the information and provide a printed or graphical presentation of the results. The Econometrics Toolbox should allow faculty to use MATLAB in un- dergraduate and graduate level econometrics courses with absolutely no pro- gramming on the part of students or faculty. An added beneﬁt to using MATLAB and the Econometrics Toolbox is that faculty have the option of implementing methods that best reﬂect the material in their courses as well

as their own research interests. It should be easy to implement a host of ideas and methods by: drawing on existing functions in the toolbox, extending

these functions, or operating on the results from these functions. As there is an expectation that users are likely to extend the toolbox, examples of how

to accomplish this are provided at the outset in the ﬁrst chapter. Another

way to extend the toolbox is to download MATLAB functions that are avail- able on Internet sites. (In fact, some of the routines in the toolbox originally came from the Internet.) I would urge you to re-write the documentation

for these functions in a format consistent with the other functions in the toolbox and return the results from the function in a “structure variable”.

A detailed example of how to do this is provided in the ﬁrst chapter.

In addition to providing a set of econometric estimation routines and doc- umentation, the book has another goal. Programming approaches as well as design decisions are discussed in the book. This discussion should make it easier to use the toolbox functions intelligently, and facilitate creating new

functions that ﬁt into the overall design, and work well with existing toolbox routines. This text can be read as a manual for simply using the existing functions in the toolbox, which is how students tend to approach the book.

It can also be seen as providing programming and design approaches that

will help implement extensions for research and teaching of econometrics. This is how I would think faculty would approach the text. Some faculty

in Ph.D. programs expect their graduate students to engage in econometric

problem solving that requires programming, and certainly this text would eliminate the burden of spending valuable course time on computer pro- gramming and implementation details. Students in Ph.D. programs receive the added beneﬁt that functions implemented for dissertation work can be easily transported to another institution, since MATLAB is available for almost any conceivable hardware/operating system environment. Finally, there are obviously omissions, bugs and perhaps programming errors in the Econometrics Toolbox. This would likely be the case with any

such endeavor. I would be grateful if users would notify me when they en- counter problems. It would also be helpful if users who produce generally useful functions that extend the toolbox would submit them for inclusion.

iv

Much of the econometric code I encounter on the internet is simply too speciﬁc to a single research problem to be generally useful in other appli- cations. If econometric researchers are serious about their newly proposed estimation methods, they should take the time to craft a generally useful MATLAB function that others could use in applied research. Inclusion in the Econometrics Toolbox would also have the beneﬁt of introducing the method to faculty teaching econometrics and their students. The latest version of the Econometrics Toolbox functions can be found on the Internet at: http://www.econ.utoledo.edu under the MATLAB gallery icon. Instructions for installing these functions are in an Appendix to this text along with a listing of the functions in the library and a brief description of each.

Acknowledgements

First I would like to acknowledge my students who suﬀered through previous buggy and error ﬁlled versions of the toolbox functions. I owe a great deal to their patience and persistence in completing projects under such conditions. I would like to thank Michael R. Dowd would was brave enough to use early versions of the vector autoregressive library routines in his research. In addition to providing feedback regarding the functions and reading early drafts of the text, Mike also provided LaTeX assistance to produce this text. David C. Black is another faculty colleague who tested many of the functions and read early drafts of the text. Michael Magura who has always been a great editor and helpful critic also read early drafts of the text. Some of the functions in the Econometrics Toolbox represent algorithms crafted by others who unselﬁshly placed them in the public domain on the Internet. The authors names and aﬃliations have been preserved in these functions. In all cases, the documentation was modiﬁed to match the format of the Econometrics Toolbox and in many cases, the arguments returned by the function were converted to MATLAB structure variables. In other cases, functions written in other languages were converted to the MATLAB language. I am indebted to those who implemented these routines as well. The following persons have generously contributed their functions for inclusion in the Econometrics Toolbox. C. F. Baum from Boston College, Mike Cliﬀ from UNC, Matt Hergott a Duke B-school graduate and Tony E. Smith from University of Pennsylvania. Tony also spent a great deal of time looking at the spatial econometrics functions and reporting problems and bugs. Many others have sent e-mail comments regarding errors and I am grate- ful to all of you.

v

Contents

 1 Introduction 1 2 Regression using MATLAB 5 2.1 Design of the regression library 6 2.2 The ols function 8 2.3 Selecting a least-squares algorithm . . . . . . . . . . . . . . . 12 2.4 Using the results structure . . . . . . . . . . . . . . . . . . . . 17 2.5 Performance proﬁling the regression toolbox 28 2.6 Using the regression library . . . . . . . . . . . . . . . . . . . 30 2.6.1 A Monte Carlo experiment . . . . . . . . . . . . . . . 31 2.6.2 Dealing with serial correlation . . . . . . . . . . . . . 32 2.6.3 Implementing statistical tests 38 2.7 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . 41 Chapter 2 Appendix 42 3 Utility Functions 45 3.1 Calendar function utilities . . . . . . . . . . . . . . . . . . . . 45 3.2 Printing and plotting matrices . . . . . . . . . . . . . . . . . 49 3.3 Data transformation utilities . . . . . . . . . . . . . . . . . . 65 3.4 Gauss functions . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.5 Wrapper functions . . . . . . . . . . . . . . . . . . . . . . . . 73 3.6 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . 76 Chapter 3 Appendix 77 4 Regression Diagnostics 80 4.1 Collinearity diagnostics and procedures . . . . . . . . . . . . 80 4.2 Outlier diagnostics and procedures . . . . . . . . . . . . . . . 94 4.3 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . 100 vi

CONTENTS

vii

 Chapter 4 Appendix 101 5 VAR and Error Correction Models 103 5.1 VAR models . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.2 Error correction models . . . . . . . . . . . . . . . . . . . . . 113 5.3 Bayesian variants . . . . . . . . . . . . . . . . . . . . . . . . . 125 5.3.1 Theil-Goldberger estimation of these models . . . . . 138 5.4 Forecasting the models . . . . . . . . . . . . . . . . . . . . . . 139 5.5 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . 145 Chapter 5 Appendix 148 6 Markov Chain Monte Carlo Models 151 6.1 The Bayesian Regression Model 154 6.2 The Gibbs Sampler . . . . . . . . . . . . . . . . . . . . . . . . 156 6.2.1 Monitoring convergence of the sampler 159 6.2.2 Autocorrelation estimates . . . . . . . . . . . . . . . . 163 6.2.3 Raftery-Lewis diagnostics . . . . . . . . . . . . . . . . 163 6.2.4 Geweke diagnostics . . . . . . . . . . . . . . . . . . . . 165 6.3 A heteroscedastic linear model . . . . . . . . . . . . . . . . . 169 6.4 Gibbs sampling functions . . . . . . . . . . . . . . . . . . . . 175 6.5 Metropolis sampling . . . . . . . . . . . . . . . . . . . . . . . 184 6.6 Functions in the Gibbs sampling library . . . . . . . . . . . . 190 6.7 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . 197 Chapter 6 Appendix 199 7 Limited Dependent Variable Models 204 7.1 Logit and probit regressions . . . . . . . . . . . . . . . . . . . 206 7.2 Gibbs sampling logit/probit models 211 7.2.1 The probit g function . . . . . . . . . . . . . . . . . . 218 7.3 Tobit models . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 7.4 Gibbs sampling Tobit models . . . . . . . . . . . . . . . . . . 224 7.5 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . 227 Chapter 7 Appendix 228 8 Simultaneous Equation Models 230 8.1 Two-stage least-squares models . . . . . . . . . . . . . . . . . 230 8.2 Three-stage least-squares models . . . . . . . . . . . . . . . . 235 8.3 Seemingly unrelated regression models . . . . . . . . . . . . . 240

CONTENTS

viii

 8.4 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . 244 Chapter 8 Appendix 246 9 Distribution functions library 247 9.1 The pdf, cdf, inv and rnd functions 248 9.2 The specialized functions . . . . . . . . . . . . . . . . . . . . 249 9.3 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . 256 Chapter 9 Appendix 257 10 Optimization functions library 260 10.1 Simplex optimization . . . . . . . . . . . . . . . . . . . . . . . 261 10.1.1 Univariate simplex optimization . . . . . . . . . . . . 261 10.1.2 Multivariate simplex optimization . . . . . . . . . . . 268 10.2 EM algorithms for optimization 269 10.3 Multivariate gradient optimization . . . . . . . . . . . . . . . 278 10.4 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . 287 Chapter 10 Appendix 288 11 Handling sparse matrices 289 11.1 Computational savings with sparse matrices 289 11.2 Estimation using sparse matrix algorithms 297 11.3 Gibbs sampling and sparse matrices . . . . . . . . . . . . . . 304 11.4 Chapter summary . . . . . . . . . . . . . . . . . . . . . . . . 309 Chapter 11 Appendix 310 References 313 Appendix: Toolbox functions 320

List of Examples

 1.1 Demonstrate regression using the ols() function 8 2.2 Least-squares timing information . . . . . . . . . . . . . . . . 12 2.3 Proﬁling the ols() function . . . . . . . . . . . . . . . . . . . . 28 2.4 Using the ols() function for Monte Carlo 31 2.5 Generate a model with serial correlation . . . . . . . . . . . . 33 2.6 Cochrane-Orcutt iteration . . . . . . . . . . . . . . . . . . . . 34 2.7 Maximum likelihood estimation 37 2.8 Wald’s F-test . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.9 LM speciﬁcation test . . . . . . . . . . . . . . . . . . . . . . . 40 3.1 Using the cal() function . . . . . . . . . . . . . . . . . . . . . 47 3.2 Using the tsdate() function . . . . . . . . . . . . . . . . . . . 48 3.3 Using cal() and tsdates() functions . . . . . . . . . . . . . . . 49 3.4 Reading and printing time-series . . . . . . . . . . . . . . . . 49 3.5 Using the lprint() function . . . . . . . . . . . . . . . . . . . . 51 3.6 Using the tsprint() function . . . . . . . . . . . . . . . . . . . 56 3.7 Various tsprint() formats . . . . . . . . . . . . . . . . . . . . . 58 3.8 Truncating time-series and the cal() function . . . . . . . . . 60 3.9 Common errors using the cal() function . . . . . . . . . . . . 60 3.10 Using the tsplot() function . . . . . . . . . . . . . . . . . . . 61 3.11 Finding time-series turning points with fturns() 64 3.12 Seasonal diﬀerencing with sdiﬀ() function . . . . . . . . . . . 67 3.13 Annual growth rates using growthr() function . . . . . . . . 68 3.14 Seasonal dummy variables using sdummy() function 68 3.15 Least-absolute deviations using lad() function . . . . . . . . 71 4.1 Collinearity experiment . . . . . . . . . . . . . . . . . . . . . 82 4.2 Using the bkw() function . . . . . . . . . . . . . . . . . . . . 86 4.3 Using the ridge() function . . . . . . . . . . . . . . . . . . . . 88 4.4 Using the rtrace() function . . . . . . . . . . . . . . . . . . . 90 4.5 Using the theil() function . . . . . . . . . . . . . . . . . . . . 92 4.6 Using the dfbeta() function . . . . . . . . . . . . . . . . . . . 94 ix

LIST OF EXAMPLES

x

 4.7 Using the robust() function . . . . . . . . . . . . . . . . . . . 97 4.8 Using the pairs() function . . . . . . . . . . . . . . . . . . . . 99 5.1 Using the var() function . . . . . . . . . . . . . . . . . . . . . 105 5.2 Using the pgranger() function 107 5.3 VAR with deterministic variables . . . . . . . . . . . . . . . . 110 5.4 Using the lrratio() function . . . . . . . . . . . . . . . . . . . 111 5.5 Impulse response functions . . . . . . . . . . . . . . . . . . . . 112 5.6 Using the adf() and cadf() functions . . . . . . . . . . . . . . 116 5.7 Using the johansen() function 119 5.8 Estimating error correction models . . . . . . . . . . . . . . . 123 5.9 Estimating BVAR models . . . . . . . . . . . . . . . . . . . . 128 5.1 Using bvar() with general weights . . . . . . . . . . . . . . . 130 5.11 Estimating RECM models . . . . . . . . . . . . . . . . . . . . 137 5.12 Forecasting VAR models . . . . . . . . . . . . . . . . . . . . . 140 5.13 Forecasting multiple related models . . . . . . . . . . . . . . 141 5.14 A forecast accuracy experiment 143 6.1 A simple Gibbs sampler . . . . . . . . . . . . . . . . . . . . . 157 6.2 Using the coda() function . . . . . .