Sie sind auf Seite 1von 20

Econometrics

V Semester IIM Tiruchirappalli

Course structure
Course Description Introduction to econometric models and techniques, simultaneous equations, emphasizing regression. Advanced topics include instrumental variables, panel data methods, measurement error, and limited dependent variable and time series models.
Course evaluation Quizzes, Home Assignments Mid-term End-term Project reports (2) & presentations (2) Class attendance, Attitude

10% 25% 25% 30% 10%

Text Books Econometric Methods by Jack Johnston and John DiNardo 4th Edition. Econometric models and economic forecasts by Robert S. Pindyck and Daniel L. Rubinfeld 4th Edition An introduction to applied econometrics a time series approach by Kerry Patterson

Wine talk and other examples


Bordeaux & Burgundies wine 18 -24 months in oak casks & then put for aging in bottles Wine tasting done after 4 months during fermentation; -Influences the wine futures market Robert Parker publishes Wine Spectator and The Wine Advocate Low rainfall concentrated grapes High temperature grapes ripe faster with lower acidity Wine quality = 12.145 + .00117 winter rainfall + .0614 average growing season temperature - .00386 harvest rainfall Ashenfelter published results in Liquid Assets http://www.liquidasset.com/ Journal of Wine Economics
Refer to: http://www.nytimes.com/1990/03/04/us/wine-equation-puts-some-noses-out-of-joint.html?

Baseball & LoJack


Bill James published results in Baseball Abstracts (refer to Michael Lewiss Moneyball) Runs created = (Hits + Walks) Total Bases / (At Bats + Walks) Later, Boston Red Sox, under Theo Epstein won the world championship. SABR Society for American Baseball Research; Sabermetrics Research in baseball LoJack small radio transmitter hidden in car that can be remotely activated when the car is stolen. Are there any positive externalities in installing LoJack? Results As percentage of cars with LoJack increased, the level of auto theft fell. Insurance companies did not give enough discounts to pass on the reduction in payouts of unprotected cars. The impact of LoJack: Ian Ayres and Steven D. Levitt, Measuring the positive externalities from unobservable victim precaution: An empirical analysis of LoJack, 113 Q.J.Econ.43(1998) http://pricetheory.uchicago.edu/levitt/Papers/LevittAyres1998.pdf Studies use regression and randomization to promote better public policy

Framing the right question to solve the problem


If drug dealers are floating in money, why do they still stay with their mothers? Why are street prostitutes like a department store Santa? Why do terrorists tend to be drawn from educated, middle class or high income backgrounds? Why should suicide bombers buy life insurance? How do prostitutes and their customers, or johns, find one another? How much do prostitutes charge for a service, or trick, and how is that price negotiated? If a john prefers not to use a condom, how much more does he have to pay? How does a prostitutes wage compare to what she earns for doing other jobs? What happens when theres a sudden surge in demand for prostitutes, and how do prostitutes meet this demand?
STEVEN D. LEVITT AND SUDHIR ALLADI VENKATESH AN ECONOMIC ANALYSIS OF A DRUG-SELLING GANGS FINANCES Q.J.Econ. (2000). http://pricetheory.uchicago.edu/levitt/Papers/LevittVenkateshAnEconomicAnalysis2000.pdf STEVEN D. LEVITT AND SUDHIR ALLADI VENKATESH An Empirical Analysis of Street-Level Prostitution. 2007.

Freakonomics Super Freakonomics Super Crunchers

Steven D. Levitt & Stephen J. Dubner Steven D. Levitt & Stephen J. Dubner Ian Ayres

Data newer & faster sources


Web data mining Crowd-sourced tracking system (Google trends, Google Flu, Flu near you, GrippeNet.fr, Twitter)

Sources: Google Flu Trends (www.google.org/flutrends); CDC; Flu Near You

Data
Data sources Data definition Cross section data; time series data Panel data - , = ,1 + ,2 , + , , where (i,t) individual I at time t.

Implication of ,1 = 1 & ,2 = 2 . Or, ,2 ~ 2 , 2 Data transformation & aggregation (example, is it better to forecast all component inflation series and then aggregate the forecasts, or is it better to aggregate right away?

Preliminaries
Data cleaning
Non random attrition Sample selection bias (non random sample) Influential observations robust estimation methods
find parameters which minimize 2 =
1 (1 1 2 1 )2 ++ ( 1 2 )2

Appropriate econometric model


Normal distribution of dependent variable ~ (1 + 2 , 2 ) Binary dependent variable ~ () Replace by ( 1 2 ) probit model. Time series data: = + 1 + or = + + Parameter estimation Ordinary least squares all have common variance 2 Generalized least squares have variances 2 Non linear least squares non linear in parameters, 2
+

Preliminaries continued
Alternative estimation methods Maximum likelihood method find such that are most likely values Bayesian method estimate posterior distribution of the parameters using data, model and priors Diagnostics Portmanteau test or model specification test Tests on the error terms Comparing two models Likelihood Ratio (LR) test or Lagrange Multiplier (LM) principle or Wald method

Specification

Examples
Convergence between rich and poor countries Do countries converge in per capita GDP? Or in living standards? Or, instead, are they caught in a poverty trap?

Direct mail target selection


Bas Donkers, Richard Paap, Jedid-Jah Jonker, Philip Hans Franses Deriving target selection rules from endogenously selected samples. Journal of Applied Econometrics Volume 21, Issue 5, pages 549562, July/August 2006. DOI: 10.1002/jae.858. http://ideas.repec.org/a/jae/japmet/v21y2006i5p549-562.html

Forecasting sharp increases in unemployment Censored latent effects autoregressive model

2 = 1 + 2 1 + + with 2 < 1 and ~ 0, and is censored variable + 2 + , 1 + 2 + 0 = 1 0, 1 + 2 + < 0 2 With ~ 0, and an explanatory variable

Franses, Ph.H.B.F and R.Paap Censored latent effects autoregression, with an application to US unemployment Journal of Applied Econometrics Volume 17, Issue 4, pages 347366, July/August 2002. http://hdl.handle.net/1765/1532

Modelling brand choice dynamics


Paap, R.; Franses, P. H. A Dynamic Multinomial Probit Model for Brand Choice with Different Long-run and Short-run Effects of Marketing-Mix Variables JOURNAL OF APPLIED ECONOMETRICS; 15; 717-744.

Examples
Voting decisions Undecided voters tend to fall as elections near. Results show that undecided voters start to make up their minds nine weeks before the national elections. Forecasting weekly temperatures Is the forecast uncertainty for weekly temperatures constant throughout the year?
Franses, Philip Hans, Jack Neele and Dick J.C. van Dijk (2001), Modeling asymmetric volatility in weekly Dutch temperature data, Environmental Modeling and software, 16, 131-137. http://repub.eur.nl/res/pub/1533/

Distribution
Normal distribution

Log-normal distribution Pdf:

Probability density function

Cumulative distribution function

Area under pdf for a normal distribution

Probability density function

Cumulative distribution function

Distribution
The probability density function (pdf) of an exponential distribution is

Probability density function

Cumulative distribution function

The probability density function (pdf) of an cauchy distribution is

Probability density function

Cumulative distribution function

Distribution
Operations on a single random variable If X is distributed normally with mean and variance 2, then The exponential of X is distributed log-normally: eX ~ ln(N(,2)). The absolute value of X has folded normal distribution: |X| ~ Nf(,2). If = 0 this is known as the halfnormal distribution. The square of X/ has the noncentral chi-squared distribution with one degree of freedom: X2/2 ~ 21(2/2). If = 0, the distribution is called simply chi-squared. The distribution of the variable X restricted to an interval [a,b] is called the truncated normal distribution. (X )2 has a Lvy distribution with location 0 and scale 2.

Combination of two independent random variables If X1 and X2 are two independent standard normal random variables with mean 0 and variance 1, then Their sum and difference is distributed normally with mean zero and variance two: X1 X2 N(0, 2). Their product Z = X1X2 follows the "product-normal" distribution with density function fZ(z) = 1K0(|z|), where K0 is the modified Bessel function of the second kind. This distribution is symmetric around zero, unbounded at z = 0, and has the characteristic function Z(t) = (1 + t2)1/2. Their ratio follows the standard Cauchy distribution: X1 X2 Cauchy(0, 1). Their Euclidean norm has the Rayleigh distribution, also known as the chi distribution with 2 degrees of freedom.

Distribution
Combination of two or more independent random variables If X1, X2, , Xn are independent standard normal random variables, then the sum of their squares has the chi-squared distribution with n degrees of freedom If X1, X2, , Xn are independent normally distributed random variables with means and variances 2, then their sample mean is independent from the sample standard deviation, then the ratio of these two quantities will have the Student's t-distribution with n1 degrees of freedom:

If X1, , Xn, Y1, , Ym are independent standard normal random variables, then the ratio of their normalized sums of squares will have the F-distribution with (n, m) degrees of freedom:

Distribution
A discrete random variable X is said to have a Poisson distribution with parameter >0, if for n=0,1, The probability mass function is given as, ; = also its variance.
!

. The real number is equal to expected value of X and

Probability mass function

Cumulative distribution function

Examples: The number of phone calls arriving at a call centre within a minute. The number of goals in sports involving two competing teams.

Central Limit theorem

As the number of discrete events increases, the function begins to resemble a normal distribution

Comparison of probability density functions,p(k) for the sum of n fair 6-sided dice to show their convergence to a normal distribution with increasing n, in accordance to the central limit theorem.

Classical assumptions
1. Regression model is linear, correctly specified with an additive error term. 2. Error term has zero population mean 3. All explanatory variables are uncorrelated with the error term 4. Observations of the error term are uncorrelated with each other (no serial correlation)

5. Error term has constant variance (no heteroskedasticity)


6. No explanatory variable is a perfect linear function of any other explanatory variables 7. Error term is normally distributed

Regression analysis
1. 2. 3. 4. 5. 6. Review the literature and develop the theoretical model Specify the model Hypothesize the expected sign of the coefficients Collect the data. Inspect and clean the data Estimate and evaluate the equation Document the result