Beruflich Dokumente
Kultur Dokumente
1. (a) Why does OLS estimation involve taking vertical deviations of the points to
the line rather than horizontal distances?
(b) Why are the vertical distances squared before being added together?
(c) Why are the squares of the vertical distances taken rather than the absolute
values?
2. Explain, with the use of equations, the difference between the sample regression
function and the population regression function.
3. What is an estimator? Is the OLS estimator superior to all other estimators? Why
or why not? 4. What five assumptions are usually made about the unobservable
error terms in the classical linear regression model (CLRM)? Briefly explain the
meaning of each. Why are these assumptions made?
E(Ri)= Rf +βi[E(Rm)− Rf ]
The first step in using the CAPM is to estimate the stock’s beta using the market
model.
where Rit is the excess return for security i at time t, Rmt is the excess return on a
proxy for the market portfolio at time t, and ut is an iid random disturbance term.
The coefficient beta in this case is also the CAPM beta for security i.
Suppose that you had estimated and found that the estimated value of beta for a
stock, ˆ β was 1.147. The standard error associated with this coefficient SE( ˆ β) is
estimated to be 0.0548.
A city analyst has told you that this security closely follows the market, but that it
is no more risky, on average, than the market. This can be tested by the null
hypotheses that the value of beta is one. The model is estimated over 62 daily
observations. Test this hypothesis against a one-sided alternative that the security
is more risky than the market, at the 5% level. Write down the null and alternative
hypothesis. What do you conclude? Are the analyst’s claims empirically verified?
7. The analyst also tells you that shares in Chris Mining PLC have no systematic
risk, in other words that the returns on its shares are completely unrelated to
movements in the market. The value of beta and its standard error are calculated to
be 0.214 and 0.186, respectively. The model is estimated over 38 quarterly
observations. Write down the null and alternative hypotheses. Test this null
hypothesis against a two-sided alternative.
8. Form and interpret a 95% and a 99% confidence interval for beta using the
figures given in question 7.
9. Are hypotheses tested concerning the actual values of the coefficients (i.e. β) or
their estimated values (i.e. ˆ β) and why?
10. Using EViews, select one of the other stock series from the ‘capm.wk1’ file
and estimate a CAPM beta for that stock. Test the null hypothesis that the true beta
is one and also test the null hypothesis that the true alpha (intercept) is zero. What
are your conclusions?
Review questions3
(a) H0 : β3 =2
(b) H0 : β3 +β4 =1
(e) H0 : β2β3 =1
4. Which would you expect to be bigger – the unrestricted residual sum of squares
or the restricted residual sum of squares, and why?
6. You estimate a regression of the form given by (3.52) below in order to evaluate
the effect of various firm-specific factors on the returns of a sample of firms. You
run a cross-sectional regression with 200 firms
Calculate the t-ratios. What do you conclude about the effect of each variable on
the returns of the security? On the basis of your results, what variables would you
consider deleting from the regression? If a stock’s beta increased from 1 to 1.2,
what would be the expected effect on the stock’s return? Is the sign on beta as you
would have expected? Explain your answers in each case.
(b) R2,
where ut and vt are iid disturbances and x3t is an irrelevant variable which does
not enter into the data generating process for yt. Will the value of
(a) R2,
(b) Adjusted R2, be higher for the second model than the first? Explain your
answers.
9. Re-open the CAPM E views file and estimate CAPM betas for each of the other
stocks in the file.
(a) Which of the stocks, on the basis of the parameter estimates you obtain, would
you class as defensive stocks and which as aggressive stocks? Explain your
answer.
(b) Is the CAPM able to provide any reasonable explanation of the overall
variability of the returns to each of the stocks over the sample period? Why or why
not?
10. Re-open the Macro file and apply the same APT-type model to some of the
other time-series of stock returns contained in the CAPM-file.
(a) Run the stepwise procedure in each case. Is the same sub-set of variables
selected for each stock? Can you rationalize the differences between the series
chosen?
(b) Examine the sizes and signs of the parameters in the regressions in each case –
do these make sense?
Review questions
1. Are assumptions made concerning the unobservable error terms (Ɛí) or about
their sample counterparts, the estimated residuals (ˆ Ɛí))? Explain your answer.
2. What pattern(s) would one like to see in a residual plot and why?
3. A researcher estimates the following model for stock market returns, but thinks
that there may be a problem with it. By calculating the t-ratios, and considering
their significance and by examining the value of R2 or otherwise, suggest what the
problem might be.
4. (a) State in algebraic notation and explain the assumption about the CLRM’s
disturbances that is referred to by the term ‘homoscedasticity’.
(b) What would the consequence be for a regression model if the errors were not
homoscedastic?
(c) How might you proceed if you found that(b) were actually the case?
5. (a) What do you understand by the term ‘autocorrelation’?
(b) An econometrician suspects that the residuals of her model might be auto
correlated. Explain the steps involved in testing this theory using the Durbin–
Watson (DW) test.
(c) The econometrician follows your guidance (!!!) in part (b) and calculates a
value for the Durbin–Watson statistic of 0.95. The regression has 60 quarterly
observations and three explanatory variables (plus a constant term). Perform the
test. What is your conclusion?
(d) In order to allow for autocorrelation, the econometrician decides to use a model
in first differences with a constant
By attempting to calculate the long-run solution to this model, explain what might
be a problem with estimating models entirely in first differences.
(e) The econometrician finally settles on a model with both first differences and
lagged levels terms of the variables
+β7x4t−1 +vt
Δ yt = β1 +β2x2t +β3
7. What might Ramsey’s RESET test be used for? What could be done if it were
found that the RESET test has been failed?
8. (a) Why is it necessary to assume that the disturbances of a regression model are
normally distributed?
(b) In a practical econometric modeling situation, how might the problem that the
residuals are not normally distributed be addressed?
(b) A financial econometrician thinks that the stock market crash of October 1987
fundamentally changed the risk–return relationship given by the CAPM equation.
He decides to test this hypothesis using a Chow test. The model is estimated using
monthly data from January 1980–December 1995, and then two separate
regressions are run for the sub-periods corresponding to data before and after the
crash.
The model is
so that the excess return on a security at time t is regressed upon the excess return
on a proxy for the market portfolio at time t. The results for the three models
estimated for shares in British Airways (BA) are as follows:
1981M1–1995M12
1981M1–1987M10
1987M11–1995M12
(c) What are the null and alternative hypotheses that are being tested here, in terms
of α and β? (d) Perform the test.
10. For the same model as above, and given the following results, do a forward and
backward predictive failure test:
1981M1–1995M12
rt =0.0215+1.491rmt RSS=0.189 T =180 (4.83)
1981M1–1994M12
1982M1–1995M12
13. Re-open the ‘macro.wf1’ and apply the stepwise procedure including all of the
explanatory variables as listed above, i.e. erased dropped credit inflation money
spread term with a strict 5% threshold criterion for inclusion in the model. Then
examine the resulting model both financially and statistically by investigating the
signs, sizes and significances of the parameter estimates and by conducting all of
the diagnostic tests for model adequacy.
Key points unit 1 Introduction
■ The three steps in applying financial econometrics are model selection, model
estimation, and model testing. In model selection, the modeler chooses a family of
models with given statistical properties. Financial economic theory is used to
justify the model choice. The financial econometric tool used is determined in this
step.
■ Data mining is an approach to model selection based solely on the data and,
although useful, must be used with great care because the risk is that the model
selected might capture special characteristics of the sample which will not repeat in
the future.
■ Model testing is needed because model selection and model estimation are
performed on historical data and, as a result, there is the risk that the estimation
process captures characteristics that are specific to the sample data used but are not
general and will not necessarily reappear in future samples.
■ Model testing involves assessing the model’s performance using fresh data. The
procedure for doing so is called back testing and the most popular way of doing so
is using a moving window.
■ The data generating process refers to the mathematical model that represents
future data in function of past and present data. By knowing the data generating
process as a mathematical expression, computer programs that simulate data using
Monte Carlo methods can be implemented and the data generated can be used to
compute statistical quantities that would be difficult or even impossible to compute
mathematically.
■ when the linear regression model includes only one explanatory variable, the
model is said to be a simple linear regression.
■ The error term, or the residual, in a simple linear regression model measures the
error that is due to the variation in the dependent variable that is not due to the
explanatory variable.
■ The error term is assumed to be normally distributed with zero mean and
constant variance.
■ The parameters of a simple linear regression model are estimated using the
method of ordinary least squares and provides a best linear unbiased estimate of
the parameter.
■ A multiple linear regression is a linear regression that has more than one
independent or explanatory variable.
■ There are three assumptions regarding the error terms in a multiple linear
regression:
■ The three steps involved in designing a multiple linear regression model are
(3) evaluating the quality of the model with respect to the given data (diagnosis of
the model).
■ There are criteria for diagnosing the quality of a model. The tests used involve
statistical tools from inferential statistics. The estimated regression errors play an
important role in these tests and the tests accordingly are based on the three
assumptions about the error terms.
■ The first test is for the statistical significance of the multiple coefficient of
determination, which is the ratio of the sum of squares explained by the regression
and the total sum of squares.
■ If the standard deviation of the regression errors from a proposed model is found
to be too large, the fit could be improved by an alternative specification. Some of
the variance of the errors may be attributable to the variation in some independent
variable not considered in the model.
■ An analysis of variance test is used to test for the statistical significance of the
entire model.
A Correlation is a number between -1 and 1 that indicates how well a straight line
represents a series of points. A value greater than one means it shows a positive slope; a
value less than one, a negative slope. The farther away the correlation is from 0, the less
accurately a straight line describes the data.
What is the difference b/n stochastic error term and residual?
the residual is the difference between the observed Y and the estimated regression
line(Y), while the error term is the difference between the observed Y and the true
regression equation (the expected value of Y). Error term is theoretical concept that can
never be observed, but the residual is a real-world value that is calculated for each
observation every time a regression is run. The reidual can be thought of as an estimate of
the error term, and e could have been denoted as ^e.
It is very important to remember that correlation and regression measure only the linear
relationship between variables. A symmetrical relationshup, for example, y = x2 between
values of x with equal magnitudes (-a < x < a), has a correlation coefficient of 0, and the
regression line will be a horizontal line. Also, a relationship found using correlation or
regression need not be causal.
share with friends
A Correlation is a number between -1 and 1 that indicates how well a straight line
represents a series of points. A value greater than one means it shows a positive slope; a
value less than one, a negative slope. The farther away the correlation is from 0, the less
accurately a straight line describes the data.
What is the difference b/n stochastic error term and residual?
the residual is the difference between the observed Y and the estimated regression
line(Y), while the error term is the difference between the observed Y and the true
regression equation (the expected value of Y). Error term is theoretical concept that can
never be observed, but the residual is a real-world value that is calculated for each
observation every time a regression is run. The reidual can be thought of as an estimate of
the error term, and e could have been denoted as ^e.
It is very important to remember that correlation and regression measure only the linear
relationship between variables. A symmetrical relationshup, for example, y = x2 between
values of x with equal magnitudes (-a < x < a), has a correlation coefficient of 0, and the
regression line will be a horizontal line. Also, a relationship found using correlation or
regression need not be causal.
What does the term residual mean?
Let's say that you fit a simple regression line y = mx + b to a set of (x,y) data points. In a
typical research situation the regression line will not touch all of the points; it might not
touch any of them. The vertical difference between the y-co-ordinate of one of the data
points and the y value of the regression line for the x-co-ordinate of that data point is
called a residual.
A Correlation is a number between -1 and 1 that indicates how well a straight line
represents a series of points. A value greater than one means it shows a positive slope; a
value less than one, a negative slope. The farther away the correlation is from 0, the less
accurately a straight line describes the data.
What is the difference b/n stochastic error term and residual?
the residual is the difference between the observed Y and the estimated regression
line(Y), while the error term is the difference between the observed Y and the true
regression equation (the expected value of Y). Error term is theoretical concept that can
never be observed, but the residual is a real-world value that is calculated for each
observation every time a regression is run. The reidual can be thought of as an estimate of
the error term, and e could have been denoted as ^e.
It is very important to remember that correlation and regression measure only the linear
relationship between variables. A symmetrical relationshup, for example, y = x2 between
values of x with equal magnitudes (-a < x < a), has a correlation coefficient of 0, and the
regression line will be a horizontal line. Also, a relationship found using correlation or
regression need not be causal.
What does the term residual mean?
Let's say that you fit a simple regression line y = mx + b to a set of (x,y) data points. In a
typical research situation the regression line will not touch all of the points; it might not
touch any of them. The vertical difference between the y-co-ordinate of one of the data
points and the y value of the regression line for the x-co-ordinate of that data point is
called a residual.
A Correlation is a number between -1 and 1 that indicates how well a straight line
represents a series of points. A value greater than one means it shows a positive slope; a
value less than one, a negative slope. The farther away the correlation is from 0, the less
accurately a straight line describes the data.
What is the difference b/n stochastic error term and residual?
the residual is the difference between the observed Y and the estimated regression
line(Y), while the error term is the difference between the observed Y and the true
regression equation (the expected value of Y). Error term is theoretical concept that can
never be observed, but the residual is a real-world value that is calculated for each
observation every time a regression is run. The reidual can be thought of as an estimate of
the error term, and e could have been denoted as ^e.
It is very important to remember that correlation and regression measure only the linear
relationship between variables. A symmetrical relationshup, for example, y = x2 between
values of x with equal magnitudes (-a < x < a), has a correlation coefficient of 0, and the
regression line will be a horizontal line. Also, a relationship found using correlation or
regression need not be causal.
What does the term residual mean?
Let's say that you fit a simple regression line y = mx + b to a set of (x,y) data points. In a
typical research situation the regression line will not touch all of the points; it might not
touch any of them. The vertical difference between the y-co-ordinate of one of the data
points and the y value of the regression line for the x-co-ordinate of that data point is
called a residual.
Population Regression Function vs Sample
Regression Function?
1.
2.
But we can Draw one SRF for one sample from that population.
4.
PRF curve or line is the locus of the conditional mean/ expectation of the independent variable Y
for the fixed variable X in a sample data.
SRF shows the estimated relation between dependent variable Y and explanatory variable X in a
sample.
In a regression of a time series that states data as a function of calendar year, what requirement
of regression is violated?
It all depends on what data set you're working with. There a quite a number of different
regression analysis models that range the gambit of all functions you can think of. Obviously
some are more useful than others. Logistic regression is extremely useful for population
modelling because population growth follows a logistic curve. The final goal for any regression
analysis is to have a mathematical function that most closely fits your data, so advantages and
disadvantages depend entirely upon that.
Regression analysis is based on the assumption that the dependent variable is distributed
according some function of the independent variables together with independent identically
distributed random errors. If the error terms were not stochastic then some of the properties of
the regression analysis are not valid.
To take a simple case, let's suppose you have a set of pairs (x1, y1), (x2, y2), ... (xn, yn). You have
obtained these by choosing the x values and then observing the corresponding y values
experimentally. This set of pairs would be called a sample.
For whatever reason, you assume that the y's and related to the x's by some function f(.), whose
parameters are, say, a1, a2, ... . In far the most frequent case, the y's will be assumed to be a
simple linear function of the x's: y = f(x) = a + bx.
Since you have observed the y's experimentally they will almost always be subject to some error.
Therefore, you apply some statistical method for obtaining an estimate of f(.) using the sample of
pairs that you have.
This estimate can be called the sample regression function. (The theoretical or 'true' function f(.)
would simply be called the regression function, because it does not depend on the sample.)
Not a function because it should not map one value to many (eg square root).
Not the regression coefficient since for an even function it would be 0.
The F-ratio can be expressed as a function of the R^2 only under certain assumptions (e.g. linear
regression model). There are econometric models where the R^2 is not meaningfully defined or
the F-ratio cannot be expressed in terms of the R^2, but you can still carry out an F-test, .
What are some examples where the mean the median and the mode might be the
same?
The answer above displays a sample in which the sample mean, sample median and sample
mode assume the same value.
If you were asking about populations, then the population mean, population median and
population mode are the same whenever the probability density function for the population is
symmetric. For example, the normal probability density function is symmetric, the t and uniform
density functions are symmetric. Many are.
Is it possible for a function that has a horizontal asymptote to attain the value of
an asymptote?
Yes.
Think of a function that starts at the origin, increases rapidly at first and then decays gradually to
an asymptotic value of 0. It will have attained its asymptotic value at the start.
For example, the Fisher F distribution, which is often used, in statistics, to test the significance of
regression coefficients. Follow the link for more on the F distribution.
Here's how you do it in Excel: use the function =STDEV(<range with data>). That function
calculates standard deviation for a sample.
Mean is the sum of several values of the same type (x1, x2,..., xN ) divided by the number of
values.
Mean = (x1 + x2 + ... xN ) /N
The Least square method is used when doing a regression of a cloud of point { (x1,y1), (x2,y2)
etc. } by a function (linear, parabolic hyperbolic etc.). With this special algorithm we get the
closest function f (x) to approximated the cloud of point.
f(x, Beta) ~ y
Beta = (XTX)-1XT Y = coefficients of the regression
The points must be in 2 dimensions, because the methods needs to derivate the function f.
I think that the least square mean is not the proper term because you have a function f ... What is
the mean of f (x) = a *x + b ??? See.
The difference between multicollinearity and auto correlation is that multicollinearity is a linear
relationship between 2 or more explanatory variables in a multiple regression while while auto-
correlation is a type of correlation between values of a process at different points in time, as a
function of the two times or of the time difference.
For Classical Regression Model the OLS or Ordinary Least Squares - estimators (or the betas)
are BLUE (Best, Linear, Unbiased, Estimator) when :
1. The regression is linear in the coefficients, it is correctly specified and has an additive
error term.
2. Mean of the error term is zero. (Include a constant term in the regression (B0 which will
force the mean to be zero)
3. The independent variables are not correlated with the error term. (If they are correlated
then the betas will be biased.)
4. Observations of the error term (the residuals) are not correlated with each other.
5. The error term has a constant variance (Homoskedasticity)
6. No independent variable is a perfect linear function of any of the other independent
variable. (If this is true - multicollinearity will occur)
I will assume that you are asking about probability distribution functions. There are two types:
discrete and continuous. Some might argue that a third type exists, which is a mix of discrete and
continuous distributions.
When representing discrete random variables, the probability distribution is probability mass
function or "pmf." For continuous distributions, the theoretical distribution is the probability
density function or "pdf."
Some textbooks will call pmf's as discrete probability distributions.
Common pmf's are binomial, multinomial, uniform discrete and Poisson.
Common pdf's are the uniform, normal, log-normal, and exponential.
Two common pdf's used in sample size, hypothesis testing and confidence intervals are the "t
distribution" and the chi-square. Finally, the F distribution is used in more advanced hypothesis
testing and regression.
Cox model applies to observations in time (i.e. processes, or functions of t). The true likelihood
for that function would be a function of (functions of t), obtained by expressing the probability in
a space of (functions of t) as
[density]*[reference measure on (functions of t)]
The factor [density] would be the true likelihood.
The partial likelihood is a factor of [density] involving only the parameters of interest:
[density] = [partial likelihood]*[....]
There is no point in working with the full likelihood, in the sense that the nice properties of the
MLE apply to parameters from a finite dimensional space, and would not automatically apply to
the full likelihood in the space of (functiosn of t).
That is why, for example, one needs to rework the large sample theory of estimators based on
partial likelihood.
I've included links to both these terms. Definitions from these links are given below. Correlation
and regression are frequently misunderstood terms. Correlation suggests or indicates that a linear
relationship may exist between two random variables, but does not indicate whether X causes
Yor Y causes X. In regression, we make the assumption that X as the independent variable can
be related to Y, the dependent variable and that an equation of this relationship is useful.
Definitions from Wikipedia: In probability theory and statistics, correlation (often measured as
a correlation coefficient) indicates the strength and direction of a linear relationship between
two random variables. In statistics, regression analysis refers to techniques for the modeling and
analysis of numerical data consisting of values of a dependent variable (also called a response
variable) and of one or more independent variables (also known as explanatory variables or
predictors). The dependent variable in the regression equation is modeled as a function of the
independent variables, corresponding parameters ("constants"), and an error term. The error term
is treated as a random variable. It represents unexplained variation in the dependent variable.
The parameters are estimated so as to give a "best fit" of the data. Most commonly the best fit is
evaluated by using the least squares method, but other criteria have also been used.
Given any sample size there are many samples of that size that can be drawn from the
population. In the population is N and the sample size in n, then there are NCn, but remember that
the population can be infinite.
A test statistic is a value that is calculated from only the observations in a sample (no unknown
parameters are estimated). The value of the test statistic will change from sample to sample. The
sampling distribution of a test statistic is the probability distribution function for all the values
that the test statistic can take across all possible samples.
The line of best fit is found by statistical calculations which this site is too crude for. Look up
least squares regression equation if you really wish to follow up. The slope of a graph is the
slope of the tangent to the graph curve at the point in question. If the function of the graph is y =
f(x) then this is the limit, as dx tends to 0, of [f(x + dx) - f(x)]/dx.
A random variable is a function that assigns unique numerical values to all possible outcomes of
a random experiment.
A real valued function defined on a sample space of an experiment is also called random
variable.
a function is a added to the iverse function and multiply the SQURED AND CUBIC OR ethc......
Logarithmic Function
A reduced chi-square value, calculated after a nonlinear regression has been performed, is the is
the Chi-Square value divided by the degrees of freedom (DOF). The degrees of freedom in this
case is N-P, where N is the number of data points and P is the number of parameters in the fitting
function that has been used. I have added a link, which explains better the advantages of
calculating the reduced chi-square in assessing the goodness of fit of a non-linear regression
equation. In fitting an equation to the data, it is possible to also "over fit", which is to account for
small and random errors in the data, with additional parameters. The reduced chi-square value
will increase (show a worse fit) if the addition of a parameter does not significantly improve the
fit. You can also do a search on reduced chi-square value to better understand its importance.
How does the graph of the Mandelbrot set function relate to composite
functions?
The Mandelbrot graph is generated iteratively and so is a function of a function of a function ...
and in that sense it is a composite function.
A formula or graph are two ways to describe a math function. How a math function is described
depends on the domain of the function or the complexity of the function.
No, an function only contains a certain amount of vertices; leaving a logarithmic function to
NOT be the inverse of an exponential function.
Often it is that the two means are the same. But more generally, it is that some function of the
two means is zero.
Function of calorimeter?
WHAT IS THE FUNCTION OF CALORIMETER?
Its function is to temperate the heat of an object.
Is A function with a graph that is symmetric about the origin an even function?
An even function is symmetric about the y-axis. If a function is symmetric about the origin, it is
odd.
That is not a function, although it does involve the function of addition. A function is something
that is done to numbers.
Yes, the sine function is a periodic function. It has a period of 2 pi radians or 360 degrees.
Zero order hold is used in Digital - Analogue converters (DACs). It literally holds the digital
signal for the sample time, then moves to the next digital sample and holds that signal for the