Sie sind auf Seite 1von 58

ST 203 : Statistical Models and Data Analysis Lecture 8

Cliord Lam
Department of Statistics London School of Economics and Political Science

Princetonshield

1 / 58

Lecture 8 rundown

Recap from last lecture What is time series? Why are they so important? How do we model them? Were back to Excel

Princetonshield

2 / 58

Time series
Time series (TS) data is any sequence of measurements taken on a response that varies over time Examples:
Weather (pressure, temperature, rainfall; daily, weekly, annual) Health (HIV; white cell count, Cancer; tumor growth) Finance (shares, interest rates, exchange rates)

Princetonshield

3 / 58

TS - why do we care
In the business world TS are the main object of stats analysis Shares, interest rates, real estate prices, price of gold, petrol prices, ination etc. In your future jobs you might need to know something about TS Also TS are important for weather forecasting and in particular lately for identifying global warming

Princetonshield

4 / 58

TS - why do we care
The aim of TS modelling is to understand seasonal (cyclical) and directional trends In order to be able to FORECAST (i.e. predict) the values of the variable of interest on a future date This allows people in the nancial world to make prots By buying or selling shares, options etc.

Princetonshield

5 / 58

TS - forecasting

Forecasting is extremely uncertain Remember when we talked about out-of-sample predictions in MLR? I.e. predicting the outcome for ranges of the explanatory variables that we have not seen We are always less sure about out-of-sample predictions than we are of in-sample predictions

Princetonshield

6 / 58

TS - forecasting

In TS forecasting we only care about out-of-sample prediction This becomes dicult because TS are very variable and often unpredictable:
Markets have crashes and recessions Weather is highly variable

Princetonshield

7 / 58

TS and regression
Consider the data on petrol sales for cars per quarter for 4 years From the scatterplot of quarter (time) versus sales we can see that a linear downward trend would be suitable

Princetonshield

8 / 58

TS and regression
However we shouldnt really t a linear regression as we have dependent error terms Because there is a seasonal trend which shows clearly in the residual plot

Princetonshield

9 / 58

TS and regression
It makes sense that there should be a seasonal trend to petrol sales People travel more in the summer and therefore people buy more petrol then Of the NICEL assumptions I is being violated The errors are not Independent This is quite typical for time series models But there is a linear trend too Need to consider both elements
10 / 58

Princetonshield

TS Components
TS have 4 elements 1 Trend: long term direction of the data - this can be linear or exponential etc.
can be described by a regression

2 Seasonal eects: cycles related to seasons, months, weeks, days of the week 3 Cycles: long term cycles that are not necessarily related to the season - no need to worry in this course 4 Irregular uctuations: random error + blips and market crashes Princetonshield
11 / 58

TS Components
We assume a multiplicative model for how the TS components mix: Time series data= Yt =Trend Seasonality Cyclicality Irregularities = TSCI -typically we focus on T and S as these are easier to predict than long term cycles and irregular uctuations
12 / 58

Princetonshield

TS Components
We use a multiplicative model because is makes more sense than an additive model
Think about the residual plot from before It has a seasonal pattern but was also heteroscedastic (i.e. funnel shaped)

So if we just add seasonality to linearity we dont take into account that for larger time there is also often extra variability A multiplicative model does!

Princetonshield

13 / 58

Trend example
Below are monthly data on petrol sales, as you can see from this longer time series, there is an upward probably linear trend As the data are monthly there is also a monthly (seasonal) trend but it is harder to see

Princetonshield

14 / 58

Irregular uctuations example


Below are monthly data on chemical sales, as you can see from this longer time series, there are a number of irregular jumps with linear trends

Princetonshield

15 / 58

Season example
Below are seasonal data on ice-cream sales, as you can see from this time series, there is a denite seasonal trend to ice-creams sales

Princetonshield

16 / 58

Other examples

Employment will have cycles: recessions have a cyclical nature Irregularities: market crashes Seasonality: More work in the summer Trend: as more people are born more are employed

Princetonshield

17 / 58

TS components
In this course we focus on retrieving the main components of a time series Trend and Seasonality This can get pretty intense before you understand it so please pay attention The way this works is to nd the underlying trend of the time series And then divide the time series by this trend in order to get the seasonal component

Princetonshield

18 / 58

Stationarity

Time series without cycles or seasonality are called stationary I.e. if a time series has only trend and can be explained by a linear regression then it is stationary

Princetonshield

19 / 58

TS components

Let Y represent a time series Y = T SCI, the most general form I is unpredictable C is hard to do unless we know the data cycle T and S can be found so we assume Y = T S

Princetonshield

20 / 58

TS how to
1 S: First we preliminarily nd the trend by Smoothing the data
Think carefully about the type of moving average you might choose

2 M: We preliminarily nd the seasonal component by dividing the time series Y by the trend in the moving average S = Y T 3 S: We then have to get the true seasonal eects by estimating rst seasonal averages and then seasonal indices

Princetonshield

21 / 58

TS how to
4 T:Now that we have a good estimate of seasonality divide data by season to get the real trend T = Y S 5 R: Use a linear regression on the Trend to get the estimate of the trend parameters 6 F: Multiply the Trend forecast to the seasonal estimates and then do the usual residual analyses

Princetonshield

22 / 58

Smoothing
Smoothing is the idea of getting rid of seasonal, cyclical and irregular components of the time series, thus extracting the trend The idea is to summarise what a time series is doing by averaging the data points over a number of time points Some people dont like this and prefer autocorrelation models Well see these later

Princetonshield

23 / 58

Smoothing

There are two main ways of smoothing 1 Moving averages 2 Weighted moving averages 3 Exponential smoothing - later Well do Moving Averages rst

Princetonshield

24 / 58

TS example: Moving average


If we are looking at quarters we should use a 4 point centered moving average If we were looking at months we would use a 12 point centered moving average If we were looking at years we would choose a 3 or 5 point moving average If the moving average has an even number of points it needs to be centered Can also use weighted moving averages

Princetonshield

25 / 58

TS example: Moving average


The moving average is a way of smoothing out the uctuations and seasonality Lets say we want to calculate a 5 and 3 point moving averages 1 5 point MA x5 = t xt2 + xt1 + xt + xt+1 + xt+2 5

2 3 point MA x3 = t xt1 + xt + xt+1 3

Princetonshield

26 / 58

TS example: Moving average

Princetonshield

27 / 58

TS example: Moving average


The data are annual so we look at 3 and 5 pt moving averages These does not have to be centered As you can see there is an upward trend in the data (exponential or quadratic?) The 3PtMA starts at point 2 and the 5PtMA starts at point 3 and they both nish early as they need all the points above and below for the estimates

Princetonshield

28 / 58

TS example: Moving average

Princetonshield

29 / 58

Weighted moving average

Instead of just using simple averages we can try using weighted moving averages These give more weight to values close to the time we are estimating it for E.g in our minks example we use a 5 point MAs where the furthermost points worth less
x5 = t xt2 + 2xt1 + 4xt + 2xt+1 + xt+2 10

Princetonshield

30 / 58

TS example: Moving average


The weighted moving average formula looks like this in Excel

Princetonshield

31 / 58

Princetonshield

32 / 58

TS example: Moving average


Consider the example of car petrol price per gallon in dollars per quarter over 4 years It has a downward trend It has a seasonal (quarterly) component

Princetonshield

33 / 58

TS example: Moving average


For quarters we need a 4 point centered moving average. If the MA is even then we also have to center it: In this example we calculate a 4 point centered moving average
1 4 point MA x4 = t xt2 + xt1 + xt + xt+1 4

2 Say we get x4 and x4 t (t+1) , the centered 4 point MA is x4 + x4 t (t+1) 4c xt = 2 Princetonshield


34 / 58

TS example: Moving average


In our example we have quarters so we calculate a 4 point centered MA First a 4 point MA

Princetonshield

35 / 58

TS example: Moving average


Then we center it:

Princetonshield

36 / 58

TS example: Moving average


As you can see, the MAs smooth the seasonal trend, it no longer goes up and down as much as the original time series It now looks much more linear This can also be shown in the plot The MAs only start after the rst 2-3 values of the time series as you need at least that many to smooth Also they end early for the same reason

Princetonshield

37 / 58

TS example: Moving average

Princetonshield

38 / 58

Seasonal components
Once weve gotten the hang of the trend and smoothed out the season if there is one We can try and isolate the seasonal element by dividing the time series by the trend Remember that Y = T S if there are no cycles or irregularities so Y = S T What we estimate is the Ratio-to-moving-average (R2MA) In our petrol example there was a seasonal component so lets nd the R2MA
39 / 58

Princetonshield

TS example: R2MA
R2M At =
Yt 4ptCMAt

Princetonshield

40 / 58

Seasonal components

We now have an idea of the seasonal trends However, if you think about it, we want one estimate for the seasonal component for each quarter Currently we have 3 for each quarter (were looking over 4 years) Remember that because we have to leave out the rst couple and the last couple of values to get the 4 point centered MA we only have the R2MA from the 3rd Quarter till the 14th The best thing is to list them Princetonshield
41 / 58

TS example: R2MA

Princetonshield

42 / 58

Seasonal components
Once weve listed them we get average values for the seasonal component for each season For each season we have 3 values so we average them for each season. If some seasons have dierent number of values we take the average of them anyway. E.g. if for Autumn we have 2 values only then we take the average of these 2 values for Autumn.

Princetonshield

43 / 58

Seasonal Average

Princetonshield

44 / 58

Seasonal components - Seasonal averages

Next we look at the sum of the seasonal averages In this case it is 4.0013, very close to 4. The idea of seasonality is it to imagine it as going up and down around the trendline We want it to be on average 1 each year so we want the average of the four seasonal averages to be exactly 4 Remember that we are using a multiplicative model - so 1 is similar to 0 in additive models Princetonshield
45 / 58

Seasonal components

We have to do estimate the seasonal indices The way we do this is by normalising the seasonal averages so they sum to 4 E.g. for summer we do Summer.Index = Summer.Average 4 Sum.of.Avgs

Princetonshield

46 / 58

Seasonal Indices

Princetonshield

47 / 58

De-seasonalising
Now we have a grip on the seasonal component of the data We want to get a better grip on the trend So we use the same trick we used before: Y = T S so Y = T S To get a de-seasonalised trend for the petrol data we divide Petrol by the Seasonal Index From the plot we can see that the deseasonalised data now just looks regularly linear
48 / 58

Princetonshield

De-seasonalising

Princetonshield

49 / 58

De-seasonalising

Princetonshield

50 / 58

Trend

What we do now with the trend is to t a linear regression to it You do this in the usual way and I wont go over it I just keep the intercept and the coecient of Quarter

Princetonshield

51 / 58

Trend

Princetonshield

52 / 58

Forecasting
We have the Seasonal indices (seasonal component) We have the trend line (linear regression) The aim is now to forecast the next few points 1 Use the linear regression to forecast(predict) the trend values for quarters 17-20 2 Multiply the regression predictions by the appropriate seasonal index These are the forecasts

Princetonshield

53 / 58

Forecast

Princetonshield

54 / 58

Forecast
We now look at the plot of the predicted time series As you see the downward trend is continuing and we see the seasonal patter repeating itself

Princetonshield

55 / 58

Real data
We actually have the data for those next 4 quarters As you see there was a crash in petrol prices in 1985 (where the quarter 16 ends) We could not have predicted that without further information

Princetonshield

56 / 58

Real data
At the end of 1985 there was a pretty big crash in the price of crude oil that lasted all the way through to 1986 This was due to a maneuver by OPEC countries to secure their future in the market in the face of competition from other countries, e.g the USA Crude oil production has gone down pretty much since then

Princetonshield

57 / 58

Main points to take away


Time series analysis violates assumptions I from NICEL Time series have 4 components Trend, Seasonality, Cyclicality and Irregularity We can use Moving averages to smooth the time series and identify a trend We can use SMSTRF to nd moving average, seasonal index, better estimated trend by linear regression, and forecast

Princetonshield

58 / 58

Das könnte Ihnen auch gefallen