Sie sind auf Seite 1von 3

Short Term Prediction - Restricted Data Set Selection

Prediction is one my primary focuses and I only work on short term predictions.

Most people think that prediction is easy with Deep Learning. If this is the case, why do companies
hire Data Scientists ? If this is the case, they just have to buy a “Big Data Machine”, to put Data into
and to harvest predictions. But companies continue to hire Data Scientists.

---

For Deep Learning or Machine Learning, your Data Scientists will separate the data set into “training
set” and “test set”. Master this separation is one of the skill of Data Scientists.

The “Big Data Machine” learn with the training set and output predictions.

Compare them with the test set and you’ll get the accuracy of the prediction.

A rule of thumb is to start by separating data set into 80% training set and 20% test set and change
these ratios for different runs.

---

Take the ausbeer data set available in R package "fpp". (But the following work with all data sets
with inflection).

The ausbeer data set present two parts.

- one part with strongly growing trend


- another part with slightly downward trend

When you separate this dataset into training and test set, your Bid Data Machine will learn on data
with upward and downward trend (training set) and compare the resulting prediction with
downward trend (test set).
For short term prediction, it’s better to use a dataset restricted to the slightly downward part.
Something like the right part of the following figure.

Some “changing points or breaking points detection algorithms” available in R packages such as

- "cpt.mean" and "cpt.var" in R package "changepoint"


- "breakpoints", "Student", “GLR”, "Mann-Whitney", "Bartlett”, "Exponential" in R package
“cpm”

I’ve tried all these methods with default parameters, and no one work as I want. (Probably, I’ve to
be more tenacious, experimenting more set of parameters).
No Breaking points are detected with Exponential Method !

So I’ve developed my own algorithm, BestFit is the Working Title.

Tu-Anh Pho

pho.tuanh@gmail.com

Das könnte Ihnen auch gefallen