Sie sind auf Seite 1von 12

Forecasting for Business - Blog -

http://blog.lokad.com/journal/category/forecasting

Call us: +1 (716) 989 6531 or email at: contact@lokad.com

1 sur 12

22/12/2011 14:29

Forecasting for Business - Blog -

http://blog.lokad.com/journal/category/forecasting

Entries in forecasting (45) Seasonality illustrated


Monday, September 19, 2011 at 12:00PM

Seasonality is one of the strongest statistical pattern that can be leveraged to refine forecasts. Below, 4 time-series aggregated at the weekly level (159 weeks). Historical data are in red and forecasts are in purple. Vertical gray markers indicate January 1st.

When illustrating seasonality, everyone (Lokad's included) tend to use long time-series, much like the first three series here above. Indeed, it's more visual and more appealing. However, long time-series do not represent your usual situation. On average consumer goods have a lifespan of no more than 3 or 4 years. Thus, long time-series are typically a small minority in your dataset. Worse, those long time-series might be outliers that do not reflect the behavior of other shorter-lived products. Here above, the short 4th time-series is a much more representative case with less than 1 year of data. In such a situation, however, it's much less clear how seasonality can be leveraged. The Lokad trick to do that consists of using multiple time-series analysis. Learn more on our seasonality definition article. | Post a Comment | Share Article tagged forecasting, insights in forecasting, insights
Joannes Vermorel

Video: How the Forecasting Engine works?


Tuesday, September 13, 2011 at 09:00AM

Questions about under the hood details of Lokad are frequent. We have recently added a big FAQ to our Forecasting Technology section. Today, we are releasing a new video that give the big picture on how our forecasting engine is working.

2 sur 12

22/12/2011 14:29

Forecasting for Business - Blog -

http://blog.lokad.com/journal/category/forecasting

Again, special thanks to Ray Grover for the voice over. | Post a Comment | Share Article tagged video in forecasting, insights, video
Joannes Vermorel

Weekly/Monthly aggregation is a lossy process


Thursday, April 14, 2011 at 12:19PM

When practionners have a first look at a forecast report produced by Lokad, then tend to stumble upon various oddities. For example, some forecasts may look way too low. Without any observable trend nor any seasonality, Lokad anticipates something rather unexpected. Sometimes it's a by-product of rather advanced correlation analytics, but sometimes it's something both simpler and deeper.

The graph on the left represents a typical situation: steady sales for a couple of months, and then, a somewhat inexplicable drop in the forecasts. Common sense is yelling this can't be right, let's fix this broken forecast; and yet forecasting and common sense do not mix well. The way we observe sales is deeply misleading. Indeed, we are observing here monthly aggregated sales, not the sales themselves. Many businesses favor monthly forecasts because they feel their sales are too low or too erratic at the daily or weekly level to be of any practical use. Hence, they aggregate sales data over long(er) period of time. By doing so, sales appear smoother and, consequently, more predictable. This visualization of sales, i.e. thinking totals rather than an endless stream of transactions is so ubiquitous than many businesses fail to realize that aggregating sales primarily means loosing information, that is potentially valuable to perform the forecasts. Let's illustrate the point with a fresh look at the same sales history, although through weekly aggregation. The picture is extremely different. We realize that the

3 sur 12

22/12/2011 14:29

Forecasting for Business - Blog -

http://blog.lokad.com/journal/category/forecasting

seemingly steady monthly averages were just resulting from two super-heavy weeks: one in between January and February and a second in March. Such spikes routinely appear in businesses because of promotions and other various kind of exception events. With the second illustration, low forecasts are making a lot more sense: sales include infrequent spikes that should not be accounted for, and, when we mentally discard those spikes, we obtain forecasts that just follow the usual averaging pattern. A traditional forecasting system would typically be fooled by such a situation, and would anticipate a much higher monthly forecast, which would turn to be much less accurate. But Lokad is definitively not your traditional forecasting system. When monthly or weekly forecasts are requested, we keep looking at the most fine-grained data available. This let us identify patterns that would otherwise been lost through the sales aggregation process. | Post a Comment | Share Article tagged forecasting, insights in forecasting, insights, time series
Joannes Vermorel

Business is UP but forecasts are DOWN


Friday, April 1, 2011 at 11:11AM

Statistical demand forecasting is a counter-intuitive science. This point was pressed a couple of times before, but let's have a look at another misleading situation. If every single product segment of my business is growing fast, then at least some products should have an upward sales trend as well. Right? Otherwise, we would not be growing at all. This statement looks like just plain common sense; and yet it's wrong, very wrong. We live in fast paced economy. Having an identical product being sold more than 3 years is the exception rather than the norm in most consumer good businesses. As a result, product life-cycles tend to dwarf organic growth of retailers. This situation is illustrated by the schema below.

This is a set of product sales plotted on the same graphic. Each curve is associated to a particular product; and products are launched over time. Each product come with its own lifecycle pattern. The lifecycle patterns here illustrate a typical novelty

4 sur 12

22/12/2011 14:29

Forecasting for Business - Blog -

http://blog.lokad.com/journal/category/forecasting

effect: sales quickly ramp-up after product launch, and then the product enters its downward phase, which ends when the product is finally phased out of the market. Yet, how does an upward trend - from the retailer itself impacts this picture? Let's have another look at the illustration below.

Sales are higher with a positively trended retailer, yet this growth is nowhere strong enough to compensate for the product lifecycle effect. The sales of the product are still decreasing albeit at a slower rate. This situation outlines how we can have a fast-growing retail business with only negatively trended product sales. The main trick lies in the fact that new products keep being launched. Alas, this situation generates a lot of confusion. Indeed, when sales forecasts severely mismatch overall expectations, it becomes very tempting to fix the forecasts. Since most forecasting tools are poorly suited to deal with too varying or too intermittent demand anyway, it is tempting to aggregate sales per family, per category to produce an aggregated forecast; and then to de-aggregate forecasts at the SKU level using ratios. This approach is named top-down forecasting; and heavily used in many industries (textile among others). Top-down forecasts produce results that look much closer to intuitive expectations: a growth is observed in the sales forecasts, and it matches growth observed on the various business segments. Yet, by producing the forecast at the TOP level, the forecasting model is capturing an fictitious upward trend that only results from the contribution of regular product launches. If this fictitious ends up applied to a lower level - aka SKUs or products - then we significantly over-forecast the sales for each individual product. Near worst case: massive overstock is generated for products precisely at the time they are phased out of the market. From a forecasting perspective, a good forecasting system should be able to capture lifecycle effects. It means that sales forecasts may significantly differ from the overall business forecast. Business can go UP while every single product is getting DOWN. In such a situation, trying to fix forecasts is most like going to make them worse. Addendum: Despite the date of this post (April 1st, 2011), this post is not a joke. | Post a Comment | Share Article tagged forecasting, insights, lifecycle, retail, trend in forecasting, insights
Joannes Vermorel

New Forecasting Technology FAQ


Wednesday, March 9, 2011 at 11:28AM

Lately, we realized that the page detailing our forecasting

5 sur 12

22/12/2011 14:29

Forecasting for Business - Blog -

http://blog.lokad.com/journal/category/forecasting

technology was somewhat vague concerning under-the-hood aspects such as seasonality, trend, product life-cycle, promotions, ... Hence we have just posted a new extensive Forecasting Technology FAQ. Questions and Answers Nuts and bolts How accurate are your forecasts? Forecasting competitions, do you have any academic validation of your technology? Do you evaluate the accuracy of your forecasts? General patterns Macro trends (ex: financial crisis), how are they handled? Seasonality, trend, how is it handled? Promotions, how are they handled? Product Life Cycles and product launches, how are they handled? Intermittent / low volume products, how are they handled? Cannibalization, how are they handled? Weather, how is it handled? Demand artifacts Lost sales caused by stock-outs, how are they handled? Exceptional sales, how are they handled? Aggregation, top-down or bottom-up? Obviously, we are barely scratching the surface here. Don't hesitate to post your own questions, we will do our best to address them as well. | Post a Comment | Share Article tagged documentation, forecasting, insights in docs,
Joannes Vermorel forecasting, insights

Fallacies in data cleaning for (short-term) sales forecasts


Friday, November 19, 2010 at 11:43AM

When it comes to data analysis, experts frequently emphasize (and rightly so) the importance of having a clean dataset before starting any analysis. Otherwise, you end up with Garbage In, Garbage Out. As a result, most forecasting toolkits provides extensive features to support data cleaning / data preparations; and yet, Lokad does not provide any explicit feature supporting data cleaning.

Have we missed something BIG here? We don't believe so. There are some misunderstandings when it comes to data cleaning for the purpose (short-term) sales forecasting. Indeed, nowadays, sales of most retailers, wholesalers, manufacturers are stored into either an ERP or some accounting system. In our experience, as of 2010, transactional data associated to sales are remarkably clean. If there is a transaction recorded November 1st, 2010 indicating

6 sur 12

22/12/2011 14:29

Forecasting for Business - Blog -

http://blog.lokad.com/journal/category/forecasting

that the product X has been sold in Y quantity, then, the probability for this information to true is very high, with a confidence above 99.9% for most sales processes. Indeed, companies cannot afford not to know what they are selling. As a result, massive efforts have been invested in the last two decades to make really sure that sales data are reliable to some extent. We are not saying that no erroneous sales entry ever enter the system, we are only saying that the proportion is typically non-significant. If sales data are clean, why are we still pushing efforts on data cleaning? We have been observing a lot of data cleaning practices in the industry, and it turns out that the operations referred as cleaning tend to be much more than actually looking for the 0.1% erroneous transactions. The illustration here above gives some insights about the actual operations involved in a typical data cleaning phase: it's all about smoothing the extremes. For example, partial sales during shortages are manually increased, and promotional/exceptional sales are caped. Needless to say, we are not believers of this approach. Real sales data should not be replaced by fictitious sales data. Indeed, nothing can tell with 100% confidence how much products would have been sold if there had not been any shortage. The partial sales are the only tangible data that we have that does not already rely on statistical extrapolation. Yet, there is one interesting side-effect of the smooththe-extreme practice: smoothing improves the accuracy of the naive forecasting methods that behave much like the moving average. It is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail., Abraham Maslow, 1966 Trying to adjust the sales data to better fit on the only forecasting model on hand is just a bad case of the Law of the instrument. Our approach consists of tackling directly the complex patterns instead of trying to circumvent them. | Post a Comment | Share Article tagged cleaning, data, forecasting, insight, sales in accuracy,
Joannes Vermorel forecasting, insights

Width vs. Depth, Rotate your sales forecasts by 90 degrees


Tuesday, August 31, 2010 at 06:05PM

We have already discussed why Lokad did not care much about forecasting Chinese food rather than Sport Bar beverages. Another way of thinking our technology consists of rotating your sales forecasts by 90 degrees. We are observing that a consumer product has, on average, 3 years lifecycle. This means that on average the amount of data available for every single product about 18 months. When, we look at the sales history with a monthly aggregation, 18 months of data means 18 points. With 18 data points, no matter how smart or advanced is your forecasting theory, you can't do much simply because we face an utter lack of data to perform any robust statistical analysis. With 18 points, even a pattern has obviously as seasonality becomes a challenge to observe because we don't even have 2 complete seasonal observation. Your mileage may vary from one industry to the next, but unless your products stay in the market for decades, you are most likely to face this issue.

7 sur 12

22/12/2011 14:29

Forecasting for Business - Blog -

http://blog.lokad.com/journal/category/forecasting

As a direct

consequence, classical forecasting toolkits require statisticians to tweak forecasting models for every single product because no non-trivial statistical model can be robustly fit with only 18 points as input data. Yet, Lokad does not require any statistician, and the magic lies in the 90 degrees rotation: our models do not iterate over data a single time-series at a time, but against all time-series at once. Thus, we have a lot more input data available, and consequently we can succeed with rather advance models. This approach is just common sense: if you want to forecast the seasonality of your new chocolate bar, the seasonality of the other chocolate bars seems like a good candidate. Why should you treat each chocolate bar in strict isolation from the others? Yet, from a computational perspective, the problem has just become a lot harder: if you have 10,000 SKUs the number of associations between two SKUs is roughly 100 millions (and 10,000 SKU is nowhere a large number). That's precisely where the cloud kicks in: even if your algorithms are well-designed not to suffer a strict quadratic complexity, you're still going to need a lot of processing power. The cloud just happens to make this processing power available on demand at a very low price. Without the cloud, it is simply not possible to deliver this kind of technology. | Post a Comment | Share Article tagged cloud computing, depth, forecasting, insights, statistics, technology, width in forecasting, insights
Joannes Vermorel

Forecast's species: classification vs. regression


Tuesday, April 6, 2010 at 12:21PM

The word forecasting is covering a very large spectrum of processes, technologies and even markets. In the past, we introduced the worlds of forecasting software, distinguishing between: Deterministic simulation software Expert aggregation software Statistical forecasting software Lokad falls in the last category as our technology is purely statistical. Yet, Lokad is far from covering the entire statistical spectrum on is own. Two broads categories of forecasts exist in

8 sur 12

22/12/2011 14:29

Forecasting for Business - Blog -

http://blog.lokad.com/journal/category/forecasting

statistical forecasting (*): Classification forecasts Regression forecasts (*) We are oversimplifying here for the sake of clarity, as statistical learning subtleties are well beyond the scope of this modest blog post. Classification attempts to separate (or classify) objects according to their properties. The illustration below from Tomasz Malisiewicz illustrates a classification task trying to separate images picturing a chair from images picturing a table.

Illustration from tombone's blog

The output of a classification is binary (or rather discrete): objects get assigned to classes with more or less confidence, i.e. higher or lower probabilities. On the other hand, regressions typically output curves. The illustration below is considering a time-series representing historical sales, and displays the corresponding forecast.

The regression forecast is a curve rather than a binary (or combination of binary) settings. Inputs get prolonged into the future. How does this distinction impact the business? Well, it turns out that Lokad - as it stands early 2010 - only delivers regression forecasts. Thus, there are many interesting problems that cannot be tackled by Lokad because these are classification problems: Customer segmentation: for each customer, we would like to evaluate the probability of achieving successful up-sale through a direct marketing action. Following the same idea, we could try to predict the churn as well.

9 sur 12

22/12/2011 14:29

Forecasting for Business - Blog -

http://blog.lokad.com/journal/category/forecasting

Fraud detection: for each transaction, we would like to evaluate - based on the transaction pattern - the probability for the operation to be a fraud attempt. Deal prioritization: based on the properties of the prospect (availability of budget, industry, contact rank in the company, expressed level of interest, ...), we would like to evaluate the likelihood to get a profitable deal out of each prospect to prioritize the sales team efforts. Frequently, we are asked whether Lokad could deliver classification forecasts as well. Unfortunately, the answer will be negative for the time being. Albeit being rooted by the same mathematical theory, classification and regression entail very different technologies; and Lokad is pushing all its efforts toward regression problems. Although, we are not dismissive about classification problems, they truly deserve attention and efforts. For 2010, we are sticking to our roadmap, but further ahead, classification could be a natural extension of our forecasting services. | Post a Comment | Share Article tagged classification, forecasting, insights, regression, software in business, forecasting, insights, market
Joannes Vermorel

Measuring forecast accuracy


Tuesday, February 23, 2010 at 09:32AM

Most engineers will tell you that: You can't optimize what you don't measure Turns out that forecasting is no exception. Measuring forecast accuracy is one of the few cornerstones of any forecasting technology. A frequent misconception about accuracy measurement is that Lokad has to wait for the forecasts to become past, to finally compare the forecasts with what really happened. Although, this approach works to some extend, it comes with severe drawbacks: It's painfully slow: a 6 months ahead forecast takes 6 months to be validated. It's very sensitive to overfitting. Overfitting should not to be taken lightly, and it's one the few thing that is very likely to wreak havoc in your accuracy measurements. Measuring the accuracy of delivered forecasts is a tough piece of work for us. Accuracy measurement accounts for roughly half of the complexity of our forecasting technology: the more advance the forecasting technology, the greater the need for robust accuracy measurements. In particular, Lokad returns the forecast accuracy associated to every single forecast that we deliver (for example, our Excel-addin reports forecast accuracy). The metric used for accuracy measurement is the MAPE (Mean Absolute Percentage Error). In order to compute an estimated accuracy, Lokad proceeds (roughly) through cross-validation tuned for time-series forecasts. Cross-validation is simpler than it sounds. If we consider a weekly forecast 10 weeks ahead with 3 years (aka 150 weeks) of history, then the cross-validation looks like: 1. Take the 1st week, forecast 10 weeks ahead, and compare results to original.

10 sur 12

22/12/2011 14:29

Forecasting for Business - Blog -

http://blog.lokad.com/journal/category/forecasting

2. Take the 2 first weeks, forecast 10 weeks ahead, and compare. 3. Take the 3 first weeks, forecast 10 weeks ahead, and compare. 4. ... The process is rather tedious, as we end-up recomputing forecasts about 150 times for only 3 years of history. Obviously, cross-validation screams for automation, and there is little hope to go through such a process without computer support. Yet, computers typically cost less than business forecast errors, and Lokad relies on cloud computing to deliver such high-intensive computations. Attempts to "simplify" the process outlined are very likely to end-up with overfitting problems. We suggest to say very careful, as overfitting isn't a problem to be taken lightly. In doubts, stick to a complete cross-validation. | 1 Comment | Share Article tagged accuracy, forecasting, measure in accuracy, forecasting,
Joannes Vermorel insights

Internet is needed for your forecasts


Saturday, November 14, 2009 at 07:28PM

Do I really need an Internet connection to get your forecasts? is a question frequently asked by prospects having a look at our forecasting technology. Well, the answer is YES. With Lokad, there is no work-around. Our forecasting engine does not come as an on-premises solution. But why should we need an internet connection for an algorithmic processing such as forecasting? The answer to this question is one of the core reason that have lead to the very existence of Lokad in the first place. When we started working on the Lokad project - back in 2006 we quickly realized that forecasting, despite appearances, was a total misfit for local processing. 1. Your can't get your forecasts right without having the data at hand. Researchers have been looking for decades for a universal forecasting model, but the consensus among the community is that there is no free lunch; universal models do not exist, or rather, they tend to perform poorly. This is the primary reasons why forecasting toolkits feature so many models (don't click this link, it's 3000 pages manual for a popular toolkit). With Lokad, the process is much simpler because the data is made available to Lokad. Hence, it does not matter any more if thousands of parameters are needed, as parameters are handled by Lokad directly. 2. Advanced forecasting is quite resource intensive but the need to forecast is only intermittent. Even a small retailer with 10 point of sales and 10k product references represents already 100k time-series to be forecasted. If we consider a typical performance of 10k/series per hour for a single CPU (which is already quite optimistic for complex models), then computing sales forecasts for the 10 points of sales take a total 10h of CPU time. Obviously, retailers prefer not to wait for 10h to get their forecasts. Buying an amazingly powerful workstation is possible, but then does it make sense to have so much processing power staying idle 99% of the time when forecasts are made only once a week? Outsourcing the processing power

11 sur 12

22/12/2011 14:29

Forecasting for Business - Blog -

http://blog.lokad.com/journal/category/forecasting

is the obvious cost-effective approach here. 3. Forecasting is still under fast paced evolution. Since our launch about 3 years ago, Lokad has been upgraded every month or so. Our forecasting technology is not some indisputable achievement carved in stone, but on the contrary, is still undergoing a rapid evolution. Every month, the statistical learning research community moves forward with loads of fresh ideas. In such context, on-premise solutions undergo a rapid decay until the day the discrepancy between the performance of current version and the performance of the deployed version is so great that the company has no choice but to rush an upgrade. Aggressively developed SaaS ensure that customers benefit from the latest improvements without having to even worry about it. In our opinion, going for an on-premise solution for your forecasts is like entering a golf competition with a large handicap. It might make the game more interesting, but it does not maximize your chances. Don't expect your competitors to be fair enough to start with the same handicap just because you do.
Joannes Vermorel

| Post a Comment | Share Article

tagged business, forecasting, insight, technology in business, forecasting, insights

12 sur 12

22/12/2011 14:29

Das könnte Ihnen auch gefallen