What to do when a statistical forecast doesn’t make sense

Sometimes a statistical forecast just doesn’t make sense.  Every forecaster has been there.  They may double-check that the data was input correctly or review the model settings but are still left scratching their head over why the forecast looks very unlike the demand history.   When the occasional forecast doesn’t make sense, it can erode confidence in the entire statistical forecasting process.

This blog will help a layman understand what the Smart statistical models are and how they are chosen automatically.  It will address how that choice sometimes fails, how you can know if it did, and what you can do to ensure that the forecasts can always be justified.  It’s important to know to expect, and how to catch the exceptions so you can rely on your forecasting system.


How methods are chosen automatically

The criteria to automatically choose one statistical method out of a set is based on which method came closest to correctly predicting held-out history.  Earlier history is passed to each method and the result is compared to actuals to find the one that came closest overall.  That automatically chosen method is then fed all the history to produce the forecast. Check out this blog to learn more about the model selection https://smartcorp.com/uncategorized/statistical-forecasting-how-automatic-method-selection-works/

For most time series, this process can capture trends, seasonality, and average volume accurately. But sometimes a chosen method comes mathematically closest to predicting the held-out history but doesn’t project it forward in a way that makes sense.  That means the system selected method isn’t best and for some “hard to forecast”


Hard to forecast items

Hard to forecast items may have large, unpredictable spikes in demand, or typically no demand but random irregular blips, or unusual recent activity.  Noise in the data sometimes randomly wanders up or down, and the automated best-pick method might forecast a runaway trend or a grind into zero.  It will do worse than common sense and in a small percentage of any reasonably varied group of items.  So, you will need to identify these cases and respond by overriding the forecast or changing the forecast inputs.


How to find the exceptions

Best practice is to filter or sort the forecasted items to identify those where the sum of the forecast over the next year is significantly different than the corresponding history last year.  The forecast sum may be much lower than the history or vice versa.  Use supplied metrics to identify these items; then you can choose to apply overrides to the forecast or modify the forecast settings.


How to fix the exceptions

Often when the forecast seems odd, an averaging method, like Single Exponential Smoothing or even a simple average using Freestyle, will produce a more reasonable forecast.  If trend is possibly valid, you can remove only seasonal methods to avoid a falsely seasonal result.  Or do the opposite and use only seasonal methods if seasonality is expected but wasn’t projected in the default forecast.  You can use the what-if features to create any number of forecasts, evaluate & compare, and continue to fine tune the settings until you are comfortable with the forecast.

Cleaning the history, with or without changing the automatic method selection, is also effective at producing reasonable forecasts. You can embed forecast parameters to reduce the amount of history used to forecast those items or the number of periods passed into the algorithm so earlier, outdated history is no longer considered.  You can edit spikes or drops in the demand history that are known anomalies so they don’t influence the outcome.  You can also work with the Smart team to implement automatic outlier detection and removal so that data prior to being forecasted is already cleansed of these anomalies.

If the demand is truly intermittent, it is going to be nearly impossible to forecast “accurately” per period. If a level-loading average is not acceptable, handling the item by setting inventory policy with a lead time forecast can be effective.  Alternatively, you may choose to use “same as last year” models which while not prone to accuracy will be generally accepted by the business given the alternatives forecasts.

Finally, if the item was introduced so recently that the algorithms do not have enough input to accurately forecast, a simple average or manual forecast may be best.  You can identify new items by filtering on the number of historical periods.


Manual selection of methods

Once you have identified rows where the forecast doesn’t make sense to the human eye, you can choose a smaller subset of all methods to allow into the forecast run and compare to history.  Smart will allow you to use a restricted set of methods just for one forecast run or embed the restricted set to use for all forecast runs going forward. Different methods will project the history into the future in different ways.  Having a sense of how each works will help you choose which to allow.


Rely on your forecasting tool

The more you use Smart period over period to embed your decisions about how to forecast and what historical data to consider, the less often you will face exceptions as described in this blog.  Entering forecast parameters is a manageable task when starting with critical or high impact items.  Even if you don’t embed any manual decisions on forecast methods, the forecast re-runs every period with new data. So, an item with an odd result today can become easily forecastable in time.



Statistical Forecasting: How Automatic method selection works in Smart IP&O

Smart IP&O offers automated statistical forecasting that selects the right forecast method that best forecasts the data.  It does this for each time-series in the data set.  This blog will help a laymen understand how the forecast methods are chosen automatically.

Smart makes many methods available including single and double exponential smoothing, linear and simple moving average, and Winters models.  Each model is designed to capture a different sort of pattern.  The criteria to automatically choose one statistical method out of a set of choices is based on which method came closest to correctly predicting held-out history.

Earlier demand history is passed to each method and the result is compared to actuals to find the one that came closest overall.  That “winning” automatically chosen method is then fed all the history for that item to produce the forecast.

The overall nature of the demand pattern for the item is captured by holding out different portions of the history so that an occasional outlier does not unduly influence the choice of method.  You can visualize it using the below diagram where each row represents a 3-period forecast in held out history, based on different amounts of the red earlier history.  The variances of each pass are averaged together to determine the method’s overall ranking against all other methods.

Automatic Forecasting and Statistical Forecasting App

For most time series, this process can accurately capture trends, seasonality, and average volume accurately. But sometimes a chosen method comes mathematically closest to predicting the held-out history but doesn’t project it forward in a way that makes sense.

Users can correct this by using the system’s exception reports and filtering features to identify items that merit review.  They can then configure the automatic forecast methods that they wish to be considered for that item.



How much time should it take to compute statistical forecasts?
The top factors that impact the speed of your forecast engine 

How long should it take for a demand forecast to be computed using statistical methods?  This question is often asked by customers and prospects.  The answer truly depends.  Forecast results for a single item can be computed in the blink of an eye, in as little as a few hundredths of a second, but sometimes they may require as much as five seconds.  To understand the differences, it’s important to understand that there is more involved than grinding through the forecast arithmetic itself.   Here are six factors that influence the speed of your forecast engine.

1) Forecasting method.  Traditional time-series extrapolative techniques (such as exponential smoothing and moving average methods), when cleverly coded, are lighting fast.  For example, the Smart Forecast automatic forecasting engine that leverages these techniques and powers our demand planning and inventory optimization software can crank out statistical forecasts on 1,000 items in 1 second!  Extrapolative methods produce an expected forecast and a summary measure of forecast uncertainty. However, more complex models in our platform that generate probabilistic demand scenarios take much longer given the same computing resources.  This is partly because they create a much larger volume of output, usually thousands of plausible future demand sequences. More time, yes, but not time wasted, since these results are much more complete and form the basis for downstream optimization of inventory control parameters.

2) Computing resources.  The more resources you throw at the computation, the faster it will be.  However, resources cost money and it may not be economical to invest in these resources.  For example, to make certain types of machine learning-based forecasts work, the system will need to multi-thread computations across multiple servers to deliver results quickly.  So, make sure you understand the assumed compute resources and associated costs. Our computations happen on the Amazon Web Services cloud, so it is possible to pay for a great deal of parallel computation if desired.

3) Number of time-series.  Do you have to forecast only a few hundred items in a single location or many thousands of items across dozens of locations?  The greater the number of SKU x Location combinations, the greater the time required.  However, it is possible to trim the time to get demand forecasts by better demand classification.  For example, it is not important to forecast every single SKU x Location combination. Modern Demand Planning Software can first subset the data based on volume/frequency classifications before running the forecast engine.  We’ve observed situations where over one million SKU x Location combinations existed, but only ten percent had demand in the preceding twelve months.

4) Historical Bucketing.  Are you forecasting using daily, weekly, or monthly time buckets?  The more granular the bucketing, the more time it is going to take to compute statistical forecasts.  Many companies will wonder, “Why would anyone want to forecast on a daily basis?” However, state-of-the-art demand forecasting software can leverage daily data to detect simultaneous day-of-week and week-of-month patterns that would otherwise be obscured with traditional monthly demand buckets. And the speed of business continues to accelerate, threatening the competitive viability of the traditional monthly planning tempo.

5) Amount of History.  Are you limiting the model by only feeding it the most recent demand history, or are you feeding all available history to the demand forecasting software? The more history you feed the model, the more data must be analyzed and the longer it is going to take.

6) Additional analytical processing.  So far, we’ve imagined feeding items’ demand history in and getting forecasts out. But the process can also involve additional analytical steps that can improve results. Examples include:

a) Outlier detection and removal to minimize the distortion caused by one-off events like storm damage.

b) Machine learning that decides how much history should be used for each item by detecting regime change.

c) Causal modeling that identifies how changes in demand drivers (such as price, interest rate, customer sentiment, etc.) impact future demand.

d) Exception reporting that uses data analytics to identify unusual situations that merit further management review.


The Rest of the Story. It’s also critical to understand that the time to get an answer involves more than the speed of forecasting computations per se.  Data must be loaded into memory before computing can begin. Once the forecasts are computed, your browser must load the results so that they may be rendered on screen for you to interact with.  If you re-forecast a product, you may choose to save the results.  If you are working with product hierarchies (aggregating item forecasts up to product families, families up to product lines, etc.), the new forecast is going to impact the hierarchy, and everything must be reconciled.   All of this takes time.

Fast Enough for You? When you are evaluating software to see whether your need for speed will be satisfied, all of this can be tested as part of a proof of concept or trial offered by demand planning software solution providers.  Test it out, and make sure that the compute, load, and save times are acceptable given the volume of data and forecasting methods you want to use to support your process.




Do your statistical forecasts suffer from the wiggle effect?

 What is the wiggle effect? 

It’s when your statistical forecast incorrectly predicts the ups and downs observed in your demand history when there really isn’t a pattern.  It’s important to make sure your forecasts don’t wiggle unless there is a real pattern.

Here is a transcript from a recent customer where this issue was discussed:

Customer: “The forecast isn’t picking up on the patterns I see in the history.  Why not?” 

Smart:  “If you look closely, the ups and downs you see aren’t patterns.  It’s really noise.”  

Customer:  “But if we don’t predict the highs, we’ll stock out.”

Smart: “If the forecast were to ‘wiggle’ it would be much less accurate.  The system will forecast whatever pattern is evident, in this case a very slight uptrend.  We’ll buffer against the noise with safety stocks. The wiggles are used to set the safety stocks.”

Customer: “Ok. Makes sense now.” 

Do your statistical forecasts suffer from the wiggle effect graphic

The wiggle looks reassuring but, in this case, it is resulting in an incorrect demand forecast. The ups and downs aren’t really occurring at the same times each month.  A better statistical forecast is shown in light green.



How to Handle Statistical Forecasts of Zero

A statistical forecast of zero can cause lots of confusion for forecasters, especially when the historical demand is non-zero.  Sure, it’s obvious that demand is trending downward, but should it trend to zero?  When the older demand is much greater than the more recent demand and the more recent demand is very low volume (i.e., 1,2,3 units demanded), the answer is, statistically speaking, yes.  However, this might not jive with the planner’s business knowledge and expected minimum level of demand.  So, what should a forecaster do to correct this? Here are three suggestions:


  1. Limit the historical data fed to the model. In a down trending situation, the older data is often much greater than the recent data.   When the older much higher volume demand is ignored, the down trend won’t be nearly as significant.  You’ll still forecast a down trend, but results are more likely to be line with business expectations.
  1. Try trend dampening. Smart Demand Planner has a feature called “trend hedging” that enables users to define how a trend should phase out over time. The higher the percentage trend hedge (0-100%), the more pronounced the trend dampening.  This means that a forecasted trend will not continue through the whole forecast horizon.  This means the demand forecast will start to flatten before it hits zero on a downtrend.
  1. Change the forecast model. Switch from a trending method like Double Exponential Smoothing or Linear Moving Average to a non-trending method such as Single Exponential Smoothing or Simple Moving Average. You won’t forecast a downtrend, but at least your forecast won’t be zero and thus more likely to be accepted by the business.