What makes a probabilistic forecast?

What’s all the hoopla around the term “probabilistic forecasting?” Is it just a more recent marketing term some software vendors and consultants have coined to feign innovation? Is there any real tangible difference compared to predecessor “best fit” techniques?  Aren’t all forecasts probabilistic anyway?

To answer this question, it is helpful to think about what the forecast really is telling you in terms of probabilities.  A “good” forecast should be unbiased and therefore yield a 50/50 probability being higher or lower than the actual.  A “bad” forecast will build in subjective buffers (or artificially depress the forecast) and result in demand that is either biased high or low.  Consider a salesperson that intentionally reduces their forecast by not reporting sales they expect to close to be “conservative.” Their forecasts will have negative forecast bias as actuals will nearly always be higher than what they predicted.   On the other hand, consider a customer that provides an inflated forecast to their manufacturer.  Worried about stockouts, they overestimate demand to ensure their supply.  Their forecast will have a positive bias as actuals will nearly always be lower than what they predicted. 

These types of one-number forecasts described above are problematic.  We refer to these predictions as “point forecasts” since they represent one point (or a series of points over time) on a plot of what might happen in the future.   They don’t provide a complete picture because to make effective business decisions such as determining how much inventory to stock or the number of employees to be available to support demand requires detailed information on how much lower or higher the actual will be!  In other words, you need the probabilities for each possible outcome that might occur.  So, by itself, the point forecast isn’t probabilistic one.   

To get a probabilistic forecast, you need to know the distribution of possible demands around that forecast.  Once you compute this, the forecast becomes “probabilistic.”  How forecasting systems and practitioners such as demand planners, inventory analysts, material managers, and CFOs determine these probabilities is the heart of the question: “what makes a forecast probabilistic?”     

Normal Distributions
Most forecasts and the systems/software that produce them start with a prediction of demand.  Then they figure out the range of possible demands around that forecast by making incorrect theoretical assumptions about the distribution.  If you’ve ever used a “confidence interval” in your forecasting software, this is based on a probability distribution around the forecast.  The way this range of demand is determined is to assume a particular type of distribution.  Most often this means assuming a bell shaped, otherwise known as a normal distribution.  When demand is intermittent, some inventory optimization and demand forecasting systems may assume the demand is Poisson shaped. 

After creating the forecast, the assumed distribution is slapped around the demand forecast and you then have your estimate of probabilities for every possible demand – i.e., a “probabilistic forecast.”  These estimates of demand and associated probabilities can then be used to determine extreme values or anything in between if desired.  The extreme values at the upper percentiles of the distribution (i.e., 92%, 95%, 99%, etc.) are most often used as inputs to inventory control models.  For example, reorder points for critical spare parts in an electrical utility might be planned based on a 99.5% service level or even higher.  While a non-critical service part might be planned at an 85% or 90% service level.

The problem with making assumptions about the distribution is that you’ll get these probabilities wrong.  For example, if the demand isn’t normally distributed but you are forcing a bell shaped/normal curve on the forecast then how can then the probabilities will be incorrect.  Specifically, you might want to know the level of inventory needed to achieve a 99% probability of not running out of stock and the normal distribution will tell you to stock 200 units.  But when compared to the actual demand, you come to find out that 200 units only filled demand entirely in 40/50 observations.  So, instead of getting a 99% service level you only achieved an 80% service level!  This is a gigantic miss resulting from trying to fit a square peg into a round hole.  The miss would have led you to take an incorrect inventory reduction.

Empirically Estimated Distributions are Smart
To produce a smart (read accurate) probabilistic forecast you need to first estimate the distribution of demand empirically without any naïve assumptions about the shape of the distribution.  Smart Software does this by running tens of thousands of simulated demand and lead time scenarios.  Our solution leverages patented techniques that incorporate Monte Carlo simulation, Statistical Bootstrapping, and other methods.  The scenarios are designed to simulate real life uncertainty and randomness of both demand and lead times.  Actual historical observations are utilized as the primary inputs, but the solution will give you the option of simulating from non-observed values as well.  For example, just because 100 units was the peak historical demand, that doesn’t mean you are guaranteed to peak out at 100 in the future.  After the scenarios are done you will know the exact probability for each outcome. The “point” forecast then becomes the center of that distribution.  Each future period over time is expressed in terms of the probability distribution associated with that period.

Leaders in Probabilistic Forecasting
Smart Software, Inc. was the first company to ever introduce statistical bootstrapping as part of a commercially available demand forecasting software system twenty years ago.  We were awarded a US patent at the time for it and named a finalist in the APICS Corporate Awards of Excellence for Technological Innovation.  Our NSF Sponsored research that led to this and other discoveries were instrumental in advancing forecasting and inventory optimization.    We are committed to ongoing innovation, and you can find further information about our most recent patent here.

 

 

Correlation vs Causation: Is This Relevant to Your Job?

Outside of work, you may have heard the famous dictum “Correlation is not causation.” It may sound like a piece of theoretical fluff that, though involved in a recent Noble Prize in economics, isn’t relevant to your work as a demand planner. Is so, you may be only partially correct.

Extrapolative vs Causal Models

Most demand forecasting uses extrapolative models. Also called time-series models, these forecast demand using only the past values of an item’s demand. Plots of past values reveal trend and seasonality and volatility, so there is a lot they are good for. But there is another type of model – causal models —that can potentially improve forecast accuracy beyond what you can get from extrapolative models.

Causal models bring more input data to the forecasting task: information on presumed forecast “drivers” external to the demand history of an item. Examples of potentially useful causal factors include macroeconomic variables like the inflation rate, the rate of GDP growth, and raw material prices. Examples not tied to the national economy include industry-specific growth rates and your own and competitors’ ad spending.  These variables are usually used as inputs to regression models, which are equations with demand as an output and causal variables as inputs.

Forecasting using Causal Models

Many firms have an S&OP process that involves a monthly review of statistical (extrapolative) forecasts in which management adjusts forecasts based on their judgement. Often this is an indirect and subjective way to work causal models into the process without doing the regression modeling.

To actually make a causal regression model, first you have to nominate a list of potentially-useful causal predictor variables. These may come from your subject matter expertise. For example, suppose you manufacture window glass. Much of your glass may end up in new homes and new office buildings. So, the number of new homes and offices being built are plausible predictor variables in a regression equation.

There is a complication here: if you are using the equation to predict something, you must first predict the predictors. For example, sales of glass next quarter may be strongly related to numbers of new homes and new office buildings next quarter. But how many new homes will there be next quarter? That’s its own forecasting problem. So, you have a potentially powerful forecasting model, but you have extra work to do to make it usable.

There is one way to simplify things: if the predictor variables are “lagged” versions of themselves. For example, the number of new building permits issued six months ago may be a good predictor of glass sales next month. You don’t have to predict the building permit data – you just have to look it up.

Is it a causal relationship or just a spurious correlation?

Causal models are the real deal: there is an actual mechanism that relates the predictor variable to the predicted variable. The example of predicting glass sales from building permits is an example.

A correlation relationship is more iffy. There is a statistical association that may or may not provide a solid basis for forecasting. For example, suppose you sell a product that happens to appeal most strongly to Dutch people but you don’t realize this. The Dutch are, on average, the tallest people in Europe. If your sales are increasing and the average height of Europeans is increasing, you might use that relationship to good effect. However, if the proportion of Dutch in the Euro zone is decreasing while the average height is increasing because the mix of men versus women is shifting toward men, what can go wrong? You will expect sales to increase because average height is increasing. But your sales are really mostly to the Dutch, and their relative share of the population is shrinking, so your sales are really going to decrease instead. In this case the association between sales and customer height is a spurious correlation.

How can you tell the difference between true and spurious relationships? The gold standard is to do a rigorous scientific experiment. But you are not likely to be in position to do that. Instead, you have to rely on your personal “mental model” of how your market works. If your hunches are right, then your potential causal models will correlate with demand and causal modeling will pay off for you, either to supplement extrapolative models or to replace them.

 

 

 

 

Three Ways to Estimate Forecast Accuracy

Forecast accuracy is a key metric by which to judge the quality of your demand planning process. (It’s not the only one. Others include timeliness and cost; See 5 Demand Planning Tips for Calculating Forecast Uncertainty.) Once you have forecasts, there are a number of ways to summarize their accuracy, usually designated by obscure three- or four-letter acronyms like MAPE, RMSE, and MAE.  See Four Useful Ways to Measure Forecast Error for more detail.

A less discussed but more fundamental issue is how computational experiments are organized for computing forecast error. This post compares the three most important experimental designs. One of them is old-school and essentially amounts to cheating. Another is the gold standard. A third is a useful expedient that mimics the gold standard and is best thought of as predicting how the gold standard will turn out. Figure 1 is a schematic view of the three methods.

 

Three Ways to Estimate Forecast Accuracy Software Smart

Figure 1: Three ways to assess forecast error

 

The top panel of Figure 1 depicts the way forecast error was assessed back in the early 1980’s before we moved the state of the art to the scheme shown in the middle panel. In the old days, forecasts were assessed on the same data used to compute the forecasts. After a model was fit to the data, the errors computed were not for model forecasts but for model fits. The difference is that forecasts are for future values, while fits are for concurrent values. For example, suppose the forecasting model is a simple moving average of the three most recent observations. At time 3, the model computes the average of observations 1, 2, and 3. This average would then be compared to the observed value at time 3. We call this cheating because the observed value at time 3 got a vote on what the forecast should be at time 3. A true forecast assessment would compare the average of the first three observations to the value of the next, fourth, observation. Otherwise, the forecaster is left with an overly optimistic assessment of forecast accuracy.

The bottom panel of Figure 1 shows the best way to assess forecast accuracy. In this schema, all the historical demand data are used to fit a model, which is then used to forecast future, unknown demand values. Eventually, the future unfolds, the true future values reveal themselves, and actual forecast errors can be computed. This is the gold standard. This information populates the “forecasts versus actuals” report in our software.

The middle panel depicts a useful halfway measure. The problem with the gold standard is that you must wait to learn how well your chosen forecasting methods perform. This delay does not help when you are required to choose, in the moment, which forecasting method to use for each item. Nor does it provide a timely estimate of the forecast uncertainty you will experience, which is important for risk management such as forecast hedging. The middle way is based on hold-out analysis, which excludes (“holds out”) the most recent observations and asks the forecasting method to do its work without knowing those ground truths. Then the forecasts based on the foreshortened demand history can be compared to the held-out actual values to get an honest assessment of forecast error.

 

 

Elephants and Kangaroos ERP vs. Best of Breed Demand Planning

“Despite what you’ve seen in your Saturday morning cartoons, elephants can’t jump, and there’s one simple reason: They don’t have to. Most jumpy animals—your kangaroos, monkeys, and frogs—do it primarily to get away from predators.”  — Patrick Monahan, Science.org, Jan 27, 2016.

Now you know why the largest ERP companies can’t develop high quality best-of-breed like solutions. They never had to, so they never evolved to innovate outside of their core focus. 

However, as ERP systems have become commoditized, gaps in their functionality became impossible to ignore. The larger players sought to protect their share of customer wallet by promising to develop innovative add-on applications to fill all the white spaces.  But without that “innovation muscle,” many projects failed, and mountains of technical debt accumulated.

Best-of-breed companies evolved to innovate and have deep functional expertise in specific verticals.  The result is that best of breed ERP add-ons are easier to use, have more features, and deliver more value than the native ERP modules they replace. 

If your ERP provider has already partnered with an innovative best of breed add-on provider*, you’re all set! But if you can only get the basics from your ERP, go with a best-of-breed add-on that has a bespoke integration to the ERP. 

A great place to start your search is to look for ERP demand planning add-ons that add brains to the ERP’s brawn, i.e., those that support inventory optimization and demand forecasting.  Leverage add-on tools like Smart’s statistical forecasting, demand planning, and inventory optimization apps to develop forecasts and stocking policies that are fed back to the ERP system to drive daily ordering. 

*App-stores are a license for the best of breed to sell into the ERP companies base –  being listed  partnerships.

 

 

 

 

Is your demand planning and forecasting process a black box?

There’s one thing I’m reminded of almost every day at Smart Software that puzzle me: most companies do not understand how forecasts are created, and stocking policies are determined.  It’s an organizational black box. Here is an example from a recent sales call:

How do you forecast?
We use history.

How do you use history?
What do you mean?

Well, you can take an average of the last year, last two years, average the most recent periods, or use some other type of formula to generate the forecast.
I’m pretty sure we use an average of the last 12 months.

Why 12 months instead of a different amount of history?
12 months is a good amount of time to use because it doesn’t get skewed by older data but it’s recent enough

How do you know it’s more accurate than using 18 months or some other length of history?
We don’t know. We do adjust the forecasts based on feedback from sales.  

Do you know if the adjustments make things more accurate or less than if you just used the average?
We don’t know but are confident that forecasts are inflated

What do the inventory buyers do then if they think the numbers are inflated?
They have lots of business knowledge and adjust their buys accordingly

So, is it fair to say they would ignore the forecasts at least some of the time?
Yes, some of the time.

How do the buyers decide when to order more? Do you have a reorder point or safety stock specified in your ERP system that helps guide these decisions?
Yes, we use a safety stock field.

How is safety stock calculated?
Buyers determine this based on the importance of the item, lead times, and other considerations such as how many customers purchase the item, the velocity of the item, it’s cost.  They’ll carry different amounts of safety stock depending on this.

The discussion continued. The main takeaway here is that when you scratch just below the surface, far more questions are revealed than answers.  This often means that the inventory planning and demand forecast process is highly subjective, varies from planner to planner, is not well understood by the rest of the organization, and likely to be reactive.  As Tom Willemain has described it’s “chaos masked by improvisation.”   The “as-is” process needs to be fully identified and documented.  Only then can gaps be exposed, and improvements can be made.   Here is a list of 10 questions  you can ask that will reveal your organization’s true forecasting, demand planning, and inventory planning process.