A Practical Guide to Growing a Professional Forecasting Process

Many companies looking to improve their forecasting process don’t know where to start. It can be confusing to contend with learning new statistical methods, making sure data is properly structured and updated, agreeing on who “owns” the forecast, defining what ownership means, and measuring accuracy. Having seen this over forty-plus years of practice, we wrote this blog to outline the core focus and to encourage you to keep it simple early on.

1. Objectivity. First, understand and communicate that the Demand Planning and Forecasting process is an exercise in objectivity. The focus is on getting inputs from various sources (stakeholders, customers, functional managers, databases, suppliers, etc.) and deciding whether those inputs add value. For example, if you override a statistical forecast and add 20% to the projection, you should not just assume that you automatically got it right. Instead, be objective and check whether that override increased or decreased forecast accuracy. If you find that your overrides made things worse, you’ve gained something: This informs the process and you know to better scrutinize override decisions in the future.

2.  Teamwork. Recognize that forecasting and demand planning are team sports. Agree on who will captain the team. The captain is responsible for creating the baseline statistical forecasts and supervising the demand planning process. But results depend on everyone on the team making positive contributions, providing data, suggesting alternative methodologies, questioning assumptions, and executing recommended actions. The final results are owned by the company and every single stakeholder.

3. Measurement. Don’t fixate on industry forecast accuracy benchmarks. Every SKU has its own level of “forecastability”, and you may be managing any number of difficult items. Instead, create your own benchmarks based on a sequence of increasingly advanced forecasting methods. Advanced statistical forecasts may seem dauntingly complex at first, so start simple with a basic method, such as forecasting the historical average demand. Then measure how close that simple forecast comes to the actual observed demand. Work up from there to techniques that deal with complications like trend and seasonality. Measure progress using accuracy metrics calculated by your software, such as the mean absolute percentage error (MAPE). This will allow your company to get a little bit better each forecast cycle.

4. Tempo. Then focus efforts on making forecasting a standalone process that isn’t combined with the complex process of inventory optimization. Inventory management is built on a foundation of sound demand forecasting, but it is focused on other topics: what to purchase, when to purchase, minimum order quantities, safety stocks, inventory levels, supplier lead times, etc. Let inventory management go to later. First build up “forecasting muscle” by creating, reviewing, and evolving the forecasting process to have a regular cadence. When your process is sufficiently matured, catch up with the increasing speed of business by increasing the tempo of your forecasting process to at least a monthly cadence.

Remarks

Revising a company’s forecasting process can be a major step. Sometimes it happens when there is executive turnover, sometimes when there is a new ERP system, sometimes when there is new forecasting software. Whatever the precipitating event, this change is an opportunity to rethink and refine whatever process you had before. But trying to eat the whole elephant in one go is a mistake. In this blog, we’ve outlined some discrete steps you can take to make for a successful evolution to a better forecasting process.

 

 

 

 

Correlation vs Causation: Is This Relevant to Your Job?

Outside of work, you may have heard the famous dictum “Correlation is not causation.” It may sound like a piece of theoretical fluff that, though involved in a recent Noble Prize in economics, isn’t relevant to your work as a demand planner. Is so, you may be only partially correct.

Extrapolative vs Causal Models

Most demand forecasting uses extrapolative models. Also called time-series models, these forecast demand using only the past values of an item’s demand. Plots of past values reveal trend and seasonality and volatility, so there is a lot they are good for. But there is another type of model – causal models —that can potentially improve forecast accuracy beyond what you can get from extrapolative models.

Causal models bring more input data to the forecasting task: information on presumed forecast “drivers” external to the demand history of an item. Examples of potentially useful causal factors include macroeconomic variables like the inflation rate, the rate of GDP growth, and raw material prices. Examples not tied to the national economy include industry-specific growth rates and your own and competitors’ ad spending.  These variables are usually used as inputs to regression models, which are equations with demand as an output and causal variables as inputs.

Forecasting using Causal Models

Many firms have an S&OP process that involves a monthly review of statistical (extrapolative) forecasts in which management adjusts forecasts based on their judgement. Often this is an indirect and subjective way to work causal models into the process without doing the regression modeling.

To actually make a causal regression model, first you have to nominate a list of potentially-useful causal predictor variables. These may come from your subject matter expertise. For example, suppose you manufacture window glass. Much of your glass may end up in new homes and new office buildings. So, the number of new homes and offices being built are plausible predictor variables in a regression equation.

There is a complication here: if you are using the equation to predict something, you must first predict the predictors. For example, sales of glass next quarter may be strongly related to numbers of new homes and new office buildings next quarter. But how many new homes will there be next quarter? That’s its own forecasting problem. So, you have a potentially powerful forecasting model, but you have extra work to do to make it usable.

There is one way to simplify things: if the predictor variables are “lagged” versions of themselves. For example, the number of new building permits issued six months ago may be a good predictor of glass sales next month. You don’t have to predict the building permit data – you just have to look it up.

Is it a causal relationship or just a spurious correlation?

Causal models are the real deal: there is an actual mechanism that relates the predictor variable to the predicted variable. The example of predicting glass sales from building permits is an example.

A correlation relationship is more iffy. There is a statistical association that may or may not provide a solid basis for forecasting. For example, suppose you sell a product that happens to appeal most strongly to Dutch people but you don’t realize this. The Dutch are, on average, the tallest people in Europe. If your sales are increasing and the average height of Europeans is increasing, you might use that relationship to good effect. However, if the proportion of Dutch in the Euro zone is decreasing while the average height is increasing because the mix of men versus women is shifting toward men, what can go wrong? You will expect sales to increase because average height is increasing. But your sales are really mostly to the Dutch, and their relative share of the population is shrinking, so your sales are really going to decrease instead. In this case the association between sales and customer height is a spurious correlation.

How can you tell the difference between true and spurious relationships? The gold standard is to do a rigorous scientific experiment. But you are not likely to be in position to do that. Instead, you have to rely on your personal “mental model” of how your market works. If your hunches are right, then your potential causal models will correlate with demand and causal modeling will pay off for you, either to supplement extrapolative models or to replace them.

 

 

 

 

What data is needed to support Demand Planning Software Implementations

We recently met with the IT team at one of our customers to discuss data requirements and installation of our API based integration that would pull data from their on-premises installation of their ERP system.   The IT manager and analyst both expressed significant concern about providing this data and seriously questioned why it needed to be provided at all.  They even voiced concerns that their data might be resold to their competition. Their reaction was a big surprise to us.  We wrote this blog with them in mind and to make it easier for others to communicate why certain data is necessary to support an effective demand planning process. 

Please note that if you are a forecast analyst, demand planner, of supply chain professional then most of what you’ll read below will be obvious.  But what this meeting taught me is that what is obvious to one group of specialists isn’t going to be obvious to another group of specialists in an entirely different field. 

The Four main types of data that are needed are:  

  1. Historical transactions, such as sales orders and shipments.
  2. Job usage transactions, such as what components are needed to produce finished goods
  3. Inventory Transfer transactions, such as what inventory was shipped from one location to another.
  4. Pricing, costs, and attributes, such as the unit cost paid to the supplier, the unit price paid by the customer, and various meta data like product family, class, etc.  

Below is a brief explanation of why this data is needed to support a company’s implementation of demand planning software.

Transactional records of historical sales and shipments by customer
Think of what was drawn out of inventory as the “raw material” required by demand planning software.  This can be what was sold to whom and when or what you shipped to whom and when.  Or what raw materials or subassemblies were consumed in work orders and when.  Or what is supplied to a satellite warehouse from a distribution center and when.

The history of these transactions is analyzed by the software and used to produce statistical forecasts that extrapolate observed patterns.  The data is evaluated to uncover patterns such as trend, seasonality, cyclical patterns, and to identify potential outliers that require business attention.  If this data is not generally accessible or updated in irregular intervals, then it is nearly impossible to create a good prediction of the future demand.  Yes, you could use business knowledge or gut feel but that doesn’t scale and nearly always introduces bias into the forecast (i.e., consistently forecasting too high or too low). 

Data is needed at the transactional level to support finer grained forecasting at the weekly or even daily levels.  For example, as a business enters its busy season it may want to start forecasting weekly to better align production to demand.  You can’t easily do that without having the transactional data in a well-structured data warehouse. 

It might also be the case that certain types of transactions shouldn’t be included in demand data.  This can happen when demand results from a steep discount or some other circumstance that the supply chain team knows will skew the results.  If the data is provided in the aggregate, it is much harder to segregate these exceptions.  At Smart Software, we call the process of figuring out which transactions (and associated transactional attributes) should be counted in the demand signal as “demand signal composition.” Having access to all the transactions enables a company to modify their demand signal as needed over time within the software.  Only providing some of the data results in a far more rigid demand composition that can only be remedied with additional implementation work.

Pricing and Costs
The price you sold your products for and the cost you paid to procure them (or raw materials) is critical to being able to forecast in revenue or costs.  An important part of the demand planning process is getting business knowledge from customers and sales teams.  Sales teams tend to think of demand by customer or product category and speak in the language of dollars.  So, it is important to express a forecast in dollars.  The demand planning system cannot do that if the forecast is shown in units only. 

Often, the demand forecast is used to drive or at least influence a larger planning & budgeting process and the key input to a budget is a forecast of revenue.  When demand forecasts are used to support the S&OP process, the Demand Planning software should either average pricing across all transactions or apply “time-phased” conversions that consider the price sold at that time.   Without the raw data on pricing and costs, the demand planning process can still function, but it will be severely impaired. 

Product attributes, Customer Details, and Locations
Product attributes are needed so that forecasters can aggregate forecasts across different product families, groups, commodity codes, etc. It is helpful to know how many units and total projected dollarized demand for different categories.  Often, business knowledge about what the demand might be in the future is not known at the product level but is known at the product family level, customer level, or regional level.  With the addition of product attributes to your demand planning data feed, you can easily “roll up” forecasts from the item level to a family level.  You can convert forecasts at these levels to dollars and better collaborate on how the forecast should be modified.  

Once the knowledge is applied in the form of a forecast override, the software will automatically reconcile the change to all the individual items that comprise the group.  This way, a forecast analyst doesn’t have to individually adjust every part.  They can make a change at the aggregate level and let the demand planning software do the reconciliation for them. 

Grouping for ease of analysis also applies to customer attributes, such as assigned salesperson or a customer’s preferred ship from location.  And location attributes can be useful, such as assigned region.  Sometimes attributes relate to a product and location combination, like preferred supplier or assigned planner, which can differ for the same product depending on warehouse.

 

A final note on confidentiality

Recall that our customer expressed concern that we might sell their data to a competitor. We would never do that. For decades, we have been using customer data for training purposes and for improving our products. We are scrupulous about safeguarding customer data and anonymizing anything that might be used, for instance, to illustrate a point in a blog post.

 

 

 

Types of forecasting problems we help solve

Here are examples of forecasting problems that SmartForecasts can solve, along with the kinds of business data representative of each.

Forecasting an item based on its pattern

Given the following six quarterly sales figures, what sales can you expect for the third and fourth quarters of 2023?

Forecasting an item based on its pattern

Sales by Quarter

SmartForecasts gives you many ways to approach this problem. You can make your own statistical forecasts using any of six different exponential smoothing and moving average methods. Or, like most nontechnical forecasters, you can use the time-saving Automatic command, which has been programmed to automatically select and use the most accurate method for your data. Finally, to incorporate your business judgment into the forecasting process, you can graphically adjust any statistical forecast result using SmartForecasts’ “eyeball” adjustment capabilities.

 

Forecasting an item based on its relationship to other variables.

Given the following historical relationship between unit sales and the number of sales representatives, what sales levels can you expect when the planned increase in sales staff takes place over the final two quarters of 2023?

Forecasting an item based on its relationship to other variables.

Sales and Sales Representatives by Quarter

You can answer a question like this using SmartForecasts’ powerful Regression command, designed specifically to facilitate forecasting applications that require regression analysis solutions. Regression models with an essentially unlimited number of independent/predictor variables are possible, although most useful regression models use only a handful of predictors.

 

Simultaneously forecasting a number of product items and their total

Given the following total sales for all dress shirts and the distribution of sales by color, what will individual and total sales be over the next six months?

Forecasting an item based on its relationship to other variables.

Monthly Dress Shirt Sales by Color

SmartForecasts’ unique Group Forecasting features automatically and simultaneously forecasts closely related time series, such as these items in the same product group. This saves considerable time and provides forecast results not only for the individual items but also for their total. “Eyeball” adjustments at both the item and group levels are easy to make. You can quickly create forecasts for product groups with hundreds or even thousands of items.

 

Forecasting thousands of items automatically

Given the following record of product demand at the SKU level, what can you expect demand to be over the next six months for each of the 5,000 SKUs?

Forecasting thousands of items automatically

Monthly Product Demand by SKU (Stock Keeping Unit)

In just a few minutes, SmartForecasts’ powerful Automatic Selection can take a forecasting job of this size, read the product demand data, automatically create statistical forecasts for each SKU, and saves the result. The results are then ready for export to your ERP system leveraging any one of our API-based connectors or via file export.  Once set up, forecasts will automatically be produced each planning cycle without intervention by the user.

 

Forecasting demand that is most often zero

A distinct and especially challenging type of data to forecast is intermittent demand, which is most often zero but jumps up to random nonzero values at random times. This pattern is typical of demand for slow moving items, such as service parts or big ticket capital goods.

For example, consider the following sample of demand for aircraft service parts. Note the preponderance of zero values with nonzero values mixed in, often in bursts.

Forecasting demand that is most often zero

SmartForecasts has a unique method designed especially for this type of data: the Intermittent Demand forecasting feature. Since intermittent demand arises most often in the context of inventory control, this feature focuses on forecasting the range of likely values for the total demand over a lead time, e.g., cumulative demand over the period Jun-23 to Aug-23 in the example above.

 

Forecasting inventory requirements

Forecasting inventory requirements is a specialized variant of forecasting that focuses on the high end of the range of possible future values.

For simplicity, consider the problem of forecasting inventory requirements for just one period ahead, say one day ahead. Usually, the forecasting job is to estimate the most likely or average level of product demand. However, if available inventory equals the average demand, there is about a 50% chance that demand will exceed inventory, resulting in lost sales and/or lost good will. Setting the inventory level at, say, ten times the average demand will probably eliminate the problem of stockouts, but will just as surely result in bloated inventory costs.

The trick of inventory optimization is to find a satisfactory balance between having enough inventory to meet most demand without tying up too many resources in the process. Usually, the solution is a blend of business judgment and statistics. The judgmental part is to define an acceptable inventory service level, such as meeting 95% of demand immediately from stock. The statistical part is to estimate the 95th percentile of demand.

When not dealing with intermittent demand, SmartForecasts estimates the required inventory level by assuming a bell-shaped (Normal) curve of demand, estimating both the middle and the width of the bell curve, then using a standard statistical formula to estimate the desired percentile. The difference between the desired inventory level and the average level of demand is called the safety stock because it protects against the possibility of stockouts.

When dealing with intermittent demand, the bell-shaped curve is a poor approximation to the statistical distribution of demand. In this special case, SmartForecasts uses patented intermittent demand forecasting technology to estimate the required inventory service level.

 

 

Three Ways to Estimate Forecast Accuracy

Forecast accuracy is a key metric by which to judge the quality of your demand planning process. (It’s not the only one. Others include timeliness and cost; See 5 Demand Planning Tips for Calculating Forecast Uncertainty.) Once you have forecasts, there are a number of ways to summarize their accuracy, usually designated by obscure three- or four-letter acronyms like MAPE, RMSE, and MAE.  See Four Useful Ways to Measure Forecast Error for more detail.

A less discussed but more fundamental issue is how computational experiments are organized for computing forecast error. This post compares the three most important experimental designs. One of them is old-school and essentially amounts to cheating. Another is the gold standard. A third is a useful expedient that mimics the gold standard and is best thought of as predicting how the gold standard will turn out. Figure 1 is a schematic view of the three methods.

 

Three Ways to Estimate Forecast Accuracy Software Smart

Figure 1: Three ways to assess forecast error

 

The top panel of Figure 1 depicts the way forecast error was assessed back in the early 1980’s before we moved the state of the art to the scheme shown in the middle panel. In the old days, forecasts were assessed on the same data used to compute the forecasts. After a model was fit to the data, the errors computed were not for model forecasts but for model fits. The difference is that forecasts are for future values, while fits are for concurrent values. For example, suppose the forecasting model is a simple moving average of the three most recent observations. At time 3, the model computes the average of observations 1, 2, and 3. This average would then be compared to the observed value at time 3. We call this cheating because the observed value at time 3 got a vote on what the forecast should be at time 3. A true forecast assessment would compare the average of the first three observations to the value of the next, fourth, observation. Otherwise, the forecaster is left with an overly optimistic assessment of forecast accuracy.

The bottom panel of Figure 1 shows the best way to assess forecast accuracy. In this schema, all the historical demand data are used to fit a model, which is then used to forecast future, unknown demand values. Eventually, the future unfolds, the true future values reveal themselves, and actual forecast errors can be computed. This is the gold standard. This information populates the “forecasts versus actuals” report in our software.

The middle panel depicts a useful halfway measure. The problem with the gold standard is that you must wait to learn how well your chosen forecasting methods perform. This delay does not help when you are required to choose, in the moment, which forecasting method to use for each item. Nor does it provide a timely estimate of the forecast uncertainty you will experience, which is important for risk management such as forecast hedging. The middle way is based on hold-out analysis, which excludes (“holds out”) the most recent observations and asks the forecasting method to do its work without knowing those ground truths. Then the forecasts based on the foreshortened demand history can be compared to the held-out actual values to get an honest assessment of forecast error.