Correlation vs Causation: Is This Relevant to Your Job?

Outside of work, you may have heard the famous dictum “Correlation is not causation.” It may sound like a piece of theoretical fluff that, though involved in a recent Noble Prize in economics, isn’t relevant to your work as a demand planner. Is so, you may be only partially correct.

Extrapolative vs Causal Models

Most demand forecasting uses extrapolative models. Also called time-series models, these forecast demand using only the past values of an item’s demand. Plots of past values reveal trend and seasonality and volatility, so there is a lot they are good for. But there is another type of model – causal models —that can potentially improve forecast accuracy beyond what you can get from extrapolative models.

Causal models bring more input data to the forecasting task: information on presumed forecast “drivers” external to the demand history of an item. Examples of potentially useful causal factors include macroeconomic variables like the inflation rate, the rate of GDP growth, and raw material prices. Examples not tied to the national economy include industry-specific growth rates and your own and competitors’ ad spending.  These variables are usually used as inputs to regression models, which are equations with demand as an output and causal variables as inputs.

Forecasting using Causal Models

Many firms have an S&OP process that involves a monthly review of statistical (extrapolative) forecasts in which management adjusts forecasts based on their judgement. Often this is an indirect and subjective way to work causal models into the process without doing the regression modeling.

To actually make a causal regression model, first you have to nominate a list of potentially-useful causal predictor variables. These may come from your subject matter expertise. For example, suppose you manufacture window glass. Much of your glass may end up in new homes and new office buildings. So, the number of new homes and offices being built are plausible predictor variables in a regression equation.

There is a complication here: if you are using the equation to predict something, you must first predict the predictors. For example, sales of glass next quarter may be strongly related to numbers of new homes and new office buildings next quarter. But how many new homes will there be next quarter? That’s its own forecasting problem. So, you have a potentially powerful forecasting model, but you have extra work to do to make it usable.

There is one way to simplify things: if the predictor variables are “lagged” versions of themselves. For example, the number of new building permits issued six months ago may be a good predictor of glass sales next month. You don’t have to predict the building permit data – you just have to look it up.

Is it a causal relationship or just a spurious correlation?

Causal models are the real deal: there is an actual mechanism that relates the predictor variable to the predicted variable. The example of predicting glass sales from building permits is an example.

A correlation relationship is more iffy. There is a statistical association that may or may not provide a solid basis for forecasting. For example, suppose you sell a product that happens to appeal most strongly to Dutch people but you don’t realize this. The Dutch are, on average, the tallest people in Europe. If your sales are increasing and the average height of Europeans is increasing, you might use that relationship to good effect. However, if the proportion of Dutch in the Euro zone is decreasing while the average height is increasing because the mix of men versus women is shifting toward men, what can go wrong? You will expect sales to increase because average height is increasing. But your sales are really mostly to the Dutch, and their relative share of the population is shrinking, so your sales are really going to decrease instead. In this case the association between sales and customer height is a spurious correlation.

How can you tell the difference between true and spurious relationships? The gold standard is to do a rigorous scientific experiment. But you are not likely to be in position to do that. Instead, you have to rely on your personal “mental model” of how your market works. If your hunches are right, then your potential causal models will correlate with demand and causal modeling will pay off for you, either to supplement extrapolative models or to replace them.

 

 

 

 

What data is needed to support Demand Planning Software Implementations

We recently met with the IT team at one of our customers to discuss data requirements and installation of our API based integration that would pull data from their on-premises installation of their ERP system.   The IT manager and analyst both expressed significant concern about providing this data and seriously questioned why it needed to be provided at all.  They even voiced concerns that their data might be resold to their competition. Their reaction was a big surprise to us.  We wrote this blog with them in mind and to make it easier for others to communicate why certain data is necessary to support an effective demand planning process. 

Please note that if you are a forecast analyst, demand planner, of supply chain professional then most of what you’ll read below will be obvious.  But what this meeting taught me is that what is obvious to one group of specialists isn’t going to be obvious to another group of specialists in an entirely different field. 

The Four main types of data that are needed are:  

  1. Historical transactions, such as sales orders and shipments.
  2. Job usage transactions, such as what components are needed to produce finished goods
  3. Inventory Transfer transactions, such as what inventory was shipped from one location to another.
  4. Pricing, costs, and attributes, such as the unit cost paid to the supplier, the unit price paid by the customer, and various meta data like product family, class, etc.  

Below is a brief explanation of why this data is needed to support a company’s implementation of demand planning software.

Transactional records of historical sales and shipments by customer
Think of what was drawn out of inventory as the “raw material” required by demand planning software.  This can be what was sold to whom and when or what you shipped to whom and when.  Or what raw materials or subassemblies were consumed in work orders and when.  Or what is supplied to a satellite warehouse from a distribution center and when.

The history of these transactions is analyzed by the software and used to produce statistical forecasts that extrapolate observed patterns.  The data is evaluated to uncover patterns such as trend, seasonality, cyclical patterns, and to identify potential outliers that require business attention.  If this data is not generally accessible or updated in irregular intervals, then it is nearly impossible to create a good prediction of the future demand.  Yes, you could use business knowledge or gut feel but that doesn’t scale and nearly always introduces bias into the forecast (i.e., consistently forecasting too high or too low). 

Data is needed at the transactional level to support finer grained forecasting at the weekly or even daily levels.  For example, as a business enters its busy season it may want to start forecasting weekly to better align production to demand.  You can’t easily do that without having the transactional data in a well-structured data warehouse. 

It might also be the case that certain types of transactions shouldn’t be included in demand data.  This can happen when demand results from a steep discount or some other circumstance that the supply chain team knows will skew the results.  If the data is provided in the aggregate, it is much harder to segregate these exceptions.  At Smart Software, we call the process of figuring out which transactions (and associated transactional attributes) should be counted in the demand signal as “demand signal composition.” Having access to all the transactions enables a company to modify their demand signal as needed over time within the software.  Only providing some of the data results in a far more rigid demand composition that can only be remedied with additional implementation work.

Pricing and Costs
The price you sold your products for and the cost you paid to procure them (or raw materials) is critical to being able to forecast in revenue or costs.  An important part of the demand planning process is getting business knowledge from customers and sales teams.  Sales teams tend to think of demand by customer or product category and speak in the language of dollars.  So, it is important to express a forecast in dollars.  The demand planning system cannot do that if the forecast is shown in units only. 

Often, the demand forecast is used to drive or at least influence a larger planning & budgeting process and the key input to a budget is a forecast of revenue.  When demand forecasts are used to support the S&OP process, the Demand Planning software should either average pricing across all transactions or apply “time-phased” conversions that consider the price sold at that time.   Without the raw data on pricing and costs, the demand planning process can still function, but it will be severely impaired. 

Product attributes, Customer Details, and Locations
Product attributes are needed so that forecasters can aggregate forecasts across different product families, groups, commodity codes, etc. It is helpful to know how many units and total projected dollarized demand for different categories.  Often, business knowledge about what the demand might be in the future is not known at the product level but is known at the product family level, customer level, or regional level.  With the addition of product attributes to your demand planning data feed, you can easily “roll up” forecasts from the item level to a family level.  You can convert forecasts at these levels to dollars and better collaborate on how the forecast should be modified.  

Once the knowledge is applied in the form of a forecast override, the software will automatically reconcile the change to all the individual items that comprise the group.  This way, a forecast analyst doesn’t have to individually adjust every part.  They can make a change at the aggregate level and let the demand planning software do the reconciliation for them. 

Grouping for ease of analysis also applies to customer attributes, such as assigned salesperson or a customer’s preferred ship from location.  And location attributes can be useful, such as assigned region.  Sometimes attributes relate to a product and location combination, like preferred supplier or assigned planner, which can differ for the same product depending on warehouse.

 

A final note on confidentiality

Recall that our customer expressed concern that we might sell their data to a competitor. We would never do that. For decades, we have been using customer data for training purposes and for improving our products. We are scrupulous about safeguarding customer data and anonymizing anything that might be used, for instance, to illustrate a point in a blog post.

 

 

 

Elephants and Kangaroos ERP vs. Best of Breed Demand Planning

“Despite what you’ve seen in your Saturday morning cartoons, elephants can’t jump, and there’s one simple reason: They don’t have to. Most jumpy animals—your kangaroos, monkeys, and frogs—do it primarily to get away from predators.”  — Patrick Monahan, Science.org, Jan 27, 2016.

Now you know why the largest ERP companies can’t develop high quality best-of-breed like solutions. They never had to, so they never evolved to innovate outside of their core focus. 

However, as ERP systems have become commoditized, gaps in their functionality became impossible to ignore. The larger players sought to protect their share of customer wallet by promising to develop innovative add-on applications to fill all the white spaces.  But without that “innovation muscle,” many projects failed, and mountains of technical debt accumulated.

Best-of-breed companies evolved to innovate and have deep functional expertise in specific verticals.  The result is that best of breed ERP add-ons are easier to use, have more features, and deliver more value than the native ERP modules they replace. 

If your ERP provider has already partnered with an innovative best of breed add-on provider*, you’re all set! But if you can only get the basics from your ERP, go with a best-of-breed add-on that has a bespoke integration to the ERP. 

A great place to start your search is to look for ERP demand planning add-ons that add brains to the ERP’s brawn, i.e., those that support inventory optimization and demand forecasting.  Leverage add-on tools like Smart’s statistical forecasting, demand planning, and inventory optimization apps to develop forecasts and stocking policies that are fed back to the ERP system to drive daily ordering. 

*App-stores are a license for the best of breed to sell into the ERP companies base –  being listed  partnerships.

 

 

 

 

Is your demand planning and forecasting process a black box?

There’s one thing I’m reminded of almost every day at Smart Software that puzzle me: most companies do not understand how forecasts are created, and stocking policies are determined.  It’s an organizational black box. Here is an example from a recent sales call:

How do you forecast?
We use history.

How do you use history?
What do you mean?

Well, you can take an average of the last year, last two years, average the most recent periods, or use some other type of formula to generate the forecast.
I’m pretty sure we use an average of the last 12 months.

Why 12 months instead of a different amount of history?
12 months is a good amount of time to use because it doesn’t get skewed by older data but it’s recent enough

How do you know it’s more accurate than using 18 months or some other length of history?
We don’t know. We do adjust the forecasts based on feedback from sales.  

Do you know if the adjustments make things more accurate or less than if you just used the average?
We don’t know but are confident that forecasts are inflated

What do the inventory buyers do then if they think the numbers are inflated?
They have lots of business knowledge and adjust their buys accordingly

So, is it fair to say they would ignore the forecasts at least some of the time?
Yes, some of the time.

How do the buyers decide when to order more? Do you have a reorder point or safety stock specified in your ERP system that helps guide these decisions?
Yes, we use a safety stock field.

How is safety stock calculated?
Buyers determine this based on the importance of the item, lead times, and other considerations such as how many customers purchase the item, the velocity of the item, it’s cost.  They’ll carry different amounts of safety stock depending on this.

The discussion continued. The main takeaway here is that when you scratch just below the surface, far more questions are revealed than answers.  This often means that the inventory planning and demand forecast process is highly subjective, varies from planner to planner, is not well understood by the rest of the organization, and likely to be reactive.  As Tom Willemain has described it’s “chaos masked by improvisation.”   The “as-is” process needs to be fully identified and documented.  Only then can gaps be exposed, and improvements can be made.   Here is a list of 10 questions  you can ask that will reveal your organization’s true forecasting, demand planning, and inventory planning process.

 

 

 

 

 

The Role of Trust in the Demand Forecasting Process Part 2: What do you Trust

“Regardless of how much effort is poured into training forecasters and developing elaborate forecast support systems, decision-makers will either modify or discard the predictions if they do not trust them.”  — Dilek Onkal, International Journal of Forecasting 38:3 (July-September 2022), p.802.

The words quoted above grabbed my attention and prompted this post. Those of a geekly persuasion, like your blogger, are inclined to think of forecasting as a statistical problem. While that is obviously true, those of a certain age, like your blogger, understand that forecasting is also a social activity and therefore has a large human component.

What Do You Trust?

There is a related dimension of trust: not who do you trust but what do you trust? By this, I mean both data and software.

Trust in Data

Trust in data underpins trust in the forecaster using the data. Most of our customers have their data in an ERP system. This data must be understood as a key corporate asset. For the data to be trustworthy, it must have the “three C’s”, i.e., it must be correct, complete, and current.

Correctness is obviously fundamental. We once had a customer who was implementing a new, strong forecasting process, but found the results completely at odds with their sense of what was happening in the business. It turned out that several of their data streams were incorrect by a factor of two, which is a huge error. Of course, this set back the implementation process until they could identify and correct all the gross errors in their demand data.

There is a less obvious point to be made about correctness. That is, data are random, so what you see now is not likely to be what you see next. Planning production based on the assumption that next week’s demand will be exactly the same as this week’s demand is clearly foolish, but classical formula-based forecasting models like the exponential smoothing mentioned above will project the same number throughout the forecast horizon. This is where scenario-based planning is essential for coping with the inevitable fluctuations in key variables such as customers’ demands and suppliers’ replenishment lead times.

Completeness is the second requirement for data to be trusted. Our software ultimately gets much of its value from exposing the links between operational decisions (e.g., selecting the reorder points governing replenishment of stock) and business-related metrics like inventory costs. Yet often implementation of forecasting software is delayed because item demand information is available someplace, but holding, ordering and/or shortage costs are not.  Or, to cite another recent example, a customer was able to properly size only half their inventory of spares for reparable parts because nobody had been tracking when the other half was breaking down, meaning there was no information on mean time before failure (MTBF), meaning it was not possible to model the breakdown behavior of half the fleet of reparable spares.

Finally, the currency of data matters. As the speed of business increases and company planning cycles drop from a quarterly or monthly tempo to a weekly or daily tempo, it becomes desirable to exploit the agility provided by overnight uploads of daily transactional data into the cloud. This allows high-frequency adjustments of forecasts and/or inventory control parameters for items that experience high volatility and sudden shifts in demand. The fresher the data, the more trustworthy the analysis.

Trust in Demand Forecasting Software

Even with high-quality data, forecasters must still trust the analytical software that processes the data. This trust must extend to both the software itself and to the computational environment in which it functions.

If forecasters used on-premises software, they must rely on their own IT departments to safeguard the data and keep it available for use. If they wish instead to exploit the power of cloud-based analytics, customers must trust their confidential information to their software vendors. Professional-level software, such as ours, justifies customers’ trust through SOC 2 certification. SOC 2 certification was developed by the American Institute of CPAs and defines criteria for managing customer data based on five “trust service principles”—security, availability, processing integrity, confidentiality, and privacy.

What about the software itself? What is needed to make it trustworthy? The main criteria here are the correctness of algorithms and functional reliability. If the vendor has a professional program development process, there will be little chance that the software ends up computing the wrong numbers because of a programming error. And if the vendor has a rigorous quality assurance process, there will be little chance that the software will crash just when the forecaster is on deadline or must deal with a pop-up analysis for a special situation.

Summary

To be useful, forecasters and their forecasts must be trusted by decision-makers. That trust depends on characteristics of forecasters and their processes and communication. It also depends on the quality of the data and software used in creating the forecasts.

 

Read the 1st part of this Blog “Who do you Trust” here: https://smartcorp.com/forecasting/the-role-of-trust-in-the-demand-forecasting-process-part-1-who/

 

 

 

The Role of Trust in the Demand Forecasting Process Part 1: Who do you Trust

 

“Regardless of how much effort is poured into training forecasters and developing elaborate forecast support systems, decision-makers will either modify or discard the predictions if they do not trust them.”  — Dilek Onkal, International Journal of Forecasting 38:3 (July-September 2022), p.802.

The words quoted above grabbed my attention and prompted this post. Those of a geekly persuasion, like your blogger, are inclined to think of forecasting as a statistical problem. While that is obviously true, those of a certain age, like your blogger, understand that forecasting is also a social activity and therefore has a large human component.

Who Do You Trust?

Trust is always a two-way street, but let’s stay on the demand forecaster’s side. What characteristics of and actions by forecasters and demand planners build trust in their work? The above quoted Professor Onkal reviewed academic research on this topic going back to 2006. She summarized results from practitioner surveys that identified key trust factors related to forecaster characteristics, forecasting process, and forecasting communication.

Forecaster characteristics

Key to building trust among the users of forecasts are perceptions of forecaster and demand planner competence and objectivity. Competence has a mathematical component, but many managers confuse computer skills with analytic skills, so users of forecasting software can usually clear this hurdle. However, since the two are not the same, it pays dividends to absorb your vendor’s training and learn not just the math but the lingo of your forecasting software. In my observation, trust can also be increased by showing knowledge of the company’s business.

Objectivity is also a key to trustworthiness. It may be uncomfortable for the forecaster to be put in the middle of occasional departmental squabbles, but those will come up and must be handled with tact. Squabbles? Well, silos exist and tilt in different directions. Sales departments favor higher demand forecasts that drive production increases, so that they never have to say “Sorry, we are fresh out of that.” Inventory managers are wary of high demand forecasts, because “excess enthusiasm” can leave them holding the bag, sitting on bloated inventory.

Sometimes the forecaster becomes a de facto referee, and in this role must display overt signs of objectivity. That can mean first recognizing that every management decision involves tradeoffs of good things against other good things, e.g., product availability versus lean operations, and then helping the parties strike a painful but tolerable balance by surfacing the links between operational decisions and the key performance metrics that matter to folks like Chief Financial Officers.

The Forecasting process

The forecasting process can be thought of as having three phases: data inputs, calculations, and outputs. Actions can be taken to increase trust in each phase.

 

Regarding inputs:

Trust can be increased if obviously relevant inputs are at least acknowledged if not directly used in calculations. Thus, factors like social media sentiment and regional sales managers’ gut instincts can be legitimate parts of a forecast consensus process. However, objectivity requires that these putative predictors of profit be tested objectively. For instance, a professional-grade forecasting process may well include subjective adjustment to statistical forecasts but must then also assess whether the adjustments actually end up improving accuracy, not just making some people feel listened to.

Regarding the second phase, calculations:

The forecaster will be trusted to the extent that they are able to deploy more than one way to calculate forecasts and then articulate a good reason why they chose the method eventually used. In addition, the forecaster should be able to explain in accessible language how even complicated techniques do their job. It is difficult to put trust in a “black box” method that is so opaque as to be inscrutable. The importance of explainability is amplified by the fact of life that the forecaster’s superior must themselves in turn be able to justify the choice of technique to their supervisor.

For instance, exponential smoothing uses this equation: S(t) = αX(t)+(1-α)S(t-1). Many forecasters are familiar with this equation, but many forecast users are not. There is a story that explains the equation in terms of averaging irrelevant “noise” in an item’s demand history and the need to strike a balance between smoothing out noise and being able to react to sudden shifts in the level of demand. The forecaster who can tell that story will be more credible. (My own version of that story uses phrases from sports, i.e., “head fakes” and “jukes”. Finding folksy analogs appropriate to your specific audience always pays dividends.)

A final point: best practice demands that any forecast be accompanied by an honest assessment of its uncertainty. A forecaster who tries to build trust by being overly specific (“Sales next quarter will be 12,184 units”) will always fail. A forecaster who says “Sales next quarter will have a 90% chance of falling between 12,000 and 12,300 units” will be both correct more often and  also more helpful to decision makers. After all, forecasting is essentially a job of risk management, so the decision maker is best served by knowing the risks.

Forecasting communication:

Finally, consider the third phase, communication of forecast results. Research suggests that continual communication with forecast users builds trust. It avoids those horrible, deflating moments when a nicely formatted report is shot down because of some fatal flaw that could have been foreseen: “This is no good because you didn’t take account of X, Y or Z” or “We really wanted you to present results rolled up to the top of the product hierarchies (or by sales region or by product line or…)”.

Even when everybody is aligned as to what is expected, trust is enhanced by presenting results using well-crafted graphics, with massive numerical tables provided for backup but not as the main way of communicating results. My experience has been that, just as a meeting-control device, a graph is usually much better than a large numerical table. With a graph, everybody’s attention is focused on the same thing and many aspects of the analysis are immediately (and literally) visible. With a table of results, the table of participants often splinters into side conversations in which each voice is focused on different pieces of the table.

Onkal summarizes the research this way: “Take-aways for those who make forecasts and those who use them converge around clarity of communication as well as perceptions of competence and integrity.”

What Do You Trust?

There is a related dimension of trust: not who do you trust but what do you trust? By this I mean both data and software….  Read the 2nd part of this Blog “What do you Trust” here  https://smartcorp.com/forecasting/the-role-of-trust-in-the-demand-forecasting-process-part-2-what/