Most statistical forecasting works in one direct flow from past data to forecast. Forecasting with leading indicators works a different way. A leading indicator is a second variable that may influence the one being forecasted. Applying testable human knowledge about the predictive power in the relationship between these different sets of data will sometimes provide superior accuracy.

Most of the time, a forecast is based solely on the past history of the item being forecast. Let’s assume that the forecaster’s problem is to predict future unit sales of an important product. The process begins with gathering data on the product’s past sales. (Gregory Hartunian shares some practical advice on choosing the best available data in a previous post to the Smart Forecaster.) This data flows into forecasting software, which analyzes the sales record to measure the level of random variability and exploit any predictable aspects, such as trend or regular patterns of seasonal variability. The forecast is based entirely on the past behavior of the item being forecasted. Nothing that might have caused the wiggles and jiggles in the product’s sales graph is explicitly accounted for. This approach is fast, simple, self-contained and scalable, because software can zip through a huge number of forecasts automatically.

But sometimes the forecaster can do better, at the cost of more work. If the forecaster can peer through the fog of randomness and identify a second variable that influences the one being forecasted, a leading indicator, more accurate predictions are possible.

For example, suppose the product is window glass for houses. It may well be that increases or decreases in the number of construction permits for new houses will be reflected in corresponding increases or decreases in the number of sheets of glass ordered several months later. If the forecaster can distill this “lagged” or delayed relationship into an equation, that equation can be used to forecast glass sales several months hence using known values of the leading indicator. This equation is called a “regression equation” and has a form something like:

Sales of glass in 3 months = 210.9 + 26.7 × Number of housing starts this month.

Forecasting software can take the housing start and glass sales data and convert them into such a regression equation.

Graph displaying a relationship between example figures for time-shifted building permits and demand for glass

Leading indicators demonstrated

However, unlike automatic statistical forecasting based on a product’s past sales, forecasting with a leading indicator faces the same problem as the proverbial recipe for rabbit stew: “First catch a rabbit”. Here the forecaster’s subject matter expertise is critical to success. The forecaster must be able to nominate one or more candidates for the job of leading indicator. After this crucial step, based on the forecaster’s knowledge, experience and intuition, then software can be used to verify that there really is a predictive, time-delayed relationship between the candidate leading indicator and the variable to be forecasted.

This verification step is done using a “cross-correlation” analysis. The software essentially takes as input a sequence of values of the variable to be forecasted and another sequence of values of the supposed leading indicator. Then it slides the data from the forecast variable ahead by, successively, one, two, three, etc. time periods. At each slip in time (called a “lag”, because the leading indicator is lagging further and further behind the forecast variable), the software checks for a pattern of association between the two variables. If it finds a pattern that is too strong to be explained as a statistical accident, the forecaster’s hunch is confirmed.

Obviously, forecasting with leading indicators is more work than forecasting using only an item’s own past values. The forecaster has to identify a leading indicator, starting with a list suggested by the forecaster’s subject matter expertise. This is a “hand-crafting” process that is not suited to mass production of forecasts. But it can be a successful approach for a smaller number of important items that are worth the extra effort. The role of forecasting software, such as our SmartForecasts system, is to help the forecaster authenticate the leading indicator and then exploit it.

*Thomas Willemain, PhD, co-founded Smart Software and currently serves as Senior Vice President for Research. Dr. Willemain also serves as Professor Emeritus of Industrial and Systems Engineering at Rensselaer Polytechnic Institute and as a member of the research staff at the Center for Computing Sciences, Institute for Defense Analyses.*

Related Posts

## Fifteen questions that reveal how forecasts are computed in your company

In a recent LinkedIn post, I detailed four questions that, when answered, will reveal how forecasts are being used in your business. In this article, we’ve listed questions you can ask that will reveal how forecasts are created.

## How to interpret and manipulate forecast results with different forecast methods

This blog explains how each forecasting model works using time plots of historical and forecast data. It outlines how to go about choosing which model to use. The examples below show the same history, in red, forecasted with each method, in dark green, compared to the Smart-chosen winning method, in light green.

## What to do when a statistical forecast doesn’t make sense

Sometimes a statistical forecast just doesn’t make sense. Every forecaster has been there. They may double-check that the data was input correctly or review the model settings but are still left scratching their head over why the forecast looks very unlike the demand history. When the occasional forecast doesn’t make sense, it can erode confidence in the entire statistical forecasting process.