Rethinking forecast accuracy: A shift from accuracy to error metrics

Measuring the accuracy of forecasts is an undeniably important part of the demand planning process. This forecasting scorecard could be built based on one of two contrasting viewpoints for computing metrics. The error viewpoint asks, “how far was the forecast from the actual?” The accuracy viewpoint asks, “how close was the forecast to the actual?” Both are valid, but error metrics provide more information.

Accuracy is represented as a percentage between zero and 100, while error percentages start at zero but have no upper limit. Reports of MAPE (mean absolute percent error) or other error metrics can be titled “forecast accuracy” reports, which blurs the distinction.  So, you may want to know how to convert from the error viewpoint to the accuracy viewpoint that your company espouses.  This blog describes how with some examples.

Accuracy metrics are computed such that when the actual equals the forecast then the accuracy is 100% and when the forecast is either double or half of the actual, then accuracy is 0%. Reports that compare the forecast to the actual often include the following:

  • The Actual
  • The Forecast
  • Unit Error = Forecast – Actual
  • Absolute Error = Absolute Value of Unit Error
  • Absolute % Error = Abs Error / Actual, as a %
  • Accuracy % = 100% – Absolute % Error

Look at a couple examples that illustrate the difference in the approaches. Say the Actual = 8 and the forecast is 10.

Unit Error is 10 – 8 = 2

Absolute % Error = 2 / 8, as a % = 0.25 * 100 = 25%

Accuracy = 100% – 25% = 75%.

Now let’s say the actual is 8 and the forecast is 24.

Unit Error is 24– 8 = 16

Absolute % Error = 16 / 8 as a % = 2 * 100 = 200%

Accuracy = 100% – 200% = negative is set to 0%.

In the first example, accuracy measurements provide the same information as error measurements since the forecast and actual are already relatively close. But when the error is more than double the actual, accuracy measurements bottom out at zero. It does correctly indicate the forecast was not at all accurate. But the second example is more accurate than a third, where the actual is 8 and the forecast is 200. That’s a distinction a 0 to 100% range of accuracy doesn’t register. In this final example:

Unit Error is 200 – 8 = 192

Absolute % Error = 192 / 8, as a % = 24 * 100 = 2,400%

Accuracy = 100% – 2,400% = negative is set to 0%.

Error metrics continue to provide information on how far the forecast is from the actual and arguably better represent forecast accuracy.

We encourage adopting the error viewpoint. You simply hope for a small error percentage to indicate the forecast was not far from the actual, instead of hoping for a large accuracy percentage to indicate the forecast was close to the actual.  This shift in mindset offers the same insights while eliminating distortions.

 

 

 

 

What data is needed to support Demand Planning Software Implementations

We recently met with the IT team at one of our customers to discuss data requirements and installation of our API based integration that would pull data from their on-premises installation of their ERP system.   The IT manager and analyst both expressed significant concern about providing this data and seriously questioned why it needed to be provided at all.  They even voiced concerns that their data might be resold to their competition. Their reaction was a big surprise to us.  We wrote this blog with them in mind and to make it easier for others to communicate why certain data is necessary to support an effective demand planning process. 

Please note that if you are a forecast analyst, demand planner, of supply chain professional then most of what you’ll read below will be obvious.  But what this meeting taught me is that what is obvious to one group of specialists isn’t going to be obvious to another group of specialists in an entirely different field. 

The Four main types of data that are needed are:  

  1. Historical transactions, such as sales orders and shipments.
  2. Job usage transactions, such as what components are needed to produce finished goods
  3. Inventory Transfer transactions, such as what inventory was shipped from one location to another.
  4. Pricing, costs, and attributes, such as the unit cost paid to the supplier, the unit price paid by the customer, and various meta data like product family, class, etc.  

Below is a brief explanation of why this data is needed to support a company’s implementation of demand planning software.

Transactional records of historical sales and shipments by customer
Think of what was drawn out of inventory as the “raw material” required by demand planning software.  This can be what was sold to whom and when or what you shipped to whom and when.  Or what raw materials or subassemblies were consumed in work orders and when.  Or what is supplied to a satellite warehouse from a distribution center and when.

The history of these transactions is analyzed by the software and used to produce statistical forecasts that extrapolate observed patterns.  The data is evaluated to uncover patterns such as trend, seasonality, cyclical patterns, and to identify potential outliers that require business attention.  If this data is not generally accessible or updated in irregular intervals, then it is nearly impossible to create a good prediction of the future demand.  Yes, you could use business knowledge or gut feel but that doesn’t scale and nearly always introduces bias into the forecast (i.e., consistently forecasting too high or too low). 

Data is needed at the transactional level to support finer grained forecasting at the weekly or even daily levels.  For example, as a business enters its busy season it may want to start forecasting weekly to better align production to demand.  You can’t easily do that without having the transactional data in a well-structured data warehouse. 

It might also be the case that certain types of transactions shouldn’t be included in demand data.  This can happen when demand results from a steep discount or some other circumstance that the supply chain team knows will skew the results.  If the data is provided in the aggregate, it is much harder to segregate these exceptions.  At Smart Software, we call the process of figuring out which transactions (and associated transactional attributes) should be counted in the demand signal as “demand signal composition.” Having access to all the transactions enables a company to modify their demand signal as needed over time within the software.  Only providing some of the data results in a far more rigid demand composition that can only be remedied with additional implementation work.

Pricing and Costs
The price you sold your products for and the cost you paid to procure them (or raw materials) is critical to being able to forecast in revenue or costs.  An important part of the demand planning process is getting business knowledge from customers and sales teams.  Sales teams tend to think of demand by customer or product category and speak in the language of dollars.  So, it is important to express a forecast in dollars.  The demand planning system cannot do that if the forecast is shown in units only. 

Often, the demand forecast is used to drive or at least influence a larger planning & budgeting process and the key input to a budget is a forecast of revenue.  When demand forecasts are used to support the S&OP process, the Demand Planning software should either average pricing across all transactions or apply “time-phased” conversions that consider the price sold at that time.   Without the raw data on pricing and costs, the demand planning process can still function, but it will be severely impaired. 

Product attributes, Customer Details, and Locations
Product attributes are needed so that forecasters can aggregate forecasts across different product families, groups, commodity codes, etc. It is helpful to know how many units and total projected dollarized demand for different categories.  Often, business knowledge about what the demand might be in the future is not known at the product level but is known at the product family level, customer level, or regional level.  With the addition of product attributes to your demand planning data feed, you can easily “roll up” forecasts from the item level to a family level.  You can convert forecasts at these levels to dollars and better collaborate on how the forecast should be modified.  

Once the knowledge is applied in the form of a forecast override, the software will automatically reconcile the change to all the individual items that comprise the group.  This way, a forecast analyst doesn’t have to individually adjust every part.  They can make a change at the aggregate level and let the demand planning software do the reconciliation for them. 

Grouping for ease of analysis also applies to customer attributes, such as assigned salesperson or a customer’s preferred ship from location.  And location attributes can be useful, such as assigned region.  Sometimes attributes relate to a product and location combination, like preferred supplier or assigned planner, which can differ for the same product depending on warehouse.

 

A final note on confidentiality

Recall that our customer expressed concern that we might sell their data to a competitor. We would never do that. For decades, we have been using customer data for training purposes and for improving our products. We are scrupulous about safeguarding customer data and anonymizing anything that might be used, for instance, to illustrate a point in a blog post.

 

 

 

How to interpret and manipulate forecast results with different forecast methods

Smart IP&O is powered by the SmartForecasts® forecasting engine that automatically selects the most appropriate method for each item.  Smart Forecast methods are listed below:

  • Simple Moving Average and Single Exponential Smoothing for flat, noisy data
  • Linear Moving Average and Double Exponential Smoothing for trending data
  • Winters Additive and Winters Multiplicative for seasonal and seasonal & trending data.

This blog explains how each model works using time plots of historical and forecast data.  It outlines how to go about choosing which model to use.   The examples below show the same history, in red, forecasted with each method, in dark green, compared to the Smart-chosen winning method, in light green.

 

Seasonality
If you want to force (or prevent) seasonality to show in the forecast, then choose Winters models.  Both methods require 2 full years of history.

`Winter’s multiplicative will determine the size of the peaks or valleys of seasonal effects based on a percentage difference from a trending average volume.  It is not a good fit for very low volume items due to division by zero when determining that percentage. Note in the image below that the large percentage drop in seasonal demand in the history is being projected to continue over the forecast horizon making it look like there isn’t any seasonal demand despite using a seasonal method.

 

Winter’s multiplicative Forecasting method software

Statistical forecast produced with Winter’s multiplicative method. 

 

Winter’s additive will determine the size of the peaks or valleys of seasonal effects based on a unit difference from the average volume.  It is not a good fit if there’s significant trend to the data.  Note in the image below that seasonality is now being forecasted based on the average unit change in seasonality. So, the forecast still clearly reflects the seasonal pattern despite the down trend in both the level and seasonal peaks/valleys.

Winter’s additive Forecasting method software

Statistical forecast produced with Winter’s additive method.

 

Trend

If you want to force (or prevent) trend up or down to show in the forecast, then restrict the chosen methods to (or remove the methods of) Linear Moving Average and Double Exponential Smoothing.

 Double exponential smoothing will pick up on a long-term trend.  It is not a good fit if there are few historical data points.

Double exponential smoothing Forecasting method software

Statistical forecast produced with Double Exponential Smoothing

 

Linear moving average will pick up on nearer term trends.  It is not a good fit for highly volatile data

Linear moving average Forecasting method software

 

Non-Trending and Non-Seasonal Data
If you want to force (or prevent) an average from showing in the forecast, then restrict the chosen methods to (or remove the methods of) Simple Moving Average and Single Exponential Smoothing.

Single exponential smoothing will weigh the most recent data more heavily and produce a flat-line forecast.  It is not a good fit for trending or seasonal data.

Single exponential smoothing Forecasting method software

Statistical forecast using Single Exponential Smoothing

Simple moving average will find an average for each period, sometimes appearing to wiggle, and better for longer-term averaging.  It is not a good fit for trending or seasonal data.

Simple moving average Forecasting method software

Statistical forecast using Simple Moving Average

 

 

 

What to do when a statistical forecast doesn’t make sense

Sometimes a statistical forecast just doesn’t make sense.  Every forecaster has been there.  They may double-check that the data was input correctly or review the model settings but are still left scratching their head over why the forecast looks very unlike the demand history.   When the occasional forecast doesn’t make sense, it can erode confidence in the entire statistical forecasting process.

This blog will help a layman understand what the Smart statistical models are and how they are chosen automatically.  It will address how that choice sometimes fails, how you can know if it did, and what you can do to ensure that the forecasts can always be justified.  It’s important to know to expect, and how to catch the exceptions so you can rely on your forecasting system.

 

How methods are chosen automatically

The criteria to automatically choose one statistical method out of a set is based on which method came closest to correctly predicting held-out history.  Earlier history is passed to each method and the result is compared to actuals to find the one that came closest overall.  That automatically chosen method is then fed all the history to produce the forecast. Check out this blog to learn more about the model selection https://smartcorp.com/uncategorized/statistical-forecasting-how-automatic-method-selection-works/

For most time series, this process can capture trends, seasonality, and average volume accurately. But sometimes a chosen method comes mathematically closest to predicting the held-out history but doesn’t project it forward in a way that makes sense.  That means the system selected method isn’t best and for some “hard to forecast”

 

Hard to forecast items

Hard to forecast items may have large, unpredictable spikes in demand, or typically no demand but random irregular blips, or unusual recent activity.  Noise in the data sometimes randomly wanders up or down, and the automated best-pick method might forecast a runaway trend or a grind into zero.  It will do worse than common sense and in a small percentage of any reasonably varied group of items.  So, you will need to identify these cases and respond by overriding the forecast or changing the forecast inputs.

 

How to find the exceptions

Best practice is to filter or sort the forecasted items to identify those where the sum of the forecast over the next year is significantly different than the corresponding history last year.  The forecast sum may be much lower than the history or vice versa.  Use supplied metrics to identify these items; then you can choose to apply overrides to the forecast or modify the forecast settings.

 

How to fix the exceptions

Often when the forecast seems odd, an averaging method, like Single Exponential Smoothing or even a simple average using Freestyle, will produce a more reasonable forecast.  If trend is possibly valid, you can remove only seasonal methods to avoid a falsely seasonal result.  Or do the opposite and use only seasonal methods if seasonality is expected but wasn’t projected in the default forecast.  You can use the what-if features to create any number of forecasts, evaluate & compare, and continue to fine tune the settings until you are comfortable with the forecast.

Cleaning the history, with or without changing the automatic method selection, is also effective at producing reasonable forecasts. You can embed forecast parameters to reduce the amount of history used to forecast those items or the number of periods passed into the algorithm so earlier, outdated history is no longer considered.  You can edit spikes or drops in the demand history that are known anomalies so they don’t influence the outcome.  You can also work with the Smart team to implement automatic outlier detection and removal so that data prior to being forecasted is already cleansed of these anomalies.

If the demand is truly intermittent, it is going to be nearly impossible to forecast “accurately” per period. If a level-loading average is not acceptable, handling the item by setting inventory policy with a lead time forecast can be effective.  Alternatively, you may choose to use “same as last year” models which while not prone to accuracy will be generally accepted by the business given the alternatives forecasts.

Finally, if the item was introduced so recently that the algorithms do not have enough input to accurately forecast, a simple average or manual forecast may be best.  You can identify new items by filtering on the number of historical periods.

 

Manual selection of methods

Once you have identified rows where the forecast doesn’t make sense to the human eye, you can choose a smaller subset of all methods to allow into the forecast run and compare to history.  Smart will allow you to use a restricted set of methods just for one forecast run or embed the restricted set to use for all forecast runs going forward. Different methods will project the history into the future in different ways.  Having a sense of how each works will help you choose which to allow.

 

Rely on your forecasting tool

The more you use Smart period over period to embed your decisions about how to forecast and what historical data to consider, the less often you will face exceptions as described in this blog.  Entering forecast parameters is a manageable task when starting with critical or high impact items.  Even if you don’t embed any manual decisions on forecast methods, the forecast re-runs every period with new data. So, an item with an odd result today can become easily forecastable in time.

 

 

Implementing Demand Planning and Inventory Optimization Software with the Right Data

Data verification and validation are essential to the success of the implementation of software that performs statistical analysis of data, like Smart IP&O.  This article describes the issue and serves as a practical guide to doing the job right, especially for the user of the new application.

The less experience your organization has in validating historical transactions or item master attributes, the more likely it is there were problems or mistakes with data entry into the ERP that have so far gone unnoticed.  The garbage in, garbage out rule means you need to prioritize this step of the software onboarding process or risk delay and possible failure to generate ROI.

Ultimately the best person to confirm data in your ERP is entered correctly is the person who knows the business and can assert, for example, “this part doesn’t belong to that product group.”  That’s usually the same person who will open and use Smart. Though a database administrator or IT support can also play a key role by being able to say, “This part was assigned to that product group last December by Jane Smith.” Ensuring data is correct may not be a regular part of your day job but can be broken down into manageable small tasks that a good project manager will allocate the time and resources to complete.

The demand planning software vendor receiving the data also has a role.  They will confirm that the raw data was ingested without issue. The vendor can also identify abnormalities in the raw data files that point to the need for validation.  But relying on the software vendor to reassure you the data looks fine is not enough.  You don’t want to discover, after go-live, that you can’t trust the output because some of the data “doesn’t make sense.”

Each step in the data flow needs verification and validation.  Verification means the data at one step is still the same after flowing to the next step.  Validation means the data is correct and usable for analysis

The most common data flow looks like this:

Implementing Demand Planning and Inventory Optimization Software with the Right Data set

Less commonly, the first step between ERP master data and the interfacing files can sometimes be bypassed, where files are not used as an interface.  Instead, an API built by IT or the inventory optimization software vendor is responsible for data to be written directly from the ERP to the mirrored database in the cloud.  The vendor would work with IT to confirm the API is working as expected.  But the first validation step, even in that case, can still be performed.  After ingesting the data, the vendor can make the mirrored data available in files for the DBA/IT verification and business validation.

The confirmation that the mirrored data in the cloud completes the flow into the application is the responsibility of the vendor of software as a service.  SaaS vendors continually test that the software works correctly between the front-end application their subscribers see and the back-end data in the cloud database. If the subscribers still think the data doesn’t make sense in the application even after validating the interfacing files before going live, that is an issue to raise with the vendor’s customer support.

However the interfacing files are obtained, the largest part of verification and validation falls to the project manager and their team.  They must resource a test of the interfacing files to confirm:

  1. They match the data in the ERP. And that all and only the ERP data that was necessary to extract for use in the application was extracted.
  2. Nothing “jumps out” to the business as incorrect for each of the types of information in the data
  3. They are formatted as expected.

 

DBA/IT Verification Tasks

  1. Test the extract:

IT’s verification step can be done with various tools, comparing files, or importing files back to the database as temporary tables and joining them with the original data to confirm a match.  IT can depend on a query to pull the requested data into a file but that file can fail to match. The existence of delimiters or line returns within the data values can cause a file to be different than its original database table.  It is because the file relies heavily on delimiters and line returns to identify fields and records, while the table doesn’t rely on those characters to define its structure.

  1. No bad characters:

Free form data entry fields in the ERP, such as product descriptions, can sometimes themselves contain line returns, tabs, commas, and/or double quotes that can affect the structure of the output file.  Line returns should not be allowed in values that will be extracted to a file.  Characters equal to the delimiter should be stripped during extract or else a different delimiter used.

Tip: if commas are the file delimiter, numbers greater than 999 can’t be extracted with a comma. Use “1000” rather than “1,000”.

  1. Confirm the filters:

The other way that query extracts can return unexpected results is if conditions on the query are entered incorrectly.  The simplest way to avoid mistaken “where clauses” is to not use them.  Extract all data and allow the vendor to filter out some records according to rules supplied by the business.  If this will produce extract files so large that too much computing time is spent on the data exchange, the DBA/IT team should meet with the business to confirm exactly what filters on the data can be applied to avoid exchanging records that are meaningless to the application.

Tip: Bear in mind that Active/Inactive or item lifecycle information should not be used to filter out records.  This information should be sent to the application so it knows when an item becomes inactive.

  1. Be consistent:

The extract process must produce files of consistent format every time it is executed.  File names, field names, and position, delimiter, and Excel sheet name if Excel is used, numeric formats and date formats, and the use of quotes around values should never differ from one execution of the extract one day to the next. A hands-off report or stored procedure should be prepared and used for every execution of the extract.

 

Business Validation Background

Below is a break down each of validation step into considerations, specifically in the case where the vendor has provided a template format for the interfacing files where each type of information is provided in its own file.  Files sent from your ERP to Smart are formatted for easy export from the ERP.  That sort of format makes the comparison back to the ERP a relatively simple job for IT, but it can be harder for the business to interpret.  Best practice is to manipulate the ERP data, either by using pivot tables or similar in a spreadsheet.  IT may assist by providing re-formatted data files for review by the business.

To delve into the interfacing files, you’ll need to understand them.  The vendor will supply a precise template, but generally interfacing files consist into three types: catalog data, item attributes, and transactional data.

  • Catalog data contains identifiers and their attributes. Identifiers are typically for products, locations (which could be plants or warehouses), your customers, and your suppliers.
  • Item attributes contain information about products at locations that are needed for analysis on the product and location combination. Such as:
    • Current replenishment policy in the form of a Min and Max, Reorder Point, or Review Period and Order Up To value, or Safety Stock
    • Primary supplier assignment and nominal lead time and cost per unit from that supplier
    • Order quantity requirements such as minimum order quantity, manufacturing lot size, or order multiples
    • Active/Inactive status of the product/location combination or flags that identify its state in its lifecycle, such as pre-obsolete
    • Attributes for grouping or filtering, such as assigned buyer/planner or product category
    • Current inventory information like on hand, on order, and in transit quantities.
  • Transactional data contains references to identifiers along with dates and quantities. Such as quantity sold in a sales order of a product, at a location, for a customer, on a date.  Or quantity placed on purchase order of a product, into a location, from a supplier, on a date. Or quantity used in a work order of a component product at a location on a date.

 

Validating Catalog Data

Considering catalog data first, you may have catalog files similar to these examples:

Implementing Demand Planning and Inventory Optimization Software 111

Location Identifier Description Region Source Location  etc…
Location1 First location North    
Location2 Second location South Location1  
Location3 Third location South Location1  
…etc…        

 

Customer Identifier Description SalesPerson Ship From Location  etc…
Customer1 First customer Jane Location1  
Customer2 Second customer Jane Location3  
Customer3 Third customer Joe Location2  
…etc…        

 

Supplier Identifier Description Status Typical Lead                 Time Days  etc…
Supplier1 First supplier Active 18  
Supplier2 Second supplier Active 60  
Supplier3 Third Supplier Active 5  
…etc…        

 

1: Check for a reasonable count of catalog records

For each file of catalog data, open it in a spreadsheet tool like Google Sheets or MS Excel. Answer these questions:

  1. Is the record count in the ballpark? If you have about 50K products, there should not be only 10K rows in its file.
  2. If it’s a short file, maybe the Location file, you can confirm exactly that all expected Iidentifiers are in it.
  3. Filter by each attribute value and confirm again the count of records with that attribute value makes sense.

2: Check the correctness of values in each attribute field

Someone who knows what the products are and what the groups mean needs to take the time to confirm it is actually right, for all the attributes of all the catalog data.

So, if your Product file contains the attributes as in the example above, you would filter for Status of Active, and check that all resulting products are actually active.  Then filter for Status of Inactive and check that all resulting products are actually inactive.  Then filter for the first Group value and confirm all resulting products are in that group.  Repeat for Group2 and Group3, etc.  Then repeat for every attribute in every file.

It can help to do this validation with a comparison to an already existing and trusted report.  If you have another spreadsheet that shows products by Group for any reason, you can compare the interfacing files to it.  You may need to familiarize yourself with the VLOOKUP function that helps with spreadsheet comparison.

Validating Item Attribute Data

1: Check for a reasonable count of item records

The item attribute data confirmation is similar to the catalog data.  Confirm the product/location combination count makes sense in total and for each of the unique item attributes, one by one. This is an example item data file:

Implementing Demand Planning and Inventory Optimization Software 22

2: Find and explain weird numbers in item file

There tends to be many numerical values in the item attributes, so “weird” numbers merit review.  To validate data for a numerical attribute in any file, search for where the number is:

  • Missing entirely
  • Equal to zero
  • Less than zero
  • More than most others, or less than most others (sort by that column)
  • Not a number at all, when it should be

A special consideration of files that are not catalog files is they may not show the descriptions of the products and locations, just their identifiers, which can be meaningless to you.  You can insert columns to hold the product and location descriptors that you are used to seeing and fill them into the spreadsheet to assist in your work.  The VLOOKUP function works for this as well.  Whether or not you have another report to compare the Items file to, you have the catalog files for Products and Location with show both the identifier and the description for each row.

3: Spot check

If you are frustrated to find that there are too many attribute values to manually check in a reasonable amount of time, spot checking is a solution. It can be done in a manner likely to pick up on any problems.  For each attribute, get a list of the unique values in each column.  You can copy a column into a new sheet, then use the Remove Duplicates function to see the list of possible values.   With it:

  1. Confirm that no attribute values are present that shouldn’t be.
  2. It can be harder to remember which attribute values are missing that should be there, so it can help to look at another source to remind you. For example, if Group1 through Group12 are present, you might check another source to remember if these are all the Groups possible.  Even if it is not required for the interfacing files for the application, it may be easy for IT to extract a list of all the possible Groups that are in your ERP which you can use for the validation exercise.  If you find extra or missing values that you don’t expect, bring an example of each to IT to investigate.
  3. Sort alphabetically and scan down to see if any two values are similar but slightly different, maybe only in punctuation, which could mean one record had the attribute data entered incorrectly.

For each type of item, maybe one from each product group and/or location, check that all its attributes in every file are correct or at least pass a sanity check.  The more you can spot check from a broad range of items, the less likely you will have issues post go live.

 

Validating Transactional Data

Transactional files may all have a format similar to this:

Implementing Demand Planning and Inventory Optimization Software 333

 

1: Find and explain weird numbers in each transactional file

These should be checked for “weird” numbers in the Quantity field.  Then you can proceed to:

  1. Filter for dates outside the range you expect or missing expected dates entirely.
  2. Find where Transaction identifiers and line numbers are missing. They shouldn’t be.
  3. If there is more than one record for a given Transaction ID and Transaction Line Number combination, is that a mistake? Put another way, should duplicate records have their quantities summed together or is that double counting?

2: Sanity check summed quantities

Do a sanity check by filtering to a particular product you’re familiar with, and filter to a relatable date range such as last month or last year, and sum the quantities.  Is that total amount what you expected for that product in that time frame?  If you have information on total usage out of a location, you can slice the data that way to sum the quantities and compare to what you expect.  Pivot tables come in handy for verification of transactional data.  With them, you can view the data like:

Product Year Quantity Total
Prod1 2022 9,034
Prod1 2021 8,837
etc    

 

The products’ yearly total may be simple to sanity check if you know the products well.  Or you can VLOOKUP to add attributes, such as product group, and pivot on that to see a higher level that is more familiar:

Product Group Year Quantity Total
Group1 2022 22,091
Group2 2021 17,494
etc    

 

3: Sanity check count of records

It may help to display a count of transactions rather than a sum of the quantities, especially for purchase order data.  Such as:

Product Year Number of POs
Prod1 2022 4
Prod1 2021 1
etc    

 

And/or the same summarization at a higher level, like:

Product Group Year Number of POs
Group1 2022 609
Group2 2021 40
etc    

 

4: Spot checking

Spot checking the correctness of a single transaction, for each type of item and each type of transaction, completes due diligence.  Pay special attention to what date is tied to the transaction, and whether it is right for the analysis.  Dates may be a creation date, like the date a customer placed an order with you, or a promise date, like the date you expected to deliver on the customer’s order at the time of creating it, or a fulfilment date, when you actually delivered on the order.  Sometimes a promise date gets modified days after creating the order if it can’t be met.  Make sure the date in use reflects actual demand by the customer for the product most closely.

What to do about bad data 

If the mis-entries are few or one-off, you can edit the ERP records by hand as they are found, cleaning up your catalog attributes, even after go-live with the application.  But if large swathes of attributes or transaction quantities are off, this can spur an internal project to re-enter data correctly and possibly to change or start to document the process that needs to be followed when new records are entered into your ERP.

Care must be taken to avoid too long a delay in implementation of the SaaS application while waiting on clean attributes.  Break the work into chunks and use the application to analyze the clean data first so the data cleansing project occurs in parallel with getting value out of the new application.