Forecast accuracy is the percentage of demand a forecast got right when measured against actual sales. If a planner forecast 1,000 units and the business sold 950, the forecast was 95% accurate at the unit level. A higher percentage means a closer match, and a closer match means less safety stock, fewer stockouts, and less wasted capacity.
The simple-sounding definition hides a sharp question that trips up most planning teams: accurate at what level, over what time bucket, and using which formula? A forecast that looks 95% accurate at the national, monthly level can be 60% accurate at the SKU-location-weekly level where the actual replenishment decisions are made. The same dataset can produce very different "accuracy" numbers depending on the math chosen.
This page covers the four formulas planners actually use, where each one breaks down, and which one to pick depending on what decision the number will drive.
Horizon tracks MAPE, WMAPE, MAD, and bias at every level of the forecast hierarchy: SKU, SKU-location, customer-SKU, category, and total. The same forecast is automatically scored at every level, so the planner can see where aggregate accuracy hides SKU-level problems.
Horizon also reports Forecast Value Add (FVA) on top of accuracy. FVA compares each layer of the forecasting process naive forecast vs statistical baseline vs ML model vs sales overlay to show which steps are improving accuracy and which are destroying it. Many planning teams discover their sales-overlay layer is reducing accuracy, and FVA makes that visible in numbers, not opinions.
Accuracy is computed automatically on every forecast cycle. There is no separate scorecard to maintain.
Forecast accuracy is not a planning vanity metric. It is the variable that determines safety stock investment, customer service levels, and how much capacity sits idle. A 5 percentage-point improvement in SKU-weekly MAPE typically reduces safety stock by 8-15% for mid-volume items, because safety stock formulas are built on top of forecast error variance.
The downstream effect compounds. Higher accuracy lets the supply plan commit to firmer production schedules, which lets procurement negotiate better terms on raw materials, which lowers cost of goods. Lower accuracy forces the opposite: bloated safety stock, expedited freight, and last-minute scheduling chaos.
The trap most teams fall into is measuring accuracy at a level too aggregated to be useful. National monthly accuracy of 92% feels great in a leadership review. The same forecast at the SKU-DC-weekly level is what production scheduling actually consumes, and that number is often 30-40 points lower. Measure at the level where the decision is made.
There are four widely-used formulas. Each answers a slightly different question.
MAPE is the most common formula in industry. It is the average of the absolute percentage error across all forecasted periods.
Formula: MAPE = (1/n) × Σ |Actual − Forecast| / |Actual| × 100
Worked example: Forecast 1,000 / Actual 950 = 5.3% error. Forecast 200 / Actual 280 = 40% error. Forecast 50 / Actual 30 = 66.7% error. MAPE = (5.3 + 40 + 66.7) / 3 = 37.3%.
Strength: Easy to explain. Comparable across SKUs of different volume.
Weakness: Explodes on low-volume SKUs (a 50-unit error on a 30-unit SKU is 66% error). Undefined when actuals are zero.
WMAPE weights each SKU's error by its actual volume, so high-volume items dominate the metric. This is what most enterprise APS tools report by default.
Formula: WMAPE = Σ |Actual − Forecast| / Σ |Actual| × 100
Worked example (same data): Total absolute error = 50 + 80 + 20 = 150. Total actual = 950 + 280 + 30 = 1,260. WMAPE = 150 / 1,260 = 11.9%. Notice this is a third of the MAPE because the high-volume SKU was forecasted well and now dominates.
Strength: Reflects the business impact (a small % error on a million-unit SKU matters more than a large % error on a 30-unit SKU).
Weakness: Hides bad performance on low-volume long-tail SKUs.
MAD is the average absolute error expressed in units, not percent. It is the input for most safety stock formulas.
Formula: MAD = (1/n) × Σ |Actual − Forecast|
Worked example: MAD = (50 + 80 + 20) / 3 = 50 units.
Strength: Direct input into safety stock calculation (Safety Stock ≈ 1.25 × MAD × service factor × √lead time).
Weakness: Not comparable across SKUs of different scale.
RMSE squares the errors before averaging, then takes the square root. It penalizes large errors much more heavily than small ones.
Formula: RMSE = √[(1/n) × Σ (Actual − Forecast)²]
Worked example: RMSE = √[(50² + 80² + 20²) / 3] = √[(2,500 + 6,400 + 400) / 3] = √3,100 = 55.7 units.
Strength: Useful when large errors are disproportionately costly (e.g. perishables, capacity-constrained production).
Weakness: Hard to explain to non-statisticians. Single large miss can dominate.