How Does AI Improve Demand Planning?

The Honest Framing

"AI in demand planning" is a phrase used to cover many things, some genuinely transformative and some marketing-only. Five specific capabilities account for almost all the real impact: automatic model selection per SKU, machine learning forecasting on volatile SKUs, integration of external drivers, exception detection, and conversational planning assistants. The rest is mostly relabelling existing capabilities with an "AI" prefix.

This page explains each of the five capabilities, where they add measurable value, where they don't, and what to ask vendors to separate substance from marketing.

One framing point worth stating upfront: AI does not replace the planner. It changes what the planner spends time on. Without AI, planners spend 60-70% of their time on routine forecast generation and review. With AI well-implemented, planners spend 60-70% of their time on exceptions, overlays, and reconciliation the work where human judgment actually adds value.

Key Takeaways

Five AI capabilities account for almost all real impact in demand planning: model selection, ML forecasting, external drivers, exception detection, conversational assistants.
AI doesn't replace planners it changes what planners spend time on. Without AI: 60-70% routine work. With AI: 60-70% exceptions and judgment.
The most reliable AI win is automatic per-SKU model selection: 3-7 percentage points of MAPE improvement, low controversy.
ML forecasting wins on volatile SKUs with external drivers: 5-12 points. Limited gain on stable SKUs.
AI doesn't replace collaborative overlays, doesn't help with new product introductions, and shouldn't be black-box. Explainability matters.
Ask vendors specific questions: which algorithms by name, how is model selection done, how is the forecast explained, what's the override behaviour.

How Horizon Implements AI in Demand Planning

Horizon's AI capabilities cover the five areas above. Automatic model selection runs every cycle across a candidate ensemble (Holt-Winters variants, ARIMA, Croston for intermittent, gradient-boosted trees for volatile SKUs) and picks the best per SKU. ML methods handle external drivers promotions, calendar effects, configurable custom signals.

The exception engine flags SKUs by accuracy drop, bias drift, large cycle-over-cycle change, and model-confidence drop, surfacing 10-20% of the portfolio per cycle for planner review. This is where most teams see the biggest practical improvement in planner experience.

The Horizon Assistant an LLM interface lets planners and executives query the plan conversationally. "Why did the forecast for SKU 1234 change?" returns the specific drivers (statistical baseline change, ML adjustment, overlay applied, exception flag). The assistant doesn't change the forecast; it makes the existing forecast explainable.

The honest qualifier: Horizon's NVIDIA Inception membership reflects investment in AI capability, but we don't position AI as magic. Specific algorithms, specific drivers, specific exceptions explainable AI, not black-box AI.

Why AI Adoption in Demand Planning Has Been Uneven

Despite years of AI marketing in the supply chain category, real adoption varies enormously. Some companies have measurable accuracy improvements from AI; many have spent money on AI initiatives that produced flat results.

The pattern that distinguishes success from failure is consistent. Companies that succeed treat AI as a component within a larger demand planning process automating specific tasks where ML genuinely outperforms statistical methods, while keeping the process discipline (overlays, FVA, exception review) human-driven. Companies that fail try to replace the demand planning process with AI, expecting the model to absorb the entire workflow. The first pattern produces 5-10 percentage points of MAPE improvement and faster cycles. The second pattern produces an expensive black box that the team doesn't trust.

The deciding factor isn't the algorithm it's the integration of AI into the workflow. AI capabilities that surface insights without disrupting the planner's process get adopted. AI capabilities that try to replace planner judgment get bypassed.

The Five Capabilities Where AI Actually Helps

1. Automatic model selection per SKU

Different SKUs have different demand structures. Some are best forecasted with Holt-Winters exponential smoothing, some with ARIMA, some with gradient-boosted trees, some with intermittent demand methods. Picking the right model per SKU manually across a 5,000-SKU portfolio is impossible. AI-driven model selection runs candidate models on historical data, scores them on holdout periods, and picks the best per SKU automatically.

Real impact: Typically 3-7 percentage points of MAPE improvement across a heterogeneous portfolio versus standardising on one method. This is the most reliable AI win in demand planning and the one where the value is least disputed.

2. ML forecasting on volatile SKUs

For SKUs whose demand depends on external drivers (promotions, weather, price, macro indicators), gradient-boosted tree models (XGBoost, LightGBM) and neural networks typically outperform statistical methods. The reason: statistical methods treat external drivers as exogenous variables to be fit linearly. ML methods can capture non-linear interactions e.g. a 10% price drop produces 20% lift on Tuesdays but 40% lift during a holiday week.

Real impact: 5-12 percentage points of MAPE improvement on volatile SKUs with rich external data. Limited or zero impact on stable SKUs.

3. Integration of external drivers

Beyond running ML, the practical question is which external variables to include. Modern AI-assisted demand planning tools include built-in connectors for weather data, price elasticity modelling, web traffic, social signals. The AI does the variable selection flagging which signals actually predict demand for which SKU categories.

Real impact: Variable. Strong for businesses with clear external drivers (seasonal products, weather-sensitive categories, promotion-heavy). Weak for B2B and project-driven businesses.

4. Exception detection

Reviewing 5,000 SKUs per cycle is impossible. The bottleneck is identifying which SKUs need attention. AI models trained on historical patterns can flag anomalies SKUs where actuals are diverging from forecast, where bias is drifting, where the model itself has lost confidence. This converts the planner's job from "review everything" to "review the flagged 10%".

Real impact: Not a direct accuracy improvement, but a 40-60% reduction in planner hours per cycle, which lets the planner spend more time on the exceptions where judgment matters.

5. Conversational planning assistants

An emerging capability: LLM-based assistants that let planners ask questions in natural language "Why did the forecast for SKU 1234 increase 20% this cycle?", "Show me all SKUs where sales overlay is destroying accuracy" and get answers that would otherwise require navigating multiple reports. The assistant doesn't change the forecast; it changes how the planner explores the data.

Real impact: Time-to-insight improvement of 30-50% for diagnostic questions. Useful for senior planners and for executives who want to understand the plan without learning the tool's full UI.

Where AI Doesn't Help

Replacing collaborative overlays: Sales rep intelligence about a specific customer order can't be replicated by AI. The right approach is to capture overlays with structure, not eliminate them.
New product introduction: No history, no training data. AI doesn't outperform like-product modelling and human judgment here.
Black-box automation: AI that produces forecasts the planner can't explain to sales or finance gets distrusted and bypassed. Explainability matters as much as accuracy.

Questions to Ask Vendors

Which algorithms power your AI, by name? (Vague answers are a red flag.)
How does the system pick which model to use per SKU? (Auto-selection should be standard.)
How does a planner see why the AI made a specific forecast? (Explainability should exist.)
What happens when actual demand deviates significantly from the AI forecast? (Re-training cadence and override behaviour.)