Large Language Models (LLMs) like ChatGPT, Claude, Gemini, and Mistral are prediction engines. Given text, they predict what text would plausibly come next. They do this well because they’ve been trained on virtually every book, website, and document ever digitized. They know what a supply chain analyst would say, what a CFO memo looks like, what a board presentation contains.
But prediction is not reasoning. When you ask an LLM “what should we do about our inventory risk,” it doesn’t analyze your inventory. It predicts what text would typically follow that question, based on patterns from millions of similar conversations.
What LLMs Do Brilliantly
ChainAlign uses machine learning for forecasting and optimization. We use LLMs for communication. The distinction matters.
LLMs excel at:
Semantic bridging. Your ERP calls it “Material_Qty_OnHand.” Our system needs “inventory_level.” Our ML-based semantic mapper handles most of these translations automatically. LLMs handle edge cases where pattern matching alone doesn’t resolve the ambiguity, drawing on their broad exposure to how businesses describe their operations.
Translation. Our forecasting engine produces statistical outputs. An LLM transforms “demand forecast shows 2.3σ deviation from seasonal baseline with declining confidence interval” into “Demand for this product is running 15% above normal seasonal patterns, and we’re increasingly confident this trend will continue.”
These are tasks where prediction works. The LLM isn’t deciding anything. It’s helping humans understand what our computational engines have determined.
What LLMs Cannot Do
Here’s what an LLM cannot do, by design:
Stable numerical reasoning. Ask the same mathematical question twice, you may get different answers. The model predicts plausible responses, not correct ones. For decisions involving millions in revenue, “plausible” isn’t acceptable.
Awareness of consequences. An LLM has no model of your business constraints. It doesn’t know that recommending a production increase will exceed your warehouse capacity, violate your supplier contracts, or breach your working capital limits. It simply predicts what recommendation would sound reasonable.
When an LLM suggests “increase safety stock by 20%,” it has no awareness that this ties up €2M in working capital, pushes you past your credit facility covenant, and requires warehouse space you don’t have. It made a prediction that sounded like good supply chain advice. The consequences are invisible to it.
Consistent state. Every conversation starts fresh. The LLM has no memory of what it told you yesterday, what you accepted or rejected, what your organization has learned. Each interaction is an isolated prediction, disconnected from your decision history.
The Chatbot Trap
The current wave of enterprise chatbots takes this form: connect an LLM to your data, let users ask questions in natural language, receive answers.
This feels powerful. You can ask anything. You get articulate, confident responses.
But you’ve asked a prediction engine to guess at decisions that carry real consequences. The articulate confidence masks the absence of actual analysis. There’s no audit trail of how the recommendation was derived. No visibility into which constraints were considered, because none were.
When the CFO asks “why did we make this decision,” the answer is: because the LLM predicted that’s what a good answer would sound like.
Computation Before Communication
ChainAlign separates what requires computation from what requires communication.
Forecasting. Statistical and machine learning models analyze your historical data, external signals, and market patterns. These are deterministic algorithms. Same inputs, same outputs, every time. The math is auditable.
Constraints. Before any recommendation surfaces, it’s tested against your actual business constraints. Capacity limits. Supplier lead times. Working capital boundaries. Regulatory requirements. The system knows what’s feasible, not just what sounds good.
Objectives. Recommendations are scored against your defined business objectives. Margin targets. Service levels. Risk tolerance. The weighting is explicit and adjustable.
These three layers synthesize into recommendations that are computationally derived, constraint-aware, and strategically aligned.
Only then does an LLM enter the picture, to present these recommendations in clear business language, to help you explore the reasoning, to answer questions about the underlying data.
The LLM never decides. It presents what the engines have determined.
Consequences Are Computable
When ChainAlign surfaces a decision, you see more than a recommendation. You see the impact:
- Accept this recommendation: working capital increases by €1.2M, service level improves to 97.3%, stockout risk drops to 2.1%
- Reject it: current trajectory continues, here’s the projected outcome
- Modify the parameters: instantly recomputed consequences
This isn’t prediction. It’s simulation. Monte Carlo analysis running millions of scenarios against your actual constraint model. The consequences are calculated, not guessed.
And when you accept or reject a decision, that judgment is captured. The system learns what matters to your organization. Knowledge compounds rather than evaporating between conversations.
Every Recommendation Traces Back
Every recommendation in ChainAlign traces back to:
- Which data inputs were used
- Which models processed them
- Which constraints were applied
- Which objectives were weighted
- What alternatives were considered
- Why this recommendation scored highest
When regulators ask, when the board asks, when you ask yourself six months later, the reasoning is there. Not a plausible-sounding explanation generated after the fact, but the actual computational path that produced the decision.
Looking Forward
The LLM limitations we’ve described aren’t permanent. The field is evolving rapidly.
Diffusion language models are emerging as an alternative to today’s autoregressive architecture. Where current LLMs predict one token at a time in sequence, diffusion models generate and refine multiple tokens in parallel. Early results suggest this iterative refinement produces more coherent reasoning, not just faster output.
As these capabilities mature, we can extend where they add value in ChainAlign. Better reasoning could mean more nuanced translation of complex outputs. Faster generation could enable richer real-time interaction with recommendations.
But the architecture remains: computation for decisions, language models for communication. Whatever form language models take, they present what our engines have determined. They don’t replace the constraint modeling, the simulation, or the audit trail.
The goal isn’t to use less AI. It’s to use each type of AI where it creates value: machine learning for analysis, language models for communication, and rigorous computation where decisions demand it.
The Difference
Open a typical chatbot and ask: “What should we do about our supply chain risk?”
You’ll get an articulate answer. It will sound reasonable. It may even be correct.
But you won’t know if it considered your capacity constraints. You won’t know if it’s consistent with what you were told yesterday. You can’t trace how the recommendation was derived. And when circumstances change, you’ll start over with a fresh guess.
ChainAlign doesn’t guess. It computes, within your constraints, toward your objectives, with full transparency.
That’s the difference between predicting what good advice sounds like and actually providing it.