The most dangerous failure mode for an AI agent isn’t being wrong. It’s being wrong while sounding exactly as confident as when it’s right.
A fluent, well-structured output reads as trustworthy regardless of whether every claim in it is real. That’s the core problem with deploying AI agents into workflows where accuracy actually matters: the model doesn’t get quieter when it’s guessing. It writes a fabricated detail with the same tone as a verified one, and unless something has been deliberately built to stop that, nobody downstream can tell the difference until it’s too late.
We don’t treat this as a prompting problem. A well-written instruction telling an agent to “only use real information” doesn’t reliably prevent fabrication, because the model will still produce something plausible when it doesn’t have a clear answer. The fix has to be architectural: constrain what the agent can draw from, and design explicit behaviour for what happens when it doesn’t have enough to work with.
In practice, that comes down to three things we build into every agent before it goes near a real workflow.
1. Source-boundedness: the agent only knows what it’s been given
The single biggest design decision in any agent we build is deciding, upfront, exactly what it’s allowed to draw from, and refusing to let it reach beyond that.
Take a content-generation agent we built for a multi-brand organisation that needed press releases written consistently across several distinct brand identities. The agent doesn’t write from general knowledge about how press releases “usually” sound. It’s restricted to a defined set of sources: the organisation’s own templates, its approved sample releases, its tone-of-voice documentation, and its messaging hierarchy. Every sentence it produces has to be assembled from those materials, not from the model’s broader sense of good writing.
This sounds like a small constraint, but it changes the entire risk profile of the agent. A model with unrestricted access to its training knowledge will happily generate a confident-sounding boilerplate quote, a plausible statistic, or a detail that fits the tone but was never actually supplied. A model that’s been architecturally restricted to named sources cannot do that, because there’s nothing else for it to draw on.
2. Silence over speculation: missing information stays missing
Source-boundedness only works if the agent has a defined behaviour for what happens when the sources don’t cover something. Without that, a constrained agent just fabricates differently, by filling the gap with something that sounds consistent with the materials it does have, even though it isn’t actually supported by them.
We design for the opposite. If a brief is incomplete, the agent asks once for the missing detail. It doesn’t proceed on an assumption, and it doesn’t try to infer a plausible answer from context. Anything it genuinely cannot resolve is left as a clearly marked placeholder in the output, visible to whoever reviews it, rather than quietly smoothed over.
This is a deliberate trade-off. It means the agent sometimes produces an output with a gap in it, instead of a complete-looking draft. We think that’s the correct trade-off every time. A visible gap is something a human can fix in seconds. An invented detail that reads as real is something nobody catches until it’s already gone out.
3. Claims tied to a source, not just internal consistency
The final piece is making sure that what looks true and what is actually traceable are the same thing. An output can be entirely self-consistent, well-structured, and free of obvious errors, and still contain claims that were never grounded in anything real.
This is why the agents we build don’t just produce a final draft. They produce a record of what was used to build it: which document was consulted for which part of the output, and why. We described this in an earlier post as a transparency log, and it applies just as directly here. The point isn’t only accountability after the fact. It’s a structural forcing function during generation. If a claim can’t be tied back to a specific source, it shouldn’t appear in the output at all, and building the agent to produce that trace by default is what makes that rule enforceable rather than aspirational.
Why this has to be designed, not requested
None of these three things happen reliably just because an agent has been told to behave this way. A system prompt asking a model to “stick to the facts” is a request, not a constraint, and requests get quietly ignored under the right conditions, especially when a gap in the source material makes a fabricated answer the easier path to a fluent-sounding response.
Designing for grounded output means treating “no invented facts” as an architectural requirement: restrict what the agent can access, give it an explicit, designed response to missing information, and make every claim traceable by default. None of that depends on the model being well-behaved. It depends on the system around the model making fabrication structurally unavailable as an option.
Where this fits
This is the technical discipline underneath the governance conversation we’ve written about before. Governance asks whether an organisation can trust what an agent produces at scale. Grounded design is how that trust gets earned at the level of a single output, one generation at a time.
Ready to put this into practice?
If you’re evaluating an AI agent, whether built in-house, by a vendor, or by us, the question worth asking isn’t how good the writing sounds. It’s whether every claim in it can be traced back to something real, and what happens when the agent doesn’t have enough to go on.
Xemper’s AI Consulting practice helps organisations evaluate exactly this: whether the agents they’re using, or planning to build, are architected to stay grounded, or simply prompted to and hoping for the best.
If you have a workflow that needs an agent built around your own documentation and data from the ground up, our Tailored Agentic Solutions team can take it from concept to deployment.