Fluent answers from large language models can mask unreliable reasoning; stabilising both human and model processes with uncertainty cues and an Epistemic Control Loop is necessary to make AI trustworthy in high-stakes decision-making.

The Missing Knowledge Layer in AI: A Framework for Stable Human-AI Reasoning

Rikard Rosenbacke, Carl Rosenbacke, Victor Rosenbacke, Martin McKee · April 16, 2026

arxiv theoretical n/a evidence 7/10 relevance Source PDF

The paper argues that fluency should not be equated with reliability and proposes a two-layer approach — human-side stability mechanisms and a model-side Epistemic Control Loop — to make uncertainty and reasoning drift visible and manageable in high-stakes human-AI decision-making.

Large language models are increasingly integrated into decision-making in areas such as healthcare, law, finance, engineering, and government. Yet they share a critical limitation: they produce fluent outputs even when their internal reasoning has drifted. A confident answer can conceal uncertainty, speculation, or inconsistency, and small changes in phrasing can lead to different conclusions. This makes LLMs useful assistants but unreliable partners in high-stakes contexts. Humans exhibit a similar weakness, often mistaking fluency for reliability. When a model responds smoothly, users tend to trust it, even when both model and user are drifting together. This paper is the first in a five-paper research series on stabilising human-AI reasoning. The series proposes a two-layer approach: Parts II-IV introduce human-side mechanisms such as uncertainty cues, conflict surfacing, and auditable reasoning traces, while Part V develops a model-side Epistemic Control Loop (ECL) that detects instability and modulates generation accordingly. Together, these layers form a missing operational substrate for governance by increasing signal-to-noise at the point of use. Stabilising interaction makes uncertainty and drift visible before enforcement is applied, enabling more precise capability governance. This aligns with emerging compliance expectations, including the EU AI Act and ISO/IEC 42001, by making reasoning processes traceable under real conditions of use. The central claim is that fluency is not reliability. Without structures that stabilise both human and model reasoning, AI cannot be trusted or governed where it matters most.

Summary

Main Finding

Fluency is not reliability: modern LLMs and human users share a structural failure mode—“epistemic collapse”—in which linguistic fluency masks uncertainty, speculation, and drift. To govern and deploy AI safely in high‑stakes settings, the paper argues for adding a dedicated knowledge/operational layer that stabilises reasoning during interaction. This requires (1) human‑side scaffolds that make uncertainty, reliance, and epistemic status legible at decision time and (2) a model‑side Epistemic Control Loop (ECL) that detects and responds to internal instability during inference. Only once these Layers 1–2 raise the signal‑to‑noise of interaction can conventional capability governance (Layer 3) operate precisely rather than bluntly.

Key Points

Epistemic collapse: humans routinely conflate belief/inference with fact; LLMs inherit and amplify that tendency because training optimises fluent/coherent text, not explicit epistemic markers.
Shared failure mode: fluent surface text conceals transitions from grounded retrieval to speculative interpolation, producing predictable failure patterns: hallucinations, overconfidence, and false coherence.
Human vulnerabilities: interfaces and conversational framing encourage System 1 acceptance of fluent outputs; users mistake agreement and fluency for epistemic grounding (false confirmation).
Ordering principle: “Stabilisation before capability governance.” Governance is ineffective when interaction signals are epistemically degraded—better signals enable more precise, proportionate enforcement.
Three‑layer architecture:
- Layer 1: Human‑side stabilisation (epistemic scaffolds, uncertainty cues, conflict surfacing, auditable reasoning traces).
- Layer 2: Model‑side stabilisation (Epistemic Control Loop) — monitors internal signals (e.g., sample divergence, perturbation sensitivity, hidden‑state dynamics), detects drift/instability, and modulates inference (slow down, verify, maintain commitments).
- Layer 3: Capability governance and deployment control that operates on clearer signals and traces produced by Layers 1–2.
Epistemic Control Loop (high level): a regulatory layer on top of existing transformer backbones (not a new architecture) that interprets internal signatures of instability and actuates changes in decoding/behaviour. Intended to be inference‑time, audit‑friendly, and complementary to human scaffolds.
Scope and limitations: This paper is conceptual/theoretical (Part I of a five‑paper series). Empirical signals for model epistemic state are promising but currently probabilistic, model‑ and task‑dependent, and not yet fully validated. Parts II–V develop cognitive, interface, governance, and technical proposals; Part V will provide a technical ECL design.
Practical urgency: Problem magnifies as models act as agents across long decision horizons—small epistemic errors can compound into systemic failures in healthcare, law, finance, and public policy.

Data & Methods

Nature of the paper: conceptual, synthesis, and ordering/architecture proposal (theoretical framework), not an empirical experiment.
Methods used:
- Literature synthesis across cognitive science, behavioural economics, AI safety, governance, and standards (e.g., EU AI Act, ISO/IEC guidance, clinical governance).
- Analytic argumentation showing how human cognitive biases interact with LLM training objectives to produce shared failure modes.
- Survey of emerging empirical signals (divergence across stochastic samples, perturbation sensitivity, hidden‑state covariance changes) as candidate indicators of epistemic instability—treated as hypotheses rather than validated diagnostics.
- System design reasoning to propose a three‑layer solution and the Epistemic Control Loop as an inference‑time regulator.
Empirical status and caveats:
- Claims about internal signals correlating with hallucination or instability are preliminary; authors stress these are “hypothesised regulatory affordances.”
- The ECL is proposed at a high level; Part V will present a technical implementation and validation work that is under development.

Implications for AI Economics

Governance economics and signal quality:
- Transaction costs and enforcement efficiency: Better interaction signals (Layers 1–2) reduce evidentiary costs for audits and regulators, enabling more targeted interventions and lower monitoring/enforcement costs.
- Regulatory design: Regulators can move from coarse, precautionary restrictions to proportionate, signal‑driven rules if runtime epistemic traces are available and reliable.
Liability, insurance, and risk pricing:
- Insurability: Clearer runtime traces and model‑side self‑monitoring create observable metrics for underwriting (e.g., frequency of triggered ECL interventions), improving risk pricing for liability and malpractice insurance.
- Allocation of responsibility: Layered traces can disentangle user misuse vs interface‑induced overreliance vs model drift, changing contract design and liability exposure between providers, integrators, and users.
Market structure and competition:
- First‑mover advantages: Firms that implement human‑side scaffolds and ECLs may be able to obtain regulatory approvals, certifications, or preferred procurement status—raising entry barriers.
- Product differentiation: Stability/traceability become sellable attributes; markets may bifurcate into “fluent‑only” products and “stabilised, auditable” products, with different pricing and allowed uses.
Investment and cost‑benefit tradeoffs:
- Implementation costs (interface redesign, telemetry, ECL engineering, evaluation) must be weighed against reduced compliance costs, lower liability, and increased willingness of high‑stakes clients to adopt AI.
- Public good rationale: Because epistemic failures have systemic externalities (e.g., in public health, financial stability), there is an economic argument for standards, subsidies, or mandates to internalise social benefits of stabilisation.
Policy and procurement implications:
- Procurement specifications should require runtime epistemic traces, uncertainty reporting, and human‑in‑the‑loop scaffolds for high‑stakes deployments.
- Regulators can incentivise adoption via certification frameworks that evaluate both model‑side monitoring and interface‑level epistemic protections.
Measurement and metrics for economists:
- New KPIs are needed beyond accuracy: frequency/quality of ECL interventions, coverage and fidelity of epistemic traces, reduction in post‑deployment incident investigatory costs, and measures of user reliance calibrated by interface scaffolding.
- Empirical work options: economists can estimate social value of stabilisation by measuring reductions in error externalities, liability payouts, audit costs, and adoption thresholds for critical sectors.
Systemic risk & competition policy:
- If stabilisation is costly, market concentration around a few providers offering certified stable systems may increase systemic concentration risk; regulators should weigh competition policy alongside safety certification.
- Cross‑platform evidence standards: Diverging evidentiary definitions by platform could create “evidence arbitrage” (the paper cites concern about AI systems fragmenting what counts as valid evidence), affecting adjudication and market dynamics.

Actionable priorities for economists and policymakers: - Fund validation studies that quantify how Layers 1–2 affect downstream enforcement costs, error rates in high‑stakes decisions, and insurance premiums. - Update procurement specs to require auditable epistemic traces and human‑side scaffolding for mission‑critical deployments. - Develop KPIs and reporting standards around epistemic control interventions to enable market comparisons and third‑party audits. - Consider subsidies, standards, or certification to avoid winner‑take‑all outcomes and to internalise public‑good benefits of deployed stabilisation.

Limitations to note - Core claims are conceptual and partly hypothesis‑driven; empirical validation of the ECL and epistemic signals is ongoing and necessary before operational adoption. - Signals proposed (sample divergence, perturbation sensitivity, hidden‑state patterns) are model‑ and task‑dependent and will require standardisation for regulatory use.

Assessment

Paper Typetheoretical Evidence Strengthn/a — Paper is conceptual and argumentative rather than empirical; it proposes frameworks and design principles without presenting quantitative or causal evidence. Methods Rigorn/a — No empirical methods, data collection, or formal identification strategy are used — the contribution is theoretical and design-oriented rather than methodological or experimental. SampleNo empirical sample; the paper is a conceptual/theoretical treatment that synthesises observed properties of large language models and human cognitive biases and outlines a two-layer approach (human-side mechanisms and a model-side Epistemic Control Loop) for stabilising reasoning. Themeshuman_ai_collab governance adoption org_design productivity GeneralizabilityNo empirical validation — uncertain effectiveness across real-world settings and user populations, Implementation feasibility varies by domain (healthcare, law, finance, government) and by available system integration capabilities, Human factors (training, culture, incentives) differ across organizations and may limit adoption or change outcomes, Regulatory and institutional environments differ across jurisdictions, affecting applicability of governance recommendations, Technical heterogeneity of LLMs (architectures, access, latency) may alter how Epistemic Control Loop mechanisms perform

Claims (8)

Claim	Direction	Confidence	Outcome	Details
Large language models are increasingly integrated into decision-making in areas such as healthcare, law, finance, engineering, and government. Adoption Rate	positive	high	integration/adoption of LLMs into decision-making	0.12
LLMs produce fluent outputs even when their internal reasoning has drifted; a confident answer can conceal uncertainty, speculation, or inconsistency, and small changes in phrasing can lead to different conclusions. Decision Quality	negative	high	reliability/consistency of model outputs (decision quality)	0.12
Humans often mistake fluency for reliability: when a model responds smoothly, users tend to trust it, even when both model and user are drifting together. Decision Quality	negative	high	user trust in model outputs	0.12
This paper is the first in a five-paper research series on stabilising human-AI reasoning that proposes a two-layer approach: Parts II–IV introduce human-side mechanisms (uncertainty cues, conflict surfacing, auditable reasoning traces) and Part V develops a model-side Epistemic Control Loop (ECL) that detects instability and modulates generation. Governance And Regulation	positive	high	proposal of methodological architecture for stabilising human-AI reasoning	0.2
Together, these layers form a missing operational substrate for governance by increasing signal-to-noise at the point of use. Governance And Regulation	positive	high	signal-to-noise ratio of reasoning outputs at point of use (informational quality for governance)	0.02
Stabilising interaction makes uncertainty and drift visible before enforcement is applied, enabling more precise capability governance. Governance And Regulation	positive	high	visibility of uncertainty/drift and precision of capability governance	0.02
This approach aligns with emerging compliance expectations, including the EU AI Act and ISO/IEC 42001, by making reasoning processes traceable under real conditions of use. Governance And Regulation	positive	high	alignment with regulatory/compliance requirements (traceability of reasoning)	0.06
Fluency is not reliability: without structures that stabilise both human and model reasoning, AI cannot be trusted or governed where it matters most. Governance And Regulation	negative	high	trustworthiness/governability of AI in high-stakes contexts	0.02