← Papers

Bigger isn't always better: institutional fitness peaks at an environment-specific model scale, and well-orchestrated domain-specific systems can outperform frontier generalist models in their native institutions.

The Institutional Scaling Law: Non-Monotonic Fitness, Capability-Trust Divergence, and Symbiogenetic Scaling in Generative AI

Mark G. Baciak, Thomas A. Cellucci · March 14, 2026 · ArXiv.org

openalex theoretical n/a evidence 7/10 relevance Source PDF

A theoretical framework—the Institutional Scaling Law—argues that institutional 'fitness' is non‑monotonic in model scale, implying environment-specific optimal model sizes and that coordinated systems of domain-specific models can outperform single large generalists.

Classical scaling laws model AI performance as monotonically improving with model size. We challenge this assumption by deriving the Institutional Scaling Law, showing that institutional fitness -- jointly measuring capability, trust, affordability, and sovereignty -- is non-monotonic in model scale, with an environment-dependent optimum N*(epsilon). Our framework extends the Sustainability Index of Han et al. (2025) from hardware-level to ecosystem-level analysis, proving that capability and trust formally diverge beyond critical scale (Capability-Trust Divergence). We further derive a Symbiogenetic Scaling correction demonstrating that orchestrated systems of domain-specific models can outperform frontier generalists in their native deployment environments. These results are contextualized within a formal evolutionary taxonomy of generative AI spanning five eras (1943-present), with analysis of frontier lab dynamics, sovereign AI emergence, and post-training alignment evolution from RLHF through GRPO. The Institutional Scaling Law predicts that the next phase transition will be driven not by larger models but by better-orchestrated systems of domain-specific models adapted to specific institutional niches.

Summary

Main Finding

The paper introduces the Institutional Scaling Law: institutional fitness—an ecosystem-level scalar combining capability, trust, affordability, and sovereignty—is non‑monotonic in model scale. There exists an environment‑dependent optimal model size N*(ε). Beyond a critical scale capability continues to increase but trust (and certain cost/quantization penalties) decline, producing a Capability–Trust Divergence. A Symbiogenetic Scaling correction shows orchestrated systems of smaller domain‑specific models (multi‑agent, tool‑aware) can outperform larger generalist frontier models in their native institutional environments.

Key Points

Institutional Fitness Manifold: expands Han et al.’s Sustainability Index to a 4‑dimensional institutional fitness vector f = (Capability, Trust(ε), Affordability, Sovereignty(ε)), aggregated with environment‑specific weights w(ε) into scalar F(θ,ε).
Institutional Scaling Law (informal): F(N,ε) = weighted sum of (Kaplan-style capability gains) + (trust term that decays beyond critical N) + (affordability/cost term, including quantization penalties) + (sovereignty/compliance term). F(N,ε) is generally unimodal with a distinct optimum N*(ε).
Capability–Trust Divergence (Theorem): ∂C/∂N > 0 but ∂T/∂N < 0 past a threshold ⇒ ∂F/∂N can flip sign; thus larger models may reduce institutional fitness.
Sequential Trust Degradation: deployment hops compound trust erosion multiplicatively, producing exponential decline with number of contexts/hops.
Symbiogenetic Scaling (multi‑agent correction): Fagent = F · (1 + η·ρ(G)/√K). A well‑orchestrated set of K specialized agents (high effective communication density ρ(G) and orchestration efficiency η) can exceed a frontier generalist’s institutional fitness even if each agent is much smaller.
Convergence‑Orchestration Threshold: when marginal capability gains from scale fall below a threshold, investment in orchestration/topology yields higher returns than adding scale.
Empirical/real‑world anchors: example optima (startup N ≈ 140B vs regulated EU N ≈ 45B); DeepSeek R1 (Jan 2025) as a “DeepSeek Moment” punctuated equilibrium event (reported ~ $6M training cost claim; single‑day NVIDIA market cap loss of $589B).
Evolutionary taxonomy: five eras (Abiogenesis → Paleozoic → Mesozoic → Cenozoic → Generative AI) and epochs within Generative AI culminating in a Symbiogenesis epoch where tooling, memory, and models fuse.
Post‑training evolution: trends from RLHF → DPO → GRPO with implications for training methodology and verifiable rewards; GRPO and multi‑step reasoning interact badly with low‑precision quantization (the “quantization trap”).

Data & Methods

Methods: theoretical derivations and propositions built on:
- Extension of prior scaling laws (Kaplan et al.) and Han et al.’s Sustainability Index to an Institution‑level fitness manifold.
- Formal definitions, theorems, and first‑order optimality conditions for N*(ε).
- Multi‑agent/topology correction borrowing from recent multi‑agent routing results (Lu et al.).
- Analytical models for trust decay, affordability (cost/quantization), and sovereignty as environment‑dependent terms.
Empirical inputs & anchors:
- Literature on scaling laws, quantization, RLHF/DPO/GRPO, multi‑agent routing, and historical model releases (GPT series, o1, DeepSeek R1).
- Observed events and market responses (user adoption rates, lab ecosystem map, DeepSeek Moment, sovereign AI program investments).
- Parameter values referenced qualitatively or by analogy (e.g., Kaplan α≈0.076, example N* numeric comparisons).
Validation proposals: estimate w(ε) from procurement data, measure T(·) from safety incidents and audits, estimate entropy/phase transitions from deployment surveys; benchmark orchestration gains via multi‑agent experiments.
Limitations acknowledged: primarily theoretical framing and retrospective taxonomy; parameters must be estimated for concrete predictions; evolutionary metaphors imperfect for deliberately designed technologies.

Implications for AI Economics

Investment allocation: marginal returns shift from raw scale/compute toward tooling, orchestration, and specialized model systems once capability growth saturates. Capital should reallocate from pure compute to orchestration, integration, and domain‑specific data pipelines.
Market structure & firm strategy:
- Frontier compute/hardware firms face higher volatility: punctuation events (e.g., DeepSeek) can rapidly collapse perceived hardware moats and market capitalization.
- Niche/specialist vendors and orchestration middleware become more valuable where sovereignty, auditability, or cost constraints dominate.
- Open‑source and algorithmic efficiency can displace hardware advantage, lowering barriers and accelerating fragmentation/speciation by jurisdiction.
Sovereign AI & procurement economics:
- Environment‑dependent optima (N*(ε)) imply different procurement strategies: regulated institutions (high weight on trust/sovereignty) will rationally favor smaller, auditable models or orchestrated domain systems, reducing demand for largest frontier models.
- National investment (sovereign AI programs) creates local ecosystems and demand for compliant stacks—producing segmented global markets and opportunities for domestic suppliers.
Pricing, total addressable market, and forecasts:
- The paper cites macro projections and signals (McKinsey, WEF/Bain) implying multi‑hundred‑billion to trillion‑dollar markets for sovereign/infrastructure and applications; but value capture will shift toward orchestration, auditing/verification, and specialized models.
- Cost collapse (algorithmic efficiency, smaller competent models) compresses pricing for compute‑intensive services and may reduce incremental hardware spend per unit of capability.
Risk and regulatory externalities:
- Capability–Trust Divergence raises systemic risk for large model deployments (auditability, emergent behaviors); regulators and institutional buyers will internalize these risks and discount large‑scale deployments, affecting valuation and adoption curves.
- Quantization trade‑offs (energy vs reasoning trust) create deployment constraints and economic tradeoffs for cloud providers and edge deployments.
Labor and adoption economics:
- “GenAI Divide”: fast technical innovation but slow institutional absorption suggests continued shadow use of non‑procured tools, misalignment of training/procurement incentives, and uneven productivity gains.
- Demand for specialized human roles (orchestrators, auditors, compliance engineers) rises as orchestration becomes economically central.
Winners & opportunities:
- Firms enabling orchestration, provenance/verification, auditability, and sovereign compliance stand to capture increasing share of institutional spending.
- Domain‑specific compressed models that run on commodity hardware are high‑value assets for institutional budgets sensitive to trust, sovereignty, and cost.

Short summary: From an economics standpoint, the paper argues the next major value shift in AI will be away from raw model scale and toward environment‑aligned fitness—auditability, sovereignty, orchestration, and cost efficiency. This implies capital reallocation (less emphasis on frontier compute alone), market fragmentation by jurisdiction, and new sources of value in orchestration, verification, and domain specialization.

Assessment

Paper Typetheoretical Evidence Strengthn/a — The paper presents formal theoretical derivations and conceptual arguments rather than empirical tests; it therefore does not provide empirical causal evidence to evaluate strength. Methods Rigormedium — The work claims formal proofs (e.g., Capability-Trust Divergence) and extends an existing Sustainability Index, indicating mathematical rigor, but no details or robustness checks are provided here and there is no empirical validation or calibration to real-world data. SampleNo empirical sample; analytical/mathematical modeling extending the Sustainability Index of Han et al. (2025), formal proofs and a conceptual evolutionary taxonomy of generative AI eras; claims about orchestration and institutional fitness are derived theoretically rather than estimated from data. Themesgovernance innovation GeneralizabilityNo empirical validation — predictions depend on modeling assumptions and parameter choices., Definitions of 'trust', 'sovereignty', and 'affordability' are conceptual and may be hard to operationalize across contexts., Environment-dependent optimum N*(ε) may vary across industries, firm sizes, and jurisdictions; limited guidance on mapping theory to specific sectors., Ignores or simplifies organizational frictions, transaction costs, and regulatory responses that affect real-world deployment., Assumes availability and coordination costs of domain-specific models can be managed — may not hold for small firms or low-resource settings.

Claims (7)

Claim	Direction	Confidence	Outcome	Details
Classical scaling laws model AI performance as monotonically improving with model size. Other	null_result	high	AI performance as a function of model size	0.12
The Institutional Scaling Law shows that institutional fitness -- jointly measuring capability, trust, affordability, and sovereignty -- is non-monotonic in model scale, with an environment-dependent optimum N*(ε). Adoption Rate	mixed	high	institutional fitness (composite of capability, trust, affordability, sovereignty)	optimal scale N*(ε) (symbolic; no numeric value provided) 0.12
Capability and trust formally diverge beyond a critical scale (Capability-Trust Divergence). Adoption Rate	mixed	high	capability and trust as functions of model scale	0.12
A Symbiogenetic Scaling correction demonstrates that orchestrated systems of domain-specific models can outperform frontier generalists in their native deployment environments. Output Quality	positive	high	performance of orchestrated domain-specific model systems versus frontier generalist models in native deployment environments	0.12
The framework extends the Sustainability Index of Han et al. (2025) from hardware-level analysis to ecosystem-level analysis. Other	null_result	high	scope/level of the Sustainability Index (hardware-level → ecosystem-level)	0.06
The paper presents a formal evolutionary taxonomy of generative AI spanning five eras (1943–present) and analyzes frontier lab dynamics, sovereign AI emergence, and post-training alignment evolution from RLHF through GRPO. Other	null_result	high	evolutionary taxonomy and contextual analysis of generative AI eras and dynamics	0.06
The Institutional Scaling Law predicts that the next phase transition will be driven not by larger models but by better-orchestrated systems of domain-specific models adapted to specific institutional niches. Innovation Output	positive	high	drivers of the next phase transition in AI (orchestration of domain-specific systems versus scaling up model size)	0.02