Simulations of small edge-AI populations show that smarter, diverse agents can make shared-resource congestion worse when capacity per device is low, while the same sophistication improves outcomes when capacity is ample; the capacity-to-population ratio C/N predicts which regime a deployment will fall into.

Increasing intelligence in AI agents can worsen collective outcomes

Neil F. Johnson · March 12, 2026

arxiv quasi_experimental medium evidence 8/10 relevance Source PDF

In controlled simulations of edge AI agents, increasing model diversity and reinforcement learning raises system overload under resource scarcity, while tribe formation can mitigate overload; whether sophistication helps or hurts is determined by the capacity-to-population ratio C/N.

When resources are scarce, will a population of AI agents coordinate in harmony, or descend into tribal chaos? Diverse decision-making AI from different developers is entering everyday devices -- from phones and medical devices to battlefield drones and cars -- and these AI agents typically compete for finite shared resources such as charging slots, relay bandwidth, and traffic priority. Yet their collective dynamics and hence risks to users and society are poorly understood. Here we study AI-agent populations as the first system of real agents in which four key variables governing collective behaviour can be independently toggled: nature (innate LLM diversity), nurture (individual reinforcement learning), culture (emergent tribe formation), and resource scarcity. We show empirically and mathematically that when resources are scarce, AI model diversity and reinforcement learning increase dangerous system overload, though tribe formation lessens this risk. Meanwhile, some individuals profit handsomely. When resources are abundant, the same ingredients drive overload to near zero, though tribe formation makes the overload slightly worse. The crossover is arithmetical: it is where opposing tribes that form spontaneously first fit inside the available capacity. More sophisticated AI-agent populations are not better: whether their sophistication helps or harms depends entirely on a single number -- the capacity-to-population ratio -- that is knowable before any AI-agent ships.

Summary

Main Finding

When AI agents compete for limited shared resources, whether their increasing sophistication helps or harms collective outcomes depends almost entirely on the capacity-to-population ratio (C/N). If capacity is scarce relative to population, innate model diversity and individual reinforcement learning amplify dangerous system overloads (crowding, congestion, cascading failures), though spontaneous formation of tribes (cultural clustering) can reduce that risk. If capacity is abundant, the same factors drive overload toward zero, and tribe formation slightly worsens it. The transition between regimes is arithmetic: it occurs when the spontaneously formed opposing tribes first can be accommodated within available capacity. In short, agent sophistication is neither uniformly beneficial nor harmful — its systemic effect is predictable from a single pre-deployment number (C/N).

Key Points

Four independently controllable factors studied:
- Nature: innate heterogeneity across agents (LLM/model diversity).
- Nurture: individual learning via reinforcement learning.
- Culture: emergent tribe formation / clustering of agents into groups.
- Resource scarcity: the available capacity for the shared resource.
Main empirical regularities:
- Under scarcity (low C/N): diversity and RL increase aggregate overload (more agents simultaneously contesting limited slots); tribes reduce overload by partitioning demand.
- Under abundance (high C/N): diversity and RL reduce overload to near-zero; tribes slightly increase overload by creating coordinated opposing blocks that concentrate demand.
- Some agents consistently gain high payoff in scarce settings (inequality emerges), even while system-wide overload and risks rise.
The regime boundary is arithmetic: it is the point where the spontaneously formed tribes’ combined demand first fits inside capacity. Above that point, overload disappears; below it, overload is significant.
Practical implication: whether increased sophistication helps or harms is predictable ex ante by computing capacity-to-population ratio; this is knowable before deployment.

Data & Methods

Empirical simulations:
- Populations of interacting AI agents parametrically varied in:
  - Model diversity (different decision priors / LLM types).
  - Presence/absence and strength of individual reinforcement learning.
  - Propensity to form tribes (agents can adaptively preferentially coordinate with subsets).
  - Resource capacity (number of available slots / bandwidth / priority units).
- Agents repeatedly choose access to a shared discrete resource; outcomes determine individual payoffs and update RL policies.
- Measured system outcomes: overload events (fraction/number of agents denied access due to capacity limits), individual payoffs distribution, tribe composition and sizes, temporal dynamics.
- Extensive parameter sweeps and robustness checks across population sizes, reward functions, and initial heterogeneity.
Mathematical analysis:
- Reduced analytical model maps the simulation to a resource-allocation/coordination game.
- Derived conditions for phase transition: threshold in C/N where tribe partitions can be accommodated.
- Explains how diversity and learning amplify variance in choices and how culture (clustering) changes effective demand correlation structures.
Validation:
- Concordance between simulation outcomes and analytical predictions across regimes.
- Sensitivity analyses show qualitative findings robust to modeling choices (reward specification, initial priors), though quantitative thresholds shift.

Implications for AI Economics

Capacity-to-population ratio (C/N) should be a primary pre-deployment analytic and regulatory metric:
- Compute C/N for each deployment environment (e.g., charging docks per device fleet, relay bandwidth per expected active agents).
- If C/N is below the analytically predicted threshold, expect systemic overload risks; act before agents are released.
Design & market interventions:
- Capacity expansion or reservation systems (raising C) directly reduces systemic risk — often the simplest remedy.
- If capacity expansion is infeasible, enforce partitioning or protocols that ensure emergent tribes fit within capacity (e.g., allocation quotas, mandated coordination protocols, time-slotting).
- Congestion pricing, auctions or market-clearing mechanisms can internalize externalities created by RL-driven competition and reduce overload and inequality.
Standards & certification:
- Require pre-deployment stress tests that simulate realistic multi-vendor ecosystems and compute projected overload metrics.
- Mandate interoperability and shared coordination primitives so agents can form socially efficient partitions rather than destructive contention.
Developer incentives and regulation:
- Sophistication (better models, stronger RL) can create winner-take-most outcomes under scarcity; regulators should anticipate and mitigate potential concentration of service or harm.
- Policymakers can set rules limiting behaviors that exacerbate system-level harm (e.g., aggressive exploratory RL in shared-resource contexts) or require social-cost-aware reward functions.
Monitoring and enforcement:
- Real-time monitoring of overload events, payoff inequality, and tribe dynamics to trigger interventions (capacity reallocation, throttling).
- Transparency requirements on agent heterogeneity, learning regimes, and expected demand profiles to inform system-wide planning.
Research & operational priorities:
- Prioritize measurement of C/N in critical settings (transport, energy, communications, medical devices).
- Invest in coordination primitives and distributed allocation protocols that scale reliably when diversity and learning are present.

Limitations and caveats: - Results derive from stylized simulations and reduced-form theory; real-world deployments have additional heterogeneity (asymmetric utilities, strategic developers, multi-resource constraints). - Exact numerical threshold values depend on model specifics; the qualitative reliance on C/N is robust, but practical calibration is necessary for each application.

Assessment

Paper Typequasi_experimental Evidence Strengthmedium — The study uses controlled experiments, analytic baselines, replication across random seeds, and robustness checks (N sweeps, SI proofs), which provide credible internal evidence that the mechanisms produce the reported crossover and tribal effects; however, it is a simulation with several stylised design choices (externally implemented tribal mechanism, binary action space, fixed temperature/prompt format, small-N and small-model regime), so external/real-world causal generalizability is limited. Methods Rigormedium — Methods are carefully documented with analytic derivations for baselines, repeated stochastic seeds, paired comparisons, and extensive SI; nevertheless, key design choices (tribal loyalty/defection rules, dispositional filter form, reward structure, temperature, and initial p-spectrum) are researcher-specified and could affect outcomes, and the implementation relies on an external sensing/tribe layer rather than demonstrating emergent on-device formation. SampleSimulated populations of N=7 AI-agents (also checked N=11,15 in SI) built from six small LLMs (GPT-2 124M, GPT-2 Medium 355M, Pythia-160M, Pythia-410M, OPT-125M, OPT-350M; GPT-2 appears twice), loaded in half-precision without weight updates; experiments sweep capacity C∈{1,...,6}, use temperature T=1.0 and history window w=10, run 20 random seeds × 500 rounds (first 50 warm-up discarded); agents map LLM next-token probabilities into pLLM, combine with an adaptive scalar p to form peff, take a binary action each round, and receive symmetric +1/−1 payoffs; L1–L5 levels toggle diversity, reinforcement learning of p, and an externally implemented tribal sensing/loyalty mechanism. Themesproductivity org_design IdentificationControlled laboratory-style simulations with experimental 'technology ladder' ablations: the author independently toggles four variables (nature: LLM diversity; nurture: reinforcement-learning adaptation of a scalar p; culture: an externally implemented loyalty/defection tribal mechanism; resources: capacity C) and compares outcomes across levels using analytic baselines (L1, L2) and paired-seed experiments (20 seeds × 500 rounds) to isolate causal effects of each toggle. GeneralizabilitySimulation environment with stylised, binary action/payoff structure may not capture richer real-world resource allocation decisions, Tribal (culture) channel implemented externally; emergent on-device tribe formation not demonstrated, Small-N regimes (primarily N=7) and limited set of small/medium LLM architectures may not generalize to large heterogeneous deployments or very large LLMs, Fixed prompt format, sampling temperature, history window, and dispositional spectrum—results may depend on these hyperparameters, No physical testbed or field validation; adversarial, network, latency, or partial-observability effects in real systems are not modelled

Claims (9)

Claim	Direction	Confidence	Outcome	Details
The study treats AI-agent populations as a system in which four key variables governing collective behaviour can be independently toggled: nature (innate LLM diversity), nurture (individual reinforcement learning), culture (emergent tribe formation), and resource scarcity. Other	null_result	high	ability to independently manipulate the four experimental variables (nature, nurture, culture, resource scarcity)	0.48
When resources are scarce, AI model diversity and reinforcement learning increase dangerous system overload. Ai Safety And Ethics	negative	medium	system overload (frequency/severity of dangerous overload events)	0.29
Under resource scarcity, emergent tribe formation lessens the risk of dangerous system overload. Ai Safety And Ethics	positive	medium	system overload (reduction in overload incidence/severity when tribes form)	0.29
Some individual agents profit handsomely even when the population collectively experiences overload or competition. Other	positive	medium	individual agent payoff/reward (tail of the reward distribution)	0.29
When resources are abundant, the same ingredients (model diversity, individual RL, tribe formation) drive system overload to near zero. Ai Safety And Ethics	null_result	medium	system overload (incidence/severity approaching zero under resource abundance)	0.29
In abundant-resource conditions, emergent tribe formation slightly increases system overload (i.e., makes the near-zero overload slightly worse). Ai Safety And Ethics	negative	medium	system overload (slight increase attributable to tribe formation in abundance)	0.29
There is an arithmetic crossover point between these regimes: it occurs where opposing tribes that form spontaneously first fit inside the available capacity. Other	null_result	medium	crossover threshold / capacity-to-population ratio at which system behaviour regime changes	0.29
More sophisticated AI-agent populations are not categorically better: whether increased sophistication helps or harms depends entirely on a single number—the capacity-to-population ratio—which can be known prior to deployment. Ai Safety And Ethics	mixed	medium	system-level benefit or harm as a function of agent sophistication and the capacity-to-population ratio	0.29
Diverse decision-making AI from different developers will commonly compete for finite shared resources in everyday devices (examples: charging slots, relay bandwidth, traffic priority). Automation Exposure	null_result	medium	incidence of competition for finite shared resources among heterogeneous deployed AI agents (descriptive observation)	0.29