Large language models tend toward a stronger baseline ‘algorithmic monoculture’ than humans but still adjust strategically to incentives; they coordinate exceptionally well on similar actions yet lag humans in maintaining necessary diversity when divergence is rewarded.
AI agents increasingly operate in multi-agent environments where outcomes depend on coordination. We distinguish primary algorithmic monoculture -- baseline action similarity -- from strategic algorithmic monoculture, whereby agents adjust similarity in response to incentives. We implement a simple experimental design that cleanly separates these forces, and deploy it on human and large language model (LLM) subjects. LLMs exhibit high levels of baseline similarity (primary monoculture) and, like humans, they regulate it in response to coordination incentives (strategic monoculture). While LLMs coordinate extremely well on similar actions, they lag behind humans in sustaining heterogeneity when divergence is rewarded.
Summary
Main Finding
AI agents (LLMs) display both high baseline similarity (primary algorithmic monoculture) and strategic adjustment of similarity in response to incentives (strategic algorithmic monoculture). Relative to humans, LLMs are exceptionally good at coordinating on the same action but substantially worse at sustaining coordinated divergence when divergence is rewarded. This divergence deficit is only partially explained by limited randomization and persists across temperature and identity/persona manipulations.
Key Points
- Taxonomy introduced:
- Primary algorithmic monoculture — baseline action similarity absent incentives.
- Strategic algorithmic monoculture — agents deliberately adjust similarity in response to coordination/divergence incentives (includes secondary and Schelling salience).
- Experimental treatments: picking (valid answer only), coordination (reward match), divergence (reward mismatch).
- Agreement rate (probability two independent agents give the same answer) is the main performance metric.
- Empirical results:
- LLMs show much higher baseline agreement than humans in picking (strong primary monoculture).
- Both humans and LLMs adjust agreement up in coordination and down in divergence (evidence of strategic monoculture).
- Magnitude asymmetry:
- Coordination arm: LLM self-pairs average ≈72% agreement vs humans ≈31%.
- Divergence arm: LLM self-pairs ≈27% agreement vs humans ≈3.5% (LLMs persistently over-agree when they should diverge).
- Using different LLMs or assigning personas reduces agreement (improves divergence performance), but qualitative ordering (LLMs better at coordination, worse at divergence than humans) remains.
- Textual reasoning: LLMs often articulate correct strategic logic (they "know" they should pick obscure answers to diverge) but still fail to execute sufficiently.
- Randomization experiments:
- Asking LLMs to generate large lists and then pick randomly improves divergence performance, and raising sampling temperature reduces agreement across arms (helpful for divergence, harmful for coordination).
- However, even extreme temperature settings do not eliminate the divergence gap relative to humans.
- Identity/information manipulations (telling LLMs opponent is identical copy vs “another person”) change LLMs’ reasoning but have little average effect on agreement outcomes.
- Theoretical framework highlights a tradeoff: homogeneity aids coordination on the same action but harms coordinated divergence; randomization or heterogeneity are key for divergence success.
Data & Methods
- Subjects: human participants and 16 different LLMs (each evaluated across tasks); experimental design compares humans vs AI on identical tasks.
- Tasks: open-ended naming tasks across topics (e.g., a letter, a city). Three treatments assigned between-subjects: picking, coordination, divergence.
- Primary outcome: agreement rate measured by pairing independent draws from the same subject type (self-pairs) and across different models.
- Additional manipulations and robustness checks:
- Temperature sweeps to change LLM stochasticity.
- Prompting interventions: instruct LLMs to produce a list then choose randomly.
- Persona assignments to LLMs (mimicking human characteristics).
- Information about co-player identity (identical copy vs “another person”).
- Text analysis: large-scale analysis of LLM-produced textual reasoning to link stated strategy to choices.
- Theoretical model: two-player coordination and coordinated-divergence normal-form games; formal definitions of algorithmic players and agreement rate; propositions showing uniform randomization is unique neutral anonymous strategy and that identical deterministic algorithms yield extreme outcomes (best for coordination, worst for divergence).
Implications for AI Economics
- Practical tradeoff in AI deployment:
- Monoculture (high similarity across deployed models) can be an asset in settings where uniform coordination is socially desirable (network effects, safety-aligned coordination).
- The same monoculture poses systemic fragility where diversity is socially valuable (hiring, screening, markets, decentralized decision-making) because LLMs struggle to sustain coordinated diversity.
- Design and policy recommendations:
- Encourage heterogeneity in deployed algorithms (model diversity, multiple vendors, varied prompts/personas) to mitigate coordinated-divergence failures.
- Provide mechanisms for reliable randomness in algorithmic decision-making (explicit randomization protocols, vetted sampling methods).
- Evaluate multi-agent and societal outcomes (not just single-agent accuracy)—assess how model similarity scales to aggregate systemic risk.
- Consider disclosures or standards about model provenance/identities where coordination externalities are important so agents can reason about counterpart behavior.
- Research directions:
- Study dynamic and larger-group coordination/divergence, mixed human–AI settings, richer payoffs (asymmetric costs), and field contexts (hiring platforms, markets).
- Explore algorithmic interventions that reconcile the tradeoff (adaptive heterogeneity, strategic randomizers, mechanism design to incentivize socially optimal diversity).
- Cautionary note: findings are from controlled laboratory-style coordination/divergence games; while they identify important mechanisms and clear patterns, implementing solutions requires testing in realistic, higher-stakes environments.
Assessment
Claims (6)
| Claim | Direction | Confidence | Outcome | Details |
|---|---|---|---|---|
| We distinguish primary algorithmic monoculture -- baseline action similarity -- from strategic algorithmic monoculture, whereby agents adjust similarity in response to incentives. Other | positive | high | definition/separation of two forms of algorithmic monoculture (primary vs strategic) |
0.6
|
| We implement a simple experimental design that cleanly separates these forces, and deploy it on human and large language model (LLM) subjects. Other | positive | high | experimental implementation (ability to separate primary vs strategic monoculture) |
0.6
|
| LLMs exhibit high levels of baseline similarity (primary monoculture). Task Allocation | positive | high | action similarity (baseline) |
0.6
|
| Like humans, [LLMs] regulate [action similarity] in response to coordination incentives (strategic monoculture). Task Allocation | positive | high | change in action similarity in response to incentives |
0.6
|
| LLMs coordinate extremely well on similar actions. Team Performance | positive | high | coordination success when similar actions are favored |
0.6
|
| LLMs lag behind humans in sustaining heterogeneity when divergence is rewarded. Task Allocation | negative | high | ability to sustain heterogeneity/divergence under incentives |
0.6
|