People cooperate readily with a powerful chatbot, but not as completely as with humans: natural-language chat with GPT-5.2 yields high initial cooperation yet a persistent plateau below the near-universal cooperation seen in human pairs, because conversations with the AI prompt explicit conditional rules rather than trust-building social signals.

Playing Against the Machine: Cooperation, Communication, and Strategy Heterogeneity in Repeated Prisoner's Dilemma

Chowdhury Mohammad Sakib Anwar, Konstantinos Georgalos · March 16, 2026

arxiv rct medium evidence 8/10 relevance Source PDF

In repeated Prisoner’s Dilemma games humans cooperate at high initial rates with an AI chatbot but converge to lower steady-state cooperation than in human–human pairs, with chat eliciting explicit rule-based strategies rather than the social-emotional signals that produce full cooperation between humans.

This paper investigates how natural language communication with an AI agent affects human cooperative behaviour in indefinitely repeated Prisoner's Dilemma games. We conduct a laboratory experiment (n = 126) with two between-subjects treatments varying whether human participants chat with an AI chatbot (GPT-5.2) before every round or only before the first round of each supergame, and benchmark against human-human data from Dvorak and Fehrler (2024) (n = 108). We find four main results. First, cooperation against the AI is high and initially comparable to human-human levels, but unlike in the human-human setting, where cooperation converges to near-complete levels, cooperation against the AI plateaus and never reaches full cooperation. Second, repeated communication, which substantially increases cooperation in human-human interactions, has no detectable effect in the human-AI setting. Third, strategy estimation reveals that human-AI subjects favour Grim Trigger under pre-play communication and remain dispersed under repeated communication, whereas human-human subjects converge to Tit-for-Tat and unconditional cooperation respectively. Fourth, human-AI conversations contain more explicit strategy commitments but fewer emotional and social messages. These results suggest that humans cooperate with AI at high rates but do not develop the trust observed in human-human interactions. Cooperation in the human-AI setting is sustained through conditional rules rather than through the social bonds and mutual understanding that characterise human-human cooperation.

Summary

Main Finding

Humans cooperate with an AI chatbot (GPT-5.2) at high rates in an indefinitely repeated Prisoner’s Dilemma, but the pattern and maintenance of cooperation differ from human–human play: cooperation against the AI plateaus below full cooperation (~82%) and repeated natural-language communication has no detectable additional effect. Humans facing the AI adopt more punitive, rule‑based strategies (grim-trigger) and use more explicit commitments but fewer emotional/social messages than in human–human conversations. Overall, cooperation with AI is sustained by conditional rules rather than by the social trust and mutual understanding that characterise human–human cooperation.

Key Points

Experimental setup
- Between-subjects lab experiment with human participants playing indefinitely repeated PD against an AI chatbot (GPT-5.2 snapshot).
- n = 126 human–AI subjects; benchmark human–human dataset from Dvorak & Fehrler (2024), n = 108.
- Two treatments: (i) Pre-play chat: natural-language messages only before the first round of each supergame; (ii) Repeated chat: messages before every round.
- Payoffs: T=37, R=30, P=17, S=0 (normalized g = 7/13, l = 30/13). Continuation probability δ = 0.80. Seven supergames.
- Perfect monitoring (actions observed), length-generation as in Dvorak & Fehrler (2024). Subjects informed the partner was an AI.
Four main empirical results
Cooperation is high vs AI and initially comparable to human–human play, but unlike human–human interactions (which converge to near-complete cooperation), cooperation against the AI plateaus around ~82% and never reaches full cooperation.
Repeated communication substantially increases cooperation in human–human settings in prior work, but in the human–AI experiment repeated communication produced no detectable additional effect beyond pre-play chat.
Strategy estimation (Strategy Frequency Estimation Method) shows humans facing the AI favor grim-trigger under pre-play communication and remain strategy‑dispersed under repeated communication; by contrast, human–human subjects converge to Tit‑for‑Tat (pre-play) and unconditional cooperation (repeated).
NLP analysis of chat content finds more explicit strategy commitments (promises/conditional proposals) but fewer emotional and social messages in human–AI conversations versus human–human conversations.
Interpretation
- Humans appear to respond to the AI as a non‑human counterpart by relying on algorithmic, rule-based reasoning and punitive strategies (hedging), rather than building social trust through affective engagement.
- The AI–human cooperative equilibrium is sustained differently: conditional enforcement of rules rather than social-bonding/forgiveness norms.

Data & Methods

Subjects and platform
- Human participants recruited via ORSEE at Lancaster Experimental Economics Laboratory (LExEL).
- Implementation in oTree. Ethics clearance and preregistration reported; replication materials available on OSF (https://osf.io/e4rz8).
Treatments & game parameters
- Two between-subjects treatments: pre-play chat vs repeated chat (chat with GPT-5.2 pinned snapshot; reasoning effort = medium).
- 7 supergames, continuation probability δ = 0.8 (length generated as in Dvorak & Fehrler, 2024), perfect monitoring.
Strategy inference
- Strategy Frequency Estimation Method (Dal Bó & Fréchette, 2011) used to infer population distribution over canonical strategies (grim-trigger, Tit‑for‑Tat, unconditional cooperation/defection, lenient M1BF variants, etc.).
- Computation of Basin of Attraction of Defection (BAD) for baseline strategic uncertainty (π* ≈ 0.4 under parameters).
Conversation analysis
- Natural language processing (NLP) methods (classification/keyword/topic analysis) to quantify content: explicit commitments, promises, threats, emotional/social language.
- Comparative content metrics vs human–human benchmark (Dvorak & Fehrler data).
Benchmarks & robustness
- Direct comparison to Dvorak & Fehrler (2024) human–human results due to matched design and parameters.
- Robustness checks across treatments and sessions reported; key null result (no effect of repeated chat vs pre-play chat in human–AI) emphasized.
Limitations (noted by authors)
- Laboratory setting and single LLM snapshot (GPT-5.2) limit external generalisability.
- Participants were explicitly told they faced an AI; effects may differ under deception/uncertainty about partner type.
- Perfect monitoring only; different monitoring/noise regimes might change the role of repeated communication.

Implications for AI Economics

For models of human–AI strategic interaction
- Behavioral framing matters: humans facing AIs may default to rule-based, punitive strategies rather than socially forgiving ones. Economic models of repeated human–AI interaction should incorporate shifts in strategy distribution and trust formation.
- The mere capacity for fluent natural-language communication by an AI does not guarantee the same social mechanisms (affect, trust-building) that sustain human–human cooperation.
For AI design and deployment in cooperative settings
- To maximise sustained cooperation in long-run interactions (negotiation, advisory roles, customer service), designers should consider not only accurate commitments but also social-signalling behaviours (empathic language, forgiveness norms) that build trust over time.
- If AIs are unforgiving or behave mechanically after defections, humans may adopt grim-trigger responses that reduce long-run efficiency despite high short-run cooperation.
For policy and governance
- Evaluation of AI systems in social/economic roles should test repeated-interaction dynamics and communication formats—benchmarks should measure not only initial cooperation but convergence and robustness of cooperation over time.
- Disclosure that a partner is an AI changes human behaviour; regulators and institutions should account for behavioral shifts (algorithm aversion, defensive strategies) when approving AI agents for roles requiring sustained cooperation.
For empirical research agenda
- Key next steps: replicate with different LLMs (varying degrees of expressed affect/forgiveness), test alternative monitoring regimes, field studies in real-world repeated interactions, and experiments that vary whether subjects believe the partner is human or AI.
- Investigate whether altering the AI’s conversational style (more emotional/social messaging, explicit forgiveness) reduces human reliance on punitive strategies and allows cooperation to converge to human–human levels.

Overall, the paper provides controlled evidence that while AI chatbots can sustain high cooperation via conditional rules, natural-language communication with an AI does not automatically substitute for the social-trust channels central to human–human cooperation—an important consideration for economic theory, AI system design, and policy.

Assessment

Paper Typerct Evidence Strengthmedium — Internal identification of the communication-frequency effect within the human–AI experiment is strong due to random assignment and controlled lab conditions, producing credible causal estimates for that treatment; however, overall evidence is limited by modest sample size, reliance on a single AI model (GPT-5.2), use of a non-concurrent external benchmark for human–human comparisons, and lab-artifact concerns that reduce external validity. Methods Rigormedium — Design uses standard experimental methods (indefinitely repeated PD, between-subjects randomization), strategy estimation, and conversational content coding; these approaches are appropriate and informative, but potential weaknesses include limited power for some heterogeneous-effects tests, possible subjectivity in content coding (unclear blinding/ intercoder reliability reported), absence of details on stakes/demographics/preregistration, and dependence on one LLM implementation. SampleLab experiment with 126 human subjects who each played indefinitely repeated Prisoner’s Dilemma games against an AI chatbot (GPT-5.2) under two randomized communication-frequency treatments; benchmark human–human data come from a prior study (Dvorak & Fehrler 2024, n = 108). Demographic details, recruitment pool, and monetary-stake levels are not specified in the summary. Themeshuman_ai_collab org_design IdentificationRandomized lab experiment: human subjects were randomly assigned to one of two between-subjects treatments (chat only before each supergame vs chat before every round) when paired with an AI (GPT-5.2); causal effects of communication frequency on cooperation are identified by this random assignment. Comparison to human–human behavior relies on an external benchmark dataset (Dvorak & Fehrler 2024) rather than a concurrent randomized control, so cross-condition comparisons are correlational/benchmarked rather than fully randomized. GeneralizabilitySingle AI model (GPT-5.2) — may not generalize to other LLMs or different prompt/personality configurations, Controlled lab environment with modest sample size — limited external validity to field or high-stakes settings, Comparison to human–human interactions uses an external/non-concurrent benchmark study, risking contextual confounds, Prisoner’s Dilemma game setting — results may not translate to other strategic environments or multi-party interactions, Participant pool demographics unspecified (e.g., students vs representative sample) — limits population generalizability, Short-to-moderate experimental horizon and stakes unclear — long-run dynamics and high-stakes behavior unknown

Claims (12)

Claim	Direction	Confidence	Outcome	Details
Initial cooperation rates against the AI (GPT-5.2) are high and comparable to initial cooperation in human–human pairs. Team Performance	null_result	high	initial cooperation rate (cooperation in early rounds / first round of supergames)	n=126 0.6
Cooperation with the AI plateaus and never reaches the near-complete cooperation levels observed in human–human interactions. Team Performance	negative	high	cooperation rate over time and asymptotic/end-state cooperation level	n=126 0.6
Allowing repeated pre-play communication (chat before every round) has no detectable effect on cooperation rates when the partner is an AI. Team Performance	null_result	high	effect of chat frequency on cooperation rate (difference in cooperation between chat-frequency treatments)	n=126 0.6
In the human–human benchmark, repeated pre-play communication substantially increases cooperation. Team Performance	positive	high	change in cooperation rate associated with repeated communication in human–human pairs	n=108 0.6
Strategy estimation indicates human–AI subjects tend to favor Grim Trigger when allowed pre-play communication. Decision Quality	positive	medium	prevalence/frequency of Grim Trigger strategy classification among subjects	n=126 0.36
When allowed repeated communication with the AI, human subjects remain behaviorally dispersed and do not converge to a single dominant strategy. Team Performance	mixed	medium	strategy convergence / dispersion (distribution of inferred strategies over time)	n=126 0.36
Human–human subjects converge to Tit‑for‑Tat under one condition and to unconditional cooperation under the repeated-communication condition. Team Performance	positive	medium	prevalent strategy type over time in human–human pairs (Tit‑for‑Tat vs unconditional cooperation)	n=108 0.36
Human–AI chat logs contain more explicit strategy commitments (stated rules) than human–human chats. Decision Quality	positive	medium	frequency/count of explicit strategy-commitment messages in chat logs	n=126 0.36
Human–AI chats contain fewer emotional and social messages compared with human–human chats. Decision Quality	negative	medium	frequency/count of emotional/social message types in chat logs	n=126 0.36
Cooperation with the AI is sustained mainly through conditional rule-based strategies rather than through trust-building, emotional, and social channels. Decision Quality	mixed	medium	mechanism of cooperation (relative contribution of conditional rule-following vs social/emotional trust indicators)	n=126 0.36
Experimental design: subjects played an indefinitely repeated Prisoner’s Dilemma in supergames with two between-subjects treatments varying chat timing (chat only before first round of each supergame vs chat before every round); the AI partner was GPT-5.2. Other	null_result	high	experimental treatment specification (chat-frequency manipulation; AI identity)	n=126 0.6
Sample sizes reported: human–AI experiment n = 126; human–human benchmark n = 108. Other	null_result	high	reported sample sizes	n=126 0.6