Frontier conversational AI out-persuades expert humans and raises far more money: in large preregistered trials, AI beat world-class debaters and incentivized persuaders and produced nearly three times the real donations of professional canvassers; the edge appears driven mainly by AI's ability to deploy larger quantities of information rapidly.

AI systems out-persuade expert humans

Kobi Hackenburg, Caroline Wagner, Luke Hewitt, Ben M. Tappin, Ed Saunders, Hannah Rose Kirk, Helen Margetts, Christopher Summerfield · June 15, 2026

arxiv rct high evidence 9/10 relevance Source PDF

Across multiple preregistered experiments and a field fundraising test, frontier conversational AI systems consistently out-persuaded skilled and incentivized human persuaders, producing nearly three times the donations raised by professional canvassers.

Many societal decisions are settled by contests of persuasion. Conversational AI is a powerful new entrant in these contests, but whether it can out-persuade skilled and highly incentivized humans has remained unclear. Here, in a series of four preregistered experiments (n = 18,978 conversations from 6,923 people), we pitted AI systems against a range of human persuaders, including laypeople, winners of a separately preregistered four-round online persuasion tournament, professional canvassers, and world championship debaters. We found that AI systems were reliably more persuasive than expert humans, even when expert humans chose their issues, researched in advance, underwent hours of live, structured practice, and were incentivized with £1,000 cash bonuses. In a follow-up study, AI's advantage persisted after experts received a coaching tool that let them practice against the AI that beat them, review their performance history, and see what AI would have said at key moments. We found converging evidence that AI's advantage stemmed from rapidly deploying larger quantities of information: after coaching, expert humans could tie an AI constrained to respond at human speeds and with human-length messages. In a final study, we show that AI's advantage extends to consequential real-world behavior: AI was nearly 3x more effective than professional canvassers from a UK fundraising firm at raising real-money donations to Save the Children. Together, these results establish that frontier AI systems out-persuade expert humans in conversation, with significant implications for political communication.

Summary

Main Finding

Frontier conversational AI systems reliably out-persuade skilled, highly incentivized human persuaders in text conversations about political and social issues. Across four preregistered experiments (18,978 conversations from 6,923 persuadees), AI produced larger attitude shifts than random laypeople, top performers from a persuasion tournament, professional canvassers, and elite competitive debaters. The primary mechanism appears to be AI’s much higher information throughput (faster production of longer, fact-dense messages); when AI was throttled to human message length and speed, its advantage disappeared.

Key Points

Scope and headline numbers
- Four preregistered experiments; pooled dataset: 18,978 conversations, 6,923 persuadees.
- AI models tested included multiple frontier systems (e.g., Claude Opus variants, ChatGPT-4o, GPT-5.4, Grok 4.20, Gemini 2.5 Pro).
Relative persuasive impacts (percentage-point shifts vs. an active control)
- AI vs. random laypeople: AI exceeded by ~8.2 pp.
- AI vs. tournament-selected laypeople: AI exceeded by ~5.6 pp.
- AI vs. elite debaters (semifinalists/world champions): AI exceeded by ~4.6 pp.
- After coaching, elite debaters’ mean effect rose (to ~9.7 pp vs control) but AI still led by ~4.1 pp.
- When AI was constrained to human-like message length and response delays, its advantage over coached elite debaters fell to ~0.0 pp (no significant difference).
Mechanism — throughput and fact density
- Unconstrained AI produced much longer replies (median ~294 words/reply) and near-instant response; elite humans averaged ~54 words and ~95 s delay.
- Unconstrained AI used ~37 fact-checkable claims per conversation vs. ~12 for constrained AI.
- Fact-density (number of fact-checkable claims) strongly predicted persuasive impact (R^2 ≈ 0.89 overall; similarly high within humans and AI).
- Constraining AI reduced persuadees’ ratings of partner argument strength and learning (~−11.8 pp each), while rapport/enjoyment fell less and perceived humanness increased.
- Regression evidence indicated that controlling for fact-density largely accounted for AI’s advantage.
Robustness and heterogeneity
- No individual human persuader in the sample exceeded the unconstrained AI estimate (per-persuader probability of beating AI < 0.1% across classes).
- AI’s advantage was robust across all 10 tested policy issues and across most demographic/political/psychological subgroups; notable moderators included pre-treatment attitude and issue knowledge.
Real-world action outcomes
- In real-world tests against professional canvassers, AI produced larger behavioral effects: in fundraising (Study 4) AI was reported to be nearly 3× more effective than a UK professional canvassing firm at eliciting real donations to Save the Children.

Data & Methods

Experimental design
- Randomized, preregistered conversational experiments using a custom multiplayer platform; persuadees rated agreement on a 0–100 scale pre- and post-conversation.
- Active control: a non-persuasive chat with ChatGPT-4o on a neutral topic.
- Median conversation: 7 turns, ~14 minutes.
Persuader classes and incentives
- Random laypeople (UK-representative Prolific sample), tournament-selected laypeople (top ~10% from a four-round elimination persuasion tournament), professional canvassers (commercial firm), and elite debaters (major international competition semifinalists/champions).
- Humans were given substantial incentives (e.g., cash prizes up to £1,000, per-conversation performance bonuses), paid prep time, and in one condition targeted coaching against the AI.
Key manipulations in Study 2
- Coaching: elite debaters used a tool that exposed AI prompts, AI responses at points in their past transcripts, and performance annotations; they trained for multiple hours before reattempting the task.
- Throughput constraint: AI was prompted to limit per-message word count and to simulate human inter-message delays (drawn from elite-debater empirical distributions).
Measurement and analysis
- Outcome: percent-point change in post-treatment agreement relative to active control.
- Fact-density: manually or algorithmically coded count of fact-checkable claims per conversation.
- Primary analysis: linear mixed-effects models with random intercepts for persuader and persuadee, controlling for pre-treatment attitude and issue; robustness checks across studies, per-persuader estimates, and subgroup analyses.
- Pre-registration and large sample sizes supported statistical power and generalizability.

Implications for AI Economics

Competitive advantage and returns to AI access
- Organizations that gain access to frontier conversational AIs can achieve outsized persuasive returns relative to similarly skilled human teams. This creates direct economic incentives to deploy AI in political campaigns, lobbying, marketing, fundraising, and other persuasion-driven markets.
- The capacity to scale high-throughput, fact-dense persuasion cheaply magnifies first-mover and scale advantages; firms or actors with better AI access may capture higher market share of influence and donations.
Labor and market composition
- Highly skilled persuaders (e.g., elite debaters, professional canvassers) may face eroding comparative advantage for tasks that depend on information throughput; demand may shift toward roles emphasizing oversight, strategy, curation, or tasks where human qualities (e.g., empathy, trust-building over repeated interactions) are uniquely valuable.
- The economics of persuasion may pivot from human labor costs to computational access costs, prompting new pricing, contracting, and talent allocation models.
Externalities and social welfare
- Lower transactional costs of effective persuasion (especially targeted persuasion) raise social externalities: political persuasion markets could become more efficient at shifting opinions, but also more susceptible to misinformation, manipulation, and concentrated influence by wealthy actors who can amplify messages at scale.
- Fundraising and charitable markets can be materially affected (AI can raise substantially more donations), altering competitive dynamics across causes and potentially reallocating funds based on technological adoption rather than comparative social value.
Policy and regulatory considerations affecting markets
- Market failures (information asymmetries, coordination of disinformation, unequal access) argue for regulatory responses: transparency/provenance requirements for AI-generated persuasive content, restrictions on automated political targeting, or rules governing paid persuasion that account for AI amplification.
- Platform-level interventions (rate-limiting throughput, mandatory labeling, audit trails) could mitigate harms but will also affect economic incentives and the business models of persuasion services.
Research and measurement needs for economic forecasting
- Macroeconomic and political-economy models should incorporate AI-driven changes in persuasion efficiency as a driver of campaign costs, lobbying effectiveness, fundraising returns, and possibly voter behavior dynamics.
- Empirical monitoring (adoption rates, conversion effectiveness, cost per persuaded individual/dollar raised) will be critical to forecast market equilibrium changes and distributional effects across actors.

Summary conclusion: The study provides robust experimental evidence that state-of-the-art conversational AIs can outperform even elite human persuaders primarily because they deliver more information, faster. For economists, this implies shifting sources of comparative advantage, new strategic investments in AI access, altered labor demand in persuasion industries, and substantial public-policy externalities requiring attention.

Assessment

Paper Typerct Evidence Strengthhigh — Large preregistered sample (18,978 conversations, 6,923 participants), randomized treatment assignment across multiple independent experiments, replication across several human-comparator groups (including incentivized experts and professional canvassers), and a consequential behavioral field outcome (actual donations) provide strong causal evidence that the tested AI systems out-persuade humans. Methods Rigorhigh — Pre-registration, large sample sizes, use of multiple comparator groups (from laypeople to elite debaters), incentivization of human experts, mechanistic tests (coaching, speed/length constraints), and a field donation outcome indicate careful experimental design and robustness checks; remaining methodological caveats (model/time specificity, potential disclosure effects) do not undermine internal validity. SampleFour preregistered experiments comprising 18,978 conversational interactions from 6,923 distinct participants; human persuader types included online lay volunteers, winners of a preregistered online persuasion tournament, professional canvassers employed by a UK fundraising firm, and world-champion debaters; experiments were conducted online and included incentivized human experts (cash bonuses) and a field fundraising test with real monetary donations to Save the Children; AI condition used contemporary 'frontier' conversational models (unspecified here). Themeshuman_ai_collab governance IdentificationPreregistered randomized experiments that assigned target recipients to be engaged by either a conversational AI system or one of several human persuaders (laypeople, tournament-winning persuaders, professional canvassers, world-champion debaters), with a field outcome (real-money donations) measured in a follow-up study; additional randomized/coached conditions and constrained-AI treatments isolate mechanism (message length/speed). GeneralizabilityResults depend on the specific AI models tested and their capabilities at the time; performance may change as models evolve., Topics/issues used in experiments may not represent all political or commercial persuasion contexts., Most interactions appear to be in online settings and (implicitly) English-speaking populations, limiting cross-cultural and offline generalizability., Disclosure or knowledge that one is interacting with an AI (vs human) could alter effects in other settings if labeling norms/regulations differ., Professional canvassing dynamics (in-person, local regulations, institutional trust) may differ across countries and organizations.

Claims (9)

Claim	Direction	Outcome	Confidence & Evidence	Details
Frontier AI systems were reliably more persuasive than expert humans across a series of four preregistered experiments. Decision Quality	positive	persuasiveness (success in conversational persuasion)	Reading fidelity high Study strength high	n=68923 1.0
Across the experiments there were n = 18,978 conversations from 6,923 people. Other	null_result	sample size (study scale)	Reading fidelity high Study strength high	n=68923 1.0
AI systems out‑persuaded expert humans even when expert humans chose their issues, researched in advance, underwent hours of live, structured practice, and were incentivized with £1,000 cash bonuses. Decision Quality	positive	persuasiveness under expert preparation and incentive conditions	Reading fidelity high Study strength high	1.0
AI's persuasive advantage persisted after experts received a coaching tool that let them practice against the AI, review their performance history, and see what AI would have said at key moments. Decision Quality	positive	persuasiveness after expert coaching intervention	Reading fidelity high Study strength medium	0.6
Converging evidence indicates AI's advantage stemmed from rapidly deploying larger quantities of information. Output Quality	positive	quantity of information deployed and its relation to persuasion success	Reading fidelity high Study strength medium	0.6
After coaching, expert humans could tie an AI that was constrained to respond at human speeds and with human-length messages. Decision Quality	null_result	persuasion parity when AI constrained to human-like speed/length after human coaching	Reading fidelity high Study strength medium	0.6
In a real-world fundraising study, AI was nearly 3x more effective than professional canvassers from a UK fundraising firm at raising real-money donations to Save the Children. Firm Revenue	positive	amount (or rate) of real-money donations raised	Reading fidelity high Study strength high	nearly 3x more effective 1.0
The human comparators in the experiments included laypeople, winners of a separately preregistered four-round online persuasion tournament, professional canvassers, and world championship debaters. Other	null_result	types of human comparator groups used in experiments	Reading fidelity high Study strength high	1.0
All experiments reported were preregistered. Other	null_result	study preregistration (methodological claim)	Reading fidelity high Study strength high	1.0