Instant AI help backfires: brief interactions with assistants boost short-term accuracy but make people more likely to give up and perform worse without the AI. The effect appears rapidly across reasoning and comprehension tasks, suggesting current assistants may undermine the learning persistence needed for long-term skill acquisition.

AI Assistance Reduces Persistence and Hurts Independent Performance

Grace Liu, Brian Christian, Tsvetomira Dumbalska, Michiel A. Bakker, Rachit Dubey · April 06, 2026

arxiv rct medium evidence 7/10 relevance Source PDF

Randomized trials show that brief AI assistance improves immediate task performance but reduces people’s persistence and impairs subsequent unassisted performance across tasks, with effects appearing after only about 10 minutes of interaction.

People often optimize for long-term goals in collaboration: A mentor or companion doesn't just answer questions, but also scaffolds learning, tracks progress, and prioritizes the other person's growth over immediate results. In contrast, current AI systems are fundamentally short-sighted collaborators - optimized for providing instant and complete responses, without ever saying no (unless for safety reasons). What are the consequences of this dynamic? Here, through a series of randomized controlled trials on human-AI interactions (N = 1,222), we provide causal evidence for two key consequences of AI assistance: reduced persistence and impairment of unassisted performance. Across a variety of tasks, including mathematical reasoning and reading comprehension, we find that although AI assistance improves performance in the short-term, people perform significantly worse without AI and are more likely to give up. Notably, these effects emerge after only brief interactions with AI (approximately 10 minutes). These findings are particularly concerning because persistence is foundational to skill acquisition and is one of the strongest predictors of long-term learning. We posit that persistence is reduced because AI conditions people to expect immediate answers, thereby denying them the experience of working through challenges on their own. These results suggest the need for AI model development to prioritize scaffolding long-term competence alongside immediate task completion.

Summary

Main Finding

Short, targeted exposure to an AI assistant (∼10–15 minutes) improves immediate task performance but causally reduces subsequent unassisted performance and persistence. Across randomized controlled trials (N ≈ 1,222 total), participants who used an AI that provided direct solutions solved fewer problems and were more likely to give up when the AI was removed. Effects are strongest for participants who used the AI to obtain direct answers rather than hints.

Key Points

Causal design: Three randomized controlled experiments comparing an AI-assisted condition to an unassisted control.
Tasks: Fraction arithmetic (Experiments 1 & 2) and reading comprehension (Experiment 3).
Short-term vs downstream outcome:
- During assisted trials, AI users performed better.
- When assistance was removed, AI users had lower solve rates and (in several tests) higher skip rates than controls.
Persistence measured by "skip" choices: AI users—especially those who requested direct answers—were more likely to skip problems when unassisted.
Usage heterogeneity: In Experiment 2, 61% reported using the AI to obtain direct answers, 27% to get hints/clarifications, 12% did not use the AI. The reduced persistence and unassisted performance were concentrated in the direct-answer subgroup.
Effect sizes and significance (examples reported):
- Experiment 1 (final test problems): solve rate AI = 0.57 vs control = 0.73; t(305) = −3.64, p < 0.001; Cohen’s d = −0.42. Skip rate AI = 0.20 vs control = 0.11; t(305) = 2.16, p = 0.031; d = 0.25.
- Experiment 2 (replication): solve rate AI = 0.71 vs control = 0.77; t(583) = −2.33, p = 0.020; d = −0.19. Aggregate skip-rate difference not significant, but concentrated in the direct-answer users.
Mechanism hypothesis: AI conditions users to expect immediate solutions and reduces opportunities to practice overcoming difficulty, diminishing the disposition to persist—an ability tightly linked to long-term learning.

Data & Methods

Overall sample: N ≈ 1,222 participants across three experiments (specific experiment Ns: Experiment 1 initial N=354, final N=307 after exclusions; Experiment 2 initial N=667, final N=585; Experiment 3 initial N=201).
Recruitment: US-based participants via Prolific; brief online tasks (≈13–15 minutes); monetary compensation.
Experimental structure (common elements):
- Random assignment to AI or control.
- Pretest (some experiments) → learning/assisted phase (AI available only in AI condition) → unexpected removal of AI → unassisted test phase (identical problems across conditions).
- AI condition: sidebar AI (GPT-5 in these studies) pre-prompted with each problem and solution; participants could request answers, hints, or clarifications.
- Control condition: same problems with no AI. In Experiment 2, a neutral sidebar with previously seen worked solutions was included to equalize interface changes.
- Outcome measures: solve rate (correct responses), skip rate (choosing to skip a problem), and change from pretest to test. Skipping is interpreted as a persistence/motivation measure since no penalty for wrong answers.
Exclusions: failed attention checks; inability to solve basic pretest items (to control for baseline ability). Analyses include t-tests, ANOVA, and reported Cohen’s d effect sizes and confidence intervals.
Robustness: Experiment 2 addresses confounds from Experiment 1 (pretest-based exclusions and interface parity). Experiment 3 tests cross-domain generality (reading comprehension).

Implications for AI Economics

Human capital and deskilling risk:
- Short-term productivity gains from AI assistance may come at the cost of reduced future human capability—an erosion of persistence and independent problem-solving that undermines human capital accumulation.
- Standard productivity accounting that focuses on immediate output could overstate long-run gains if it ignores downstream declines in worker skill and motivation.
Substitution vs complementarity:
- Effects differ by use-style: when AI is used for direct answers, it behaves more like a substitute that can degrade worker capability; when used as hints/scaffolds, it may be more of a complement. Incentive structures matter a great deal.
Firm-level decisions and market externalities:
- Firms optimizing for short-run metrics may favor assistant designs that maximize immediate throughput (direct answers), thereby creating negative externalities on broader workforce skill formation and future productivity.
- There may be a collective action problem: individual firms reap immediate benefits while societal human capital suffers, suggesting a potential role for regulation, standards, or industry best practices.
Product design and platform incentives:
- To align AI with long-run human capital creation, design incentives should prioritize scaffolding, delayed/full disclosure strategies, hints, graded feedback, or refusal when appropriate.
- Platform/business models should consider features that encourage learning (e.g., hint-first defaults, progressive disclosure, practice modes) rather than immediate full solutions.
Policy and measurement implications:
- Policymakers and economists should incorporate downstream effects of AI assistance into evaluations of labor market impacts, training returns, and education policy.
- Longitudinal data is needed to estimate cumulative effects on skills, wages, and occupational mobility; short-run RCTs show causal mechanisms but cannot yet quantify long-term macro impacts.
Research & monitoring priorities:
- Long-run, longitudinal studies to measure accumulation of persistence and skill erosion.
- Field experiments in workplaces and schools to test external validity.
- Research on interface/prompting interventions that preserve or enhance persistence (e.g., scaffolded help, forced effort before answer, metacognitive prompts).

Limitations (relevant for interpretation) - Short-duration lab-style online tasks—uncertain generalizability to long-term, repeated real-world use. - Single AI model (GPT-5) and specific UI; effects may vary with different assistant behaviors or design choices. - Self-reported usage categories subject to reporting error; usage was not fully instrumented in all cases. - Need for longitudinal evidence to determine whether brief effects compound, attenuate, or reverse over time.

Bottom line: These RCTs provide causal evidence that current answer-providing AI assistants can impair downstream independent performance and reduce persistence after only brief exposure—an economically meaningful risk to human capital formation that should shape product design, firm incentives, and policy.

Assessment

Paper Typerct Evidence Strengthmedium — Random assignment and a large pooled sample (N=1,222) give credible causal identification for short-term effects across multiple tasks, but external validity is limited by brief exposure (~10 minutes), lab-style tasks (math reasoning and reading comprehension), unspecified participant recruitment/demographics, and no long-term follow-up to show persistence of effects. Methods Rigorhigh — Use of randomized trials, multiple tasks, and clear pre/post measurement of unassisted performance and persistence indicates rigorous design; however, missing details (e.g., recruitment source, blinding, robustness checks, range of AI models, and longer-term follow-up) constrain confidence in broader claims. SamplePooled sample of N = 1,222 human participants enrolled in a series of randomized controlled trials involving short (~10 minute) interactions with AI across a variety of tasks including mathematical reasoning and reading comprehension; further sample recruitment and demographic details are not specified in the summary. Themeshuman_ai_collab skills_training productivity IdentificationRandomized controlled trials: participants were randomly assigned to receive AI assistance or a control condition, and causal effects on persistence and subsequent unassisted performance were estimated by comparing outcomes across arms. GeneralizabilityShort, one-off exposure (~10 minutes) — unclear if effects persist or accumulate over repeated interactions, Laboratory-style tasks (math and reading) — may not generalize to complex, real-world work tasks, Participant recruitment and demographics unspecified — limits applicability across populations and labor contexts, Single or limited AI model/interaction designs — results may depend on specific assistant behavior, Measured outcomes are persistence and immediate unassisted performance, not long-term skill acquisition, wages, or firm-level productivity

Claims (8)

Claim	Direction	Confidence	Outcome	Details
Through a series of randomized controlled trials on human-AI interactions (N = 1,222), we provide causal evidence that AI assistance reduces persistence. Skill Acquisition	negative	high	persistence (willingness to continue working on tasks without AI)	n=1222 1.0
AI assistance impairs unassisted performance: although AI improves short-term performance, people perform significantly worse without AI after interacting with it. Output Quality	negative	high	unassisted task performance (accuracy/quality when working without AI after prior AI assistance)	n=1222 1.0
AI assistance improves short-term performance on tasks (people do better while using the AI). Output Quality	positive	high	short-term task performance (immediate accuracy/quality while assisted by AI)	n=1222 1.0
People are more likely to give up after interacting with AI (increased likelihood of quitting tasks unassisted). Task Completion Time	negative	high	likelihood of giving up / task abandonment	n=1222 1.0
These negative effects (reduced persistence and impaired unassisted performance) emerge after only brief interactions with AI (approximately 10 minutes). Skill Acquisition	negative	high	onset/time to observable effect (persistence and unassisted performance after ~10 minutes of AI use)	n=1222 0.6
These effects are observed across a variety of tasks, including mathematical reasoning and reading comprehension. Output Quality	mixed	high	task-specific performance and persistence across task types (math reasoning, reading comprehension, etc.)	n=1222 0.6
We posit that persistence is reduced because AI conditions people to expect immediate answers, denying them the experience of working through challenges on their own. Skill Acquisition	negative	high	mechanistic explanation for reduced persistence (expectation of immediate answers)	n=1222 0.1
These results suggest the need for AI model development to prioritize scaffolding long-term competence alongside immediate task completion. Governance And Regulation	mixed	high	recommendation for AI development priorities (design objective, not an empirical outcome)	n=1222 0.1