Instant AI help backfires: brief interactions with assistants boost short-term accuracy but make people more likely to give up and perform worse without the AI. The effect appears rapidly across reasoning and comprehension tasks, suggesting current assistants may undermine the learning persistence needed for long-term skill acquisition.
People often optimize for long-term goals in collaboration: A mentor or companion doesn't just answer questions, but also scaffolds learning, tracks progress, and prioritizes the other person's growth over immediate results. In contrast, current AI systems are fundamentally short-sighted collaborators - optimized for providing instant and complete responses, without ever saying no (unless for safety reasons). What are the consequences of this dynamic? Here, through a series of randomized controlled trials on human-AI interactions (N = 1,222), we provide causal evidence for two key consequences of AI assistance: reduced persistence and impairment of unassisted performance. Across a variety of tasks, including mathematical reasoning and reading comprehension, we find that although AI assistance improves performance in the short-term, people perform significantly worse without AI and are more likely to give up. Notably, these effects emerge after only brief interactions with AI (approximately 10 minutes). These findings are particularly concerning because persistence is foundational to skill acquisition and is one of the strongest predictors of long-term learning. We posit that persistence is reduced because AI conditions people to expect immediate answers, thereby denying them the experience of working through challenges on their own. These results suggest the need for AI model development to prioritize scaffolding long-term competence alongside immediate task completion.
Summary
Main Finding
Short, targeted exposure to an AI assistant (∼10–15 minutes) improves immediate task performance but causally reduces subsequent unassisted performance and persistence. Across randomized controlled trials (N ≈ 1,222 total), participants who used an AI that provided direct solutions solved fewer problems and were more likely to give up when the AI was removed. Effects are strongest for participants who used the AI to obtain direct answers rather than hints.
Key Points
- Causal design: Three randomized controlled experiments comparing an AI-assisted condition to an unassisted control.
- Tasks: Fraction arithmetic (Experiments 1 & 2) and reading comprehension (Experiment 3).
- Short-term vs downstream outcome:
- During assisted trials, AI users performed better.
- When assistance was removed, AI users had lower solve rates and (in several tests) higher skip rates than controls.
- Persistence measured by "skip" choices: AI users—especially those who requested direct answers—were more likely to skip problems when unassisted.
- Usage heterogeneity: In Experiment 2, 61% reported using the AI to obtain direct answers, 27% to get hints/clarifications, 12% did not use the AI. The reduced persistence and unassisted performance were concentrated in the direct-answer subgroup.
- Effect sizes and significance (examples reported):
- Experiment 1 (final test problems): solve rate AI = 0.57 vs control = 0.73; t(305) = −3.64, p < 0.001; Cohen’s d = −0.42. Skip rate AI = 0.20 vs control = 0.11; t(305) = 2.16, p = 0.031; d = 0.25.
- Experiment 2 (replication): solve rate AI = 0.71 vs control = 0.77; t(583) = −2.33, p = 0.020; d = −0.19. Aggregate skip-rate difference not significant, but concentrated in the direct-answer users.
- Mechanism hypothesis: AI conditions users to expect immediate solutions and reduces opportunities to practice overcoming difficulty, diminishing the disposition to persist—an ability tightly linked to long-term learning.
Data & Methods
- Overall sample: N ≈ 1,222 participants across three experiments (specific experiment Ns: Experiment 1 initial N=354, final N=307 after exclusions; Experiment 2 initial N=667, final N=585; Experiment 3 initial N=201).
- Recruitment: US-based participants via Prolific; brief online tasks (≈13–15 minutes); monetary compensation.
- Experimental structure (common elements):
- Random assignment to AI or control.
- Pretest (some experiments) → learning/assisted phase (AI available only in AI condition) → unexpected removal of AI → unassisted test phase (identical problems across conditions).
- AI condition: sidebar AI (GPT-5 in these studies) pre-prompted with each problem and solution; participants could request answers, hints, or clarifications.
- Control condition: same problems with no AI. In Experiment 2, a neutral sidebar with previously seen worked solutions was included to equalize interface changes.
- Outcome measures: solve rate (correct responses), skip rate (choosing to skip a problem), and change from pretest to test. Skipping is interpreted as a persistence/motivation measure since no penalty for wrong answers.
- Exclusions: failed attention checks; inability to solve basic pretest items (to control for baseline ability). Analyses include t-tests, ANOVA, and reported Cohen’s d effect sizes and confidence intervals.
- Robustness: Experiment 2 addresses confounds from Experiment 1 (pretest-based exclusions and interface parity). Experiment 3 tests cross-domain generality (reading comprehension).
Implications for AI Economics
- Human capital and deskilling risk:
- Short-term productivity gains from AI assistance may come at the cost of reduced future human capability—an erosion of persistence and independent problem-solving that undermines human capital accumulation.
- Standard productivity accounting that focuses on immediate output could overstate long-run gains if it ignores downstream declines in worker skill and motivation.
- Substitution vs complementarity:
- Effects differ by use-style: when AI is used for direct answers, it behaves more like a substitute that can degrade worker capability; when used as hints/scaffolds, it may be more of a complement. Incentive structures matter a great deal.
- Firm-level decisions and market externalities:
- Firms optimizing for short-run metrics may favor assistant designs that maximize immediate throughput (direct answers), thereby creating negative externalities on broader workforce skill formation and future productivity.
- There may be a collective action problem: individual firms reap immediate benefits while societal human capital suffers, suggesting a potential role for regulation, standards, or industry best practices.
- Product design and platform incentives:
- To align AI with long-run human capital creation, design incentives should prioritize scaffolding, delayed/full disclosure strategies, hints, graded feedback, or refusal when appropriate.
- Platform/business models should consider features that encourage learning (e.g., hint-first defaults, progressive disclosure, practice modes) rather than immediate full solutions.
- Policy and measurement implications:
- Policymakers and economists should incorporate downstream effects of AI assistance into evaluations of labor market impacts, training returns, and education policy.
- Longitudinal data is needed to estimate cumulative effects on skills, wages, and occupational mobility; short-run RCTs show causal mechanisms but cannot yet quantify long-term macro impacts.
- Research & monitoring priorities:
- Long-run, longitudinal studies to measure accumulation of persistence and skill erosion.
- Field experiments in workplaces and schools to test external validity.
- Research on interface/prompting interventions that preserve or enhance persistence (e.g., scaffolded help, forced effort before answer, metacognitive prompts).
Limitations (relevant for interpretation) - Short-duration lab-style online tasks—uncertain generalizability to long-term, repeated real-world use. - Single AI model (GPT-5) and specific UI; effects may vary with different assistant behaviors or design choices. - Self-reported usage categories subject to reporting error; usage was not fully instrumented in all cases. - Need for longitudinal evidence to determine whether brief effects compound, attenuate, or reverse over time.
Bottom line: These RCTs provide causal evidence that current answer-providing AI assistants can impair downstream independent performance and reduce persistence after only brief exposure—an economically meaningful risk to human capital formation that should shape product design, firm incentives, and policy.
Assessment
Claims (8)
| Claim | Direction | Confidence | Outcome | Details |
|---|---|---|---|---|
| Through a series of randomized controlled trials on human-AI interactions (N = 1,222), we provide causal evidence that AI assistance reduces persistence. Skill Acquisition | negative | high | persistence (willingness to continue working on tasks without AI) |
n=1222
1.0
|
| AI assistance impairs unassisted performance: although AI improves short-term performance, people perform significantly worse without AI after interacting with it. Output Quality | negative | high | unassisted task performance (accuracy/quality when working without AI after prior AI assistance) |
n=1222
1.0
|
| AI assistance improves short-term performance on tasks (people do better while using the AI). Output Quality | positive | high | short-term task performance (immediate accuracy/quality while assisted by AI) |
n=1222
1.0
|
| People are more likely to give up after interacting with AI (increased likelihood of quitting tasks unassisted). Task Completion Time | negative | high | likelihood of giving up / task abandonment |
n=1222
1.0
|
| These negative effects (reduced persistence and impaired unassisted performance) emerge after only brief interactions with AI (approximately 10 minutes). Skill Acquisition | negative | high | onset/time to observable effect (persistence and unassisted performance after ~10 minutes of AI use) |
n=1222
0.6
|
| These effects are observed across a variety of tasks, including mathematical reasoning and reading comprehension. Output Quality | mixed | high | task-specific performance and persistence across task types (math reasoning, reading comprehension, etc.) |
n=1222
0.6
|
| We posit that persistence is reduced because AI conditions people to expect immediate answers, denying them the experience of working through challenges on their own. Skill Acquisition | negative | high | mechanistic explanation for reduced persistence (expectation of immediate answers) |
n=1222
0.1
|
| These results suggest the need for AI model development to prioritize scaffolding long-term competence alongside immediate task completion. Governance And Regulation | mixed | high | recommendation for AI development priorities (design objective, not an empirical outcome) |
n=1222
0.1
|