Users routinely lean on AI for trivial tasks that it does not meaningfully speed up, while underreporting how often they use it and overestimating time savings; prior exposure further entrenches reliance, risking an inefficient overreliance feedback loop.

The efficiency-gain illusion: People underestimate the rate of AI use and overestimate its benefits on simple tasks

Sunny Yu, Myra Cheng, Ahmad Jabbar, Ilia Sucholutsky, Katherine M. Collins, Dan Jurafsky, Robert D. Hawkins · May 21, 2026

arxiv quasi_experimental medium evidence 7/10 relevance Source PDF

Across three pre-registered studies, people often choose AI for trivial tasks even when it confers no meaningful time or effort savings, they underreport their AI use and overestimate the efficiency gains, and prior AI use increases subsequent adoption, creating a reinforcing miscalibration loop.

People are increasingly turning to AI assistance for simple tasks, e.g., arithmetic, spell-check, and answering simple questions. But does AI assistance actually save users time and effort? We investigate people's propensity to use AI for cognitively simple tasks and assess whether their reliance is well-calibrated. Across three pre-registered user studies (N = 2691), we find that people frequently choose to use AI even when doing so is inefficient (i.e. provides no meaningful time or effort savings). We identify systematic miscalibration at two levels: (1) a self-estimate miscalibration where people on average believe that they are using AI less than they actually are, and (2) efficiency-gain illusions where people overestimate how much time and effort savings AI use affords. We also identify a session-level carryover effect where a participant's prior AI use leads to further AI adoption and entrenches their miscalibration about time savings. Our results shed light on the mechanisms and biases underlying people's choice of whether to use AI as well as the risk of an overreliance feedback loop.

Summary

Main Finding

Across three pre-registered studies (N = 2,691), people use LLM-based AI on simple tasks substantially more than they predict, and they systematically overestimate how much time and mental effort AI will save. Short-term exposure to using AI increases subsequent AI adoption and worsens calibration about AI’s efficiency — creating a likely feedback loop toward overreliance despite little or no real efficiency gain for trivial tasks.

Key Points

Scope: 24 simple tasks (under 5 minutes each) spanning four categories — information seeking, information processing/synthesis, procedural guidance/execution, and content creation/transformation — with easy and difficult variants.
Usage gap (Study 1):
- Predicted population-level AI use = 33%; actual behavioral use = 47% (β = 1.07, p < 0.001).
- Gap larger on easy tasks: predicted 20% vs actual 38% (β = 1.42, p < 0.001).
- No overall time or effort savings from AI; independent completions sometimes reported lower effort.
Efficiency-gain illusions (Study 2):
- Participants predicted AI would save ~55.7 seconds on average but actual mean saving was ~7.5 seconds (β = −48.33, p < 0.001).
- Predicted independent completion time ≈ predicted; main miscalibration is underestimating time required when using AI (predicted AI time 43.3s vs actual AI time 86.2s; β = 42.9, p < 0.001).
- For subjective effort (NASA-TLX): people overestimated effort for independent completion (predicted 2.66 vs actual 2.36), while predictions for AI-assisted effort were accurate (predicted 1.76 vs actual 1.74).
- On easy tasks AI actually slowed people by ~10 seconds (prompting/read-processing friction).
- Prompting costs dominate: prompt construction averaged ~48.7s vs response processing ~37.6s; 41% of prompts were copy-pasted from instructions.
Carryover / feedback (Study 3):
- Exposure to AI increased subsequent AI adoption: 44.5% vs 27.7% for those exposed to AI vs independent completion (β = 0.54, p < 0.001).
- Prior AI use increased agreement with the (illusory) belief that tasks are faster with AI.
Robustness: effects hold across task categories and difficulty levels and after controlling for Need For Cognition and other traits.

Data & Methods

Design: Three pre-registered, between-subject experiments to avoid learning/demand effects from within-subject comparisons.
- Study 1: Compare a prediction sample (participants state whether they'd use AI on given tasks) vs a completion sample (participants allowed to choose to use an LLM chatbot while completing tasks). N and beta tests reported; key contrasts measured proportion choosing AI.
- Study 2: Prediction sample reported expected completion time and subjective effort (NASA-TLX) for independent vs AI-assisted completions. Completion sample randomly assigned to independent or AI-assisted conditions; actual times and effort recorded and decomposed (prompt construction, model response time, reading/processing).
- Study 3: Exposure phase (two tasks completed either with AI or independently) followed by a test phase (choices for easy tasks) to measure carryover and calibration shifts.
Measures: objective completion time (seconds), subjective effort (NASA-TLX), binary AI-use choices, prompt decomposition, and questionnaire items (e.g., confidence/agreement). Statistical analyses reported with beta coefficients and p-values; robustness checks across tasks and participant traits included.

Implications for AI Economics

Productivity estimates are likely biased upward if based on user expectations or self-reports. Firms and policymakers should not rely on stated beliefs about AI time savings; measured, task-level productivity assessments are essential.
Transaction costs and UI friction matter. Prompting and interaction latency can erase or reverse productivity gains on low-complexity tasks. Economic models of AI adoption should explicitly include per-task interaction costs (prompting, verification, reading) rather than assuming negligible setup costs.
Adoption dynamics: short-term experience with AI can increase future use even when efficiency gains are small or negative. Diffusion models should incorporate behavioral miscalibration and positive feedback (carryover) leading to potential overuse/lock-in of AI tools.
Labor and skill effects: frequent delegation on trivial tasks may produce cognitive deskilling and reduce human task competency over time. Human capital models should account for endogenous skill depreciation from offloading and possible long-run productivity losses.
Welfare and investment decisions: organizations investing in AI (tools, subscriptions, integrations) should weigh low marginal gains for simple tasks against risks of overreliance and broader impacts on training, quality control, and task allocation. Cost–benefit analyses should include indirect costs (reduced learning, verification time, potential errors).
Policy/design recommendations to mitigate miscalibration and overuse:
- Measure realized time/effort impacts within organizations before wide deployment and automation of simple tasks.
- Reduce prompting friction (better UX, templates, autosuggestions) to align user expectations with realized gains where beneficial.
- Provide real-time calibration feedback to users (e.g., show measured time saved/consumed) to prevent inefficient offloading loops.
- Design nudges or default settings that discourage AI use on trivial tasks where it slows work or harms skill formation.
- Monitor adoption externalities and consider guidelines for when human-only completion is socially preferable (training, audits).
Research directions for AI economics:
- Quantify long-run effects of routine AI offloading on human capital formation and labor supply.
- Build diffusion/adoption models that endogenize misperception-driven feedback loops and UI transaction costs.
- Field experiments in organization settings to measure aggregate productivity, verification costs, and error externalities from widespread AI adoption.

Concluding note: users expect large efficiency wins from LLM assistance, but for many simple tasks those gains are minimal or absent; economists and decision-makers should measure realized effects, include interaction costs, and guard against miscalibration-driven overadoption.

Assessment

Paper Typequasi_experimental Evidence Strengthmedium — Large, pre-registered multi-study design with direct behavioral measurements provides credible evidence about choices and miscalibration in the lab/online setting; however, outcomes are limited to simple tasks and short sessions, and the paper does not measure downstream economic outcomes (e.g., wages, firm productivity) or long-run adoption in real-world settings. Methods Rigormedium — Strengths include pre-registration, multiple independent studies, a large pooled sample, and direct observation of behavior versus self-report; limitations include probable reliance on online panel participants, artificial/simple tasks that may invite demand effects, possible unreported attrition or heterogeneous treatment compliance, and limited external validity beyond short-term simple-task contexts. SamplePooled sample of N=2,691 participants across three pre-registered online user studies who completed cognitively simple tasks (e.g., arithmetic, spell-check, answering simple questions); participants were assigned to conditions that varied AI availability/exposure and reported estimates of their AI use and perceived time/effort savings (detailed demographics and recruitment platform not specified in the summary). Themeshuman_ai_collab adoption productivity IdentificationPre-registered behavioral experiments (three studies, N=2,691) that observe participant choices to use AI assistance on simple tasks and compare actual usage to self-reports; causal leverage comes from experimental variation in AI availability/exposure and session-level prior-exposure contrasts to estimate carryover effects and treatment-on-the-treated style comparisons of efficiency gains. GeneralizabilityTasks are simple cognitive exercises (arithmetic, spell-check, simple Q&A) and may not generalize to complex, real-world, or workplace tasks, Likely recruited from online panels (e.g., MTurk/Prolific) — not nationally representative, Short-session laboratory/online experiments may not reflect long-run adoption, learning, or habituation in real settings, Specific AI interface and framing in the experiments may affect behavior differently than commercial products, Cultural and institutional context may limit transferability across countries or professional environments

Claims (6)

Claim	Direction	Confidence	Outcome	Details
People frequently choose to use AI even when doing so is inefficient (i.e., provides no meaningful time or effort savings). Adoption Rate	positive	high	frequency_of_AI_use_when_AI_is_inefficient	n=2691 0.8
People exhibit self-estimate miscalibration: on average they believe they are using AI less than they actually are. Adoption Rate	negative	high	discrepancy_between_self_reported_and_actual_AI_use	n=2691 0.8
People display 'efficiency-gain illusions': they overestimate how much time and effort savings AI use provides. Task Completion Time	positive	high	perceived_time_and_effort_savings_vs_actual_time_and_effort_savings	n=2691 0.8
There is a session-level carryover effect: a participant's prior AI use leads to further AI adoption and entrenches their miscalibration about time savings. Adoption Rate	positive	high	effect_of_prior_AI_use_on_subsequent_AI_adoption_and_miscalibration	n=2691 0.48
The paper's findings are based on three pre-registered user studies with a combined sample size of N = 2691. Other	null_result	high	study_sample_description	n=2691 0.8
People are increasingly turning to AI assistance for simple tasks (e.g., arithmetic, spell-check, answering simple questions). Adoption Rate	positive	medium	trend_in_AI_adoption_for_simple_tasks	0.14