The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲
← Papers

Fluent large language models can make users feel more competent than they are: the proposed 'LLM fallacy' argues that opaque, low-friction AI outputs encourage people to credit themselves for machine-produced work, risking miscalibrated hiring, promotion, and training decisions.

The LLM Fallacy: Misattribution in AI-Assisted Cognitive Workflows
Hyunwoo Kim, Harin Yu, Hanau Yi · April 16, 2026
arxiv theoretical n/a evidence 7/10 relevance Source PDF
The paper proposes the 'LLM fallacy' — a cognitive attribution error where fluent, opaque LLM outputs lead users to overattribute competence to themselves, creating a systematic gap between perceived and actual ability with implications for education, hiring, and workplace evaluation.

The rapid integration of large language models (LLMs) into everyday workflows has transformed how individuals perform cognitive tasks such as writing, programming, analysis, and multilingual communication. While prior research has focused on model reliability, hallucination, and user trust calibration, less attention has been given to how LLM usage reshapes users' perceptions of their own capabilities. This paper introduces the LLM fallacy, a cognitive attribution error in which individuals misinterpret LLM-assisted outputs as evidence of their own independent competence, producing a systematic divergence between perceived and actual capability. We argue that the opacity, fluency, and low-friction interaction patterns of LLMs obscure the boundary between human and machine contribution, leading users to infer competence from outputs rather than from the processes that generate them. We situate the LLM fallacy within existing literature on automation bias, cognitive offloading, and human--AI collaboration, while distinguishing it as a form of attributional distortion specific to AI-mediated workflows. We propose a conceptual framework of its underlying mechanisms and a typology of manifestations across computational, linguistic, analytical, and creative domains. Finally, we examine implications for education, hiring, and AI literacy, and outline directions for empirical validation. We also provide a transparent account of human--AI collaborative methodology. This work establishes a foundation for understanding how generative AI systems not only augment cognitive performance but also reshape self-perception and perceived expertise.

Summary

Main Finding

The paper defines the "LLM fallacy": a cognitive attribution error in which users systematically misinterpret outputs co-produced with large language models (LLMs) as evidence of their own independent competence. This produces a persistent divergence between perceived ability (as measured by system-assisted outputs) and actual unaided capability. The fallacy arises from LLM properties — fluency, pipeline opacity, and rapid interaction — which create attributional ambiguity and encourage cognitive outsourcing.

Key Points

  • Definition: The LLM fallacy = misattribution of LLM-assisted outputs to the user's own independent competence (an attributional, not purely epistemic, distortion).
  • Distinct from related concepts:
    • Not the same as hallucination (which is an output error), automation bias (over-reliance), or mere cognitive offloading; it specifically concerns self-assessment/skill attribution.
  • Core mechanisms:
    • Attribution ambiguity: iterative, underspecified prompts + polished model outputs make it hard to separate human vs. machine contribution.
    • Fluency illusion: grammatically/cohesively fluent outputs serve as heuristic cues of expertise.
    • Cognitive outsourcing: reliance on LLMs reduces engaged practice and internalization of skills.
    • Pipeline opacity: hidden intermediate computations prevent users from tracing how outputs were produced.
  • Formalization: capability divergence (ΔC = perceived − actual competence) emerges from interaction of system properties (opacity, fluency, immediacy) mediated by attribution ambiguity and outsourcing.
  • Multi-domain manifestations:
    • Computational: produced working code without understanding architecture/debugging.
    • Linguistic: fluent text in foreign languages without true language ability.
    • Analytical: reproduced arguments/analyses without internalized reasoning.
    • Creative: attributed authorship/creativity to oneself despite model contribution.
    • Epistemic: equating access to summaries/explanations with deep understanding.
    • Professional signaling: resumes/interviews reflecting LLM-enabled outputs rather than transferable skill.
  • Empirical grounding: paper is conceptual and observational—collects recurring patterns from practice, cites prior empirical literature, and presents illustrative cases rather than controlled experiments.
  • Research agenda: proposes testable hypotheses and measurement constructs (e.g., ΔC), and calls for experiments, longitudinal studies, and evaluation frameworks to validate and quantify the effect.

Data & Methods

  • Approach: theoretical synthesis and conceptual modeling drawing on literature from automation bias, cognitive offloading, extended mind, agency, metacognition, and human–AI interaction.
  • Evidence type: cross-context observational patterns and illustrative cases (coding, education, hiring) and citations to related empirical work; no novel randomized controlled trials or large-scale quantitative datasets in this paper.
  • Proposed empirical strategies for validation (suggested by authors):
    • Lab experiments: task completions with/without LLM assistance followed by unaided transfer tests to measure ΔC.
    • Longitudinal studies: track skill acquisition and reliance patterns over time as LLM exposure increases.
    • Field/hiring audits: compare hiring outcomes and on-the-job unaided performance for LLM-assisted applicants.
    • Natural experiments: exploit policy or platform changes (e.g., mandated disclosure of AI assistance) to identify effects on signaling and hiring.
    • Metrics & econometric designs: operationalize capability divergence, measure interactional immediacy/fluency/opacity, use RCTs, diff-in-diff, mediation analysis to estimate causal pathways.
  • Practical measurement suggestions: pair observed output quality with independent tests of domain knowledge or transfer tasks; instrument for intensity/type of LLM use; measure downstream task robustness (e.g., debugging, extension, unaided problem solving).

Implications for AI Economics

  • Labor market signaling and hiring:
    • Output-based signals (e.g., take-home tests, portfolios) may be inflated by LLM assistance, degrading the informativeness of observable outputs as signals of skill.
    • Firms face higher screening costs and greater risk of hiring workers whose unaided productivity is lower than implied by submitted work.
    • Potential credential inflation: more applicants may claim "skills" producible primarily via LLMs, shifting equilibrium and raising recruiter reliance on costly screening or credentials.
  • Human capital formation and returns to skill:
    • Offloading to LLMs can reduce on-the-job and educational learning, potentially slowing accumulation of underlying human capital and changing the return profile of investments in training.
    • Skills that are frequently outsourced to LLMs may depreciate or fail to transfer, producing mismatches between nominal task performance and durable competence.
  • Productivity measurement and firm-level performance:
    • Measured productivity gains from LLM tools (output-per-worker) may overstate long-run productive capacity if gains reflect model assistance rather than improved worker skill.
    • Short-term efficiency improvements could mask long-term fragility: products that require human debugging or domain expertise may underperform if workers lack internalized skills.
  • Market design and equilibrium effects:
    • Adverse selection and moral hazard: workers may misrepresent unaided ability; employers may underinvest in training if outputs remain superficially acceptable via LLMs.
    • Platforms and credentialing institutions become more valuable as verifiers; markets for credible testing, proctored assessments, or verified skill badges may expand.
  • Policy and regulation:
    • Disclosure requirements for AI-assisted work (in applications, certifications, educational submissions) could reduce misattribution externalities and improve market efficiency.
    • Subsidies or standards for skill certification, and investment in measurement tools that distinguish unaided capability from assisted output, could mitigate misallocation.
    • Education policy: curriculum design should emphasize procedural knowledge, transfer tasks, and assessments that test unaided competence to prevent hollowed learning.
  • Research and measurement priorities for economists:
    • Quantify ΔC across tasks and populations; estimate effects on wages, hiring probability, training investments, and firm productivity.
    • Identify thresholds where LLM assistance is productive vs. where it generates harmful signaling distortions.
    • Evaluate interventions (disclosure mandates, proctored tests, on-the-job probation periods) via RCTs or natural experiments to measure improvements in matching and skill accumulation.

Overall, the paper highlights a potentially large, underrecognized market failure: when observed outputs no longer reliably signal underlying human capability, labor market matching, human capital accumulation, and productivity measurement can be distorted. Economists should prioritize measuring the magnitude of this misattribution, its persistence, and the effectiveness of institutional remedies (disclosure, testing, certification, platform design) to restore credible signals of skill.

Assessment

Paper Typetheoretical Evidence Strengthn/a — The paper is conceptual and synthesizes prior literature without presenting original empirical tests or causal identification; claims are theoretical and intended to motivate future empirical validation. Methods Rigorn/a — No empirical methods or data analysis are reported; rigor pertains to theoretical clarity, literature synthesis, and conceptual framing rather than statistical or experimental design. SampleNo empirical sample; the paper develops a conceptual framework and typology based on existing literature in automation bias, cognitive offloading, and human–AI collaboration, with illustrative examples across computational, linguistic, analytical, and creative domains. Themeshuman_ai_collab skills_training labor_markets productivity GeneralizabilityNo empirical validation — applicability depends on future studies across populations and contexts, May vary by user skill level (novices vs experts) and domain (coding, writing, analysis, creative work), Likely sensitive to model characteristics (fluency, transparency, error rates) and interface design, Cultural and institutional differences in attribution, assessment, and accountability not addressed, Workplace incentives (performance metrics, supervision) and task structure could alter manifestations

Claims (8)

ClaimDirectionConfidenceOutcomeDetails
The rapid integration of large language models (LLMs) into everyday workflows has transformed how individuals perform cognitive tasks such as writing, programming, analysis, and multilingual communication. Organizational Efficiency positive high how individuals perform cognitive tasks (writing, programming, analysis, multilingual communication)
0.02
Less attention has been given to how LLM usage reshapes users' perceptions of their own capabilities. Skill Acquisition null_result high degree of prior research focus on users' self-perception following LLM use
0.06
The paper introduces the 'LLM fallacy,' a cognitive attribution error in which individuals misinterpret LLM-assisted outputs as evidence of their own independent competence, producing a systematic divergence between perceived and actual capability. Skill Acquisition negative high divergence between perceived competence and actual competence when using LLM outputs
0.02
The opacity, fluency, and low-friction interaction patterns of LLMs obscure the boundary between human and machine contribution, leading users to infer competence from outputs rather than from the processes that generate them. Decision Quality negative high user inference of competence (output-based vs process-based attribution)
0.02
The LLM fallacy is situated within existing literature on automation bias, cognitive offloading, and human–AI collaboration, but is distinguished as a form of attributional distortion specific to AI-mediated workflows. Ai Safety And Ethics null_result high conceptual distinctiveness of LLM fallacy relative to related constructs
0.06
The paper proposes a conceptual framework of the underlying mechanisms of the LLM fallacy and a typology of its manifestations across computational, linguistic, analytical, and creative domains. Ai Safety And Ethics positive high formal framework and typology coverage across domains
0.02
The LLM fallacy has implications for education, hiring, and AI literacy. Hiring mixed high impacts on education practices, hiring decisions, and AI literacy needs
0.02
This work establishes a foundation for understanding how generative AI systems not only augment cognitive performance but also reshape self-perception and perceived expertise. Skill Acquisition mixed high interaction between augmented cognitive performance and changes in self-perception/perceived expertise
0.06

Notes