The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲
← Papers

Generative LLMs speed programmers' idea generation but appear to reduce moments of human creativity; they still produce code that is at least as idea-rich and more often correct.

"Like Taking the Path of Least Resistance": Exploring the Impact of LLM Interaction on the Creative Process of Programming
Zeinabsadat Saghi, Run Huang, Souti Chattopadhyay · May 13, 2026
arxiv quasi_experimental medium evidence 7/10 relevance Source PDF
In a within-subject study of 20 programmers, LLM assistance shortened idea-generation time and reduced observed 'creative moments' but produced code with similar numbers of ideas and higher correctness/functionality than unassisted solutions.

Creativity is fundamentally human. As AI takes on more of the generative work that once required human imagination, despite documented limitations in creative ability, a critical question emerges: How does GenAI affect users' creativity? Through a within-subject study followed by retrospective interviews with (N=20) programmers, we investigated the impact of LLMs on participants' process of creative thinking in programming and the creativity of generated solutions. Across two conditions (LLM-assisted vs. unassisted), participants using LLMs had significantly shorter idea-generation periods (p=0.0004), leading to fewer creative moments (p=0.002). Qualitative analysis of participants' interactions and interviews revealed four different human-LLM collaboration modes supporting various problem-solving strategies. However, a comparative analysis of the generated solutions shows that while LLMs can help generate more correct and functional code, their solutions contain roughly the same number of ideas as participant-generated ones. Based on our findings, we discuss design implications and considerations for effectively using LLMs to support user creativity.

Summary

Main Finding

Using an LLM-based coding assistant (GitHub Copilot) speeds up implementation and produces more functionally correct code, but shortens idea-generation time and reduces the number of observed creative “aha” moments during programming. Despite the process-level change, the final solutions contain roughly the same number of distinct ideas as unassisted solutions. How the LLM is used (collaboration mode) strongly mediates its effect on creativity.

Key Points

  • Study design: within-subject, counterbalanced lab study with N=20 programmers; each completed four Python tasks under two conditions (LLM-assisted vs. unassisted).
  • Main quantitative results:
    • Idea-generation periods were significantly shorter with LLM assistance (p = 0.0004).
    • Participants experienced significantly fewer creative moments in LLM-assisted sessions (p = 0.002).
    • LLM-assisted code was functionally and syntactically better overall, but contained roughly the same number of unique ideas as participant-written solutions.
  • Four empirically observed human–LLM collaboration modes:
    • Brainstormer: LLM used to generate ideas.
    • Implementer: LLM used to implement a participant’s idea.
    • Verifier: LLM used to check/refine solutions.
    • Co-pilot: active back-and-forth assistance where LLM shares thinking agency.
  • Creative outcomes depend on mode: modes that preserve user ideation agency (implementer, verifier) produced more creative moments and diversity than modes where the user ceded thinking to the LLM (co-pilot/brainstormer).
  • Behavioral observation: participants often followed “the path of least resistance” — delegating ideation to LLMs rather than using freed time to explore alternatives.
  • Design note from pilots: aggressive autocompletion (Copilot) can bias or shortcut the user’s creative process and was controlled in the experiment.

Data & Methods

  • Participants: 20 (after excluding 3 for drop/data loss), recruited from undergrad/graduate students with Python and LLM exposure; varied experience levels.
  • Tasks: 4 programming tasks per participant (two algorithmic tasks with complexity constraints and two system-design tasks), implemented in Python inside VS Code.
  • Conditions:
    • Unassisted: no external resources allowed; only provided IDE.
    • LLM-assisted: GitHub Copilot accessible; no other external resources allowed.
  • Procedure:
    • Within-subject, counterbalanced order of conditions and tasks.
    • Think-aloud protocol, screen/audio recording.
    • Pop-up self-report prompt every 5 minutes asking about “aha” moments to triangulate observed creative moments.
    • Post-session retrospective interviews probing strategy, LLM effects, and triggers for creative moments.
  • Analysis:
    • Mixed-methods: quantitative comparisons of idea-generation duration, number of creative moments, code quality (functional/syntactic correctness), and number of unique ideas in solutions; qualitative coding of interaction episodes and interviews to identify collaboration modes.
  • Limitations noted by authors:
    • Lab setting, limited sample size (N=20), participant pool skewed to students/researchers.
    • Single LLM tool (GitHub Copilot) and single language (Python) — limits generalizability.
    • Task lengths were short; autocompletion effects were controlled but may differ in real-world workflows.

Implications for AI Economics

  • Productivity vs. Creative Process trade-off
    • Short-term productivity gains: LLMs raise implementation speed and reduce syntactic/functional errors, implying higher measured output per coder hour for routine implementation tasks.
    • Process externality: reduced ideation time and fewer creative moments imply potential long-run declines in individual creative skill maintenance and exploration efforts, which may lower the rate of novel innovations absent countervailing actions.
  • Labor demand and skill composition
    • Reallocation of tasks: demand likely shifts away from routine implementation toward higher-level design, evaluation, and orchestration tasks. Workers who can preserve or augment ideation capability (critical thinking, design framing) capture more value.
    • Human capital depreciation risk: widespread reliance on LLMs for ideation/creative thinking could erode developers’ creative problem-solving skills, lowering long-run human capital in creativity-intensive tasks and affecting wages for those who fail to adapt.
  • Product differentiation, market structure, and aggregate innovation
    • Homogenization risk: LLMs’ tendency to propose modal/common patterns can push outputs toward similar solutions across developers and firms, potentially reducing product differentiation and increasing competitive pressures based on speed/cost rather than novelty.
    • Innovation growth: if reduced process-level creativity translates into fewer truly novel solutions, aggregate innovation rates could slow — particularly in domains where creativity drives breakthrough products.
  • Returns to complementary investments
    • Complementarity with organization design and interfaces: firms that invest in workflows, tooling, and training that preserve human ideation (e.g., using LLMs as implementers/verifiers, not ghostwriters) will likely realize better long-term innovation returns.
    • Platform design matters: LLM providers and IDE vendors can influence economic outcomes by offering interaction modes that encourage user agency (e.g., delayed/conditional suggestions, scaffolded ideation tools) — affecting productivity, skill retention, and product diversity across the economy.
  • Policy and measurement suggestions for economists and firms
    • Track idea diversity and creative-process indicators, not just output quantity—e.g., measures of solution heterogeneity across teams, time spent in ideation, and frequency of novel approaches.
    • Monitor skill trajectories of workers exposed to LLMs longitudinally to detect human-capital depreciation in creative skills.
    • Consider incentives for maintaining creative engagement (training, rotation to ideation tasks, design reviews) to offset cognitive offloading externalities.
  • Practical recommendations
    • For firms: integrate LLMs as tools that augment implementation and verification while institutionalizing practices (code review, design workshops) that preserve developer ideation.
    • For platform designers: provide collaboration-mode controls (brainstorm vs. implement vs. verify), adjustable suggestion aggressiveness, and explicit scaffolds that nudge users to explore alternatives before accepting completions.
    • For policymakers: encourage transparency/reporting about how LLMs are used in creative work and support workforce upskilling programs focused on higher-order creative skills.

Overall, the paper shows LLMs improve implementation efficiency but can shorten the creative process unless interaction design and organizational practices deliberately preserve human ideation agency. Those dynamics have measurable implications for productivity, skill demand, innovation, and market outcomes in the AI-augmented economy.

Assessment

Paper Typequasi_experimental Evidence Strengthmedium — The within-subject design gives reasonably strong internal comparison by controlling for individual heterogeneity and finds statistically significant differences, but the small sample (N=20), potential order/learning effects, and task/domain specificity limit robustness and external validity. Methods Rigormedium — The study combines quantitative within-subject tests and qualitative interviews, which is appropriate for the research question; however, the small convenience sample, limited information on randomization/counterbalancing, possible subjective measurement of 'creative moments', and no mention of pre-registration or inter-rater reliability reduce methodological rigor. SampleN=20 programmers participated in a within-subject lab study performing programming tasks under two conditions (LLM-assisted and unassisted); data include task timing, counts of identified 'creative moments', participants' code outputs (correctness/functionality and idea counts), interaction logs with the LLM, and retrospective interview transcripts. Themeshuman_ai_collab productivity skills_training IdentificationWithin-subject experimental comparison: each participant completed programming tasks in two conditions (LLM-assisted vs. unassisted), so participants act as their own controls; quantitative differences in idea-generation time, counts of creative moments, and code correctness are tested with standard statistical tests (p-values), supplemented by retrospective qualitative interviews to triangulate processes. GeneralizabilitySmall sample size (N=20) limits statistical power and representativeness, Participants are programmers only — results may not generalize to other creative domains (writing, design, art), Unclear participant demographics/experience distribution (convenience sampling likely), Laboratory task setting and short-duration tasks may not reflect long-term, real-world workflows or complex projects, Specific LLM model, interface, and prompt setups used may not generalize to other models or integration contexts, Potential order/learning effects in within-subject design if not fully counterbalanced

Claims (6)

ClaimDirectionConfidenceOutcomeDetails
We conducted a within-subject study followed by retrospective interviews with programmers (N=20). Other positive high study_design_and_sample
n=20
0.48
Participants using LLMs had significantly shorter idea-generation periods (p=0.0004). Task Completion Time negative high idea-generation period (time spent generating ideas)
n=20
0.48
Using LLMs led to fewer creative moments observed in participants (p=0.002). Creativity negative high count of creative moments
n=20
0.48
Qualitative analysis of participants' interactions and interviews revealed four different human-LLM collaboration modes supporting various problem-solving strategies. Other positive high types of collaboration modes
n=20
0.24
LLMs can help generate more correct and functional code compared to participant-generated solutions. Output Quality positive high correctness and functionality of generated code
0.48
LLM-generated solutions contain roughly the same number of ideas as participant-generated solutions. Creativity null_result high number of ideas per solution
0.48

Notes