Generative AI tools routinely collapse or rot contextual cues in workplace use, forcing users into ad hoc fixes; the paper argues firms should stop indiscriminately hoarding context data and instead design interactional practices that keep AI systems aligned with users' real-world contexts.

Context Collapse: Barriers to Adoption for Generative AI in Workplace Settings

Emanuel Moss, Elizabeth Watkins, Christopher Persaud, Dawn Nafus, Passant Karunaratne, Mona Sloane · April 06, 2026

arxiv descriptive low evidence 7/10 relevance Source PDF

Through expert interviews the paper shows that current GenAI tools systematically mis-handle users' contexts, and that users and developers adopt ad hoc interactional strategies to compensate, suggesting a shift from data collection toward embedding interactional practices.

As generative AI technologies are pressed into service in workplace settings, current approaches to account for the contexts in which such technologies are used fall short of users' expectations and needs. This paper empirically demonstrates, through expert interviews, both how these tools fail to account for users' context and how users deploy concrete strategies address such failures. The paper analyzes how context is variously conceptualized by tool developers, users, and social scientists to identify specific pitfalls inherent in computational approaches to context. Multiple distinct contexts tend to collapse into one another or rot, degrading over time, reducing the utility of any efforts to account for context. The paper concludes with a provocation to shift from an indiscriminate collection of context-relevant data toward a more interactional set of practices to embed GenAI systems more appropriately into users' contexts of use.

Summary

Main Finding

Generative AI systems fail to reliably account for the social and professional contexts in which expert users operate. Attempts by developers to solve this by collecting ever more context data (larger context windows, RAG, metadata, user histories) produce limited gains and introduce new frictions (privacy, bias, “context rot” and “context collapse”). As a result, professionals spend extra effort (prompting, verification, workflow workarounds) to make outputs usable, reducing the promised productivity gains and slowing adoption. The authors argue for a shift away from indiscriminate data accumulation toward interactional design practices that embed GenAI into users’ actual contexts of use (what they call context engineering done through interaction, not just data aggregation).

Key Points

Definition gaps: “Context” means different things to model developers (tokens, context windows, retrieved documents, metadata) and to professionals (projects, clients, deliverables, tacit procedures, social relationships). This mismatch produces systematic failures.
Failure modes observed: generic outputs, persistent hallucinations, off-topic responses, and bias toward common representations that omit niche, domain-specific needs.
Developers’ current technical responses: expand context windows, retrieval-augmented generation (RAG), collect user metadata and histories, and append documents to prompts. These are treated as a route to better domain fit and a competitive “moat.”
Limits of the data-centric approach:
- Multiple distinct contexts collapse together (context collapse) or degrade over time (context rot), reducing relevance.
- Accumulating more data raises privacy, IP, and legal exposures and environmental/computational costs.
- Over-reliance on synthetic outputs risks “model collapse” where subsequent training degrades fidelity.
Users’ compensatory behaviors:
- Prompt engineering and careful seeding of context.
- Manual verification, multi-step workflows, and repeated clarification with the model.
- Tactical avoidance of certain tasks for which AI is unreliable.
Recommended conceptual shift: move from indiscriminate collection of context-relevant data to interactional practices that solicit and manage the precise, situational context needed for a task (e.g., ephemeral, project-scoped context, user-in-the-loop clarifying questions, workflows that integrate human expertise and checks).

Data & Methods

Study design: qualitative, semi-structured interviews plus demonstration/prompting exercises.
Sample: n = 15 professionals recruited from the U.S./Canada who regularly use GenAI in their work. Participants spanned domains such as medicine, law, software and semiconductor design, 3D and graphic design, animation, music production, accounting, and research.
Recruitment: screened respondents (n = 235) and selected participants to optimize domain diversity and frequency of AI use.
Data collection:
- Interviews with three sections: background/use patterns; open-ended exploration of interactions and frustrations; task demonstrations and prompting exercises (11/15 participants shared demos).
- Collected 29 real prompt descriptions and 22 descriptions of prompting approaches.
Analysis: grounded theory with iterative coding (open → axial → selective), producing a codebook and higher-order concepts from interview transcripts (ASR-corrected).
Limitations noted by authors: small, self-selected sample; geographic limitation (U.S./Canada); gender and demographic skew; qualitative design limits statistical generalizability—but yields rich, practice-oriented insights.

Implications for AI Economics

Realized productivity gains may be substantially lower than headline claims:
- Time/effort spent on verification, context-setting, and workflow adaptation offsets efficiency benefits; measured ROI will be lower and adoption slower.
Platform competition and switching costs:
- “Context moats” based on vast user-specific data may be less valuable if that data poorly captures actionable, project-specific context. Vendors that support interactional context capture and ephemeral/project-scoped context may win users despite smaller historical data stores.
Labor-market effects:
- Increased demand for verification, curation, and prompt-engineering work—complementary tasks that preserve expert gatekeeping rather than fully automating work.
- Need for upskilling: workers must learn to integrate GenAI safely and productively into workflows, changing task composition rather than eliminating roles immediately.
Cost structure and investment implications:
- Building systems that indiscriminately collect and store context is costly (privacy compliance, storage, compute, legal risk). Investment may shift toward UX/interaction design, integration with collaboration tools, and human-in-the-loop systems that deliver more usable outputs per unit of data.
Product-market opportunities:
- Tools that implement interactional context engineering (clarifying dialog, ephemeral project context, structured capture of tacit workflow rules) have potential market value.
- Domain-specific tooling that combines minimal, targeted data collection with high-quality interactional capture may yield higher effective productivity than large monolithic models alone.
Measurement & policy implications:
- Productivity studies should account for secondary labor (editing/verification) and context-management effort to avoid overstating AI gains.
- Policymakers and firms should weigh privacy/regulatory costs of aggressive context collection against modest improvements in output relevance.
Long-term model risks affecting economic value:
- Model collapse and degraded training fidelity from synthetic feedback loops can reduce long-term model reliability, increasing maintenance costs and potentially reducing trust and adoption across firms.

Overall, the paper implies a reorientation of commercial and research priorities: rather than investing primarily in larger data troves and broader context windows, firms and investors should prioritize interactional design, human-in-the-loop workflows, and targeted context capture to unlock real productivity improvements and sustainable economic value from GenAI in the workplace.

Assessment

Paper Typedescriptive Evidence Strengthlow — Findings are based on qualitative expert interviews which provide rich, interpretive evidence about failures and coping strategies but do not establish causal effects, representativeness, or magnitudes; results are hypothesis-generating rather than confirmatory. Methods Rigormedium — The study uses systematic expert interviews and comparative analysis across developer, user, and social-science perspectives, which is appropriate for unpacking conceptual issues; however, qualitative methods are sensitive to selection and interpretation bias, and the paper does not (in the summary) provide quantitative validation, sample size justification, or robustness checks that would raise rigor to high. SampleQualitative data from semi-structured expert interviews with multiple stakeholder groups — tool developers, workplace users, and social scientists — analyzed to identify how context is conceptualized and managed in GenAI deployments (transcripts and thematic analysis of interview material). Themeshuman_ai_collab org_design adoption GeneralizabilitySmall, non-random expert sample limits representativeness of broader worker populations, Findings may reflect specific platforms, firms, or industries sampled and not generalize across sectors, Cultural and geographic biases likely if interviewees concentrated in particular regions, Rapidly evolving GenAI capabilities and deployment practices may outdate some observations

Claims (6)

Claim	Direction	Confidence	Outcome	Details
Current approaches to account for the contexts in which generative AI technologies are used fall short of users' expectations and needs. Worker Satisfaction	negative	high	fit between system behavior and users' expectations/needs (contextual appropriateness)	0.18
Generative AI tools fail to account for users' context in workplace settings. Output Quality	negative	high	degree to which tools incorporate relevant contextual factors	0.18
Users deploy concrete strategies to address failures of generative AI systems to account for context. Task Allocation	positive	high	user practices and strategies for mitigating system-context misalignment	0.18
Tool developers, users, and social scientists conceptualize 'context' differently, and these divergent conceptualizations reveal specific pitfalls inherent in computational approaches to context. Ai Safety And Ethics	mixed	high	differences in conceptual definitions and the resulting pitfalls for computational design	0.18
Multiple distinct contexts tend to collapse into one another or 'rot', degrading over time and reducing the utility of efforts to account for context. Organizational Efficiency	negative	high	durability and distinctness of contextual representations and their utility for system design	0.18
Rather than indiscriminate collection of context-relevant data, researchers and practitioners should adopt interactional practices to embed generative AI systems more appropriately into users' contexts of use. Governance And Regulation	positive	high	recommended design and deployment practices for contextual integration	0.03