Evidence (5157 claims)
Adoption
7395 claims
Productivity
6507 claims
Governance
5877 claims
Human-AI Collaboration
5157 claims
Innovation
3492 claims
Org Design
3470 claims
Labor Markets
3224 claims
Skills & Training
2608 claims
Inequality
1835 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 609 | 159 | 77 | 736 | 1615 |
| Governance & Regulation | 664 | 329 | 160 | 99 | 1273 |
| Organizational Efficiency | 624 | 143 | 105 | 70 | 949 |
| Technology Adoption Rate | 502 | 176 | 98 | 78 | 861 |
| Research Productivity | 348 | 109 | 48 | 322 | 836 |
| Output Quality | 391 | 120 | 44 | 40 | 595 |
| Firm Productivity | 385 | 46 | 85 | 17 | 539 |
| Decision Quality | 275 | 143 | 62 | 34 | 521 |
| AI Safety & Ethics | 183 | 241 | 59 | 30 | 517 |
| Market Structure | 152 | 154 | 109 | 20 | 440 |
| Task Allocation | 158 | 50 | 56 | 26 | 295 |
| Innovation Output | 178 | 23 | 38 | 17 | 257 |
| Skill Acquisition | 137 | 52 | 50 | 13 | 252 |
| Fiscal & Macroeconomic | 120 | 64 | 38 | 23 | 252 |
| Employment Level | 93 | 46 | 96 | 12 | 249 |
| Firm Revenue | 130 | 43 | 26 | 3 | 202 |
| Consumer Welfare | 99 | 51 | 40 | 11 | 201 |
| Inequality Measures | 36 | 105 | 40 | 6 | 187 |
| Task Completion Time | 134 | 18 | 6 | 5 | 163 |
| Worker Satisfaction | 79 | 54 | 16 | 11 | 160 |
| Error Rate | 64 | 78 | 8 | 1 | 151 |
| Regulatory Compliance | 69 | 64 | 14 | 3 | 150 |
| Training Effectiveness | 81 | 15 | 13 | 18 | 129 |
| Wages & Compensation | 70 | 25 | 22 | 6 | 123 |
| Team Performance | 74 | 16 | 21 | 9 | 121 |
| Automation Exposure | 41 | 48 | 19 | 9 | 120 |
| Job Displacement | 11 | 71 | 16 | 1 | 99 |
| Developer Productivity | 71 | 14 | 9 | 3 | 98 |
| Hiring & Recruitment | 49 | 7 | 8 | 3 | 67 |
| Social Protection | 26 | 14 | 8 | 2 | 50 |
| Creative Output | 26 | 14 | 6 | 2 | 49 |
| Skill Obsolescence | 5 | 37 | 5 | 1 | 48 |
| Labor Share of Income | 12 | 13 | 12 | — | 37 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Human Ai Collab
Remove filter
AI capabilities can be copied, invoked, embedded in workflows, and scaled across institutions at low marginal cost.
Descriptive claim about AI technology characteristics made in the paper; supported by conceptual argument and examples rather than quantified empirical data.
Earlier high-risk technologies were slowed by capital intensity, physical bottlenecks, organizational inertia, and specialized supply chains.
Historical/analytic claim presented as background context in the paper; supported by conceptual comparison rather than a specific empirical study.
Scientific institutions, distinctively, manufacture legitimate judgment, so they do not merely adapt to AI; they compete with it for the same functional role.
Conceptual/theoretical assertion in the paper describing institutional roles; no empirical data or sample size provided in the excerpt.
No single governance setting dominates across all contexts; moderate governance becomes increasingly competitive as the learner accumulates experience within the governed action space.
Empirical finding reported from experiments with the contextual-bandit learner operating under different governance constraints and learning over time; comparative performance over learning horizon described in the paper. Sample size / trial counts not provided in the excerpt.
This workload-buffering effect (governance improving performance while reducing fatigue) contradicts the usual framing of governance as pure overhead.
Interpretation and comparison of empirical manufacturing results against prior framing in literature (qualitative claim within the paper). No sample size provided.
Governance is not a binary switch but a tunable design variable: tighter constraints predictably convert autonomous AI assignments into supervised collaborations, with domain-specific costs and benefits.
Empirical finding reported from experiments using the HAAS benchmark across the two domains (software engineering and manufacturing); qualitative and/or quantitative comparisons of allocations under varying governance constraints. Paper does not state sample size in the provided text.
AI learns indiscriminately from implicit knowledge, acquiring both beneficial patterns and harmful biases.
Asserted in the paper as a conceptual point about training data and learned patterns; no empirical evaluation or quantified bias measures provided.
Whether the futures these configurations help create remain governable and worth inhabiting will depend on leaders who can see, early enough, where and how consequential decisions are actually being shaped.
Normative/prognostic claim linking future governability to leaders' detection capabilities (conceptual; no empirical test provided in the excerpt).
These configurations will shape how power, responsibility, and trust are distributed in organizational life.
Theoretical/prognostic claim in the paper linking configurations to distribution of power, responsibility, and trust (no empirical quantification in the excerpt).
Fluent users' failures occur alongside greater success on complex tasks.
Combined analysis of task complexity, success outcomes, and failure incidence in the 27K transcripts showing that fluent users both attempt and have greater success on complex tasks even while experiencing more failures.
Fluent users adopt a fundamentally different interactional mode: they iterate collaboratively with the AI, refining goals and critically assessing outputs, whereas novices take a passive stance.
Qualitative and quantitative analysis of the same 27,000 annotated WildChat transcripts, with annotations describing interactional mode and user behavior (iteration, goal refinement, critical assessment vs. passivity).
Augmentation is bounded rather than linear (i.e., human-AI augmentation shows diminishing or negative returns past a balanced zone).
Synthesis of interview themes across 34 cases producing the bounded-augmentation / curvilinear conceptualization.
Mediators such as trust, cohesion and accountability are reshaped when AI-generated contributions enter collaboration.
Thematic evidence from interviews indicating changes in trust, cohesion and accountability dynamics associated with the introduction of AI outputs into team collaboration.
Social (leadership engagement, trust, ownership, mediation and alignment) and technical (automation, creation, reliability, distraction and integration) subsystems combine to enable or erode team effectiveness, summarized in an e-leadership–AI orientation matrix.
Analytic synthesis from thematic coding (Gioia-informed) of interview data producing a conceptual matrix mapping social and technical factors to outcomes.
Analysis identifies a curvilinear pattern of bounded augmentation, where effectiveness peaks in a zone of balanced use but declines under under-use and over-reliance.
Thematic (Gioia-informed) analysis of 34 semi-structured interviews with project managers across five UK industries; pattern emerges from cross-case coding and synthesis.
Generative AI-powered tools like ChatGPT are reshaping market skill demands while also offering new forms of on-demand learning support to meet those demands.
Framed in paper as background/motivation; asserted from prior literature and the paper's motivating claims rather than reported as a quantified result in this study.
Susceptibility to visual priming varies across state-of-the-art VLMs.
Comparative experiments run across multiple state-of-the-art vision-language models showing differential changes in IPD behavior when exposed to the same visual primes and color cues. (Paper notes variation in susceptibility and mitigation effectiveness across models; specific model list and per-model sample sizes not given in the abstract.)
Color-coded reward matrices alter VLM decision patterns.
Experimental condition varying the visual presentation of the IPD payoff matrix (color-coding of rewards) and measuring resulting decision patterns of multiple VLMs in IPD trials. (Reported as part of the experimental setup across models; exact counts not provided in abstract.)
VLM behavior can be influenced by image content depicting behavioral concepts (kindness/helpfulness vs. aggressiveness/selfishness).
Experimental manipulation in the Iterated Prisoner's Dilemma (IPD): VLMs were exposed to images labeled/connoting 'kindness/helpfulness' versus 'aggressiveness/selfishness' and subsequent choices in IPD rounds were recorded across multiple state-of-the-art VLMs. (Paper reports experiments across multiple VLMs; exact sample sizes per model/condition not stated in the abstract.)
AI adoption leads both to job displacement and job creation, including the emergence of new occupational categories.
Abstract states the review examines empirical evidence on both job displacement and creation and the emergence of new occupations; no numeric counts or sample sizes provided in abstract.
The study identifies short-term transitional risks and long-term productivity gains associated with AI integration in the workforce.
Abstract states the paper evaluates both short-term risks and long-term productivity gains from AI integration based on the reviewed literature; no empirical quantification given in abstract.
AI-driven automation and augmentation are reshaping employment landscapes, with emphasis on sector-level disruption, skill transformation, and socioeconomic consequences.
Abstract states this as a conclusion of the review drawing on interdisciplinary empirical literature; no specific studies or sample sizes cited in abstract.
The accelerating deployment of artificial intelligence across industries has fundamentally altered the structure of global labour markets.
Statement in abstract summarizing a systematic review of interdisciplinary literature (economics, computer science, organizational behaviour, public policy); no specific sample size reported in abstract.
Failures are structured by task family and execution surface, with HR, management, and multi-system business workflows as persistent bottlenecks and local workspace repair comparatively easier but unsaturated.
Error-mode analysis across the 105 tasks and evaluated models reported in experiments; authors identify task-family-level patterns (HR, management, multi-system workflows) and relative ease of local workspace repair.
Whether LLM-based assistants improve or degrade code quality remains unresolved: existing studies report contradictory outcomes contingent on context and evaluation criteria.
Review finds mixed/contradictory findings across included studies regarding code quality effects.
Differences between models are large enough to shape outcomes in practice, so reliability should be incorporated alongside average performance when assessing and deploying LLMs in high-stakes decision contexts.
Authors' interpretation of empirical differences in funding decisions, scores, confidence, and reliability across models in the controlled experiment; presented as an implication/recommendation.
Demographic characteristics intersect with AI exposure—i.e., exposure varies by demographic groups.
Paper reports that it examines how demographic characteristics intersect with exposure based on recent empirical studies; no demographic breakdowns or sample sizes provided in the abstract.
Recent studies combine task-level exposure metrics with employment and usage data to assess AI exposure and impacts.
Paper notes that it draws on studies that use task-level exposure metrics alongside employment and usage data; methodological claim rather than a quantitative result.
Generative large language models (LLMs) present organizations with a transformative technology whose labor market implications remain nascent yet consequential.
Statement in paper synthesizing emerging empirical research; no specific study, method, or sample size reported in the abstract.
Objectives, constraints, and prompt guidance affect reliability and generalization.
Authors' analysis and discussion based on experiments and ablations described in the paper (qualitative/empirical observations about sensitivity to objectives, constraints, and prompts).
The architect's role is shifting, but the human remains central.
Authors' discussion and interpretive analysis about the role of humans in agentic AI-driven design processes.
Across evolved designs, components often correspond to known techniques; the novelty lies in how they are coordinated.
Authors' qualitative analysis of evolved architectures and components reported in the paper (design inspection and interpretation of evolved solutions).
We identify significant differences between human and AI negotiation behaviors, finding that humans favor lower-complexity deals and are significantly less reliable partners compared to LM-based agents.
Results from the user study comparing human vs LM-based agent negotiation behavior (statements in the results section).
The distribution of complementary (non-AI) skills across the workforce shapes whether AI improvements generate productivity bottlenecks or concentration-driven inequality.
Derived from the task-based model analysis described in the article; framed as a theoretical mechanism with reference to empirical patterns but without specific empirical study details in the excerpt.
The paper extends paradox theory to conceptualise the Creativity Paradox in the context of GenAI.
Theoretical extension and conceptual development within the paper (no empirical tests reported).
Delegating tasks to genAI can be individually beneficial in the short term even as widespread adoption degrades future model performance (creating a social dilemma).
Result of the paper's behavioral model showing an individual-level incentive to use genAI versus a collective cost from adoption (theoretical/model-based; no empirical sample reported in abstract).
ASC (adaptive stopping criterion) halts harmful refinement but incurs a 3.8 pp confidence-elicitation cost.
Reported experiment with ASC showing that it prevents harmful iterative refinement yet causes a measured cost described as 3.8 percentage points due to confidence elicitation.
Only o3-mini (+3.4 pp, EIR = 0%), Claude Opus 4.6 (+0.6 pp, EIR ~ 0.2%), and o4-mini (+/-0 pp) remain non-degrading under self-correction; GPT-5 degrades by -1.8 pp.
Reported measured changes in accuracy (percentage-point changes) and measured EIR values for the named models after applying iterative self-correction across the experiment suite.
Across 7 models and 3 datasets (GSM8K, MATH, StrategyQA), we find a sharp near-zero EIR threshold (<= 0.5%) separating beneficial from harmful self-correction.
Empirical experiments reported across 7 LLMs and 3 benchmark datasets (GSM8K, MATH, StrategyQA) comparing outcomes of iterative self-correction as a function of measured EIR.
AI influences innovation performance in organizations.
Discussion and synthesis of studies and reports on AI adoption and innovation performance presented in the review.
AI adoption is producing organizational implications, including changes in project management practices.
Findings synthesized from conference papers, case studies and industry reports included in the review.
Automation, generative AI, and intelligent systems are reshaping task structures, leading to both job displacement risks and the creation of new AI-driven roles.
Synthesis of empirical studies, conference findings, and industry reports reporting both displacement risks and new role emergence (review paper).
AI is rapidly transforming the nature of work, the demand for skills, and the professional roles of Information Technology (IT) practitioners.
Stated as a synthesis result from a narrative review of recent empirical studies, conference findings, and industry reports (review paper).
AIGC is reshaping the rights and obligations of platforms and workers.
Argument in the paper describing legal and practical impacts of AIGC on platform-worker relationships; based on doctrinal/legal analysis and discussion of platform practices rather than reported quantitative empirical data.
The study explores implications of algorithmic enterprises for competitive advantage, labour markets, and regulatory policy.
Declared scope of the paper in the abstract; exploration is conceptual and analytical rather than reporting empirical findings or quantified effects.
Survey evidence suggests public attitudes towards AI combine optimism with apprehension, and most respondents oppose granting AI systems final authority over hiring and dismissal decisions.
Review cites multiple public opinion and survey studies reporting mixed (optimistic and apprehensive) attitudes and opposition to AI final authority in employment decisions (survey evidence summarized).
There are important regional differences—especially in developing contexts—that necessitate context-specific approaches to improving women’s participation in AI-enabled work.
Observation reported in the review drawing on geographically diverse studies and policy analyses; the abstract does not quantify differences or report sample sizes for cross-region comparisons.
Social, cultural, and ethical considerations influence women’s engagement in AI-centric workplaces.
Claim made in the review, based on interdisciplinary literature that includes sociocultural analyses and ethical discussions; the abstract does not provide empirical effect estimates or sample sizes.
AI applications—ranging from recruitment algorithms to workplace automation—can either reinforce gender disparities or promote equitable employment outcomes.
Stated in the review based on collated findings from multiple studies and analyses that document both harms (e.g., biased recruitment algorithms) and potential benefits (e.g., tools designed to reduce bias); no single empirical study or pooled effect size provided in the abstract.
Artificial Intelligence (AI) is rapidly transforming workplaces across the globe, offering both novel opportunities and unique challenges for women in technology-driven industries.
Stated in the paper's introduction/abstract as a summary conclusion based on a narrative literature review of peer-reviewed studies, policy analyses, and preprint research; no specific sample size or primary empirical method reported in the abstract.