The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (6507 claims)

Adoption
7395 claims
Productivity
6507 claims
Governance
5877 claims
Human-AI Collaboration
5157 claims
Innovation
3492 claims
Org Design
3470 claims
Labor Markets
3224 claims
Skills & Training
2608 claims
Inequality
1835 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 609 159 77 736 1615
Governance & Regulation 664 329 160 99 1273
Organizational Efficiency 624 143 105 70 949
Technology Adoption Rate 502 176 98 78 861
Research Productivity 348 109 48 322 836
Output Quality 391 120 44 40 595
Firm Productivity 385 46 85 17 539
Decision Quality 275 143 62 34 521
AI Safety & Ethics 183 241 59 30 517
Market Structure 152 154 109 20 440
Task Allocation 158 50 56 26 295
Innovation Output 178 23 38 17 257
Skill Acquisition 137 52 50 13 252
Fiscal & Macroeconomic 120 64 38 23 252
Employment Level 93 46 96 12 249
Firm Revenue 130 43 26 3 202
Consumer Welfare 99 51 40 11 201
Inequality Measures 36 105 40 6 187
Task Completion Time 134 18 6 5 163
Worker Satisfaction 79 54 16 11 160
Error Rate 64 78 8 1 151
Regulatory Compliance 69 64 14 3 150
Training Effectiveness 81 15 13 18 129
Wages & Compensation 70 25 22 6 123
Team Performance 74 16 21 9 121
Automation Exposure 41 48 19 9 120
Job Displacement 11 71 16 1 99
Developer Productivity 71 14 9 3 98
Hiring & Recruitment 49 7 8 3 67
Social Protection 26 14 8 2 50
Creative Output 26 14 6 2 49
Skill Obsolescence 5 37 5 1 48
Labor Share of Income 12 13 12 37
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Productivity Remove filter
Professional and Technical Services, Information, and Finance and Insurance account for approximately 86 percent of the base-case direct contribution.
Sectoral decomposition of base-case direct contribution in the model; paper explicitly reports the three sectors' combined share as ~86%.
high mixed AI Capex Is Justified: A Bottom-Up Sectoral Estimate of Arti... share of base-case direct GDP contribution by sector (three-sector concentration...
Subjectivity persisted in AI-powered recruitment decisions; human judgment remained an important factor.
Theme 2 (subjectivity in AI-powered recruitment) from interviews indicating retained human subjectivity and judgement in recruitment processes (n = 22).
high mixed The augmented recruiter: examining AI integration and decisi... degree_of_subjectivity_in_decision_making
Sensitivity analyses indicate the observed positive belief changes likely reflect recovery from carry-over effects rather than genuine training-induced shifts.
Authors' sensitivity analyses discussed in the paper that examined alternative explanations (e.g., carry-over effects) and concluded the belief-change result is likely due to recovery from such effects.
high mixed Scaffolding Human-AI Collaboration: A Field Experiment on Be... validity of belief-change effect (source attribution: training vs. carry-over re...
Simulations demonstrate that standard methods, such as principal components analysis and inverse covariance weighting, can generate spurious cross-study differences, whereas our approach recovers comparable latent treatment effects.
Simulation experiments reported in the paper comparing the proposed method to PCA and inverse covariance weighting; results show PCA and inverse-covariance-weighted estimators can produce spurious cross-study differences while the proposed method recovers comparable latent treatment effects (no simulation sample sizes provided in the abstract).
high mixed Nonparametric Identification and Estimation of Causal Effect... comparability/accuracy of estimated latent treatment effects across studies (sim...
Big data analytics (BDA) adoption is a risky strategy with potentially high rewards for start-ups.
Stated as a summary conclusion based on empirical analysis of a large sample of start-ups in Germany comparing adopters and non-adopters across multiple performance measures (survival, costs, sales, employee growth, access to financing).
high mixed Big data-based management decisions and start-up performance overall performance/risk–reward tradeoff
Bounded agents act as an amplifying but not necessary extension to the foundation-model stack for changing work coordination.
Conceptual argument within the paper distinguishing bounded agents from the core stack; no empirical comparison or measurement reported.
high mixed Remote-Capable Knowledge Work Should Default to AI-Enabled F... role of bounded agents in amplifying coordination impacts
The effects of generative AI on work and organisations are heterogeneous and context-dependent, shaped by job roles, skill levels, and institutional environments.
Synthesis across the included studies noting variation in outcomes conditional on role, skill, and institutional context.
high mixed Generative AI in the Workplace: A Systematic Review of Produ... heterogeneity of AI effects across roles/skills/institutions
The positive effect of big data applications on firms' markups exhibits heterogeneity across organizational, technological, and environmental dimensions.
Paper reports heterogeneity analysis showing variation in the magnitude of the positive markup effect across organizational, technological and environmental factors; based on model implications and empirical subgroup/interaction tests using micro-level firm data (sample size not reported).
high mixed Big data application and firm markups: evidence from China heterogeneity of the big-data → markup effect across organizational, technologic...
If employment losses are relatively small and productivity gains are realised, AI adoption could boost Exchequer revenues. But if job displacement is sizeable, tax receipts fall while welfare spending rises, resulting in potentially large pressures on the public finances.
Conditional fiscal scenarios simulated in the report combining employment, wage and benefit changes with the public finance implications (tax receipts and welfare spending); reported as scenario-based outcomes.
high mixed Artificial Intelligence and income inequality in Ireland Exchequer revenues / tax receipts and welfare spending
Ireland’s tax and welfare system absorbs most of the income loss for lower income households, and roughly half of the loss for households at the top of the income distribution.
Microsimulation using SWITCH to model taxes and transfers applied to simulated income changes across income groups; reported as a finding in the report.
high mixed Artificial Intelligence and income inequality in Ireland net income after taxes and transfers (absorption of income loss)
Qualitative results underscored both perceived benefits in comprehension and challenges when interpretations of gaze behaviors were inaccurate.
Qualitative analysis of participant feedback from the study (n=36) reporting themes of improved comprehension and occasional problems when the assistant misinterpreted gaze.
high mixed From Gaze to Guidance: Interpreting and Adapting to Users' C... participant-reported benefits and challenges (qualitative themes)
The productivity decomposition classifies deployments into five regimes that separate beneficial adoption from harmful adoption and identifies which deployments are vulnerable to the augmentation trap.
Model-based taxonomy produced from the analytical decomposition (classification into five regimes described in the paper).
high mixed The Augmentation Trap: AI Productivity and the Cost of Cogni... classification of AI deployment regimes (beneficial vs harmful, vulnerability to...
Small differences in managerial incentives can determine which skill path a worker takes (whether they realize full potential or deskill).
Comparative statics / theoretical sensitivity analysis in the dynamic model indicating tipping behavior based on managerial incentives.
high mixed The Augmentation Trap: AI Productivity and the Cost of Cogni... worker skill trajectory contingent on managerial incentives
Result 3: When AI productivity depends less on worker expertise, workers can permanently diverge in skill: experienced workers realize their full potential while less experienced workers deskill to zero.
Analytical result from the dynamic model showing path-dependent divergence in skill levels under particular parameterizations (lower dependence of AI on worker expertise).
high mixed The Augmentation Trap: AI Productivity and the Cost of Cogni... long-run worker skill distribution (experienced vs less experienced)
The rise of agentic AI development, where LLM-based agents autonomously read, write, navigate, and debug codebases, introduces a new primary consumer with fundamentally different constraints.
Conceptual claim argued in the paper; refers to the emergence of agentic LLM-based tools as new consumers of software artifacts rather than an empirical measurement; no sample size reported.
high mixed Beyond Human-Readable: Rethinking Software Engineering Conve... who/what is the primary consumer of software engineering artifacts (human develo...
Analysis uncovers dramatic asymmetries: inhibition 17.6% vs. preference 75.0%.
Paper reports specific aggregated percentages for two types of implicit effects (inhibition and preference) observed in their analysis; methodology context implies these are results from the benchmark evaluation (300 items / 17 models).
high mixed ImplicitMemBench: Measuring Unconscious Behavioral Adaptatio... rates of inhibition vs. preference effects (implicit memory outcomes)
These results suggest the need for AI model development to prioritize scaffolding long-term competence alongside immediate task completion.
Authors' policy/research recommendation based on experimental findings showing short-term gains but longer-term harms.
high mixed AI Assistance Reduces Persistence and Hurts Independent Perf... recommendation for AI development priorities (design objective, not an empirical...
These effects are observed across a variety of tasks, including mathematical reasoning and reading comprehension.
Trials included multiple task types (explicitly naming mathematical reasoning and reading comprehension); cross-task analysis reported.
high mixed AI Assistance Reduces Persistence and Hurts Independent Perf... task-specific performance and persistence across task types (math reasoning, rea...
Providing issue-specific design guidance reduces design violations, but substantial non-compliance remains.
Intervention experiments in paper: agents were given issue-specific design guidance and resulting patch compliance measured; reported reduction in violations but remaining non-compliance.
high mixed Does Pass Rate Tell the Whole Story? Evaluating Design Const... design violations / design satisfaction
Policy implication: encouraging public sharing of AI-assisted solutions offsets the decline associated with private diversion (flow margin) but cannot repair participation-driven deterioration in conditional resolution; the latter requires directly maintaining contributor engagement.
Prescriptive conclusion from the theoretical model comparing interventions: public-sharing encouragement helps with flow-margin diversion but not with supply-side contributor thinning.
high mixed When AI Improves Answers but Slows Knowledge Creation: Match... archive creation (via posted volume) and conditional resolution (via contributor...
Diagnostic prediction: in a congested regime, observing a joint decline in posted volume and conditional resolution implies supply-side pool thinning is quantitatively present; by contrast, volume decline with stable or rising resolution indicates private diversion (flow margin) alone is the dominant force.
Analytical diagnostic derived from the model that links empirical patterns (volume and conditional resolution) to underlying mechanisms; no empirical validation given in the excerpt.
high mixed When AI Improves Answers but Slows Knowledge Creation: Match... posted volume and conditional resolution probability (joint pattern)
For the short-run optimization problem of AI deployment given fixed job responsibilities and worker skill levels, the firm’s optimal strategy for an m-step job can be computed in time O(m^2) using dynamic programming; the long-run joint optimization including task assignment to workers can also be solved in polynomial time up to an arbitrarily small error term.
Algorithmic results and complexity analysis derived in the theoretical sections and appendices of the paper (dynamic programming construction and polynomial-time solution statements).
high mixed Chaining Tasks, Redefining Work: A Theory of AI Automation computational complexity (time complexity) of computing optimal AI deployment an...
Appending a neighboring step to an existing AI chain adds no additional human verification burden (verification is a fixed cost at the chain level), which can make appending steps to a chain optimal even if manual execution is individually preferable for the appended step.
Theoretical model setup and formal argument showing verification is incurred only at the last augmented step of a chain; illustrative examples (data scientist workflow) and comparative-cost reasoning in the paper.
high mixed Chaining Tasks, Redefining Work: A Theory of AI Automation marginal verification cost when extending AI chains
AI chaining can overturn standard comparative advantage logic in assignment: when multiple adjacent steps are executed as an AI chain, a step may be assigned to AI (as part of the chain) even if manual human execution would be preferred for that step in isolation.
Theoretical model of production as an ordered sequence of steps with firms endogenously bundling contiguous steps into tasks and jobs; formal comparative-static arguments and illustrative examples in the paper showing how fixed verification costs per chain change marginal assignment incentives.
high mixed Chaining Tasks, Redefining Work: A Theory of AI Automation assignment of individual steps to AI versus human execution
Automation leads economic growth to accelerate, but the acceleration is remarkably slow because of the prominence of 'weak links' (an elasticity of substitution among tasks substantially less than one); even when most tasks are automated by rapidly-improving capital, output is constrained by the tasks performed by slowly-improving labor.
Theoretical mechanism from the task-based model (σ < 1 weak-links structure) combined with calibrated simulations that incorporate historical accounting results.
high mixed Past Automation and Future A.I.: How Weak Links Tame the Gro... rate and speed of acceleration of economic growth in response to automation
The general public supports both targeted programs and broader interventions (including job guarantees and UBI), contrasting with economists' preferences.
Survey comparisons across groups contrasting normative policy support (textual summary in Key Findings; exact public-group percentages not provided in excerpt).
high mixed Forecasting the Economic Effects of AI policy preferences of the general public vs. economists
Unconditional forecasts are relatively close to historical trends, but under the rapid scenario the range of plausible outcomes expands (greater uncertainty).
Comparison of unconditional (all-things-considered) survey forecasts to conditional rapid-scenario forecasts; dispersion metrics referenced qualitatively in Key Findings (detailed variance numbers not provided in excerpt).
high mixed Forecasting the Economic Effects of AI forecast dispersion/uncertainty across scenarios
Both rapid model improvement and benchmark quality issues contributed to underestimating agent capabilities.
Synthesis of results: improved LLM performance plus audit findings showing benchmark errors together explain the prior underestimation; based on the re-evaluation and audit described in the paper.
high mixed ELT-Bench-Verified: Benchmark Quality Issues Underestimate A... factors contributing to underestimation of agent capabilities (model improvement...
Developers actively manage the collaboration, externalizing plans into persistent artifacts, and negotiating AI autonomy through context injection and behavioral constraints.
Observed behaviors in chat transcripts and committed artifacts showing developers creating persistent plans, injecting context, and specifying constraints to shape AI behavior.
high mixed Programming by Chat: A Large-Scale Behavioral Analysis of 11... practices for managing AI collaboration (externalization of plans, context injec...
Developers redistribute cognitive work to AI, delegating diagnosis, comprehension, and validation rather than engaging with code and outputs directly.
Content and interaction analyses of chat sessions showing developer prompts delegating diagnosis, comprehension, and validation tasks to the AI assistants (Cursor and GitHub Copilot) across the dataset.
high mixed Programming by Chat: A Large-Scale Behavioral Analysis of 11... allocation of cognitive tasks (diagnosis, comprehension, validation) between dev...
Conversational programming operates as progressive specification, with developers iteratively refining outputs rather than specifying complete tasks upfront.
Qualitative/content analysis of the 74,998 messages across 11,579 sessions indicating patterns of iterative prompts and refinements rather than one-shot complete specifications.
high mixed Programming by Chat: A Large-Scale Behavioral Analysis of 11... mode of task specification (iterative refinement vs complete upfront specificati...
The influence of human capital (number of specialists in scientific and technological fields) on value added varies across sectors.
Number of specialists in scientific and technological fields included as a covariate in MMQR; reported heterogeneous effects across sectors/quantiles in the results section.
The influence of R&D expenditure on value added varies across sectors.
R&D expenditure included as a core explanatory variable in panel MMQR estimations; authors report differing coefficient sizes/signs across sectors/quantiles.
These AI capability improvements would impact the economy and labor market as organizations adopt AI, which could have a substantially longer timeline.
Theoretical implication/interpretation by the authors (economic and labor market impact contingent on organizational adoption; timeline longer than capability improvements).
high mixed Crashing Waves vs. Rising Tides: Preliminary Findings on AI ... impact on economy and labor market (timing and magnitude of effects)
AI automation is a continuum between (i) crashing waves where AI capabilities surge abruptly over small sets of tasks, and (ii) rising tides where the increase in AI capabilities is more continuous and broad-based.
Conceptual framing proposed by the authors (theoretical proposition).
high mixed Crashing Waves vs. Rising Tides: Preliminary Findings on AI ... pattern of AI capability change across tasks (crashing waves vs rising tides)
This paper proposes three archetypal AI technology types: AI for effort reduction, AI to increase observability, and mechanism-level incentive change AI.
Conceptual taxonomy introduced by the authors (theoretical classification presented in the paper).
high mixed Incentives, Equilibria, and the Limits of Healthcare AI: A G... typology of AI technologies (categorical classification)
Big Data-based FinTech can contribute to financial stability only when its implementation is strategically justified, ethically grounded and supported by effective regulation, robust data governance and investment in human capital.
Normative conclusion drawn from systemic and structural analysis of literature and synthesis of empirical studies; no empirical test provided within the paper.
high mixed Implications of Big Data Technologies for the Resilience of ... contribution of Big Data-based FinTech to financial stability conditional on gov...
The effectiveness of Big Data solutions varies across the financial sphere and depends critically on data quality, regulatory alignment and organisational readiness.
Derived from comparative analysis of sector-specific applications and synthesis of findings in the reviewed literature; no quantified cross-sector sample reported.
high mixed Implications of Big Data Technologies for the Resilience of ... effectiveness of Big Data solutions
AI intensity and employment elasticity are linked by a U-shaped relationship.
Result reported by the paper based on the authors' empirical/econometric analysis of international datasets (OECD/ILO/World Bank).
high mixed Impact Of Artificial Intelligence (AI) On Employment employment elasticity (relationship to AI intensity)
The paper analyzes AI as a continuous process using data from the OECD, ILO, and the World Bank to study job displacement, creation, and reallocation.
Empirical analysis described in the paper using datasets from OECD, ILO, and World Bank; econometric approach implied.
high mixed Impact Of Artificial Intelligence (AI) On Employment job displacement, job creation, and job reallocation
AI is recognized as a primary change agent that influences various aspects of economies the world over, and thus it profoundly changes not only the number of jobs but also their quality.
Stated as a high-level conclusion in the paper's introduction/abstract; based on literature synthesis of studies from 2013-2025 and references to international sources (OECD, ILO, World Bank).
high mixed Impact Of Artificial Intelligence (AI) On Employment number of jobs and job quality (employment and quality of work)
AI plays a dual role by enhancing productivity while intensifying energy use in the short run.
Synthesis of empirical findings in the paper: documented short-run increase in electricity growth (energy use) following AI adoption alongside statements/evidence that AI enhances productivity (exact productivity measures and estimates not provided in the summary).
high mixed The Impact of AI Adoption on Electricity Output Growth Gap: ... productivity (improvement) and corporate electricity output growth gap (increase...
The four-variable account (produced output, underlying understanding, calibration accuracy, self-assessed ability) better explains phenomena like overconfidence, over- and under-reliance on AI, 'crutch' effects, and weak transfer than the simpler claim that generative AI merely amplifies the Dunning–Kruger effect.
Argumentative synthesis in the paper comparing explanatory power of the proposed four-variable framework against the more general Dunning–Kruger metaphor; draws on examples and empirical patterns from the reviewed literature rather than a single empirical test.
high mixed Beyond the Steeper Curve: AI-Mediated Metacognitive Decoupli... explanatory fit for phenomena such as overconfidence, reliance patterns, crutch ...
A useful working model is 'AI-mediated metacognitive decoupling': LLM use widens the gap among produced output, underlying understanding, calibration accuracy, and self-assessed ability.
Conceptual synthesis and theoretical proposal grounded in reviewed empirical findings from multiple literatures (human–AI interaction, learning research, model evaluation); presented as the paper's working model rather than as a single empirical estimate.
high mixed Beyond the Steeper Curve: AI-Mediated Metacognitive Decoupli... degree of alignment/decoupling between produced output, underlying understanding...
There is a fundamental trade-off between operational stability and theoretical deliberation across multi-agent coordination frameworks.
Empirical results from controlled benchmarks comparing agent architectures under fixed computational time budgets, as reported in the paper (no numeric sample size or statistical details provided in the abstract).
high mixed An Empirical Study of Multi-Agent Collaboration for Automate... operational stability versus depth/quality of theoretical deliberation
As technological progress devalues labor, the welfare benefits of steering are at first increased but, beyond a critical threshold, decline and optimal policy shifts toward greater redistribution.
Theoretical model extension analyzing planner's optimal choice as labor's economic value changes; the paper states a non-monotonic relationship with a critical threshold.
high mixed NBER WORKING PAPER SERIES welfare benefits of steering; optimal policy (steering vs redistribution)
Using pre-existing exposure as an instrument for ChatGPT adoption in a long-difference IV design, ChatGPT adoption causes households to spend more time on digital leisure activities while leaving total time spent on productive online activities unchanged.
IV long-difference empirical design: instrumenting household adoption with pre-ChatGPT exposure (2021 browsing); outcome measured as changes in categorized browsing durations (LLM-based classification into 'leisure' vs 'productive' sites); controls include demographic-by-region fixed effects and browsing composition controls.
high mixed https://arxiv.org/pdf/2603.03144 change in time spent on digital leisure activities and total time on productive ...
Once efficiency is made explicit, the main practical question becomes how many efficiency doublings are required to keep scaling productive despite diminishing returns.
Framing/forecasting claim in the paper presenting an operational research question (conceptual; no empirical sample in excerpt).
high mixed The Unreasonable Effectiveness of Scaling Laws in AI required number of efficiency doublings to sustain productive scaling
The practical burden of scaling depends on how efficiently real resources are converted into that (logical) compute.
Argument in the paper linking conceptual 'logical compute' to real-world conversion efficiency (qualitative claim; no empirical sample in excerpt).
high mixed The Unreasonable Effectiveness of Scaling Laws in AI efficiency of converting real resources into logical compute
The compute variable is best understood as logical compute, an implementation-agnostic notion of model-side work.
Conceptual argument presented in the paper reframing 'compute' as an abstract, implementation-agnostic quantity (no empirical sample provided).
high mixed The Unreasonable Effectiveness of Scaling Laws in AI definition/interpretation of the 'compute' variable