The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (2608 claims)

Adoption
7395 claims
Productivity
6507 claims
Governance
5877 claims
Human-AI Collaboration
5157 claims
Innovation
3492 claims
Org Design
3470 claims
Labor Markets
3224 claims
Skills & Training
2608 claims
Inequality
1835 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 609 159 77 736 1615
Governance & Regulation 664 329 160 99 1273
Organizational Efficiency 624 143 105 70 949
Technology Adoption Rate 502 176 98 78 861
Research Productivity 348 109 48 322 836
Output Quality 391 120 44 40 595
Firm Productivity 385 46 85 17 539
Decision Quality 275 143 62 34 521
AI Safety & Ethics 183 241 59 30 517
Market Structure 152 154 109 20 440
Task Allocation 158 50 56 26 295
Innovation Output 178 23 38 17 257
Skill Acquisition 137 52 50 13 252
Fiscal & Macroeconomic 120 64 38 23 252
Employment Level 93 46 96 12 249
Firm Revenue 130 43 26 3 202
Consumer Welfare 99 51 40 11 201
Inequality Measures 36 105 40 6 187
Task Completion Time 134 18 6 5 163
Worker Satisfaction 79 54 16 11 160
Error Rate 64 78 8 1 151
Regulatory Compliance 69 64 14 3 150
Training Effectiveness 81 15 13 18 129
Wages & Compensation 70 25 22 6 123
Team Performance 74 16 21 9 121
Automation Exposure 41 48 19 9 120
Job Displacement 11 71 16 1 99
Developer Productivity 71 14 9 3 98
Hiring & Recruitment 49 7 8 3 67
Social Protection 26 14 8 2 50
Creative Output 26 14 6 2 49
Skill Obsolescence 5 37 5 1 48
Labor Share of Income 12 13 12 37
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Skills Training Remove filter
Across heterogeneous learners, a common broadcast curriculum can be slower than personalized instruction by a factor linear in the number of learner types.
Theoretical comparative result in the model (analysis of broadcast vs personalized curricula across heterogeneous learner types; abstract states factor linear in number of types).
high negative A Mathematical Theory of Understanding speed of instruction / time to learn under broadcast curriculum vs personalized ...
Significant limitations emerged in case law citations, with most cited cases being non-existent or incorrectly referenced.
Authors' review of the case citations produced by the four AI engines for the single transcript, finding many citations were fabricated or misreferenced.
high negative Robot Wingman: Using AI to Assess an Employment Termination accuracy of case law citations (error rate / hallucination rate)
Initial adaptation challenges to AI integration were identified among employees.
Participants in semi-structured interviews (n=12) reported initial difficulties adapting to AI tools; themes relating to early adaptation challenges were coded.
high negative AI-AUGMENTED WORKFORCE: THE IMPACT OF ARTIFICIAL INTELLIGENC... initial adaptation challenges to AI
There is a central design tension in human-AI systems: maximizing short-term hybrid capability does not necessarily preserve long-term human cognitive competence.
Conceptual/theoretical claim derived from the framework and discussion in the paper (argument and mathematical framing), no empirical sample or longitudinal data presented in the excerpt.
high negative Cognitive Amplification vs Cognitive Delegation in Human-AI ... long-term human cognitive competence
Rather than broad job losses, evidence points to a reallocation at the entry level: AI automates tasks typically assigned to junior staff, shifting the nature of entry-level roles.
Synthesis of firm- and task-level empirical studies reported in the brief documenting automation of routine/junior tasks and changes in job-task composition; specific sample sizes vary by cited study and are not provided in the brief.
high negative AI, Productivity, and Labor Markets: A Review of the Empiric... automation of entry-level/junior tasks and changes to entry-level job content
The gap between informal natural language requirements and precise program behavior (the 'intent gap') has always plagued software engineering, but AI-generated code amplifies it to an unprecedented scale.
Conceptual claim and argumentation in the paper; presented as an observed escalation in the scale of the existing 'intent gap' due to AI code generation. No quantitative evidence or sample size given in the excerpt.
high negative Intent Formalization: A Grand Challenge for Reliable Coding ... mismatch between intended and actual program behavior (intent gap) / resulting c...
Some declines (in self-efficacy and meaningfulness) from passive AI use persist after participants return to manual work.
Within-experiment assessment of outcomes after participants returned to manual (no-AI) tasks following the AI-use manipulation in the pre-registered experiment (N = 269); reported persistent reductions in self-efficacy and meaningfulness for the passive condition.
high negative Relying on AI at work reduces self-efficacy, ownership, and ... self-efficacy; perceived meaningfulness (measured post-return to manual work)
Passive use of AI reduces perceived meaningfulness of work.
Pre-registered experiment (N = 269) with self-reported measure of work meaningfulness; passive-copy condition showed lower meaningfulness ratings than No-AI and Active-collaboration conditions.
high negative Relying on AI at work reduces self-efficacy, ownership, and ... perceived meaningfulness of work
Passive use of AI reduces psychological ownership of the produced outputs.
Same pre-registered experiment (N = 269). Participants in the passive-copy AI condition reported lower psychological ownership of their outputs (self-report scales) relative to No-AI and Active-collaboration conditions.
high negative Relying on AI at work reduces self-efficacy, ownership, and ... psychological ownership of outputs
Passive use of AI (copying AI-generated output) reduces workers' self-efficacy.
Pre-registered between-subjects experiment (N = 269) using occupation-specific writing tasks. Participants assigned to a passive-copy AI condition reported lower self-efficacy (self-reported confidence to complete tasks without AI) compared to the No-AI (manual) and Active-collaboration conditions.
high negative Relying on AI at work reduces self-efficacy, ownership, and ... self-efficacy (confidence to complete tasks without AI)
Provider incentives may be misaligned (e.g., optimizing for engagement or test performance instead of durable learning), requiring contracts, regulation, or purchaser design to align incentives.
Consensus from interdisciplinary workshop (50 scholars) highlighting incentive risks and market-design considerations; descriptive, not empirical.
high negative The Future of Feedback: How Can AI Help Transform Feedback t... provider optimization metrics (engagement/test performance) vs. durable learning...
Extensive learner data needed to personalize AI feedback raises privacy and data-governance concerns (consent, storage, usage).
Qualitative consensus from workshop participants (50 scholars) noting data-collection requirements and governance risks; no empirical governance studies included.
high negative The Future of Feedback: How Can AI Help Transform Feedback t... volume/type of learner data collected; privacy risk indicators; compliance with ...
Automated feedback may not capture pedagogical nuances expert teachers use (motivation, socio-emotional cues, complex reasoning), limiting pedagogical fit.
Expert syntheses from the workshop of 50 scholars highlighting limits of automation relative to expert teacher judgment; no empirical comparisons presented.
high negative The Future of Feedback: How Can AI Help Transform Feedback t... coverage of socio-emotional and complex-reasoning cues in feedback; corresponden...
AI-generated feedback can be incorrect, misleading, or misaligned with learning objectives; assessing feedback quality is nontrivial.
Repeated concern raised across workshop participants (50 scholars) in qualitative synthesis; noted as a substantive risk and open challenge rather than empirically quantified here.
high negative The Future of Feedback: How Can AI Help Transform Feedback t... feedback factual correctness; alignment with stated learning objectives; rate of...
Adoption requires hardware (VR headsets, capable GPUs) and integration effort, implying upfront capital expenditure for labs/observatories.
Paper explicitly notes hardware requirements (VR headsets, capable GPUs) and integration effort as part of adoption considerations; common-sense assessment of required capital.
high negative iDaVIE v1.0: A virtual reality tool for interactive analysis... upfront capital expenditure and integration effort required for adoption
When identical replies are labeled as coming from AI rather than from a human, recipients report feeling less heard and less validated (an attribution effect).
Controlled attribution labeling experiment within the study: identical replies presented with different source labels (AI vs. human) and recipient-rated perceptions of being heard/validated measured.
high negative Practicing with Language Models Cultivates Human Empathic Co... recipient-rated feelings of being heard and validated
There are limited randomized controlled trials or longitudinal evaluations; few studies measure patient-relevant outcomes or economic impacts.
Literature synthesis noting scarcity of RCTs and long-term observational studies, and absence of widespread patient-outcome and cost-effectiveness evaluations in existing publications.
high negative Human-AI interaction and collaboration in radiology: from co... number of RCTs/longitudinal studies, frequency of patient outcome and economic o...
Many published studies focus on standalone algorithm accuracy rather than clinician–AI joint performance in routine workflows.
Review of the literature categorizing study designs (preponderance of algorithm development/validation studies, fewer reader-in-the-loop, simulation, or deployment studies).
high negative Human-AI interaction and collaboration in radiology: from co... proportion of studies reporting standalone algorithm metrics versus those report...
Ethical and legal issues—patient privacy, algorithmic bias, intellectual property, and equitable access—pose risks to AI deployment in drug development.
Ethics and legal analyses, policy reports, and documented case examples collated in the review that identify these recurring concerns.
high negative From Algorithm to Medicine: AI in the Discovery and Developm... ethical/legal risk incidence; privacy breaches; bias outcomes; access inequities
Regulatory uncertainty about validation standards and liability for AI tools raises investment risk and may slow deployment.
Regulatory and policy reports included in the narrative review describing evolving standards and open questions about validation, explainability, and liability for ML-based tools.
high negative From Algorithm to Medicine: AI in the Discovery and Developm... regulatory clarity; investment risk and deployment timelines
Adoption of AI in drug R&D requires high upfront investment in data curation, compute infrastructure, and specialized talent.
Industry reports and economic analyses summarized in the review reporting capital and operational needs for building AI capabilities; qualitative synthesis rather than quantitative costing across firms.
high negative From Algorithm to Medicine: AI in the Discovery and Developm... fixed upfront costs (data curation, compute, hiring/training)
Limited transparency and interpretability of many AI algorithms (black-box models) complicate clinical and regulatory trust and adoption.
Regulatory reports, methodological critiques, and case examples in the review highlighting interpretability concerns and their impact on clinical/regulatory acceptance.
high negative From Algorithm to Medicine: AI in the Discovery and Developm... clinical/regulatory acceptance, trust, and adoption rates; explainability metric...
Performance of AI models in drug R&D depends on large, high-quality, and representative biomedical datasets; dataset bias or gaps substantially undermine model performance and generalizability.
Methodological literature and case studies cited in the review documenting failures or limited generalization when training data are biased, sparse, or non-representative; thematic synthesis rather than pooled quantification.
high negative From Algorithm to Medicine: AI in the Discovery and Developm... model performance/generalizability across populations and contexts
Predictions from AI depend on data quality and coverage and still require experimental (wet-lab) validation.
Discussion of early failures and limits in case studies and expert observations within the narrative review; methodological argument about dependence of ML models on input data.
high negative Learning from the successes and failures of early artificial... predictive validity of computational models / need for experimental validation
High-quality, standardized, interoperable data (clean, annotated, connected across modalities) is a critical limiting factor for translating AI capability into sustained impact.
Conceptual emphasis and domain knowledge argument in the editorial; no empirical measurement of data quality's causal effect included.
high negative AI as the Catalyst for a New Paradigm in Biomedical Research ability to translate AI capability into sustained impact (dependent on data qual...
At the question level, incorrect chatbot suggestions substantially reduce caseworker accuracy, with a two-thirds reduction on easy questions where the control group performed best.
Question-level analysis from the randomized experiment comparing cases where chatbot suggestions were incorrect versus control; paper reports a ~66% reduction in accuracy on easy questions when chatbot suggestions were incorrect (exact denominators and statistics not provided in the excerpt).
high negative LLMs in social services: How does chatbot accuracy affect hu... caseworker accuracy on easy questions when presented with incorrect chatbot sugg...
Gaps in infrastructure readiness, digital awareness, and inclusive policy frameworks hinder equitable AI adoption among micro‑enterprises.
Cross‑study synthesis of barriers identified across the 55 included articles; infrastructural, awareness, and policy barriers are explicitly reported as recurring themes.
high negative Role of AI in Enhancing Work Efficiency and Opportunities fo... barriers to AI adoption (infrastructure readiness, digital awareness, policy inc...
Only 24.4% of at-risk workers have viable transition pathways, where 'viable' is defined as sharing at least 3 skills and achieving at least 50% skill transfer.
Analysis of job-to-job transitions on the validated knowledge graph using an operational definition of viable pathways (>=3 shared skills and >=50% skill transfer); proportion of at-risk workers meeting that criterion reported as 24.4% (underlying at-risk worker count not given in the excerpt).
high negative Graph-Based Analysis of AI-Driven Labor Market Transitions: ... percentage of at-risk workers with viable transition pathways (per defined thres...
20.9% of jobs in the dataset face high automation risk.
Risk classification applied to the jobs represented in the knowledge graph (sample size: 9,978 job postings); proportion of jobs labeled as 'high automation risk' is reported as 20.9%.
high negative Graph-Based Analysis of AI-Driven Labor Market Transitions: ... proportion of jobs classified as high automation risk
AI notably reduces customer stability in sports enterprises (SE).
Empirical estimation using the DML model on the same panel dataset of 45 Chinese listed SEs (2012–2023); authors report a statistically significant negative effect of AI on customer stability.
high negative Can Artificial Intelligence Enhance the Stability of Supply ... customer stability (component of supply chain stability)
The sample is limited to Chinese A-share-listed design enterprises (2014–2023), which may limit generalizability to small and medium-sized enterprises (SMEs) or firms in other countries/regions.
Study sample description: A-share-listed design-oriented enterprises in China between 2014 and 2023; authors explicitly note this as a limitation.
high negative AI-driven design management: enhancing organizational produc... External validity / generalizability of results
Using TFP as a proxy for project efficiency aggregates effects at the firm level and therefore lacks micro-level insight into specific project workflows or design iteration processes.
Methodological limitation acknowledged in the paper: TFP is used as a firm-level proxy and the dataset does not include micro-level project workflow or iteration logs.
high negative AI-driven design management: enhancing organizational produc... Granularity of project-efficiency measurement (limitation of TFP proxy)
Human judgment is constrained by bounded rationality, cognitive biases, and information-processing limitations.
Cited as established findings from prior research across decision sciences and related fields (extensive literature evidence referenced; no new empirical data in this paper's abstract).
high negative Reframing Organizational Decision-Making in the Age of Artif... human judgment accuracy/quality and cognitive processing capacity
Ireland exhibits the largest gender gap in advanced digital task use: approximately 44% of men versus 18% of women perform advanced digital tasks — a 26 percentage point gap, close to double the European average.
Country-level descriptive statistics from ESJS for Ireland reporting shares of men and women performing advanced digital tasks. (Exact Irish sample size not provided in the excerpt.)
high negative Squandered skills? Bridging the digital gender skills gap fo... Share (%) of men and women in Ireland performing advanced digital tasks; gender ...
Across Europe, women are around 15 percentage points less likely than men to perform advanced digital tasks in their jobs.
Empirical analysis of the European Skills and Jobs Survey (ESJS) (Cedefop, 2021) using regression-based estimates and descriptive statistics across European countries. (Exact sample size and country count not provided in the excerpt.)
high negative Squandered skills? Bridging the digital gender skills gap fo... Probability / share of workers performing advanced digital tasks (binary indicat...
AI substitutes many routine tasks, including both manual and cognitive/rule-based activities, disproportionately affecting middle-skill occupations.
Task-based substitution reasoning within SBTC framework and cross-sectoral task analysis. The paper provides conceptual synthesis rather than presenting new microdata or quantified task-level estimates.
high negative Artificial Intelligence, Automation, and Employment Dynamics... employment and wages in routine / middle-skill occupations; task displacement
Key implementation challenges include data quality and integration, model interpretability, cybersecurity and privacy, regulatory/compliance uncertainty, skills gaps among accounting professionals, and implementation costs.
Identified by the paper through literature review and practitioner reports; these are presented as recurring barriers rather than quantified with a specific sample.
high negative Role of Artificial Intelligence in the Accounting Sector incidence/severity of implementation barriers (data quality scores, integration ...
Nearby business closures increased perceived impediments to growth, amplifying pessimism via local exposure (social contagion effect).
Empirical comparison of perceived impediments to growth across variation in local exposure to nearby business closures (survey measures of local closures correlated with respondents' perceived impediments), using the cross-country survey sample.
high negative Peer Influence and Individual Motivations in Global Small Bu... perceived impediments to growth
Two regimes emerge: an inequality-decreasing regime when AI behaves like a broadly available commodity technology or when labor-market institutions share rents widely (high ξ).
Model regime characterization and calibrated counterfactuals showing falling wage dispersion and ΔGini under commodity-like AI assumptions or higher rent-sharing elasticity.
high negative When AI Levels the Playing Field: Skill Homogenization, Asse... wage dispersion and aggregate inequality (ΔGini)
Generative AI compresses within-task skill differences (reduces dispersion of individual task performance).
Theoretical task-based model and calibrated quantitative simulations (Method of Simulated Moments matching six empirical moments) showing reductions in within-task performance dispersion after introducing AI technology.
high negative When AI Levels the Playing Field: Skill Homogenization, Asse... within-task performance dispersion (skill/ability variance within a task)
No evaluated program reported Kirkpatrick‑Barr level‑4 outcomes (organizational change, patient outcomes, or sustained metacognitive mastery).
Reviewers mapped reported outcomes from all 27 included programs and found none that demonstrated organizational-level impacts or patient‑level outcomes (level 4).
high negative Assessing the effectiveness of artificial intelligence educa... Kirkpatrick‑Barr level‑4 outcomes (organizational impact, patient outcomes, meta...
Because the design is cross-sectional and sampling purposive/geographically constrained, causal inference and generalizability are limited.
Authors' stated limitations in the summary: cross-sectional design and purposive, geographically constrained sample (Karnataka, India).
high negative AI-driven stress management and performance optimization: A ... generalizability / causal inference (methodological limitation)
Workplace stress is associated with lower employee retention.
PLS-SEM analysis on a cross-sectional survey of N = 350 pharmaceutical workers in Karnataka, India (purposive sampling). Reported direct path: Stress → Retention, β = 0.321, p < 0.001. (Note: the paper interprets this as stress reducing retention; sign/coding conventions of the variables are not detailed in the summary.)
high negative AI-driven stress management and performance optimization: A ... employee retention (retention intent/behavior)
Automated compliance and credentialing systems raise governance issues (auditability, appeals mechanisms) and risk incorrect automated deregistration if not properly governed.
Governance and algorithmic-risk discussion in the paper; logical argumentation rather than case-based evidence.
high negative &lt;i&gt;Electrotechnical education, institutional complianc... rate of incorrect automated decisions, existence and effectiveness of appeal pro...
The paper models career progression as a continuous function and treats certification gaps as discontinuities that impede labour-market mobility.
Mathematical/conceptual modeling described in the methods (career-progression-as-continuous-function approach); this is a modeling choice reported in the paper rather than an empirical finding.
high negative &lt;i&gt;Electrotechnical education, institutional complianc... labour-market mobility / continuity of career progression (in the conceptual mod...
There is limited long-term impact evidence and few system-level assessments of AI in developing-country agriculture.
Authors' methodological caveat based on the temporal scope and types of studies available in the >60-study review.
high negative A systematic review of the economic impact of artificial int... presence/absence of long-term impact evaluations and system-level assessments
The evidence base is skewed toward pilots and high‑performer contexts; there is a lack of long‑panel, multi‑project longitudinal studies to validate typical returns and scalability.
Authors' assessment of evidence types in the 160 studies: mix of conceptual papers, case studies, pilots, and only limited larger empirical evaluations.
high negative Digital Twins Across the Asset Lifecycle: Technical, Organis... representativeness and longitudinal robustness of evidence
Opacity, bias, and errors in AI systems demand auditing, standards, and governance (algorithmic accountability) to ensure trustworthy assessment.
Synthesis of literature on algorithmic bias and accountability plus policy analysis recommending audits and standards; supported by country cases that discuss governance concerns.
high negative The Future of Assessment: Rethinking Evaluation in an AI-Ass... algorithmic fairness, transparency, and reliability
Student data used by AI vendors raises risks around consent, reuse, commercial exploitation, and other data-privacy concerns.
Policy analysis and literature on data governance, privacy law debates; examples from national policy documents in the comparative cases. No original data on breaches or misuse presented.
high negative The Future of Assessment: Rethinking Evaluation in an AI-Ass... privacy risks and governance of student data
Limitations of the study include reliance on self-reported perceptions (subject to response and survivorship bias), lack of experimental/causal identification, potential non-representative sample, and cross-sectional design limiting inference about long-term productivity effects.
Authors' stated limitations in the paper summary.
high negative Artificial Intelligence as a Catalyst for Innovation in Soft... validity threats (self-report bias, lack of causal design) as reported by author...