The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (2608 claims)

Adoption
7395 claims
Productivity
6507 claims
Governance
5877 claims
Human-AI Collaboration
5157 claims
Innovation
3492 claims
Org Design
3470 claims
Labor Markets
3224 claims
Skills & Training
2608 claims
Inequality
1835 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 609 159 77 736 1615
Governance & Regulation 664 329 160 99 1273
Organizational Efficiency 624 143 105 70 949
Technology Adoption Rate 502 176 98 78 861
Research Productivity 348 109 48 322 836
Output Quality 391 120 44 40 595
Firm Productivity 385 46 85 17 539
Decision Quality 275 143 62 34 521
AI Safety & Ethics 183 241 59 30 517
Market Structure 152 154 109 20 440
Task Allocation 158 50 56 26 295
Innovation Output 178 23 38 17 257
Skill Acquisition 137 52 50 13 252
Fiscal & Macroeconomic 120 64 38 23 252
Employment Level 93 46 96 12 249
Firm Revenue 130 43 26 3 202
Consumer Welfare 99 51 40 11 201
Inequality Measures 36 105 40 6 187
Task Completion Time 134 18 6 5 163
Worker Satisfaction 79 54 16 11 160
Error Rate 64 78 8 1 151
Regulatory Compliance 69 64 14 3 150
Training Effectiveness 81 15 13 18 129
Wages & Compensation 70 25 22 6 123
Team Performance 74 16 21 9 121
Automation Exposure 41 48 19 9 120
Job Displacement 11 71 16 1 99
Developer Productivity 71 14 9 3 98
Hiring & Recruitment 49 7 8 3 67
Social Protection 26 14 8 2 50
Creative Output 26 14 6 2 49
Skill Obsolescence 5 37 5 1 48
Labor Share of Income 12 13 12 37
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Skills Training Remove filter
The study was conducted by the Mohammed bin Rashid School of Government’s Future of Government Center, in collaboration with global AI pioneers.
Authorship and collaboration statement in the report.
high null result Charting AI Governance Future in the Arab Region: A Policy R... institutional authorship and collaboration on the study
The report highlights the key findings of a field study covering ten Arab countries to explore the realities and challenges of AI governance.
Report statement describing the geographic scope of the field study (explicitly: ten Arab countries).
high null result Charting AI Governance Future in the Arab Region: A Policy R... geographic coverage of the field study (number of countries)
The recommendations are based on regional research that included hundreds of leaders active in the AI domains, from the public and private sectors.
Report statement claiming participant base of the underlying research (described as 'hundreds of leaders').
high null result Charting AI Governance Future in the Arab Region: A Policy R... scope and participant coverage of the underlying research
Zero-shot baselines and standard retrieval stagnate around 50-60% accuracy across model generations on the graduate-level final exam.
Pilot study reported on a full graduate-level final exam comparing zero-shot and standard retrieval baselines across model generations; reported accuracy range given as ~50-60%. Exact number of exam questions or models compared not stated.
high null result From 50% to Mastery in 3 Days: A Low-Resource SOP for Locali... exam accuracy (percentage correct)
The cooperative video game KeyWe, with a scripted agent, served as a valid testbed for studying human-agent teamwork and the effects of the training intervention.
Methodological choice: KeyWe was used as the experimental environment and the agent behavior was scripted for consistency; all behavioral and performance measures were collected within this setting.
high null result Teaming Up With an AI Agent: Training Humans to Develop Huma... experimental_testbed_description
Half of the participants received the teamwork training and half did not (between-subjects comparison).
Experimental design description: participants were split into trained and untrained groups (50/50).
high null result Teaming Up With an AI Agent: Training Humans to Develop Huma... experimental_assignment (trained vs. untrained)
The model yields two limits on the speed of learning and adoption: a structural limit determined by prerequisite reachability and an epistemic limit determined by uncertainty about the target.
Theoretical result stated in the paper (model-derived identification of two distinct limiting factors on learning speed).
high null result A Mathematical Theory of Understanding speed of learning / adoption
Teaching is modeled as sequential communication with a latent target.
Modeling assumption explicitly stated in the paper (formalization of teaching in the theoretical framework).
high null result A Mathematical Theory of Understanding model specification (teaching process)
The paper models the learner as a mind: an abstract learning system characterized by a prerequisite structure over concepts.
Modeling assumption explicitly stated in the paper (definition of the 'mind' in the theoretical model).
high null result A Mathematical Theory of Understanding model specification (representation of learner)
This Article presents the results of an experiment in which a transcript of a hypothetical client interview involving potential disability discrimination, retaliation, and wrongful termination claims was submitted to each AI system, with prompts requesting identification and assessment of viable legal theories.
Methodological description of the experiment: one hypothetical client interview transcript fed to each of four AI engines with prompts to identify and assess legal theories.
high null result Robot Wingman: Using AI to Assess an Employment Termination experimental procedure (input and prompts)
Despite fears of mass unemployment, aggregate labor-market data through 2025 show limited labor-market disruption from generative AI.
Review of aggregate employment and labor-market studies and macro-level data through 2025 cited in the brief; methods include analyses of employment statistics and macro labor indicators (no single sample size reported).
high null result AI, Productivity, and Labor Markets: A Review of the Empiric... aggregate employment / labor-market disruption
Open research challenges that define the research agenda include scaling beyond benchmarks, achieving compositionality over changes, metrics for validating specifications, handling rich logics, and designing human-AI specification interactions.
Authors' explicit enumeration of open problems and a proposed multi-disciplinary research agenda; presented as expert opinion rather than empirical finding.
high null result Intent Formalization: A Grand Challenge for Reliable Coding ... progress on research questions (research agenda advancement)
Self-concordance did not mediate the AI-over-questionnaire effect on goal progress.
Preplanned mediation model reported in the paper found no evidence that self-concordance mediated the AI vs questionnaire effect on goal progress; reported as non-significant in the preregistered analysis.
high null result AI-Assisted Goal Setting Improves Goal Progress Through Soci... goal progress (mediator tested: self-concordance, self-report)
Compared with the matched written-reflection questionnaire, the AI did not significantly improve overall goal progress.
Preplanned comparison within the preregistered RCT; reported non-significant difference between AI and written-reflection condition on overall goal progress at two-week follow-up (no significant p-value reported in the summary).
high null result AI-Assisted Goal Setting Improves Goal Progress Through Soci... goal progress (self-reported goal progress at two-week follow-up)
We conducted a preregistered three-arm randomized controlled trial (RCT) comparing an AI career coach ('Leon,' powered by Claude Sonnet), a matched structured written questionnaire, and a no-support control.
Preregistered RCT reported in the paper; three arms as described; total sample size N = 517; participants randomized to AI coach, written-reflection questionnaire, or no-support control; outcomes assessed at two-week follow-up.
high null result AI-Assisted Goal Setting Improves Goal Progress Through Soci... trial design / allocation and follow-up measurement of goal-related outcomes at ...
Research agenda: empirical microdata on managerial time use, task-level automation, performance outcomes, and wage impacts are needed to quantify substitution versus complementarity and to evaluate human-in-the-loop designs' effects on firm performance and distributional outcomes.
Explicit methodological recommendation within the paper; identifies gaps due to the paper's conceptual (non-empirical) approach.
high null result Comparative analysis of strategic vs. computational thinking... availability and use of microdata on managerial tasks, automation, firm performa...
Practical recommendations for firms and policymakers include investing in training for AI curation/evaluation/coordination, experimenting with decentralised decision rights and governance safeguards, and monitoring competitive dynamics related to model/platform providers.
Policy and practitioner takeaways explicitly presented in the discussion/implications sections, deriving from the conceptual framework and mapped literature.
high null result Generative AI and the algorithmic workplace: a bibliometric ... recommended organisational and policy actions
The paper recommends a research agenda for AI economists: causal microeconometric studies (DiD, IVs, RCTs), structural models with hybrid human–AI agents, measurement work on GenAI use, distributional analysis and policy evaluation.
Explicit recommendations listed in the implications and research agenda sections; logical follow‑on from bibliometric findings about gaps in causal and measurement evidence.
high null result Generative AI and the algorithmic workplace: a bibliometric ... recommended methodological directions for future empirical and theoretical resea...
Bibliometric mapping profiles the intellectual structure and evolution of the field but does not establish causal effects of GenAI on organisational outcomes.
Methodological limitation explicitly stated in the paper; bibliometric approach (co‑word, citation, thematic mapping) is descriptive and historical in scope.
high null result Generative AI and the algorithmic workplace: a bibliometric ... methodological limitation (inability to infer causality from bibliometric mappin...
Co‑word and thematic analyses reveal six coherent conceptual clusters that bridge technical AI topics (e.g., LLMs, GANs) with managerial themes (e.g., autonomy, coordination, decision‑making).
Thematic mapping and co‑word network analysis performed on the 212‑paper corpus; identification of six clusters reported in results.
high null result Generative AI and the algorithmic workplace: a bibliometric ... number and thematic composition of conceptual clusters (six clusters linking tec...
Bibliometric and conceptual tools (VOSviewer, Bibliometrix) were used to identify performance trends, co‑word structures, thematic maps, and conceptual evolution in the GenAI–organisation literature.
Methods section: use of VOSviewer for network visualization and Bibliometrix for bibliometric statistics, co‑word analysis, thematic mapping and Sankey thematic evolution.
high null result Generative AI and the algorithmic workplace: a bibliometric ... types of bibliometric analyses applied (performance trends, co‑word structures, ...
The study analysed a corpus of 212 Scopus‑indexed publications covering 2018–2025 to map emergent literature on Generative AI and organisational change.
Bibliometric dataset constructed from Scopus; sample size = 212 peer‑reviewed articles; time window 2018–2025; analyses performed with Bibliometrix and VOSviewer.
high null result Generative AI and the algorithmic workplace: a bibliometric ... size and timeframe of bibliometric corpus (number of publications, 2018–2025)
Research agenda: causal studies (panel data, quasi-experiments) are needed to estimate effects of AI exposure on employment outcomes and to evaluate retraining/income-support interventions for pre-retirement populations.
Authors’ stated recommendation based on limits of cross-sectional regression results from the n=889 survey and the identified need to move from association to causation.
Study limitations: cross-sectional design, self-reported intentions, potential unobserved confounders, and limited generalizability to only three cities (Beijing, Guangzhou, Lanzhou).
Explicit methodological statements in the paper describing data and design: cross-sectional survey of 889 respondents from three cities and reliance on self-reported employment intentions.
Outcomes reported are primarily self-reported psychological measures rather than objective productivity metrics.
Paper reports measurement instruments focused on self-reported self-efficacy, psychological ownership, meaningfulness, and enjoyment/satisfaction; no primary objective productivity metrics reported.
high null result Relying on AI at work reduces self-efficacy, ownership, and ... measurement type (self-reported psychological outcomes)
The experiment was pre-registered, used occupation-specific writing tasks, and employed a between-subjects design with three conditions (No-AI, Passive AI, Active collaboration).
Study design reported in the paper: pre-registration statement, N = 269, between-subjects assignment to three conditions using occupation-specific writing tasks.
high null result Relying on AI at work reduces self-efficacy, ownership, and ... n/a (methodological claim)
Active, collaborative AI use preserves perceived meaningfulness of work at levels comparable to independent work and does not produce the lasting psychological costs seen with passive use.
Pre-registered experiment (N = 269) with post-manipulation and post-return measures; Active-collaboration condition matched No-AI on meaningfulness and showed no persistent declines after returning to manual tasks.
high null result Relying on AI at work reduces self-efficacy, ownership, and ... perceived meaningfulness of work (including post-return)
Active, collaborative AI use preserves psychological ownership of outputs at levels comparable to independent work.
Pre-registered experiment (N = 269); Active-collaboration condition reported ownership levels similar to No-AI condition on self-report scales.
high null result Relying on AI at work reduces self-efficacy, ownership, and ... psychological ownership of outputs
Active, collaborative AI use (human drafts first, then uses AI to refine) preserves self-efficacy at levels comparable to independent (no-AI) work.
Pre-registered experiment (N = 269) comparing Active-collaboration and No-AI conditions; no statistically meaningful differences in self-efficacy between them (self-reported measures).
high null result Relying on AI at work reduces self-efficacy, ownership, and ... self-efficacy (confidence to complete tasks without AI)
The work is qualitative and exploratory — presenting naturalistic phenomena rather than causal empirical estimates, and is intended to be hypothesis-generating rather than definitive.
Methodology explicitly stated: naturalistic, qualitative daily observations over one month across multiple platforms; comparative observational documentation without experimental manipulation or causal identification.
high null result When Openclaw Agents Learn from Each Other: Insights from Em... nature of evidence (qualitative/exploratory vs. causal inference)
Results are from role-play contexts and short-term interventions; economic estimates of benefit require validation in field settings, across diverse populations, and with different LLM models.
Authors' caveats and limitations stated in the paper noting external validity concerns and the experimental context (role-play, short-term follow-up).
high null result Practicing with Language Models Cultivates Human Empathic Co... generalizability/external validity (not directly measured)
Outcome measures included alignment to the normative taxonomy (coding/automated), recipient-rated perceptions of being heard/validated, and blinded empathy judgments.
Methods section description listing primary and secondary outcomes used in the trial and evaluations.
high null result Practicing with Language Models Cultivates Human Empathic Co... alignment metrics, recipient-rated perceptions, blinded empathy judgments
A data-driven taxonomy was derived mapping common idiomatic empathic moves (e.g., validation, perspective-taking, emotional labeling, offers of support) used in naturalistic support conversations.
Textual analysis of the collected corpus (33,938 messages) produced an operational taxonomy of idiomatic empathic expressions used in the role-play dialogues.
high null result Practicing with Language Models Cultivates Human Empathic Co... taxonomy of empathic communication moves (categorical coding scheme)
The Lend an Ear platform collected a large conversational corpus: 33,938 messages across 2,904 conversations with 968 participants.
Dataset description reported in the paper specifying counts of participants, conversations, and messages used to build and analyze communication patterns.
high null result Practicing with Language Models Cultivates Human Empathic Co... corpus size (number of messages, conversations, participants)
Key empirical metrics introduced and used are: AI adoption rates (sector-level intensity), Skill shift index, Hybrid job share, and employment levels/net changes by sector.
Methods description listing the constructed metrics used in the simulated dataset and subsequent analyses (definitions and calculation procedures provided in the paper).
high null result AI-Driven Transformation of Labor Markets: Skill Shifts, Hyb... Defined metrics (AI adoption rate, Skill shift index, Hybrid job share, Employme...
The study's main limitations include reliance on a simulated dataset rather than exhaustive administrative microdata, literature limited to selected publishers/years, and correlational (not causal) identification of some effects.
Authors' explicitly stated limitations in the paper's methods and discussion sections describing data choices (simulated dataset, selected publishers 2020–2024) and the observational/correlational nature of several analyses.
high null result AI-Driven Transformation of Labor Markets: Skill Shifts, Hyb... Study validity/generalizability limitations
This work is conceptual/theoretical and reports no original empirical dataset; it explicitly calls for mixed-methods empirical validation (case studies, field experiments, longitudinal studies), measurement development, and multi-level data collection.
Explicit methodological statement in the paper describing its nature as a theoretical synthesis and listing empirical needs; no empirical sample provided.
high null result Revolutionizing Human Resource Development: A Theoretical Fr... presence/absence of original empirical data in the paper (none)
Four autonomous agents were benchmarked on the same fresh CTF challenge set alongside human teams.
Benchmarking experiment described in the study: four autonomous AI agents evaluated on the identical fresh challenge set used in the live onsite CTF.
high null result Understanding Human-AI Collaboration in Cybersecurity Compet... agent performance metrics on the fresh CTF challenge set (success rates, traject...
The study's empirical base consists of 40 semi-structured interviews with cross-industry project practitioners in the UK, analyzed using thematic qualitative methods.
Stated data and methods in the paper: sample size (40), interview method, cross-industry sampling, and thematic analysis.
high null result AI in project teams: how trust calibration reconfigures team... study sample and methodology (empirical basis)
Limitation: Implementation heterogeneity — the costs and feasibility of the recommended HR changes vary by context and may affect generalisability.
Explicit limitation acknowledged in the paper; drawn from theoretical reasoning about contextual heterogeneity and practitioner variability.
high null result Symbiarchic leadership: leading integrated human and AI cybe... implementation costs; feasibility; effect on generalisability
Limitation: The framework is conceptual and requires empirical validation across sectors, firm sizes and AI‑intensity levels.
Explicit limitation acknowledged by the authors; based on the paper's method (theoretical synthesis, no original data).
high null result Symbiarchic leadership: leading integrated human and AI cybe... generalizability and empirical validity across contexts
The paper generates empirically testable propositions (e.g., how leader practices affect AI adoption speed, task reallocation, productivity, error rates, employee well‑being and turnover) and suggests natural‑experiment settings for evaluation.
Stated methodological output of the conceptual synthesis; the paper lists candidate empirical tests and research opportunities but contains no original empirical tests.
high null result Symbiarchic leadership: leading integrated human and AI cybe... AI adoption speed; task reallocation; productivity; error rates; employee well‑b...
The available evidence consists mainly of promising empirical studies and case studies, but there are few long-run, generalized ROI or productivity estimates; results are heterogeneous across therapeutic areas.
Self-described limitation of the narrative review: heterogeneity of study designs and outcomes precluded pooled quantitative estimates and long-run ROI assessment.
high null result From Algorithm to Medicine: AI in the Discovery and Developm... evidence quality (availability of long-run ROI/productivity estimates) and heter...
AI applications span the full drug development pipeline, including target discovery, in silico screening and de novo design, preclinical safety models, clinical trial design and patient selection/monitoring, and post-marketing surveillance.
Comprehensive literature synthesis across preclinical, clinical, and post-marketing sources in the narrative review summarizing documented uses across these stages.
high null result From Algorithm to Medicine: AI in the Discovery and Developm... coverage of pipeline stages by AI applications (scope)
Current evidence is illustrative rather than systematic; there is a lack of long-run, quantitative measures of AI’s effect on late-stage clinical outcomes in the literature reviewed.
Explicit methodological statement in the paper: study is an expert/opinion synthesis and narrative review with no new causal econometric estimates or primary experimental data.
high null result Learning from the successes and failures of early artificial... existence/availability of long-run quantitative measures linking AI adoption to ...
Suggested metrics for researchers and investors to monitor include R&D cycle time, cost per IND/NDA, proportion of projects using AI, success rates at development stages, market concentration measures, and investment flows into AI-enabled biotech vs incumbents.
Recommendations made in the Implications section as metrics to watch; no empirical tracking or baseline measures provided.
high null result AI as the Catalyst for a New Paradigm in Biomedical Research recommended monitoring metrics for AI impact in pharma/biotech
Limitations of the analysis include limited empirical validation of archetypes or impacts and potential selection bias toward prominent firms and technologies.
Explicit limitations stated in the Data & Methods section of the paper.
high null result AI as the Catalyst for a New Paradigm in Biomedical Research generalizability and representativeness of the paper's claims
The paper is an editorial/conceptual synthesis rather than a primary empirical study: it uses qualitative analysis and illustrative examples, and reports no new quantitative estimates.
Explicit statement in the Data & Methods section of the paper describing document type, approach, evidence base, and limitations.
high null result AI as the Catalyst for a New Paradigm in Biomedical Research empirical evidence provision (absence of new quantitative data)
Ethical oversight and governance (addressing bias, consent, downstream risks) are critical constraints that must be addressed for AI to generate sustained benefits.
Normative synthesis referencing common ethical concerns; no empirical evaluation of oversight mechanisms in the paper.
high null result AI as the Catalyst for a New Paradigm in Biomedical Research ethical acceptability and downstream risk mitigation
Transparency and auditability for model behavior, provenance, and decisions are essential for trustworthy deployment and regulatory acceptance.
Policy and governance synthesis drawing on regulatory dynamics; no empirical study of regulatory outcomes included.
high null result AI as the Catalyst for a New Paradigm in Biomedical Research trustworthiness/regulatory acceptability of models