The Commonplace
Home Dashboard Papers Evidence Digests 🎲

Evidence (1902 claims)

Adoption
5126 claims
Productivity
4409 claims
Governance
4049 claims
Human-AI Collaboration
2954 claims
Labor Markets
2432 claims
Org Design
2273 claims
Innovation
2215 claims
Skills & Training
1902 claims
Inequality
1286 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 369 105 58 432 972
Governance & Regulation 365 171 113 54 713
Research Productivity 229 95 33 294 655
Organizational Efficiency 354 82 58 34 531
Technology Adoption Rate 277 115 63 27 486
Firm Productivity 273 33 68 10 389
AI Safety & Ethics 112 177 43 24 358
Output Quality 228 61 23 25 337
Market Structure 105 118 81 14 323
Decision Quality 154 68 33 17 275
Employment Level 68 32 74 8 184
Fiscal & Macroeconomic 74 52 32 21 183
Skill Acquisition 85 31 38 9 163
Firm Revenue 96 30 22 148
Innovation Output 100 11 20 11 143
Consumer Welfare 66 29 35 7 137
Regulatory Compliance 51 61 13 3 128
Inequality Measures 24 66 31 4 125
Task Allocation 64 6 28 6 104
Error Rate 42 47 6 95
Training Effectiveness 55 12 10 16 93
Worker Satisfaction 42 32 11 6 91
Task Completion Time 71 5 3 1 80
Wages & Compensation 38 13 19 4 74
Team Performance 41 8 15 7 72
Hiring & Recruitment 39 4 6 3 52
Automation Exposure 17 15 9 5 46
Job Displacement 5 28 12 45
Social Protection 18 8 6 1 33
Developer Productivity 25 1 2 1 29
Worker Turnover 10 12 3 25
Creative Output 15 5 3 1 24
Skill Obsolescence 3 18 2 23
Labor Share of Income 7 4 9 20
Clear
Skills Training Remove filter
Active, collaborative AI use preserves perceived meaningfulness of work at levels comparable to independent work and does not produce the lasting psychological costs seen with passive use.
Pre-registered experiment (N = 269) with post-manipulation and post-return measures; Active-collaboration condition matched No-AI on meaningfulness and showed no persistent declines after returning to manual tasks.
high null result Relying on AI at work reduces self-efficacy, ownership, and ... perceived meaningfulness of work (including post-return)
Active, collaborative AI use preserves psychological ownership of outputs at levels comparable to independent work.
Pre-registered experiment (N = 269); Active-collaboration condition reported ownership levels similar to No-AI condition on self-report scales.
high null result Relying on AI at work reduces self-efficacy, ownership, and ... psychological ownership of outputs
Active, collaborative AI use (human drafts first, then uses AI to refine) preserves self-efficacy at levels comparable to independent (no-AI) work.
Pre-registered experiment (N = 269) comparing Active-collaboration and No-AI conditions; no statistically meaningful differences in self-efficacy between them (self-reported measures).
high null result Relying on AI at work reduces self-efficacy, ownership, and ... self-efficacy (confidence to complete tasks without AI)
The work is qualitative and exploratory — presenting naturalistic phenomena rather than causal empirical estimates, and is intended to be hypothesis-generating rather than definitive.
Methodology explicitly stated: naturalistic, qualitative daily observations over one month across multiple platforms; comparative observational documentation without experimental manipulation or causal identification.
high null result When Openclaw Agents Learn from Each Other: Insights from Em... nature of evidence (qualitative/exploratory vs. causal inference)
Results are from role-play contexts and short-term interventions; economic estimates of benefit require validation in field settings, across diverse populations, and with different LLM models.
Authors' caveats and limitations stated in the paper noting external validity concerns and the experimental context (role-play, short-term follow-up).
high null result Practicing with Language Models Cultivates Human Empathic Co... generalizability/external validity (not directly measured)
Outcome measures included alignment to the normative taxonomy (coding/automated), recipient-rated perceptions of being heard/validated, and blinded empathy judgments.
Methods section description listing primary and secondary outcomes used in the trial and evaluations.
high null result Practicing with Language Models Cultivates Human Empathic Co... alignment metrics, recipient-rated perceptions, blinded empathy judgments
A data-driven taxonomy was derived mapping common idiomatic empathic moves (e.g., validation, perspective-taking, emotional labeling, offers of support) used in naturalistic support conversations.
Textual analysis of the collected corpus (33,938 messages) produced an operational taxonomy of idiomatic empathic expressions used in the role-play dialogues.
high null result Practicing with Language Models Cultivates Human Empathic Co... taxonomy of empathic communication moves (categorical coding scheme)
The Lend an Ear platform collected a large conversational corpus: 33,938 messages across 2,904 conversations with 968 participants.
Dataset description reported in the paper specifying counts of participants, conversations, and messages used to build and analyze communication patterns.
high null result Practicing with Language Models Cultivates Human Empathic Co... corpus size (number of messages, conversations, participants)
Key empirical metrics introduced and used are: AI adoption rates (sector-level intensity), Skill shift index, Hybrid job share, and employment levels/net changes by sector.
Methods description listing the constructed metrics used in the simulated dataset and subsequent analyses (definitions and calculation procedures provided in the paper).
high null result AI-Driven Transformation of Labor Markets: Skill Shifts, Hyb... Defined metrics (AI adoption rate, Skill shift index, Hybrid job share, Employme...
The study's main limitations include reliance on a simulated dataset rather than exhaustive administrative microdata, literature limited to selected publishers/years, and correlational (not causal) identification of some effects.
Authors' explicitly stated limitations in the paper's methods and discussion sections describing data choices (simulated dataset, selected publishers 2020–2024) and the observational/correlational nature of several analyses.
high null result AI-Driven Transformation of Labor Markets: Skill Shifts, Hyb... Study validity/generalizability limitations
This work is conceptual/theoretical and reports no original empirical dataset; it explicitly calls for mixed-methods empirical validation (case studies, field experiments, longitudinal studies), measurement development, and multi-level data collection.
Explicit methodological statement in the paper describing its nature as a theoretical synthesis and listing empirical needs; no empirical sample provided.
high null result Revolutionizing Human Resource Development: A Theoretical Fr... presence/absence of original empirical data in the paper (none)
Four autonomous agents were benchmarked on the same fresh CTF challenge set alongside human teams.
Benchmarking experiment described in the study: four autonomous AI agents evaluated on the identical fresh challenge set used in the live onsite CTF.
high null result Understanding Human-AI Collaboration in Cybersecurity Compet... agent performance metrics on the fresh CTF challenge set (success rates, traject...
The study's empirical base consists of 40 semi-structured interviews with cross-industry project practitioners in the UK, analyzed using thematic qualitative methods.
Stated data and methods in the paper: sample size (40), interview method, cross-industry sampling, and thematic analysis.
high null result AI in project teams: how trust calibration reconfigures team... study sample and methodology (empirical basis)
Limitation: Implementation heterogeneity — the costs and feasibility of the recommended HR changes vary by context and may affect generalisability.
Explicit limitation acknowledged in the paper; drawn from theoretical reasoning about contextual heterogeneity and practitioner variability.
high null result Symbiarchic leadership: leading integrated human and AI cybe... implementation costs; feasibility; effect on generalisability
Limitation: The framework is conceptual and requires empirical validation across sectors, firm sizes and AI‑intensity levels.
Explicit limitation acknowledged by the authors; based on the paper's method (theoretical synthesis, no original data).
high null result Symbiarchic leadership: leading integrated human and AI cybe... generalizability and empirical validity across contexts
The paper generates empirically testable propositions (e.g., how leader practices affect AI adoption speed, task reallocation, productivity, error rates, employee well‑being and turnover) and suggests natural‑experiment settings for evaluation.
Stated methodological output of the conceptual synthesis; the paper lists candidate empirical tests and research opportunities but contains no original empirical tests.
high null result Symbiarchic leadership: leading integrated human and AI cybe... AI adoption speed; task reallocation; productivity; error rates; employee well‑b...
The available evidence consists mainly of promising empirical studies and case studies, but there are few long-run, generalized ROI or productivity estimates; results are heterogeneous across therapeutic areas.
Self-described limitation of the narrative review: heterogeneity of study designs and outcomes precluded pooled quantitative estimates and long-run ROI assessment.
high null result From Algorithm to Medicine: AI in the Discovery and Developm... evidence quality (availability of long-run ROI/productivity estimates) and heter...
AI applications span the full drug development pipeline, including target discovery, in silico screening and de novo design, preclinical safety models, clinical trial design and patient selection/monitoring, and post-marketing surveillance.
Comprehensive literature synthesis across preclinical, clinical, and post-marketing sources in the narrative review summarizing documented uses across these stages.
high null result From Algorithm to Medicine: AI in the Discovery and Developm... coverage of pipeline stages by AI applications (scope)
Current evidence is illustrative rather than systematic; there is a lack of long-run, quantitative measures of AI’s effect on late-stage clinical outcomes in the literature reviewed.
Explicit methodological statement in the paper: study is an expert/opinion synthesis and narrative review with no new causal econometric estimates or primary experimental data.
high null result Learning from the successes and failures of early artificial... existence/availability of long-run quantitative measures linking AI adoption to ...
Suggested metrics for researchers and investors to monitor include R&D cycle time, cost per IND/NDA, proportion of projects using AI, success rates at development stages, market concentration measures, and investment flows into AI-enabled biotech vs incumbents.
Recommendations made in the Implications section as metrics to watch; no empirical tracking or baseline measures provided.
high null result AI as the Catalyst for a New Paradigm in Biomedical Research recommended monitoring metrics for AI impact in pharma/biotech
Limitations of the analysis include limited empirical validation of archetypes or impacts and potential selection bias toward prominent firms and technologies.
Explicit limitations stated in the Data & Methods section of the paper.
high null result AI as the Catalyst for a New Paradigm in Biomedical Research generalizability and representativeness of the paper's claims
The paper is an editorial/conceptual synthesis rather than a primary empirical study: it uses qualitative analysis and illustrative examples, and reports no new quantitative estimates.
Explicit statement in the Data & Methods section of the paper describing document type, approach, evidence base, and limitations.
high null result AI as the Catalyst for a New Paradigm in Biomedical Research empirical evidence provision (absence of new quantitative data)
Ethical oversight and governance (addressing bias, consent, downstream risks) are critical constraints that must be addressed for AI to generate sustained benefits.
Normative synthesis referencing common ethical concerns; no empirical evaluation of oversight mechanisms in the paper.
high null result AI as the Catalyst for a New Paradigm in Biomedical Research ethical acceptability and downstream risk mitigation
Transparency and auditability for model behavior, provenance, and decisions are essential for trustworthy deployment and regulatory acceptance.
Policy and governance synthesis drawing on regulatory dynamics; no empirical study of regulatory outcomes included.
high null result AI as the Catalyst for a New Paradigm in Biomedical Research trustworthiness/regulatory acceptability of models
Rigorous model validation and reproducibility across datasets and settings are necessary constraints for successful AI deployment.
Normative claim in the editorial based on reproducibility concerns in ML and biomedical research; no reported validation trials within the paper.
high null result AI as the Catalyst for a New Paradigm in Biomedical Research reliability and generalizability of AI models across settings
The paper is primarily discursive and invitational: it opens a dialogue and proposes a research agenda rather than providing definitive empirical answers.
Stated methodological stance and limits: conceptual/philosophical analysis, interdisciplinary literature synthesis, qualitative/illustrative examples, and explicit note of no systematic empirical evaluation.
high null result At the table with Wittgenstein: How language shapes taste an... presence/absence of new empirical datasets or systematic experimental validation...
The paper identifies three core mechanisms underlying calibrated trust and complementarity: (1) calibrated trust balancing reliance and oversight, (2) complementarity–trust interaction for optimal performance, and (3) dynamic feedback loops producing reinforcing learning cycles.
Explicit identification of mechanisms claimed in the paper's synthesis; this is a descriptive claim about the paper's content rather than an empirical finding—no sample or empirical test reported in the abstract.
high null result Optimising Human– AI Decision Performance: A Trust and Cap... n/a (identification of theoretical mechanisms)
It remains unclear how developers' general programming and security-specific experience, and the type of AI tool used (free vs. paid), affect the security of the resulting software — motivating this study.
Paper's stated research gap/motivation: the authors identify uncertainty in the literature regarding interactions between developer experience, AI tool tier (free vs. paid), and resulting code security.
high null result The Impact of AI-Assisted Development on Software Security: ... the combined effect of developer experience and AI tool type on code security (i...
Participants were assigned a security-related programming task using either no AI tools, the free version, or the paid version of Gemini.
Experimental design described in the paper: random/conditional assignment of participants into three groups (no AI, free Gemini, paid Gemini) performing the same security-related programming task.
high null result The Impact of AI-Assisted Development on Software Security: ... experimental condition (tool used) as it relates to subsequent code security out...
We conducted a quantitative programming study with software developers (n = 159) exploring the impact of Google's AI tool Gemini on code security.
Explicit methodological statement in the paper: a quantitative study with 159 participating software developers assigned to experimental conditions to evaluate Gemini's impact on security-related programming tasks.
high null result The Impact of AI-Assisted Development on Software Security: ... impact of Gemini on code security (security of code produced in the study)
The article introduces a novel Bayesian Item Response Theory framework that quantifies human–AI synergy by separately estimating individual ability, collaborative ability, and AI model capability while controlling for task difficulty.
Methodological contribution described in the paper: development and application of a Bayesian Item Response Theory model that includes separate parameters for individual ability, collaborative ability, AI model capability, and task difficulty (method section of the paper).
high null result Quantifying and Optimizing Human-AI Synergy: Evidence-Based ... estimated parameters for individual ability, collaborative ability, AI model cap...
A quantitative methodology was employed, utilizing a structured questionnaire administered to 400 small business owners.
Explicit methodological statement in the paper: structured questionnaire survey with sample size N=400 small business owners.
high null result The role of artificial intelligence in enhancing financial l... method / sample (use of structured questionnaire; sample size = 400)
This research conducts a critical analysis of the ethical implications of artificial intelligence in terms of job displacement during the fifth industrial revolution.
Author-declared methodology: a literature-based critical analysis drawing on novel studies and the existing body of literature; no further methodological details (e.g., inclusion criteria, databases searched) provided in the excerpt.
high null result A Study on Work-Life Balance of Women Employees in the IT Se... ethical implications of AI-related job displacement
This study analyzes comments and statements from party members in OECD countries from 2016 to 2025 through content analysis, examining media interviews, speeches, and debates.
Description of the study's data and method: content analysis of party member comments and statements drawn from media interviews, speeches, and debates across OECD countries over the 2016–2025 period (sample size and selection details not reported in the excerpt).
high null result Political Ideology, Artificial Intelligence (AI), and Labor ... dataset composition and methodological approach (sources and timeframe of analyz...
The study contributes to the literature by integrating evidence across higher education, vocational training, and lifelong learning to emphasize the need for balanced policy approaches to skill formation.
Stated contribution in the paper: cross-pathway synthesis of existing empirical evidence and secondary data (methods described as comparative synthesis; no primary empirical contribution reported in the summary).
high null result Balancing Higher Education, Vocational Training, and Lifelon... scholarly contribution / integrative synthesis
The study uses secondary data and comparative evidence from prior empirical studies to analyze relationships between higher education, vocational education, and lifelong learning.
Stated methodology in the paper: analysis of secondary data and synthesis of prior empirical/comparative studies (no primary data collection; no sample sizes reported).
high null result Balancing Higher Education, Vocational Training, and Lifelon... methodological approach / data sources
Drawing on leadership theory, emotional intelligence research and AI ethics informs the proposed framework.
Methodological/design statement in the paper describing its intellectual grounding; indicates literature-based synthesis rather than primary data collection.
high null result Deconstructing success: why being human still matters sources informing the framework (theoretical influences)
Chatbot suggestions were artificially varied in aggregate accuracy across treatment conditions from low (53%) to high (100%).
Paper describes experimental manipulation of chatbot suggestion accuracy with aggregate accuracies ranging from 53% to 100%; manipulation method (how suggestions were generated or sampled) described in methods (not fully detailed in excerpt).
high null result LLMs in social services: How does chatbot accuracy affect hu... manipulated chatbot suggestion accuracy (range 53%–100%)
Caseworkers in the control condition (no chatbot suggestions) had a mean accuracy of 49%.
Reported experimental outcome: mean accuracy for control group = 49%; based on the randomized experiment using the 770-question benchmark.
high null result LLMs in social services: How does chatbot accuracy affect hu... caseworker accuracy (mean percent correct in control condition = 49%)
We conducted a randomized experiment with caseworkers recruited from nonprofit outreach organizations in Los Angeles.
Paper describes a randomized experiment recruiting caseworkers from nonprofit outreach organizations in Los Angeles; sample size and recruitment details not given in the excerpt.
high null result LLMs in social services: How does chatbot accuracy affect hu... execution of a randomized experiment with nonprofit caseworker participants (loc...
The benchmark questions have corresponding expert-verified answers.
Paper states benchmark questions have expert-verified answers; verification method and number/credentials of experts not specified in the excerpt.
high null result LLMs in social services: How does chatbot accuracy affect hu... availability of expert-verified reference answers for benchmark questions
We created a 770-question multiple-choice benchmark dataset of difficult, but realistic questions that a caseworker might receive.
Paper reports creation of a benchmark dataset containing 770 multiple-choice questions described as difficult and realistic; questions and dataset construction described in methods (no sample-of-questions or external validation details provided in the excerpt).
high null result LLMs in social services: How does chatbot accuracy affect hu... benchmark dataset size and content (770 multiple-choice questions)
The study's conclusions draw on three complementary evidence bases: (a) task-level evidence on what generative AI can already do in practice; (b) occupational exposure and complementarity analysis using Philippine labor force data; and (c) firm- and worker-level evidence on AI adoption.
Description of methods and data sources in the paper: task-level capability testing/assessment, analysis of national labor force/occupation data for exposure/complementarity, and firm/worker surveys or qualitative adoption evidence.
high null result Labor Futures Under Artificial Intelligence: Scenarios for t... methodological integration of evidence bases (description of data/methods rather...
The review focuses on AI applications within small‑scale business environments, with a special focus on women‑owned micro firms in Jaipur, India.
Scope and aim articulated in the paper; geographic and demographic focus explicitly stated by the authors.
high null result Role of AI in Enhancing Work Efficiency and Opportunities fo... scope of review (women‑owned micro firms in Jaipur; AI in micro‑enterprise conte...
The systematic review follows PRISMA 2020 guidelines.
Methodological statement in the paper indicating adherence to PRISMA 2020 for the review process.
high null result Role of AI in Enhancing Work Efficiency and Opportunities fo... methodological adherence to PRISMA 2020 reporting standards
After screening and eligibility filtering, 55 open‑access journal articles were included for in‑depth analysis.
PRISMA‑guided screening and eligibility process reported in the review; final included sample explicitly stated as 55 open‑access journal articles.
high null result Role of AI in Enhancing Work Efficiency and Opportunities fo... number of included articles for analysis (n = 55)
A Scopus search identified 265 records using keywords related to women’s entrepreneurship and AI.
Systematic literature search reported in the paper following PRISMA 2020; search executed in Scopus with specified keywords; initial yield stated as 265 records.
high null result Role of AI in Enhancing Work Efficiency and Opportunities fo... number of records identified in database search (n = 265)
Viable transition pathways are operationally defined in this study as sharing at least 3 skills and achieving at least 50% skill transfer.
Methodological definition stated in the paper used to determine whether a job-to-job transition is considered viable.
high null result Graph-Based Analysis of AI-Driven Labor Market Transitions: ... criteria thresholds for classifying transition viability (>=3 shared skills; >=5...
We identified 4,534 feasible transitions between jobs in the dataset.
Count of feasible job-to-job transition pairs found in the knowledge graph analysis (4,534 transitions reported).
high null result Graph-Based Analysis of AI-Driven Labor Market Transitions: ... number of feasible job-to-job transitions identified
We constructed and validated a knowledge graph of 9,978 Egyptian job postings, 19,766 skill activities, and 84,346 job-skill relationships with a 0.74% error rate.
Empirical construction and validation of a knowledge graph using a dataset of 9,978 job postings, 19,766 distinct skill/activity nodes, and 84,346 job–skill edges; reported overall error rate 0.74% (validation method not detailed in the excerpt).
high null result Graph-Based Analysis of AI-Driven Labor Market Transitions: ... size and quality (error rate) of the knowledge graph (counts of postings, skills...