The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (14156 claims)

Adoption
8625 claims
Productivity
7686 claims
Governance
6917 claims
Human-AI Collaboration
6574 claims
Org Design
4189 claims
Innovation
4131 claims
Labor Markets
3588 claims
Skills & Training
2985 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 761 200 101 904 2020
Governance & Regulation 829 400 191 122 1566
Organizational Efficiency 784 193 125 84 1197
Technology Adoption Rate 637 236 124 97 1103
Research Productivity 431 131 58 340 972
Output Quality 481 183 59 47 770
Decision Quality 332 177 82 49 647
Firm Productivity 439 57 88 20 610
AI Safety & Ethics 218 279 66 33 602
Market Structure 181 170 123 24 503
Task Allocation 214 64 72 33 388
Skill Acquisition 174 62 62 17 315
Innovation Output 204 27 45 18 295
Employment Level 105 54 108 13 282
Fiscal & Macroeconomic 132 69 43 26 277
Consumer Welfare 117 63 42 11 233
Firm Revenue 154 48 26 3 231
Task Completion Time 173 31 8 12 225
Inequality Measures 44 123 50 6 223
Worker Satisfaction 89 65 22 12 188
Error Rate 71 92 10 2 175
Regulatory Compliance 77 69 14 5 165
Automation Exposure 58 56 26 13 156
Training Effectiveness 96 21 14 19 152
Wages & Compensation 77 37 25 6 145
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 81 21 1 115
Hiring & Recruitment 52 7 8 3 70
Creative Output 32 20 8 3 64
Skill Obsolescence 5 47 6 1 59
Social Protection 28 16 8 2 54
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Increasing cost and failure rates in the pharmaceutical R&D process have not fundamentally improved over the last decade.
Stated as a contextual observation in the paper's opening paragraph; presented as a summary of industry trends (no specific dataset, sample size, or citation included in the excerpt).
high negative Strategic Key Performance Indicators for AI in Lead Optimiza... cost and failure rates in pharmaceutical R&D
Without support, performance stays stable up to three issues but declines as additional issues increase cognitive load.
Empirical study / human-AI negotiation case study in a property rental scenario that varied the number of negotiated issues; the paper reports observed performance across different numbers of issues (no sample size for this specific comparison stated in the abstract).
high negative From Overload to Convergence: Supporting Multi-Issue Human-A... negotiation performance (ability to find good agreements) under increasing numbe...
Reliance on automated content generation introduces risks of cognitive overreliance, algorithmic bias, and strategic misalignment.
The paper articulates these risks as conceptual/qualitative concerns in its discussion; no quantitative estimates or empirical tests of these specific risks are reported in the provided excerpt.
high negative The Strategic Impact of Generative Artificial Intelligence o... risks to decision-making including cognitive overreliance, algorithmic bias, str...
Wide disagreement among AIs created confusion and undermined appropriate reliance on advice.
Reported experimental finding from the paper: manipulating within-panel disagreement across tasks produced wide disagreement conditions that, according to the abstract, led to confusion and reduced appropriate reliance. No quantitative metrics reported in abstract.
high negative More Isn't Always Better: Balancing Decision Accuracy and Co... appropriate reliance on advice / decision-making
High within-panel consensus fostered overreliance on AI advice.
Experimental manipulation of within-panel consensus across the three tasks; the abstract reports that high consensus increased participants' reliance on AI (interpreted as overreliance). Specific measures and sample size not provided in abstract.
high negative More Isn't Always Better: Balancing Decision Accuracy and Co... reliance on AI advice (overreliance)
Current (pay-upfront) models impose a financial barrier to entry for developers, limiting innovation and excluding actors from emerging economies.
Analytical argument in the paper based on cost-structure reasoning and literature on barriers to entry; no empirical sample or causal estimate provided.
high negative Revenue-Sharing as Infrastructure: A Distributed Business Mo... developer entry barriers / access to platform
Improvements in AI ('better' AI) amplify the excess automation as well.
Model comparative statics: increased AI capabilities raise private incentives to automate, leading to more displacement than is socially optimal; theoretical analysis only.
high negative The AI Layoff Trap level of automation / worker displacement as a function of AI capability
More competition amplifies the excess automation (the automation arms race).
Comparative-statics result in the competitive task-based theoretical model showing increased competition raises firms' incentives to automate; no empirical sample.
high negative The AI Layoff Trap level of automation / worker displacement as a function of competition intensity
The resulting loss from excess automation harms both workers and firm owners.
Welfare comparisons from the model showing negative payoff changes for workers (lower wages/less employment) and reduced owner returns when automation is excessive; theoretical analysis, no empirical data.
high negative The AI Layoff Trap welfare/profits of workers and firm owners (losses caused by excess automation)
In a competitive task-based model, demand externalities trap rational firms in an automation arms race, displacing workers well beyond what is collectively optimal.
Formal equilibrium analysis in the paper's theoretical competitive task-based model; comparative statics and welfare analysis (no empirical sample).
high negative The AI Layoff Trap extent of worker displacement relative to social optimum
Knowing that AI-driven displacement can erode demand is not enough for firms to stop automating.
Analytical result from the paper's competitive task-based model showing firms' incentives do not internalize demand externalities; no empirical sample.
high negative The AI Layoff Trap firm automation decisions (propensity to automate) despite awareness of aggregat...
If AI displaces human workers faster than the economy can reabsorb them, it risks eroding the very consumer demand firms depend on.
Theoretical statement in the paper's motivating premise; no empirical sample reported (conceptual argument about aggregate demand effects when displacement outpaces reabsorption).
high negative The AI Layoff Trap consumer demand (aggregate demand) as affected by worker displacement
Fukui is Japan's least-visited prefecture.
Descriptive claim in the paper specifying the study site (Fukui) as the country's least-visited prefecture; no supporting national rankings provided in the excerpt.
We quantify an annual opportunity gap of 865,917 unrealized visits, equivalent to approximately 11.96 billion yen (USD 76.2 million) in lost revenue.
Model-based estimate produced by the DSS using the analyzed datasets and the DHDE-informed optimization; figure reported directly in the paper.
high negative Engineering Distributed Governance for Regional Prosperity: ... unrealized visits and lost revenue
For regions experiencing demographic decline and structural stagnation, the primary risk is 'under-vibrancy', a condition where low visitor density suppresses economic activity and diminishes satisfaction.
Conceptual claim and problem framing provided by the authors (theoretical/qualitative argument in the paper).
high negative Engineering Distributed Governance for Regional Prosperity: ... economic activity and satisfaction (conceptual)
Most research in urban informatics and tourism focuses on mitigating overtourism in dense global cities.
Author statement in introduction positioning the paper relative to existing literature; no quantitative literature review or citation counts reported in the excerpt.
Developers and experts still lack a shared view, resulting in repeated coordination, clarification rounds, and error-prone handoffs.
Observational/qualitative claim in paper describing current MSD practice (no numeric sample reported).
high negative LLM-Powered Workflow Optimization for Multidisciplinary Soft... frequency of coordination rounds / error-prone handoffs
Even with AI coding assistants like GitHub Copilot, individual coding tasks are semi-automated, but the workflow connecting domain knowledge to implementation is not.
Qualitative observation/comparative statement in paper (no empirical sample reported).
high negative LLM-Powered Workflow Optimization for Multidisciplinary Soft... degree of automation of coding tasks vs. end-to-end workflow automation
Multidisciplinary Software Development (MSD) requires domain experts and developers to collaborate across incompatible formalisms and separate artifact sets.
Conceptual/argument in paper framing the problem (no empirical sample reported).
high negative LLM-Powered Workflow Optimization for Multidisciplinary Soft... collaboration/workflow efficiency between domain experts and developers
Strict data sovereignty laws fragment regional collaboration between African Union member states and hinder AI development.
Stated in the paper as a policy barrier; supported by the authors' policy review of data sovereignty rules and their implications for cross-border data sharing.
high negative Take the Train: Africa at the Crossroad of Modern AI regional collaboration for AI development
Restricted cloud access due to payment system mismatches and volatile exchange rates is a barrier to AI adoption in Africa.
Claim made in the paper as part of the list of barriers; based on the authors' qualitative and quantitative review and reference to policy/financial constraints across African countries.
high negative Take the Train: Africa at the Crossroad of Modern AI cloud access for AI developers
Important barriers include limited access to high-performance computing (HPC).
Paper identifies limited HPC access as a key barrier; supported by the authors' collection and consolidation of HPC availability data via the Africa AI Compute Tracker (ACT).
high negative Take the Train: Africa at the Crossroad of Modern AI access to high-performance computing (HPC)
Africa's participation in modern AI development is constrained by severe infrastructural and policy gaps.
Stated as a central argument in the paper; supported by the paper's synthesis of qualitative and quantitative evidence and reference to official declarations on AI adoption across the continent.
high negative Take the Train: Africa at the Crossroad of Modern AI Africa's participation in modern AI development
Only 12% of AI market value is used in physical activities.
Descriptive aggregate: authors categorize and report that 12% of estimated AI market value maps to physical activities.
high negative Where can AI be used? Insights from a deep ontology of work ... share of AI market value by activity type (physical)
Off-the-shelf implementations of DRL have seen mixed success, often plagued by high sensitivity to the hyperparameters used during training.
Statement in the paper's abstract describing observed/prior performance issues with standard DRL implementations; implies literature/empirical experience but no specific experiment/sample given in the abstract.
high negative DeepStock: Reinforcement Learning with Policy Regularization... sensitivity of DRL performance to hyperparameter choices (resulting in mixed suc...
Coal-based energy consumption structure and a secondary-industry-dominated industrial structure significantly inhibit regional TFCP and have strong negative spatial spillovers.
Control-variable coefficients from Spatial Durbin Model on panel data (30 provinces, 2010–2023) showing statistically significant negative direct and indirect effects for coal-dominant energy structure and secondary-industry share.
high negative Study on the impact of industrial intelligence and the digit... total factor carbon productivity (TFCP)
Applying them to hardware-in-the-loop (HIL) embedded and Internet-of-Things (IoT) systems remains challenging due to the tight coupling between software logic and physical hardware behavior; code that compiles successfully may still fail when deployed on real devices because of timing constraints, peripheral initialization requirements, or hardware-specific behaviors.
Conceptual/engineering reasoning stated in the paper describing known HIL/IoT failure modes (no experimental quantification provided in this excerpt).
high negative Skilled AI Agents for Embedded and IoT Systems Development code failure / runtime correctness when deployed to hardware
The most vulnerable occupational groups to AI-driven transformation are office workers, data entry operators, call center workers, accountants, and administrative staff with routine analytical and administrative tasks.
Results of the envelope-model assessment for the sampled European Union countries that identify occupations with high exposure/vulnerability to AI-driven change; occupations are listed explicitly in the paper.
high negative Artificial intelligence as a driver of economic growth: Chal... vulnerability / exposure to AI-driven job displacement
AI appears to be a diffusing technology, not an emerging occupation.
Synthesis of empirical findings: presence of a shared vocabulary but lack of a coherent practitioner population in resume data, interpreted as diffusion of AI skills/vocabulary across existing roles.
high negative NLP Occupational Emergence Analysis: How Occupations Form an... status of AI as technology diffusion versus occupation formation
Across heterogeneous learners, a common broadcast curriculum can be slower than personalized instruction by a factor linear in the number of learner types.
Theoretical comparative result in the model (analysis of broadcast vs personalized curricula across heterogeneous learner types; abstract states factor linear in number of types).
high negative A Mathematical Theory of Understanding speed of instruction / time to learn under broadcast curriculum vs personalized ...
The findings provide evidence against cue-based accounts of lie detection more generally.
Authors' interpretation: because lie-detection accuracy did not decrease despite changes to visual cues (retouching, backgrounds, avatars), the results challenge theories that rely on superficial cues for lie detection.
high negative Through the Looking-Glass: AI-Mediated Video Communication R... validity of cue-based accounts of lie detection
Participants' confidence in their judgments declined in AI-mediated videos, particularly when some participants used avatars while others did not.
Experimental comparisons across conditions with varying levels of AI mediation; subgroup/condition contrast highlighting larger declines in mixed-avatar settings.
high negative Through the Looking-Glass: AI-Mediated Video Communication R... participants' confidence in their lie-detection judgments
Perceived trust in speakers declined in AI-mediated videos.
Experimental results from the two preregistered online experiments comparing perceived trust across varying levels of AI mediation (retouching, background replacement, avatars).
high negative Through the Looking-Glass: AI-Mediated Video Communication R... perceived trust in speakers
AI-based tools that mediate, enhance or generate parts of video communication may interfere with how people evaluate trustworthiness and credibility.
Motivating claim stated in the paper's introduction/abstract; not an empirical finding but a hypothesis motivating the experiments.
high negative Through the Looking-Glass: AI-Mediated Video Communication R... evaluation of trustworthiness and credibility (general)
AI adoption faces critical obstacles originating from digital illiteracy, poor Internet access, excessive application costs, and the rural-to-urban divide.
Survey findings and interview themes from the mixed-methods study (survey n=293; interviews n=12) identifying barriers to AI adoption.
Users still had concerns about how AI credit assessments and chatbots operate.
Qualitative interview data (n=12) and/or survey responses (n=293) reporting user concerns about AI credit scoring and chatbots.
high negative The Impact of Artificial Intelligence on Financial Inclusion... user concerns / trust regarding AI credit assessments and chatbots
Compositional spatial reasoning remains a formidable challenge for state-of-the-art VLMs (as revealed by our evaluation).
Empirical results from the evaluation of the 37 VLMs on the MultihopSpatial benchmark showing poor performance on multi-hop/compositional queries.
high negative MultihopSpatial: Multi-hop Compositional Spatial Reasoning B... performance on compositional/multi-hop spatial reasoning tasks
Existing benchmarks predominantly focus on elementary, single-hop relations and neglect multi-hop compositional spatial reasoning and precise visual grounding needed for real-world scenarios.
Literature/benchmark survey and motivation presented by the authors comparing characteristics of prior benchmarks vs. the proposed needs.
high negative MultihopSpatial: Multi-hop Compositional Spatial Reasoning B... scope/complexity of spatial reasoning tasks in existing benchmarks
Adoption barriers exist, particularly for small and medium-sized enterprises and firms in emerging economies, where capability and data constraints limit impact.
Findings reported from the systematic review and mixed-methods assessment (abstract references barriers observed across reviewed studies); number of studies reported in abstract is 104 for the systematic review.
high negative Artificial intelligence as a catalyst for the circular econo... adoption barriers / limitations to AI impact (capability and data constraints)
AI can initially exacerbate distributional injustice.
Dimension-level analysis indicating negative (or initially negative) effects of AI on the distributional component of the energy justice index.
high negative Artificial intelligence adoption for advancing energy justic... distributional justice component of energy justice index
There are few integrated frameworks (bridging ethics and technical controls) in the current AI governance landscape.
Result of the literature review and cluster analysis showing limited coverage of frameworks that integrate ethical principles with auditable technical controls.
high negative AI Governance Risk Tiering for Sustainable Digital Infrastru... prevalence of integrated governance frameworks
Findings reveal a fragmented landscape dominated by ethics/privacy-centric and compliance/risk-focused approaches.
Synthesis of the reviewed literature and results of PCA/k-means clustering indicate thematic dominance of ethics/privacy and compliance/risk orientations across frameworks.
high negative AI Governance Risk Tiering for Sustainable Digital Infrastru... dominant thematic focus of governance frameworks
Significant limitations emerged in case law citations, with most cited cases being non-existent or incorrectly referenced.
Authors' review of the case citations produced by the four AI engines for the single transcript, finding many citations were fabricated or misreferenced.
high negative Robot Wingman: Using AI to Assess an Employment Termination accuracy of case law citations (error rate / hallucination rate)
These findings uncover critical threats to judicial integrity and public trust and underscore the urgent need for robust safeguards against non-legal influences in AI legal systems.
Interpretation/conclusion drawn from the empirical results (observed deviations, sentiment amplification, and subgroup vulnerabilities).
high negative LLM Safety in Judicial AI: A Stress Test of Social Media Inf... potential impact on judicial integrity and public trust (qualitative/inferential...
These safety risks are compounded for emotionally charged topics.
Subgroup analyses where emotionally charged case topics showed larger deviations and stronger effects from injected sentiment.
high negative LLM Safety in Judicial AI: A Stress Test of Social Media Inf... change in deviation/amplification of model outputs for emotionally charged topic...
These safety risks are compounded (stronger) for low-skilled occupational categories.
Subgroup analyses reported in the paper showing larger model deviations and/or greater sentiment amplification effects for cases involving low-skilled occupations.
high negative LLM Safety in Judicial AI: A Stress Test of Social Media Inf... interaction effect: deviation/amplification magnitude by occupational skill leve...
The sentiment-induced divergences lead to unstable and often inflated compensation predictions by the models.
Analysis of model-predicted compensation amounts under sentiment perturbations showing increased variability and upward bias compared to CJOL amounts.
high negative LLM Safety in Judicial AI: A Stress Test of Social Media Inf... predicted compensation amounts (inflation and instability) from LLMs versus CJOL...
Public opinion (social media sentiment) substantially amplifies deviations between LLM outputs and real rulings.
Stress-test experiments in which injected social media sentiment increased the divergence of model outputs from CJOL judgments across the sample.
high negative LLM Safety in Judicial AI: A Stress Test of Social Media Inf... change in deviation between LLM outputs and CJOL rulings when social media senti...
Models exhibit inherent deviations from real rulings.
Empirical comparison of LLM outputs to CJOL judgments showing systematic differences (based on the paper's reported comparisons across the dataset).
high negative LLM Safety in Judicial AI: A Stress Test of Social Media Inf... magnitude and frequency of deviations between LLM outputs and actual court judgm...
GDP growth is initially negatively affected by the ageing population.
Estimated negative association reported in panel threshold regressions using provincial panel data (31 provinces, 2000–2022); ageing operationalized (primary specification) as an ageing measure (paper also tests old-age dependency ratio).