The Commonplace
Home Dashboard Papers Evidence Digests 🎲

Evidence (2215 claims)

Adoption
5126 claims
Productivity
4409 claims
Governance
4049 claims
Human-AI Collaboration
2954 claims
Labor Markets
2432 claims
Org Design
2273 claims
Innovation
2215 claims
Skills & Training
1902 claims
Inequality
1286 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 369 105 58 432 972
Governance & Regulation 365 171 113 54 713
Research Productivity 229 95 33 294 655
Organizational Efficiency 354 82 58 34 531
Technology Adoption Rate 277 115 63 27 486
Firm Productivity 273 33 68 10 389
AI Safety & Ethics 112 177 43 24 358
Output Quality 228 61 23 25 337
Market Structure 105 118 81 14 323
Decision Quality 154 68 33 17 275
Employment Level 68 32 74 8 184
Fiscal & Macroeconomic 74 52 32 21 183
Skill Acquisition 85 31 38 9 163
Firm Revenue 96 30 22 148
Innovation Output 100 11 20 11 143
Consumer Welfare 66 29 35 7 137
Regulatory Compliance 51 61 13 3 128
Inequality Measures 24 66 31 4 125
Task Allocation 64 6 28 6 104
Error Rate 42 47 6 95
Training Effectiveness 55 12 10 16 93
Worker Satisfaction 42 32 11 6 91
Task Completion Time 71 5 3 1 80
Wages & Compensation 38 13 19 4 74
Team Performance 41 8 15 7 72
Hiring & Recruitment 39 4 6 3 52
Automation Exposure 17 15 9 5 46
Job Displacement 5 28 12 45
Social Protection 18 8 6 1 33
Developer Productivity 25 1 2 1 29
Worker Turnover 10 12 3 25
Creative Output 15 5 3 1 24
Skill Obsolescence 3 18 2 23
Labor Share of Income 7 4 9 20
Clear
Innovation Remove filter
AI development proceeds not through smooth advancement but through extended periods of stasis interrupted by rapid phase transitions that reorganize the competitive landscape (punctuated equilibrium pattern).
Argument based on punctuated equilibrium theory from evolutionary biology and historical analysis presented in the paper identifying discrete transitions in AI history; the paper cites and classifies eras/events as evidence.
high negative Punctuated Equilibria in Artificial Intelligence: The Instit... pattern of AI development (stasis vs. phase transitions)
Progress in agentic AI systems that generate and optimize GPU kernels is constrained by benchmarks that reward speedup over software baselines rather than proximity to hardware-efficient execution.
Author argument/observation in paper (conceptual claim about limitations of existing benchmarks); no empirical sample or experiment reported in the provided text.
high negative SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GP... benchmark_alignment_with_hardware_efficiency
The gap between informal natural language requirements and precise program behavior (the 'intent gap') has always plagued software engineering, but AI-generated code amplifies it to an unprecedented scale.
Conceptual claim and argumentation in the paper; presented as an observed escalation in the scale of the existing 'intent gap' due to AI code generation. No quantitative evidence or sample size given in the excerpt.
high negative Intent Formalization: A Grand Challenge for Reliable Coding ... mismatch between intended and actual program behavior (intent gap) / resulting c...
The crowding-out effect of AI washing on green innovation is heterogeneous: private enterprises, small and medium-sized enterprises (SMEs), and firms in highly competitive sectors suffer more severe negative impacts.
Subgroup/heterogeneity analysis reported in the paper on the same sample of Chinese A-share listed companies (2006–2024); abstract identifies private firms, SMEs, and firms in highly competitive industries as more affected.
high negative The Spillover Effects of Peer AI Rinsing on Corporate Green ... green innovation (heterogeneous treatment effects across firm types and industri...
The negative relationship between AI washing and green innovation is transmitted through dual channels in both product and capital markets.
Mechanism analysis reported in the paper (presumably mediation or channel analysis) using the same dataset of Chinese A-share firms' annual reports and firm-level market data; abstract states product- and capital-market channels convey the crowding-out effect.
high negative The Spillover Effects of Peer AI Rinsing on Corporate Green ... green innovation (via product-market and capital-market channels)
Corporate AI washing exerts a significant crowding-out effect on green innovation.
Empirical analysis using semantic measures of 'AI washing' derived from large language model (LLM) analysis of annual reports for Chinese A-share listed companies (2006–2024); paper reports statistically significant negative relationship between AI washing and firms' green innovation (details of regression models not provided in abstract).
The capital-output elasticity dropped significantly, from 0.42 in 2010–2015 to 0.35 in 2016–2022.
Estimated from an extended Cobb–Douglas production function applied to China's economy over 2010–2022, with period split 2010–2015 vs 2016–2022 (as reported in the study summary).
high negative Analysis of China's Economic Growth Drivers: An Empirical St... capital-output elasticity (elasticity of output with respect to capital)
Securitization of economic dependencies—especially in strategic sectors (semiconductors, telecoms, cloud)—frames partner states as security risks and exposes them to blacklists, de-risking campaigns, and sudden loss of market access.
Process tracing of export controls and blacklisting episodes; chronologies of sanction/policy actions affecting firms and partners; policy documents and public lists (e.g., export-control lists). (Data sources: export-control lists, sanction policy documents, corporate/access denials; sample sizes not specified.)
high negative China-US Trade War and the Challenges for Developing Countri... incidence of blacklisting/sanctions affecting partners, sudden changes in market...
Large-scale AI models have significant energy and resource costs, creating a notable environmental footprint that must be addressed.
Narrative integration of prior empirical studies measuring compute, energy consumption, and embodied emissions of large models (cited literature); the review does not present new quantitative measurements itself.
high negative The Evolution and Societal Impact of Artificial Intelligence... energy consumption, carbon emissions, and resource use associated with large-sca...
As AI is deployed in safety-critical domains, reliability, regulation, and human-oriented system design become essential to avoid harms.
Review of literature on safety-critical systems, human–machine interaction studies, and regulatory policy discussions; the paper reports this as a consensus implication rather than presenting new empirical tests.
high negative The Evolution and Societal Impact of Artificial Intelligence... system reliability/safety and risk of harm in safety-critical deployments
AI‑enabled platforms can magnify winner‑takes‑most dynamics in digital services trade, concentrating market power.
Theoretical and empirical literature on network effects and platform markets reviewed in the paper; illustrative examples (no novel empirical aggregation).
high negative Analysis of Digital Services Trade and Export Competitivenes... market concentration / competition in digital services
Current data governance regimes in China can impede cross‑border data flows.
Comparative policy analysis and literature documenting data localization and privacy/regulatory regimes that restrict flows (descriptive evidence in the review).
high negative Analysis of Digital Services Trade and Export Competitivenes... volume/feasibility of cross‑border data flows
Institutional barriers—fragmented international rules on data flows and privacy, regulatory divergence including data localization, weak participation in multilateral rule setting, and uneven domestic regulation of platforms—impede digital services trade.
Comparative policy analysis and literature review, supported by policy documents and case examples (qualitative evidence; no original econometric tests).
high negative Analysis of Digital Services Trade and Export Competitivenes... cross‑border digital services trade / export competitiveness
A key architectural risk is interoperability failure and fragmentation across vendors and protocols in agent ecosystems.
Comparative analysis with IoT and other platform histories showing vendor/protocol fragmentation; argument is conceptual and illustrative rather than empirically measured for future agent ecosystems.
high negative The Internet of Physical AI Agents: Interoperability, Longev... degree of interoperability and fragmentation across vendors/protocols
Domains such as disaster response, healthcare, industrial automation, and mobility will be affected and are safety‑critical, where failures have high social and economic cost.
Domain examples and policy reasoning; draws on general knowledge about those sectors and potential harms; no new empirical damage quantification provided in the paper.
high negative The Internet of Physical AI Agents: Interoperability, Longev... social and economic costs of failures in safety‑critical domains
IoT digitized perception at scale but exposed limitations such as fragmentation, weak security, limited autonomy, and poor sustainability.
Historical and comparative analysis of IoT deployments and literature cited illustratively in the paper; qualitative evidence from prior IoT incidents and ecosystem studies rather than new empirical data.
high negative The Internet of Physical AI Agents: Interoperability, Longev... levels of fragmentation, security robustness, autonomy, and sustainability in Io...
Adoption requires hardware (VR headsets, capable GPUs) and integration effort, implying upfront capital expenditure for labs/observatories.
Paper explicitly notes hardware requirements (VR headsets, capable GPUs) and integration effort as part of adoption considerations; common-sense assessment of required capital.
high negative iDaVIE v1.0: A virtual reality tool for interactive analysis... upfront capital expenditure and integration effort required for adoption
Current models heavily rely on large static datasets and batch training and exhibit poor lifelong/continual learning.
Synthesis of common practices in contemporary ML (supervised pretraining and offline training paradigms); no new experiments provided.
high negative Why AI systems don't learn and what to do about it: Lessons ... continual learning performance; dependence on dataset size and batch training
HindSight scores are negatively correlated with LLM-judged novelty (Spearman ρ = −0.29, p < 0.01), indicating LLM judges tend to overvalue novel-sounding ideas that do not materialize in the literature.
Reported Spearman correlation between HindSight scores and LLM-judged novelty across the generated ideas; ρ = −0.29 with p < 0.01. Interpretation that LLMs overvalue novel-sounding ideas is drawn from the negative correlation.
high negative HindSight: Evaluating LLM-Generated Research Ideas via Futur... Correlation between HindSight score (downstream impact) and LLM-judged novelty s...
Barriers to adoption include toolchain cost, trace data storage/transfer demands, IP-security concerns when sharing traces, and organizational inertia.
Listed as practical caveats and limitations in the summary; based on authors' experience and reasoning rather than quantified study.
high negative ODIN-Based CPU-GPU Architecture with Replay-Driven Simulatio... adoption barriers (cost, storage, security, organizational factors)
Adoption requires up-front investment in tooling and infrastructure for deterministic capture/replay, plus management of large trace data and integration with existing validation/IP/security workflows.
Authors explicitly list these practical caveats in the summary: needs tooling/infrastructure, trace data management, and integration with validation flows and IP/security constraints. (Descriptive claim based on implementation experience; no cost figures provided.)
high negative ODIN-Based CPU-GPU Architecture with Replay-Driven Simulatio... required tooling/infrastructure and trace-data management burden
Static ACLs evaluate deterministic rules that ignore partial execution paths and therefore can only capture a subset of organizational constraints.
Formal argument and examples showing static ACLs map to Policy functions that do not depend on partial_path; illustrative limitations presented.
high negative Runtime Governance for AI Agents: Policies on Paths coverage of organizational constraints by static ACLs (proportion of constraints...
Runtime evaluation imposes additional compute, latency, logging, and engineering costs that increase the marginal cost of deploying agents.
Operational discussion in the paper outlining additional runtime compute and logging requirements; cost implications argued qualitatively; no empirical cost measurements provided.
high negative Runtime Governance for AI Agents: Policies on Paths marginal deployment cost (compute/latency/engineering overhead)
Prompt-level instructions and static access control lists (ACLs) are limited special cases of a more general runtime policy-evaluation framework and cannot, in general, enforce path-dependent rules.
Formalization showing prompt/system messages and static ACLs map to restricted forms of the Policy(agent_id, partial_path, proposed_action, org_state) function; logical proof/argument in the paper and illustrative counterexamples.
high negative Runtime Governance for AI Agents: Policies on Paths ability to detect/enforce path-dependent policy violations (yes/no / coverage of...
LLM-based agent behavior is non-deterministic and path-dependent: an agent's safety/compliance risk depends on the entire execution path, not just the current prompt or single action.
Formal/abstract execution model defined in the paper (states, actions, execution paths) and conceptual arguments/illustrative examples showing how earlier states/actions affect later behavior; no large-scale empirical dataset reported.
high negative Runtime Governance for AI Agents: Policies on Paths path-dependent compliance/safety risk (probability of policy violation condition...
Real-world deployment will require representative data coverage and online adaptation despite the method’s robustness mechanisms.
Authors' discussion/limitations section: theoretical requirements for persistently exciting/representative trajectories for DeePC and recommendation for online adaptation and continual data collection for deployment.
high negative Data-driven generalized perimeter control: Zürich case study data representativeness and need for online adaptation (deployment readiness/ris...
None of the 13 systems report end-to-end evaluation on real quantum hardware (Layer 3b).
Systematic check of reported experiments for each of the 13 systems found no documented real-device, end-to-end hardware execution results (explicit Layer 3b reporting was absent).
high negative Generative AI for Quantum Circuits and Quantum Code: A Techn... presence/absence of real-device end-to-end hardware execution reporting
Regulators and payers remain central bottlenecks—AI can accelerate discovery but cannot bypass clinical evidence requirements.
Policy discussion and regulatory analysis in the paper noting that approvals require clinical evidence independent of discovery modality.
high negative Has AI Reshaped Drug Discovery, or Is There Still a Long Way... regulatory and payer requirements as constraints on the impact of AI-driven disc...
Downstream clinical development costs and translational failure rates remain the major drivers of total R&D expenditure; early-stage AI savings may not translate into proportionate increases in approved drugs.
Economic analysis and discussion in the paper referencing known cost distributions in drug development and historical attrition rates in clinical phases.
high negative Has AI Reshaped Drug Discovery, or Is There Still a Long Way... contribution of clinical development costs and failure rates to total R&D expend...
Inherent biological complexity and translational gaps between in silico predictions, preclinical models, and human biology constrain downstream success rates.
Review of translational failures and literature cited in the paper demonstrating mismatch between preclinical signals and clinical outcomes; conceptual analysis of biological complexity.
high negative Has AI Reshaped Drug Discovery, or Is There Still a Long Way... translational success rate from preclinical predictions to clinical efficacy
Gaps exist between computational designs and chemical/experimental feasibility (e.g., synthetic accessibility and assay readiness), limiting the usefulness of some generative outputs.
Case studies and critiques in the paper showing generated molecules that are synthetically infeasible or incompatible with experimental constraints; discussion of missing integration of practical constraints in many generative models.
high negative Has AI Reshaped Drug Discovery, or Is There Still a Long Way... fraction of computationally designed molecules that are synthetically accessible...
Many models have limited interpretability and insufficient uncertainty quantification, hampering trust and decision-making.
Methodological analysis in the paper noting common deep-learning approaches lacking clear interpretability and uncertainty estimates; references to literature on model explainability and calibration gaps.
high negative Has AI Reshaped Drug Discovery, or Is There Still a Long Way... degree of model interpretability and presence/quality of uncertainty quantificat...
Poor data quality, fragmentation, and limited accessibility reduce model reliability and generalizability.
Survey of data characteristics and limitations presented in the paper; examples of biased or sparse datasets and the paper's discussion of impacts on model performance and transferability.
high negative Has AI Reshaped Drug Discovery, or Is There Still a Long Way... model reliability/generalizability as a function of data quality, coverage, and ...
AI remains an augmenting technology rather than a standalone solution: no AI-only originated drug has yet achieved regulatory approval.
Review of drug-approval records and company disclosures summarized in the paper; explicit statement that to date no entirely AI-originated molecule has received full regulatory approval.
high negative Has AI Reshaped Drug Discovery, or Is There Still a Long Way... regulatory approval status of AI-originated drug candidates (number of approvals...
Ethical and legal issues—patient privacy, algorithmic bias, intellectual property, and equitable access—pose risks to AI deployment in drug development.
Ethics and legal analyses, policy reports, and documented case examples collated in the review that identify these recurring concerns.
high negative From Algorithm to Medicine: AI in the Discovery and Developm... ethical/legal risk incidence; privacy breaches; bias outcomes; access inequities
Regulatory uncertainty about validation standards and liability for AI tools raises investment risk and may slow deployment.
Regulatory and policy reports included in the narrative review describing evolving standards and open questions about validation, explainability, and liability for ML-based tools.
high negative From Algorithm to Medicine: AI in the Discovery and Developm... regulatory clarity; investment risk and deployment timelines
Adoption of AI in drug R&D requires high upfront investment in data curation, compute infrastructure, and specialized talent.
Industry reports and economic analyses summarized in the review reporting capital and operational needs for building AI capabilities; qualitative synthesis rather than quantitative costing across firms.
high negative From Algorithm to Medicine: AI in the Discovery and Developm... fixed upfront costs (data curation, compute, hiring/training)
Limited transparency and interpretability of many AI algorithms (black-box models) complicate clinical and regulatory trust and adoption.
Regulatory reports, methodological critiques, and case examples in the review highlighting interpretability concerns and their impact on clinical/regulatory acceptance.
high negative From Algorithm to Medicine: AI in the Discovery and Developm... clinical/regulatory acceptance, trust, and adoption rates; explainability metric...
Performance of AI models in drug R&D depends on large, high-quality, and representative biomedical datasets; dataset bias or gaps substantially undermine model performance and generalizability.
Methodological literature and case studies cited in the review documenting failures or limited generalization when training data are biased, sparse, or non-representative; thematic synthesis rather than pooled quantification.
high negative From Algorithm to Medicine: AI in the Discovery and Developm... model performance/generalizability across populations and contexts
Predictions from AI depend on data quality and coverage and still require experimental (wet-lab) validation.
Discussion of early failures and limits in case studies and expert observations within the narrative review; methodological argument about dependence of ML models on input data.
high negative Learning from the successes and failures of early artificial... predictive validity of computational models / need for experimental validation
High-quality, standardized, interoperable data (clean, annotated, connected across modalities) is a critical limiting factor for translating AI capability into sustained impact.
Conceptual emphasis and domain knowledge argument in the editorial; no empirical measurement of data quality's causal effect included.
high negative AI as the Catalyst for a New Paradigm in Biomedical Research ability to translate AI capability into sustained impact (dependent on data qual...
The paper's evidence base is limited by early-stage projects with limited longitudinal outcome data and dependence on publicly available project information which may be incomplete or biased.
Methods and limitations explicitly stated in the paper (qualitative review; reliance on secondary sources; two case studies; absence of large-scale quantitative evaluation).
high negative Decentralized Autonomous Organizations in the Pharmaceutical... completeness and robustness of empirical evidence supporting claims about DAO ef...
Data protection and privacy (especially sensitive health data) complicate open-data DAO models.
Conceptual analysis referencing privacy/data-protection concerns for health data (e.g., GDPR-like regimes); no empirical evaluation of privacy breaches within DAOs provided.
high negative Decentralized Autonomous Organizations in the Pharmaceutical... data privacy risk level, feasibility of open-data sharing for clinical data
Significant barriers remain for DAOs in pharma: regulatory uncertainty about tokenized securities, IP fractionalization, and clinical data sharing.
Legal/regulatory analysis and literature synthesis highlighting unclear classifications and open regulatory questions; no new regulatory rulings provided.
high negative Decentralized Autonomous Organizations in the Pharmaceutical... regulatory clarity/status for tokenized securities and IP models; legal risk ind...
Pharmaceutical R&D faces rising costs, long approval timelines, supply-chain inefficiencies, and low patient involvement.
Literature review and synthesis of well-documented industry challenges cited in the paper (secondary sources); no new primary data presented in this study.
high negative Decentralized Autonomous Organizations in the Pharmaceutical... R&D cost per approved drug, average time-to-approval, supply-chain performance m...
The black-box nature of many deep learning models undermines scientific interpretability and experimental trust, limiting adoption in materials research.
Cited concerns and methodological papers advocating interpretable architectures and post hoc explanation methods reviewed in the paper; synthesis of community critique.
high negative Machine Learning-Driven R&D of Perovskites and Spinels: From... model interpretability and experimental adoption/trust
Insufficient attention to model reliability—particularly uncertainty miscalibration—reduces real-world utility because experimentalists need reliable confidence estimates, not only point predictions.
Survey of literature on uncertainty estimation and calibration (Bayesian NNs, ensembles, temperature scaling, conformal prediction) and papers reporting calibration issues; recommendations drawn from these sources.
high negative Machine Learning-Driven R&D of Perovskites and Spinels: From... calibration of predictive uncertainties (e.g., calibration error, coverage) and ...
Progress of DL-driven materials discovery is limited by scarcity of high-quality, diverse labeled datasets; small, noisy, or biased datasets limit model generalization.
Review and synthesis of empirical studies and methodological papers documenting dataset size/quality issues and their impact on model performance; no new dataset analysis in this paper.
high negative Machine Learning-Driven R&D of Perovskites and Spinels: From... model generalization / predictive performance on out-of-distribution materials o...
Traditional ESG ratings often suffered from data inconsistency, subjectivity and limited coverage of unstructured sustainability information.
Literature review and citations cited in the paper (e.g., Berg et al. 2022 and other ESG-rating divergence studies). This is presented as established background evidence rather than a new empirical finding in the study.
high negative Green Intelligence in Finance: Artificial Intelligence-Drive... Quality attributes of traditional ESG ratings: data consistency, subjectivity, c...
The article identifies and lays out several concerns regarding the government's approach to regulating AI.
Analytical critique presented in the paper (legal/policy analysis summarizing potential regulatory shortcomings). Based on the author's review and argumentation rather than primary empirical data.
high negative Regulation and governance of artificial intelligence in Indi... adequacy and risks of the government's AI regulatory approach