The Commonplace
Home Dashboard Papers Evidence Digests 🎲

Evidence (2340 claims)

Adoption
5267 claims
Productivity
4560 claims
Governance
4137 claims
Human-AI Collaboration
3103 claims
Labor Markets
2506 claims
Innovation
2354 claims
Org Design
2340 claims
Skills & Training
1945 claims
Inequality
1322 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 378 106 59 455 1007
Governance & Regulation 379 176 116 58 739
Research Productivity 240 96 34 294 668
Organizational Efficiency 370 82 63 35 553
Technology Adoption Rate 296 118 66 29 513
Firm Productivity 277 34 68 10 394
AI Safety & Ethics 117 177 44 24 364
Output Quality 244 61 23 26 354
Market Structure 107 123 85 14 334
Decision Quality 168 74 37 19 301
Fiscal & Macroeconomic 75 52 32 21 187
Employment Level 70 32 74 8 186
Skill Acquisition 89 32 39 9 169
Firm Revenue 96 34 22 152
Innovation Output 106 12 21 11 151
Consumer Welfare 70 30 37 7 144
Regulatory Compliance 52 61 13 3 129
Inequality Measures 24 68 31 4 127
Task Allocation 75 11 29 6 121
Training Effectiveness 55 12 12 16 96
Error Rate 42 48 6 96
Worker Satisfaction 45 32 11 6 94
Task Completion Time 78 5 4 2 89
Wages & Compensation 46 13 19 5 83
Team Performance 44 9 15 7 76
Hiring & Recruitment 39 4 6 3 52
Automation Exposure 18 17 9 5 50
Job Displacement 5 31 12 48
Social Protection 21 10 6 2 39
Developer Productivity 29 3 3 1 36
Worker Turnover 10 12 3 25
Skill Obsolescence 3 19 2 24
Creative Output 15 5 3 1 24
Labor Share of Income 10 4 9 23
Clear
Org Design Remove filter
Gaps exist between computational designs and chemical/experimental feasibility (e.g., synthetic accessibility and assay readiness), limiting the usefulness of some generative outputs.
Case studies and critiques in the paper showing generated molecules that are synthetically infeasible or incompatible with experimental constraints; discussion of missing integration of practical constraints in many generative models.
high negative Has AI Reshaped Drug Discovery, or Is There Still a Long Way... fraction of computationally designed molecules that are synthetically accessible...
Many models have limited interpretability and insufficient uncertainty quantification, hampering trust and decision-making.
Methodological analysis in the paper noting common deep-learning approaches lacking clear interpretability and uncertainty estimates; references to literature on model explainability and calibration gaps.
high negative Has AI Reshaped Drug Discovery, or Is There Still a Long Way... degree of model interpretability and presence/quality of uncertainty quantificat...
Poor data quality, fragmentation, and limited accessibility reduce model reliability and generalizability.
Survey of data characteristics and limitations presented in the paper; examples of biased or sparse datasets and the paper's discussion of impacts on model performance and transferability.
high negative Has AI Reshaped Drug Discovery, or Is There Still a Long Way... model reliability/generalizability as a function of data quality, coverage, and ...
AI remains an augmenting technology rather than a standalone solution: no AI-only originated drug has yet achieved regulatory approval.
Review of drug-approval records and company disclosures summarized in the paper; explicit statement that to date no entirely AI-originated molecule has received full regulatory approval.
high negative Has AI Reshaped Drug Discovery, or Is There Still a Long Way... regulatory approval status of AI-originated drug candidates (number of approvals...
Predictions from AI depend on data quality and coverage and still require experimental (wet-lab) validation.
Discussion of early failures and limits in case studies and expert observations within the narrative review; methodological argument about dependence of ML models on input data.
high negative Learning from the successes and failures of early artificial... predictive validity of computational models / need for experimental validation
When incentive signals depend non-trivially on persistent environmental memory, the resulting dynamics generically cannot be reduced to a static global objective defined solely over the agent state space (i.e., no global potential function over agents exists in the generic case).
A genericity theorem/argument in the paper (mathematical demonstration showing that for nontrivial dependence on environmental memory the closed-loop vector field is, for a generic set of parameterizations, not gradient of any scalar function on agent space).
high negative How Intelligence Emerges: A Minimal Theory of Dynamic Adapti... non-existence of a static global objective (potential) over agent state space in...
AI notably reduces customer stability in sports enterprises (SE).
Empirical estimation using the DML model on the same panel dataset of 45 Chinese listed SEs (2012–2023); authors report a statistically significant negative effect of AI on customer stability.
high negative Can Artificial Intelligence Enhance the Stability of Supply ... customer stability (component of supply chain stability)
The environmental footprint of healthcare systems is growing and persistent inequities in access and outcomes have intensified calls for procurement reform.
Contemporary literature review and synthesis of sector reports and studies documenting healthcare emissions/footprint and health inequities (no original empirical data reported in this paper).
high negative Greening the Medicaid Supply Chain: An ESG-Integrated Framew... environmental footprint of healthcare systems; inequities in access and health o...
Human judgment is constrained by bounded rationality, cognitive biases, and information-processing limitations.
Cited as established findings from prior research across decision sciences and related fields (extensive literature evidence referenced; no new empirical data in this paper's abstract).
high negative Reframing Organizational Decision-Making in the Age of Artif... human judgment accuracy/quality and cognitive processing capacity
Ireland exhibits the largest gender gap in advanced digital task use: approximately 44% of men versus 18% of women perform advanced digital tasks — a 26 percentage point gap, close to double the European average.
Country-level descriptive statistics from ESJS for Ireland reporting shares of men and women performing advanced digital tasks. (Exact Irish sample size not provided in the excerpt.)
high negative Squandered skills? Bridging the digital gender skills gap fo... Share (%) of men and women in Ireland performing advanced digital tasks; gender ...
Across Europe, women are around 15 percentage points less likely than men to perform advanced digital tasks in their jobs.
Empirical analysis of the European Skills and Jobs Survey (ESJS) (Cedefop, 2021) using regression-based estimates and descriptive statistics across European countries. (Exact sample size and country count not provided in the excerpt.)
high negative Squandered skills? Bridging the digital gender skills gap fo... Probability / share of workers performing advanced digital tasks (binary indicat...
Two regimes emerge: an inequality-decreasing regime when AI behaves like a broadly available commodity technology or when labor-market institutions share rents widely (high ξ).
Model regime characterization and calibrated counterfactuals showing falling wage dispersion and ΔGini under commodity-like AI assumptions or higher rent-sharing elasticity.
high negative When AI Levels the Playing Field: Skill Homogenization, Asse... wage dispersion and aggregate inequality (ΔGini)
Generative AI compresses within-task skill differences (reduces dispersion of individual task performance).
Theoretical task-based model and calibrated quantitative simulations (Method of Simulated Moments matching six empirical moments) showing reductions in within-task performance dispersion after introducing AI technology.
high negative When AI Levels the Playing Field: Skill Homogenization, Asse... within-task performance dispersion (skill/ability variance within a task)
Regulatory uncertainty around blockchain/DeFi for corporate finance and cross-border data rules is a material risk to adoption.
Paper notes regulatory uncertainty as a risk; no jurisdictional legal analysis or compliance case studies provided in the summary.
high negative Developing Cloud-Based Financial Solutions for The Engineeri... regulatory clarity (existence of applicable rules, legal enforceability of on-ch...
Cybersecurity and data-privacy concerns arise from cloud provider centralization versus blockchain transparency.
Paper highlights this trade-off in its challenges section; discussion-based evidence rather than quantified security assessment in the summary.
high negative Developing Cloud-Based Financial Solutions for The Engineeri... data-privacy risk, exposure due to centralization, privacy vs transparency trade...
Integration complexity with legacy ERPs and heterogeneous vendor ecosystems is a significant implementation challenge.
Paper lists this as a challenge/limitation based on pilot experience and analysis. No quantified measure of integration effort is provided in the summary.
high negative Developing Cloud-Based Financial Solutions for The Engineeri... integration complexity (number/types of legacy systems, integration effort/time/...
EPC projects feature milestone-based payments, complex stakeholder flows, and large working-capital needs that strain traditional on-premise ERPs.
Problem context statement presented in the paper; consistent with commonly reported characteristics of EPC projects. The summary does not cite empirical industry-wide data.
high negative Developing Cloud-Based Financial Solutions for The Engineeri... operational complexity indicators (payment structure: milestone-based; stakehold...
Reproducibility and deployment gaps are widespread: missing code, inconsistent benchmarks, and insufficient productionization focus (monitoring, model updates, rollback).
Surveyed literature often lacks released code and consistent benchmarks; thematic analysis highlights absence of operational deployment practices.
high negative International Journal on Cybernetics & Informatics reproducibility indicators (code availability, benchmark consistency) and deploy...
Common ML pipeline pitfalls include overfitting, poor cross-validation practices, lack of real-time/online evaluation, and inadequate feature engineering.
Critical assessment of experimental practices in the surveyed literature identifying methodological shortcomings that can inflate reported performance.
high negative International Journal on Cybernetics & Informatics validity/reliability of reported model performance
There is a lack of large, labeled, realistic IoT datasets; class imbalance, concept drift, dataset bias, and synthetic datasets that poorly reflect real traffic are common problems.
Review of datasets (N-BaIoT, Bot-IoT, TON_IoT, UNSW-NB15, KDD variants, custom/synthetic datasets) and critical assessment of their limitations across studies.
high negative International Journal on Cybernetics & Informatics dataset quality and representativeness; labeling availability
Resource constraints (limited CPU, memory, energy, and network bandwidth on devices and edge nodes) significantly limit feasible ML model complexity and deployment choices.
Multiple surveyed studies report hardware constraints and evaluate runtime/memory/latency; survey synthesizes these resource limitations as a recurring challenge.
high negative International Journal on Cybernetics & Informatics resource usage (CPU, memory, energy) and feasible model complexity
Despite high reported detection accuracies in academic work, there is a shortage of production-grade, deployable ML-IDS for IoT.
Critical review of surveyed papers showing many report lab metrics but few report deployment case studies, production rollouts, or provide deployment artifacts (code, runtime/energy measurements).
high negative International Journal on Cybernetics & Informatics deployment readiness/production adoption
Limitations of the review include restricted sample size, Scopus-only coverage, emergent-literature timeframe, and heterogeneity in study designs and measures, which constrain generalizability.
Authors' limitations subsection explicitly listing these constraints from their SLR process.
high negative Pricing Strategy in Digital Marketing: A Systematic Review o... Generalisability and completeness of the review's conclusions
There has been insufficient attention in the literature to ethics, fairness, and consumer welfare in algorithmic pricing.
Persistent gap identified in the SLR—few or no included studies focused on ethics/fairness/welfare issues according to authors' coding.
high negative Pricing Strategy in Digital Marketing: A Systematic Review o... Coverage of ethics/fairness/consumer welfare topics in digital pricing literatur...
Existing empirical studies on digital VBP exhibit methodological limitations, including small/limited samples, short time windows, and inconsistent measures.
Authors' methodological critique from the SLR based on assessment of study designs and measures reported in the 30 articles.
high negative Pricing Strategy in Digital Marketing: A Systematic Review o... Methodological rigor and validity of existing digital VBP studies
The evidence base is skewed toward pilots and high‑performer contexts; there is a lack of long‑panel, multi‑project longitudinal studies to validate typical returns and scalability.
Authors' assessment of evidence types in the 160 studies: mix of conceptual papers, case studies, pilots, and only limited larger empirical evaluations.
high negative Digital Twins Across the Asset Lifecycle: Technical, Organis... representativeness and longitudinal robustness of evidence
Empirical evaluation of integrated defenses, quantitative cost/benefit analyses, and standardized threat models for VR are research gaps that remain unaddressed in the literature window surveyed (2023–2025).
Authors' stated limitations from their comparative literature review of 31 studies noting an absence of primary empirical validation and quantitative economic analyses in the reviewed corpus.
high negative Securing Virtual Reality: Threat Models, Vulnerabilities, an... presence/absence of empirical validation, cost‑benefit studies, and standard thr...
Immersive VR systems collect continuous multimodal signals (motion tracking, gaze, voice, biometrics) that enable novel inference, spoofing, and manipulation attacks beyond traditional IT threats.
Synthesis of threat descriptions across the 31 reviewed peer‑reviewed studies (2023–2025) documenting sensor modalities and attack vectors; qualitative comparative evaluation of attack surfaces.
high negative Securing Virtual Reality: Threat Models, Vulnerabilities, an... existence and extent of expanded attack surface due to multimodal signal collect...
Mean emotional self-alignment between poster and responder is 32.7%, indicating systematic affective mismatch rather than congruence.
Pairwise comparison of emotion labels across post–response pairs in the dataset; computation of mean percentage where poster and immediate responder share the same emotion (32.7%).
high negative What Do AI Agents Talk About? Emergent Communication Structu... percentage of post–response pairs with identical emotion labels (emotional self-...
Conversational coherence declines rapidly with thread depth, indicating shallow, weakly connected multi-turn exchanges.
Lexical-semantic coherence metrics (e.g., embedding-based similarity) computed across comment threads of varying depth in the Moltbook dataset; observed rapid decrease in coherence scores as thread depth increases.
high negative What Do AI Agents Talk About? Emergent Communication Structu... coherence (similarity) metric as a function of thread depth
When pipelines have cross-cutting ties, prices oscillate, allocation quality drops, and management becomes difficult.
Empirical simulation results from the ablation study: configurations with non-hierarchical, cross-cutting graph structures produced larger price volatility, frequent oscillations in price updates, and lower allocation value/throughput compared to hierarchical graphs (measured across many runs and random seeds within the 1,620-run experimental set).
high negative Real-Time AI Service Economy: A Framework for Agentic Comput... price volatility and oscillation frequency; allocation quality (value/throughput...
On the 22 postdating (contamination-free) incidents, no agent achieved end-to-end exploitation success across all 110 agent–incident pairs evaluated.
Empirical evaluation of 110 agent–incident pairs reported in the study (end-to-end exploit attempts on the 22 incidents).
high negative Re-Evaluating EVMBench: Are AI Agents Ready for Smart Contra... end_to_end_exploitation_success_rate (per_agent_per_incident)
The original EVMbench had a data contamination risk because it relied on audit-contest data published before every evaluated model's release, which could have been seen during model training.
Timing relationship between the audit-contest dataset used by EVMbench and the release dates of evaluated models (dataset predated model releases).
high negative Re-Evaluating EVMBench: Are AI Agents Ready for Smart Contra... dataset_contamination_risk (potential_training_data_leakage)
The original EVMbench evaluation was narrow: it evaluated 14 agent configurations and most models were tested only with their vendor-provided scaffold.
Description of the original EVMbench experimental setup (number of agent configurations and scaffold usage) cited in this study.
high negative Re-Evaluating EVMBench: Are AI Agents Ready for Smart Contra... evaluation_breadth (number_of_agent_configurations; scaffold_variety)
Limitations of the study include reliance on self-reported perceptions (subject to response and survivorship bias), lack of experimental/causal identification, potential non-representative sample, and cross-sectional design limiting inference about long-term productivity effects.
Authors' stated limitations in the paper summary.
high negative Artificial Intelligence as a Catalyst for Innovation in Soft... validity threats (self-report bias, lack of causal design) as reported by author...
Current bottlenecks are disparate quantum and classical resources operating in isolation, causing manual job orchestration, inefficient scheduling, data-movement overheads, and slow iteration that limit productivity and algorithmic exploration.
Use-case-driven analysis and observations from early hybrid deployments and literature; systems design decomposition highlighting latency and data-staging requirements; no quantitative benchmark data.
high negative Reference Architecture of a Quantum-Centric Supercomputer developer/researcher productivity, iteration latency, scheduling and data-transf...
Improving explainability can trade off with predictive performance, privacy, and robustness; these trade-offs must be managed rather than ignored.
Review aggregates technical literature and conceptual analyses documenting trade-offs reported by researchers (e.g., simpler interpretable models sometimes having lower predictive accuracy; disclosure risks to privacy; robustness concerns). No single causal estimate provided.
high negative Explainable AI in High-Stakes Domains: Improving Trust, Tran... predictive performance, privacy risk, model robustness
The evidence base presented is limited to a single SME pilot, so generalizability across sectors, firm sizes, and data regimes is untested and requires further research.
Explicit limitation noted in the paper and the fact that the pilot illustrated is a single case study (sample size = 1 SME pilot).
high negative ALGORITHM FOR IMPLEMENTING AI IN THE MANAGEMENT LOOP OF SMES... external validity / generalizability of results beyond the single pilot
Tasks that are routine, repetitive, or pattern‑based (e.g., boilerplate coding, refactoring, unit test generation, some accessibility fixes) will be increasingly automated by AI.
Task‑level decomposition and examples of current automation capabilities (code generation, test suggestion tools); conceptual projection rather than empirical measurement.
high negative How AI Will Transform the Daily Life of a Techie within 5 Ye... rate of automation for routine software development tasks (proportion of such ta...
Common barriers to effective RM implementation include siloed functions/weak coordination, limited resources or expertise, poor data quality/lack of metrics, and cultural resistance driven by short-term incentives.
Frequent identification of these barriers across the reviewed literature and practitioner sources synthesized via thematic analysis over the last ten years.
high negative The Role of Risk Management as an Organizational Management ... barriers to RM adoption/implementation; likelihood of successful RM
Hierarchy compresses: fewer organizational layers are needed for a given firm output as coordination costs fall.
Analytical proposition in the theoretical model and simulation results showing reduced number of layers under coordination compression.
high negative AI as Coordination-Compressing Capital: Task Reallocation, O... number of hierarchical layers per firm
Heterogeneity in study designs and contexts within the literature limits direct comparability and generalizability of findings.
Limitation noted in the paper based on the authors' assessment of diversity across the 103 reviewed studies (varying methods, contexts, metrics).
high negative Models, applications, and limitations of the responsible ado... comparability/generalizability of evidence across studies
Institutional inertia, fragmented governance structures, limited technical capacity, and weak data stewardship impede scale‑up of AI systems in the public sector.
Thematic synthesis of barriers reported across empirical studies and institutional reports within the systematic review (103 items).
high negative Models, applications, and limitations of the responsible ado... ability to scale AI systems / scale‑up rate
Low‑ and middle‑income contexts face persistent gaps—infrastructure, data ecosystems, and talent retention—that slow AI adoption in public governance.
Consistent findings across multiple studies in the 103‑item corpus reporting infrastructure deficits, weak data ecosystems, and brain drain/retention issues in LMIC settings.
high negative Models, applications, and limitations of the responsible ado... rate/extent of AI adoption in public governance in low- and middle‑income contex...
On-Premise RAG requires internal technical capabilities (MLOps, infrastructure engineers) to maintain and update the system.
Organizational evaluation and implementation discussion noting operational responsibilities and skill requirements for on-prem deployment.
high negative An Empirical Study on the Feasibility Analysis of On-Premise... need for technical staff / internal capabilities (MLOps, infra)
On-Premise RAG incurs higher latency compared with cloud RAG.
Technology evaluations included measured system latency comparisons between architectures; exact latency values and statistical details not provided in summary.
high negative An Empirical Study on the Feasibility Analysis of On-Premise... system latency (response time)
On-Premise RAG requires upfront capital expenditure (hardware) and ongoing maintenance (operations, model updates, staff).
Organizational evaluations / cost accounting and implementation discussion indicating hardware, operations, and personnel requirements for on-prem deployment; specific cost figures not provided in summary.
high negative An Empirical Study on the Feasibility Analysis of On-Premise... upfront capital expenditure and ongoing maintenance costs and staffing needs
The January 2026 DoD AI Strategy memorandum establishes a Barrier Removal Board that provides expanded authority to waive established governance controls.
Primary source analysis: close reading of the Department of Defense January 2026 AI Strategy memorandum and related policy text (policy language describing the Barrier Removal Board and its waiver authorities). No sample size required; based on document text.
high negative FEATURE COMMENT: Governance as a "Blocker": How the Pentagon... existence and authority of the Barrier Removal Board (waiver authority over gove...
Significant financial and implementation barriers (infrastructure, staff, validation) risk worsening access inequities between well-resourced and low-resource providers.
Economic analyses, stakeholder surveys, and deployment trend reports synthesized in the paper showing higher upfront costs and validation burdens for adopters; no randomized trials.
high negative Framework for Government Policy on Agentic and Generative AI... access / equity disparities / adoption gap by resource level
Regulatory fragmentation and lack of harmonized standards increase compliance complexity for healthcare AI deployments.
Policy analyses, regulatory reviews, and industry reports synthesized in the paper describing divergent national/regional regulatory approaches and their operational consequences.
high negative Framework for Government Policy on Agentic and Generative AI... regulatory compliance complexity / administrative burden