The Commonplace
Home Dashboard Papers Evidence Digests 🎲

Evidence (4560 claims)

Adoption
5267 claims
Productivity
4560 claims
Governance
4137 claims
Human-AI Collaboration
3103 claims
Labor Markets
2506 claims
Innovation
2354 claims
Org Design
2340 claims
Skills & Training
1945 claims
Inequality
1322 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 378 106 59 455 1007
Governance & Regulation 379 176 116 58 739
Research Productivity 240 96 34 294 668
Organizational Efficiency 370 82 63 35 553
Technology Adoption Rate 296 118 66 29 513
Firm Productivity 277 34 68 10 394
AI Safety & Ethics 117 177 44 24 364
Output Quality 244 61 23 26 354
Market Structure 107 123 85 14 334
Decision Quality 168 74 37 19 301
Fiscal & Macroeconomic 75 52 32 21 187
Employment Level 70 32 74 8 186
Skill Acquisition 89 32 39 9 169
Firm Revenue 96 34 22 152
Innovation Output 106 12 21 11 151
Consumer Welfare 70 30 37 7 144
Regulatory Compliance 52 61 13 3 129
Inequality Measures 24 68 31 4 127
Task Allocation 75 11 29 6 121
Training Effectiveness 55 12 12 16 96
Error Rate 42 48 6 96
Worker Satisfaction 45 32 11 6 94
Task Completion Time 78 5 4 2 89
Wages & Compensation 46 13 19 5 83
Team Performance 44 9 15 7 76
Hiring & Recruitment 39 4 6 3 52
Automation Exposure 18 17 9 5 50
Job Displacement 5 31 12 48
Social Protection 21 10 6 2 39
Developer Productivity 29 3 3 1 36
Worker Turnover 10 12 3 25
Skill Obsolescence 3 19 2 24
Creative Output 15 5 3 1 24
Labor Share of Income 10 4 9 23
Clear
Productivity Remove filter
If AI displaces human workers faster than the economy can reabsorb them, it risks eroding the very consumer demand firms depend on.
Theoretical statement in the paper's motivating premise; no empirical sample reported (conceptual argument about aggregate demand effects when displacement outpaces reabsorption).
high negative The AI Layoff Trap consumer demand (aggregate demand) as affected by worker displacement
Fukui is Japan's least-visited prefecture.
Descriptive claim in the paper specifying the study site (Fukui) as the country's least-visited prefecture; no supporting national rankings provided in the excerpt.
We quantify an annual opportunity gap of 865,917 unrealized visits, equivalent to approximately 11.96 billion yen (USD 76.2 million) in lost revenue.
Model-based estimate produced by the DSS using the analyzed datasets and the DHDE-informed optimization; figure reported directly in the paper.
high negative Engineering Distributed Governance for Regional Prosperity: ... unrealized visits and lost revenue
For regions experiencing demographic decline and structural stagnation, the primary risk is 'under-vibrancy', a condition where low visitor density suppresses economic activity and diminishes satisfaction.
Conceptual claim and problem framing provided by the authors (theoretical/qualitative argument in the paper).
high negative Engineering Distributed Governance for Regional Prosperity: ... economic activity and satisfaction (conceptual)
Most research in urban informatics and tourism focuses on mitigating overtourism in dense global cities.
Author statement in introduction positioning the paper relative to existing literature; no quantitative literature review or citation counts reported in the excerpt.
Developers and experts still lack a shared view, resulting in repeated coordination, clarification rounds, and error-prone handoffs.
Observational/qualitative claim in paper describing current MSD practice (no numeric sample reported).
high negative LLM-Powered Workflow Optimization for Multidisciplinary Soft... frequency of coordination rounds / error-prone handoffs
Even with AI coding assistants like GitHub Copilot, individual coding tasks are semi-automated, but the workflow connecting domain knowledge to implementation is not.
Qualitative observation/comparative statement in paper (no empirical sample reported).
high negative LLM-Powered Workflow Optimization for Multidisciplinary Soft... degree of automation of coding tasks vs. end-to-end workflow automation
Multidisciplinary Software Development (MSD) requires domain experts and developers to collaborate across incompatible formalisms and separate artifact sets.
Conceptual/argument in paper framing the problem (no empirical sample reported).
high negative LLM-Powered Workflow Optimization for Multidisciplinary Soft... collaboration/workflow efficiency between domain experts and developers
Only 12% of AI market value is used in physical activities.
Descriptive aggregate: authors categorize and report that 12% of estimated AI market value maps to physical activities.
high negative Where can AI be used? Insights from a deep ontology of work ... share of AI market value by activity type (physical)
Off-the-shelf implementations of DRL have seen mixed success, often plagued by high sensitivity to the hyperparameters used during training.
Statement in the paper's abstract describing observed/prior performance issues with standard DRL implementations; implies literature/empirical experience but no specific experiment/sample given in the abstract.
high negative DeepStock: Reinforcement Learning with Policy Regularization... sensitivity of DRL performance to hyperparameter choices (resulting in mixed suc...
Coal-based energy consumption structure and a secondary-industry-dominated industrial structure significantly inhibit regional TFCP and have strong negative spatial spillovers.
Control-variable coefficients from Spatial Durbin Model on panel data (30 provinces, 2010–2023) showing statistically significant negative direct and indirect effects for coal-dominant energy structure and secondary-industry share.
high negative Study on the impact of industrial intelligence and the digit... total factor carbon productivity (TFCP)
Applying them to hardware-in-the-loop (HIL) embedded and Internet-of-Things (IoT) systems remains challenging due to the tight coupling between software logic and physical hardware behavior; code that compiles successfully may still fail when deployed on real devices because of timing constraints, peripheral initialization requirements, or hardware-specific behaviors.
Conceptual/engineering reasoning stated in the paper describing known HIL/IoT failure modes (no experimental quantification provided in this excerpt).
high negative Skilled AI Agents for Embedded and IoT Systems Development code failure / runtime correctness when deployed to hardware
Compositional spatial reasoning remains a formidable challenge for state-of-the-art VLMs (as revealed by our evaluation).
Empirical results from the evaluation of the 37 VLMs on the MultihopSpatial benchmark showing poor performance on multi-hop/compositional queries.
high negative MultihopSpatial: Multi-hop Compositional Spatial Reasoning B... performance on compositional/multi-hop spatial reasoning tasks
Existing benchmarks predominantly focus on elementary, single-hop relations and neglect multi-hop compositional spatial reasoning and precise visual grounding needed for real-world scenarios.
Literature/benchmark survey and motivation presented by the authors comparing characteristics of prior benchmarks vs. the proposed needs.
high negative MultihopSpatial: Multi-hop Compositional Spatial Reasoning B... scope/complexity of spatial reasoning tasks in existing benchmarks
Adoption barriers exist, particularly for small and medium-sized enterprises and firms in emerging economies, where capability and data constraints limit impact.
Findings reported from the systematic review and mixed-methods assessment (abstract references barriers observed across reviewed studies); number of studies reported in abstract is 104 for the systematic review.
high negative Artificial intelligence as a catalyst for the circular econo... adoption barriers / limitations to AI impact (capability and data constraints)
Significant limitations emerged in case law citations, with most cited cases being non-existent or incorrectly referenced.
Authors' review of the case citations produced by the four AI engines for the single transcript, finding many citations were fabricated or misreferenced.
high negative Robot Wingman: Using AI to Assess an Employment Termination accuracy of case law citations (error rate / hallucination rate)
GDP growth is initially negatively affected by the ageing population.
Estimated negative association reported in panel threshold regressions using provincial panel data (31 provinces, 2000–2022); ageing operationalized (primary specification) as an ageing measure (paper also tests old-age dependency ratio).
Initial adaptation challenges to AI integration were identified among employees.
Participants in semi-structured interviews (n=12) reported initial difficulties adapting to AI tools; themes relating to early adaptation challenges were coded.
high negative AI-AUGMENTED WORKFORCE: THE IMPACT OF ARTIFICIAL INTELLIGENC... initial adaptation challenges to AI
Past machine learning applications to pricing have produced models that adapt slowly to real-time changes, depend heavily on historical data, and struggle to handle multi-agent scenarios.
Stated as literature/related-work critique in paper; no new empirical evidence or sample size provided in the excerpt.
high negative The Application of Adaptive Reinforcement Learning in Dynami... model adaptivity to real-time changes and capability in multi-agent scenarios
Traditional methods, such as rule-based algorithms and statistical scale forecasting, struggle to adapt to rapidly changing market conditions, competitive maneuvers, and evolving consumer strategies, leading to sub-optimal pricing and decreased profitability.
Paper asserts this as background/motivation; no detailed empirical study or sample size provided in the excerpt.
high negative The Application of Adaptive Reinforcement Learning in Dynami... adaptivity of pricing methods and resulting profitability (sub-optimal pricing, ...
In the short term, big data may inhibit welfare growth.
Theoretical comparative-static/dynamic analysis reported in the model showing that initial or short-run effects of increased data sharing can reduce welfare growth (no empirical/sample data).
high negative Study on the impact of big data sharing on individuals’ welf... short-term growth of individuals' welfare
There is a measurement asymmetry in standard LLM evaluation: unconstrained prompts can inflate constraint-adherence scores and mask the practical value of structured prompting.
Analysis of evaluation results from the controlled study showing that unconstrained (simple) prompts sometimes achieve high constraint-adherence scores, leading to misleading evaluation of structured prompts' benefits.
high negative Evaluating 5W3H Structured Prompting for Intent Alignment in... constraint_adherence_scores / evaluation_bias
There is a central design tension in human-AI systems: maximizing short-term hybrid capability does not necessarily preserve long-term human cognitive competence.
Conceptual/theoretical claim derived from the framework and discussion in the paper (argument and mathematical framing), no empirical sample or longitudinal data presented in the excerpt.
high negative Cognitive Amplification vs Cognitive Delegation in Human-AI ... long-term human cognitive competence
The interaction of artificial intelligence and environmental regulation produces a '1 + 1 < 2' crowding-out effect (their combined effect is less than the sum of individual effects).
Spatial Durbin model with interaction term between AI and environmental regulation as summarized in the abstract; reported as a crowding-out interaction.
high negative How artificial intelligence and environmental regulation inf... UCEE index (interaction effect of AI and environmental regulation)
Environmental regulation significantly inhibits local UCEE.
Spatial Durbin model results reported in the abstract indicating a significant negative local coefficient for environmental regulation.
high negative How artificial intelligence and environmental regulation inf... UCEE index (local/provincial effect of environmental regulation)
Artificial intelligence significantly inhibits local UCEE.
Spatial Durbin model results reported in the abstract indicating a significant negative local coefficient for artificial intelligence.
high negative How artificial intelligence and environmental regulation inf... UCEE index (local/provincial effect of AI)
Progress in agentic AI systems that generate and optimize GPU kernels is constrained by benchmarks that reward speedup over software baselines rather than proximity to hardware-efficient execution.
Author argument/observation in paper (conceptual claim about limitations of existing benchmarks); no empirical sample or experiment reported in the provided text.
high negative SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GP... benchmark_alignment_with_hardware_efficiency
Rather than broad job losses, evidence points to a reallocation at the entry level: AI automates tasks typically assigned to junior staff, shifting the nature of entry-level roles.
Synthesis of firm- and task-level empirical studies reported in the brief documenting automation of routine/junior tasks and changes in job-task composition; specific sample sizes vary by cited study and are not provided in the brief.
high negative AI, Productivity, and Labor Markets: A Review of the Empiric... automation of entry-level/junior tasks and changes to entry-level job content
AI-only baselines perform near or below the median of competition participants.
Comparison of AI-only baseline performance to the distribution of competition participant results reported in the paper (competition with 29 teams / 80 participants).
high negative AgentDS Technical Report: Benchmarking the Future of Human-A... relative performance rank of AI-only baselines vs participants
Our results show that current AI agents struggle with domain-specific reasoning.
Outcome of the competition reported in the paper comparing AI-only baselines to participant submissions across the AgentDS tasks (competition data from 29 teams / 80 participants); reported aggregate performance indicating AI weakness on domain-specific tasks.
high negative AgentDS Technical Report: Benchmarking the Future of Human-A... domain-specific reasoning performance
The gap between informal natural language requirements and precise program behavior (the 'intent gap') has always plagued software engineering, but AI-generated code amplifies it to an unprecedented scale.
Conceptual claim and argumentation in the paper; presented as an observed escalation in the scale of the existing 'intent gap' due to AI code generation. No quantitative evidence or sample size given in the excerpt.
high negative Intent Formalization: A Grand Challenge for Reliable Coding ... mismatch between intended and actual program behavior (intent gap) / resulting c...
The capital-output elasticity dropped significantly, from 0.42 in 2010–2015 to 0.35 in 2016–2022.
Estimated from an extended Cobb–Douglas production function applied to China's economy over 2010–2022, with period split 2010–2015 vs 2016–2022 (as reported in the study summary).
high negative Analysis of China's Economic Growth Drivers: An Empirical St... capital-output elasticity (elasticity of output with respect to capital)
Limitations include possible limited organizational generalizability due to a single Fortune 500 lab context; ABS results depend on model specification/calibration; and operational definitions of 'resilience' and 'planning cycle' require careful reading.
Authors' reported limitations based on study design: single lab context (n = 23), dependence of ABS on model choices, and nontrivial operational definitions.
high negative The Algorithmic Canvas: On the Autopoietic Redefinition of S... generalizability and robustness of study findings
Some declines (in self-efficacy and meaningfulness) from passive AI use persist after participants return to manual work.
Within-experiment assessment of outcomes after participants returned to manual (no-AI) tasks following the AI-use manipulation in the pre-registered experiment (N = 269); reported persistent reductions in self-efficacy and meaningfulness for the passive condition.
high negative Relying on AI at work reduces self-efficacy, ownership, and ... self-efficacy; perceived meaningfulness (measured post-return to manual work)
Passive use of AI reduces perceived meaningfulness of work.
Pre-registered experiment (N = 269) with self-reported measure of work meaningfulness; passive-copy condition showed lower meaningfulness ratings than No-AI and Active-collaboration conditions.
high negative Relying on AI at work reduces self-efficacy, ownership, and ... perceived meaningfulness of work
Passive use of AI reduces psychological ownership of the produced outputs.
Same pre-registered experiment (N = 269). Participants in the passive-copy AI condition reported lower psychological ownership of their outputs (self-report scales) relative to No-AI and Active-collaboration conditions.
high negative Relying on AI at work reduces self-efficacy, ownership, and ... psychological ownership of outputs
Passive use of AI (copying AI-generated output) reduces workers' self-efficacy.
Pre-registered between-subjects experiment (N = 269) using occupation-specific writing tasks. Participants assigned to a passive-copy AI condition reported lower self-efficacy (self-reported confidence to complete tasks without AI) compared to the No-AI (manual) and Active-collaboration conditions.
high negative Relying on AI at work reduces self-efficacy, ownership, and ... self-efficacy (confidence to complete tasks without AI)
Large-scale AI models have significant energy and resource costs, creating a notable environmental footprint that must be addressed.
Narrative integration of prior empirical studies measuring compute, energy consumption, and embodied emissions of large models (cited literature); the review does not present new quantitative measurements itself.
high negative The Evolution and Societal Impact of Artificial Intelligence... energy consumption, carbon emissions, and resource use associated with large-sca...
As AI is deployed in safety-critical domains, reliability, regulation, and human-oriented system design become essential to avoid harms.
Review of literature on safety-critical systems, human–machine interaction studies, and regulatory policy discussions; the paper reports this as a consensus implication rather than presenting new empirical tests.
high negative The Evolution and Societal Impact of Artificial Intelligence... system reliability/safety and risk of harm in safety-critical deployments
The current literature is skewed toward descriptive and engineering work; there is a lack of causal, field‑experimental evidence on NLP interventions' effects on customer behavior and firm profits.
Review coding of study types in the sample (engineering/descriptive vs. experimental/causal) showing few field experiments or causal designs.
high negative Natural language processing in bank marketing: a systematic ... presence vs. absence of causal/experimental studies measuring effects on custome...
Important gaps include customer acquisition, personalization at scale, use of external text sources (social media, news, reviews), operational process improvement, and cross‑channel integration.
Gap detection via low‑density regions in the UMAP thematic map of sentence‑transformer embeddings and manual review showing low article counts for these topics within the 109‑article sample.
high negative Natural language processing in bank marketing: a systematic ... topical coverage by customer journey stage and source type (acquisition, persona...
Existing literature on NLP in marketing is concentrated around customer retention tasks (e.g., churn prediction, complaint handling, relationship management).
Thematic clustering from sentence‑transformer embeddings of article text combined with UMAP visualization, and manual review of article topics and keywords identifying frequent retention‑related themes.
high negative Natural language processing in bank marketing: a systematic ... topical frequency/coverage by customer journey stage (retention)
NLP applications in bank marketing are severely under‑studied.
Descriptive result from the PRISMA review showing only 8/109 articles focused on NLP in bank marketing (≈7%), plus thematic mapping showing sparse coverage in bank‑marketing/NLP intersection.
high negative Natural language processing in bank marketing: a systematic ... proportion and absolute count of studies at the intersection of NLP and bank mar...
Jurisdictions are taking divergent policy approaches (e.g., U.S. emphasis on innovation/competition, EU emphasis on rights/standards like GDPR), producing fragmented digital trade rules.
Comparative legal and policy analysis of existing national/regional rules and international instruments (examples cited include GDPR and U.S. policy orientations); descriptive, with specific regulatory texts analyzed.
high negative Path Analysis of Digital Economy and Reconstruction of Inter... regulatory fragmentation / interoperability of digital trade rules
AI creates novel non-tariff frictions, e.g., pressures toward data localization and regulatory requirements for algorithmic transparency.
Comparative legal and policy analysis of emerging regulations (e.g., data localization laws, algorithmic regulation initiatives) and illustrative jurisdictional examples.
high negative Path Analysis of Digital Economy and Reconstruction of Inter... non-tariff regulatory frictions (data-flow restrictions, transparency/compliance...
Vietnam's civil-law features—statutory specificity, formal procedures, and constitutional principles like legal certainty and fairness—make straightforward AI deployment legally fraught.
Close textual analysis of Vietnam's statutes, constitutional provisions, and administrative procedures (doctrinal legal analysis); no quantitative sample.
high negative ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... legal compatibility of AI deployment (degree of legal obstacles to deployment)
Automated decisions complicate assigning responsibility and hinder judicial and administrative reviewability.
Doctrinal examination of accountability and review mechanisms in administrative law plus comparative institutional analysis of automated decision-making governance.
high negative ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... clarity of accountability (ability to assign responsibility) and effectiveness o...
Opaque AI models risk violating notice, reason-giving, and appeal rights protected under administrative due process.
Analysis of procedural due-process requirements (notice, reason-giving, appeal) in Vietnam's legal framework and assessment of opacity issues in algorithmic systems; qualitative reasoning, no empirical testing.
high negative ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... compliance with due-process requirements (notice, reasons, appealability)
Provider incentives may be misaligned (e.g., optimizing for engagement or test performance instead of durable learning), requiring contracts, regulation, or purchaser design to align incentives.
Consensus from interdisciplinary workshop (50 scholars) highlighting incentive risks and market-design considerations; descriptive, not empirical.
high negative The Future of Feedback: How Can AI Help Transform Feedback t... provider optimization metrics (engagement/test performance) vs. durable learning...
Extensive learner data needed to personalize AI feedback raises privacy and data-governance concerns (consent, storage, usage).
Qualitative consensus from workshop participants (50 scholars) noting data-collection requirements and governance risks; no empirical governance studies included.
high negative The Future of Feedback: How Can AI Help Transform Feedback t... volume/type of learner data collected; privacy risk indicators; compliance with ...