The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (4114 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Innovation Remove filter
Close coupling among Azure OpenAI Service, Azure Machine Learning, and cost governance tooling (FinOps) significantly decreases overall cost of ownership and enhances scalability and compliance.
Architectural analysis of Azure-native GenAI services and cost/governance tooling reported in the paper.
high positive Measuring Business ROI of Generative AI Adoption on Azure Cl... overall cost of ownership, scalability, compliance
Measurable ROI from GenAI on Azure is mainly driven by improvements in productivity, optimization of operational costs, faster decision making, and increased speed of innovation across business functions.
Reported results from the paper's mixed-method study combining quantitative ROI modelling and cost–benefit analysis plus qualitative synthesis of secondary enterprise case studies.
high positive Measuring Business ROI of Generative AI Adoption on Azure Cl... business Return on Investment (ROI) driven by productivity, cost optimization, d...
Microsoft Azure has become one of the first enterprise-scale platforms facilitating GenAI-driven change.
Statement in the paper's abstract asserting Azure's market position as an early enterprise-scale platform for GenAI.
high positive Measuring Business ROI of Generative AI Adoption on Azure Cl... enterprise-scale platform adoption
This synthesis bridges the gap between values and practice, offering a policy-ready model for secure and sustainable AI governance.
Authors' concluding claim that their integrated governance risk framework and risk-tiering matrix operationalize ethical principles into auditable technical controls and are policy-ready.
high positive AI Governance Risk Tiering for Sustainable Digital Infrastru... policy-readiness and practical applicability of the proposed model
The study aligns its integrated risk-tiering model with Sustainable Development Goal 9 on industry, innovation and infrastructure.
Authors state that the developed integrated risk-tiering model is aligned with SDG 9 as part of the study framing and intended policy relevance.
high positive AI Governance Risk Tiering for Sustainable Digital Infrastru... conceptual alignment of the model with SDG 9
The analysis produced a heat map of governance frameworks, a co-occurrence network of themes, a cluster analysis of framework coverage and an integrated governance risk framework supported by a risk-tiering matrix.
Authors report specific analytical outputs (heat map, co-occurrence network, cluster analysis) and that they developed an integrated governance risk framework with a risk-tiering matrix based on their analysis.
high positive AI Governance Risk Tiering for Sustainable Digital Infrastru... analytical outputs and resultant governance model
Our empirics demonstrate that self-evolving AI offers a scalable and interpretable paradigm.
Empirical results on the U.S. equity market are cited as evidence; the paper claims scalability and interpretability based on those empirical demonstrations and the architecture of the system.
high positive Beyond Prompting: An Autonomous Framework for Systematic Fac... scalability and interpretability of the AI-driven investing approach
Applying this methodology to the U.S. equity market, long-short portfolios formed on the simple linear combination of signals deliver a return of 59.53% (annualized).
Empirical backtest/application to the U.S. equity market reported in the paper; specific annualized return percentage is provided. Sample period, universe, and number of observations not stated in the excerpt.
high positive Beyond Prompting: An Autonomous Framework for Systematic Fac... annualized portfolio return
Applying this methodology to the U.S. equity market, long-short portfolios formed on the simple linear combination of signals deliver an annualized Sharpe ratio of 3.11.
Empirical backtest/application to the U.S. equity market reported in the paper; specific performance metric (annualized Sharpe) is provided. Sample period, universe, and number of observations not stated in the excerpt.
To mitigate data snooping biases, the closed-loop system imposes strict empirical discipline through out-of-sample validation and economic rationale requirements.
Description of model validation protocol in the paper (use of out-of-sample validation and economic rationale filters); supports claim that these steps are used to reduce data-snooping risk.
high positive Beyond Prompting: An Autonomous Framework for Systematic Fac... mitigation of data-snooping bias (robustness of signals)
The approach operationalizes the model as a self-directed engine that endogenously formulates interpretable trading signals (rather than relying on sequential manual prompts).
Methodological description and implementation details in the paper describing how the model generates signals autonomously and interpretable outputs; empirical example applied to U.S. equity market is referenced to illustrate operation.
high positive Beyond Prompting: An Autonomous Framework for Systematic Fac... interpretability and autonomy of generated trading signals
We develop an autonomous framework for systematic factor investing via agentic AI.
Statement of methodological contribution in the paper (framework description); no sample size or empirical test required for the descriptive claim.
high positive Beyond Prompting: An Autonomous Framework for Systematic Fac... autonomy of investment framework (methodological capability)
Through a comparative analysis of Pax Romana, Pax Britannica, Pax Americana, and the emerging U.S. techno-security architecture, the article demonstrates continuity in the logic of hegemonic control centered on infrastructures.
Comparative historical analysis of four hegemonic/regime examples as described in the paper; methodological approach is comparative and qualitative (no quantitative sample size given).
high positive The Logistics of Hegemony: Semiconductor Chokepoints, Global... continuity of hegemonic logic across historical regimes
Hegemonic orders can be conceptualized as historically specific logistical regimes — the material basis of hegemony evolves but the underlying logic remains constant: control over the infrastructures that organize global circulation.
Conceptual claim grounded in synthesis of structural power theory, global value chain analysis, and infrastructure studies and illustrated through comparative historical examples (Pax Romana, Pax Britannica, Pax Americana, emerging U.S. techno-security architecture).
high positive The Logistics of Hegemony: Semiconductor Chokepoints, Global... persistence of strategic logic (control over infrastructures) across historical ...
The article develops a theoretical framework of logistical hegemony to explain how infrastructures, chokepoints, and global production networks structure the exercise of power in the world economy.
Primary claim of the paper: theoretical development drawing on structural power theory, global value chain analysis, and infrastructure studies; conceptual/theoretical argumentation rather than empirical sample-based evidence.
high positive The Logistics of Hegemony: Semiconductor Chokepoints, Global... control over infrastructures and organization of global circulation
Experiments highlight a reward anatomical structure that balances income, profit, efficiency, fairness, and customer retention, moving beyond income-only goals.
Experimental design / reward engineering reported in paper; claim supported by experiments (no quantitative metrics or sample size given in excerpt).
high positive The Application of Adaptive Reinforcement Learning in Dynami... reward structure balancing multiple objectives (income, profit, efficiency, fair...
Training strength is validated by benchmarking against fixed, rule-based models and cost-plus in controlled experimentation.
Paper reports controlled experiments benchmarking ARL models against fixed/rule-based and cost-plus baselines; specific experimental design and sample sizes not provided in excerpt.
high positive The Application of Adaptive Reinforcement Learning in Dynami... relative performance of ARL training vs. baselines (validation/benchmarking outc...
Inventory challenges are addressed by utilizing a curated dataset that has been enhanced through feature engineering, transformation, and systematic cleaning, providing reliable inputs for training.
Methodological claim about dataset curation and preprocessing used to train ARL agents; no dataset size or quantitative validation reported in excerpt.
high positive The Application of Adaptive Reinforcement Learning in Dynami... quality/reliability of training inputs with respect to inventory representation
Profitability in a dynamic marketplace is enhanced through an Adaptive Reinforcement Learning (ARL)-based pricing framework that utilizes Q-Learning and Deep Q-Networks (DQN) for real-time optimization in response to changing market conditions, competition, and inventory levels.
Paper proposes and experiments with an ARL-based pricing framework (methods include Q-Learning and DQN); validation claimed via benchmarking/controlled experimentation against baselines (details not provided in excerpt).
high positive The Application of Adaptive Reinforcement Learning in Dynami... profitability and pricing optimization in dynamic markets
Dynamic pricing is crucial for maximizing revenue and maintaining competitiveness in markets with fluctuating demand, perishable goods, and diverse customer preferences.
Conceptual claim stated in paper's introduction/motivation; no empirical sample or experiment specified in the statement.
high positive The Application of Adaptive Reinforcement Learning in Dynami... maximizing revenue and maintaining competitiveness
In the long term, big data promotes sustained improvements in individuals’ welfare.
Theoretical long-run growth analysis in the model showing that sustained data sharing leads to long-run welfare improvements (analytic/model-based, no empirical/sample data).
high positive Study on the impact of big data sharing on individuals’ welf... long-term growth of individuals' welfare
There exists an optimal level of data (big data) sharing that achieves the best balance between economic development and privacy, thereby maximizing individuals' welfare.
Analytical optimization within the theoretical macro model: model yields an interior optimum for data-sharing intensity that trades off economic gains and privacy costs (derivation/analytical result; no empirical test).
high positive Study on the impact of big data sharing on individuals’ welf... individuals' welfare maximization via optimal data-sharing level
The Institutional Scaling Law predicts that the next phase transition will be driven not by larger models but by better-orchestrated systems of domain-specific models adapted to specific institutional niches.
Predictive conclusion derived from the Institutional Scaling Law and theoretical analysis in the paper. No empirical validation or sample size reported in the excerpt.
high positive The Institutional Scaling Law: Non-Monotonic Fitness, Capabi... drivers of the next phase transition in AI (orchestration of domain-specific sys...
A Symbiogenetic Scaling correction demonstrates that orchestrated systems of domain-specific models can outperform frontier generalists in their native deployment environments.
Theoretical correction/derivation and comparative analysis within the paper (no empirical sample or quantitative benchmark reported in the excerpt).
high positive The Institutional Scaling Law: Non-Monotonic Fitness, Capabi... performance of orchestrated domain-specific model systems versus frontier genera...
A mixed-methods empirical research agenda is presented, proposing a future PLS-SEM approach to test the mediating role of the cognitive flywheel and the moderating effect of fractal governance on organizational resilience.
Methodological proposal described in the paper (research design and proposed analytic approach); no executed empirical study or sample reported.
high positive Governing Human–AI Co-Evolution: Intelligentization Capabili... organizational_resilience (as mediator/moderator relationships to be tested)
Fractal governance architecture is proposed to mitigate systemic vulnerabilities such as automation bias.
Conceptual proposal of a governance design in the paper; no empirical test or sample provided.
high positive Governing Human–AI Co-Evolution: Intelligentization Capabili... reduction_in_automation_bias / improvement_in_decision_quality
The cognitive flywheel is the central mechanism of this dynamic capability and can be operationalized (the paper operationalizes the cognitive flywheel).
Theoretical operationalization within the paper (concept definition and proposed operational measures); no empirical measurement or sample reported.
high positive Governing Human–AI Co-Evolution: Intelligentization Capabili... mechanism_operationalization (cognitive_flywheel)
The co-evolutionary dynamic is formalized using coupled non-linear differential equations and time decay integrals.
Mathematical formalization reported in the paper (modeling methods described); no empirical parameter estimation or sample provided.
high positive Governing Human–AI Co-Evolution: Intelligentization Capabili... existence_of_mathematical_model/formal_framework
Dynamic cognitive advantage arises from the historical, recursive, structural coupling of human semantic intent and machine syntactic processing (a co-evolutionary dynamic).
Conceptual theory introduced and argued in the paper (mechanism-level proposition); formalization provided but no empirical validation.
high positive Governing Human–AI Co-Evolution: Intelligentization Capabili... competitive_differentiation/innovation_output
Conceptualizing the enterprise as a complex adaptive system operating far from thermodynamic equilibrium provides a more appropriate framing for organizations integrating AI and enables the theory of dynamic cognitive advantage.
Theoretical development and conceptual argumentation within the paper; formal framing rather than empirical test; no sample reported.
high positive Governing Human–AI Co-Evolution: Intelligentization Capabili... competitive_differentiation/innovation_output
We propose a multi-agent discussion framework wherein specialized agents collaboratively process extensive product information, distributing cognitive load to alleviate single-agent attention bottlenecks and capturing critical decision factors through structured dialogue.
Method description: multi-agent discussion architecture described and implemented; claimed to distribute cognitive load and reduce single-agent attention bottlenecks (design + reported behavior).
high positive MALLES: A Multi-agent LLMs-based Economic Sandbox with Consu... reduction of single-agent attention bottlenecks / distributed processing of prod...
To enhance simulation stability, we implement a mean-field mechanism designed to model the dynamic interactions between the product environment and customer populations, effectively stabilizing sampling processes within high-dimensional decision spaces.
Method description: implementation of a mean-field mechanism within the simulator; paper asserts this design stabilizes sampling in high-dimensional decision spaces (method + reported simulation behavior).
high positive MALLES: A Multi-agent LLMs-based Economic Sandbox with Consu... simulation stability / stabilized sampling processes
We introduce a preference learning paradigm in which LLMs are economically aligned via post-training on extensive, heterogeneous transaction records across diverse product categories.
Method description: post-training LLMs on heterogeneous transaction records across product categories to align preferences (methodological / training procedure described).
high positive MALLES: A Multi-agent LLMs-based Economic Sandbox with Consu... ability of models to internalize consumer preferences via post-training
This paper introduces a Multi-Agent Large Language Model-based Economic Sandbox (MALLES) as a unified simulation framework applicable to cross-domain and cross-category scenarios.
Paper description: design and implementation of MALLES, presented as a unified framework leveraging large-scale LLM generalization for cross-domain/cross-category simulation (methodological contribution).
high positive MALLES: A Multi-agent LLMs-based Economic Sandbox with Consu... existence and applicability of MALLES as a unified simulation framework
SOL-ExecBench reframes GPU kernel benchmarking from beating a mutable software baseline to closing the remaining gap to hardware Speed-of-Light.
Conceptual/positioning claim made by the authors about the intended shift in benchmarking perspective enabled by SOL-ExecBench.
high positive SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GP... benchmarking_objective_shift_toward_hardware_efficiency
To support robust evaluation of agentic optimizers, we provide a sandboxed harness with GPU clock locking, L2 cache clearing, isolated subprocess execution, and static analysis-based checks against common reward-hacking strategies.
Method/tool claim in paper describing the provided evaluation harness and its engineered controls (list of features included).
high positive SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GP... evaluation_robustness_and_integrity_of_benchmarking
We report a SOL Score that quantifies how much of the gap between a release-defined scoring baseline and the hardware SOL bound a candidate kernel closes.
Paper defines the SOL Score metric and states its interpretive meaning (fraction of gap closed between baseline and hardware SOL bound).
high positive SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GP... fraction_of_gap_closed_to_hardware_bound
SOL-ExecBench measures performance against analytically derived Speed-of-Light (SOL) bounds computed by SOLAR, our pipeline for deriving hardware-grounded SOL bounds, yielding a fixed target for hardware-efficient optimization.
Methodological claim: introduction of SOLAR pipeline to compute analytic hardware-grounded SOL bounds and use of those bounds as benchmark targets, as described in the paper.
high positive SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GP... proximity_to_hardware_speed_of_light_bounds
The benchmark covers forward and backward workloads across BF16, FP8, and NVFP4, including kernels whose best performance is expected to rely on Blackwell-specific capabilities.
Paper description of benchmark coverage (workload direction and data types; inclusion of kernels tied to Blackwell hardware features).
high positive SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GP... coverage_of_workloads_and_datatypes
We present SOL-ExecBench, a benchmark of 235 CUDA kernel optimization problems extracted from 124 production and emerging AI models spanning language, diffusion, vision, audio, video, and hybrid architectures, targeting NVIDIA Blackwell GPUs.
Paper reports construction of the benchmark with counts: 235 CUDA kernel problems and 124 source models; descriptive dataset claim in the manuscript.
high positive SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GP... benchmark_problem_count_and_coverage
End-to-end verified pipelines can produce provably correct code from informal specifications.
The paper surveys early research demonstrating pipelines that go from informal specifications to formally verified code; the provided text does not include experimental sample sizes or benchmarks.
high positive Intent Formalization: A Grand Challenge for Reliable Coding ... provable correctness of generated code
AI-generated postconditions catch real-world bugs missed by prior methods.
Surveyed early research asserted by the paper indicating empirical instances where AI-generated postconditions found bugs that other methods missed; no numeric details provided in the excerpt.
high positive Intent Formalization: A Grand Challenge for Reliable Coding ... bugs detected / error detection rate
Interactive test-driven formalization improves program correctness.
Paper surveys early research that reportedly demonstrates this effect (described as 'interactive test-driven formalization that improves program correctness'); the excerpt does not include specific study details or sample sizes.
The central bottleneck is validating specifications: since there is no oracle for specification correctness other than the user, we need semi-automated metrics that can assess specification quality with or without code, through lightweight user interaction and proxy artifacts such as tests.
Analytical claim and research agenda item in the paper; motivates need for new metrics and interaction designs. No empirical validation or sample size reported in the excerpt.
high positive Intent Formalization: A Grand Challenge for Reliable Coding ... ability to validate specification correctness / specification quality
Intent formalization offers a tradeoff spectrum suitable to the reliability needs of different contexts: from lightweight tests that disambiguate likely misinterpretations, through full functional specifications for formal verification, to domain-specific languages from which correct code is synthesized automatically.
Conceptual framework proposed in the paper describing a spectrum of specification formality; presented as an argument rather than an empirical finding, with no sample sizes provided in the excerpt.
high positive Intent Formalization: A Grand Challenge for Reliable Coding ... suitability of specification approaches for reliability requirements
Intent formalization — translating informal user intent into checkable formal specifications — is the key challenge that will determine whether AI makes software more reliable or merely more abundant.
Normative argument presented by the authors as the central thesis of the paper; no empirical study or sample size cited in the provided text.
high positive Intent Formalization: A Grand Challenge for Reliable Coding ... software reliability (correctness relative to user intent)
Agentic AI systems can now generate code with remarkable fluency.
Authoritative assertion in the paper based on contemporary observations of large code-generating models; no empirical sample size or benchmark numbers reported in the text provided.
high positive Intent Formalization: A Grand Challenge for Reliable Coding ... code generation fluency / ability to produce code
This paper employs large language models to conduct semantic analysis on the text of annual reports from Chinese A-share listed companies from 2006 to 2024.
Methodological statement in the abstract describing use of LLM-based semantic analysis on annual report texts spanning 2006–2024.
high positive The Spillover Effects of Peer AI Rinsing on Corporate Green ... methodological approach (use of LLMs for semantic analysis)
The paper recommends that the government design targeted support tools to 'enhance market returns and alleviate financing constraints', adopt a differentiated regulatory strategy, and establish a disclosure mechanism combining 'professional identification and reputational sanctions' to curb peer AI washing behaviour.
Policy prescriptions derived from empirical findings and simulation results reported in the paper; presented as recommendations in the abstract.
high positive The Spillover Effects of Peer AI Rinsing on Corporate Green ... effectiveness of policy interventions in curbing AI washing and supporting green...
Simulation results indicate that a combination of policy tools can effectively improve market equilibrium (mitigating the negative effects of AI washing).
Simulation exercises reported in the paper (model specification not provided in abstract) testing policy tool combinations and their effects on market equilibrium.
high positive The Spillover Effects of Peer AI Rinsing on Corporate Green ... market equilibrium (improvement in market outcomes related to AI washing and gre...