Evidence (7198 claims)
Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.
The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).
Browse by theme
Nine broad, paper-level topics. Click one to filter the claims below.
Adoption
8921 claims
Filter claims →
Productivity
8002 claims
Filter claims →
Governance
7198 claims
Filtered →
Human-AI Collaboration
6864 claims
Filter claims →
Org Design
4398 claims
Filter claims →
Innovation
4286 claims
Filter claims →
Labor Markets
3629 claims
Filter claims →
Skills & Training
3001 claims
Filter claims →
Inequality
2141 claims
Filter claims →
Claims by outcome category
Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 790 | 208 | 103 | 950 | 2117 |
| Governance & Regulation | 869 | 411 | 195 | 126 | 1630 |
| Organizational Efficiency | 817 | 202 | 126 | 87 | 1243 |
| Technology Adoption Rate | 675 | 258 | 128 | 106 | 1178 |
| Research Productivity | 462 | 138 | 64 | 347 | 1023 |
| Output Quality | 501 | 193 | 61 | 52 | 807 |
| Decision Quality | 346 | 180 | 84 | 51 | 668 |
| AI Safety & Ethics | 235 | 285 | 70 | 34 | 630 |
| Firm Productivity | 452 | 58 | 91 | 20 | 627 |
| Market Structure | 184 | 171 | 123 | 24 | 507 |
| Task Allocation | 221 | 65 | 76 | 34 | 401 |
| Skill Acquisition | 176 | 62 | 62 | 17 | 317 |
| Innovation Output | 207 | 28 | 48 | 18 | 303 |
| Fiscal & Macroeconomic | 135 | 72 | 44 | 26 | 284 |
| Employment Level | 105 | 56 | 108 | 13 | 284 |
| Consumer Welfare | 121 | 67 | 45 | 11 | 244 |
| Firm Revenue | 160 | 50 | 28 | 4 | 242 |
| Task Completion Time | 182 | 33 | 10 | 13 | 239 |
| Inequality Measures | 45 | 126 | 50 | 6 | 227 |
| Worker Satisfaction | 94 | 73 | 23 | 12 | 202 |
| Error Rate | 76 | 98 | 11 | 4 | 189 |
| Regulatory Compliance | 81 | 73 | 17 | 7 | 178 |
| Automation Exposure | 61 | 59 | 26 | 14 | 163 |
| Training Effectiveness | 97 | 21 | 14 | 19 | 153 |
| Wages & Compensation | 78 | 37 | 25 | 6 | 146 |
| Developer Productivity | 105 | 18 | 14 | 6 | 144 |
| Team Performance | 87 | 17 | 28 | 10 | 143 |
| Job Displacement | 12 | 83 | 21 | 1 | 117 |
| Hiring & Recruitment | 52 | 8 | 8 | 3 | 71 |
| Social Protection | 39 | 17 | 8 | 2 | 66 |
| Creative Output | 32 | 20 | 8 | 3 | 64 |
| Skill Obsolescence | 5 | 49 | 6 | 1 | 61 |
| Labor Share of Income | 17 | 19 | 17 | — | 53 |
| Worker Turnover | 15 | 14 | — | 3 | 32 |
| Industry | — | — | — | 1 | 1 |
Governance
Remove filter
The Pilot Zone policy has a more pronounced enabling effect on ESG performance for non-state-owned enterprises compared with state-owned enterprises.
Heterogeneity analysis by ownership type reported in the paper (comparison between state-owned vs. non-state-owned A-share listed manufacturing firms under DID specification).
Operational efficiencies significantly moderate the policy effect, further amplifying the Pilot Zone policy's positive impact on ESG performance.
Reported moderation/heterogeneity analysis indicating that firms with higher operational efficiency experience stronger positive policy effects on ESG performance.
Enterprise resource allocation significantly moderates the policy effect, amplifying the enabling effect of the Pilot Zone policy on ESG performance.
Reported moderation/heterogeneity analysis showing interaction effects between measures of enterprise resource allocation and the Pilot Zone policy on ESG outcomes in the DID framework.
The policy enhances manufacturing enterprises' ESG performance by strengthening environmental compliance pressures (regulatory/compliance channel).
Mechanism analysis reported in the paper identifying increased environmental compliance pressure as a transmission channel linking the Pilot Zone policy to improved ESG performance.
The policy primarily enhances manufacturing enterprises' ESG performance by intensifying R&D expenditure intensity (R&D investment channel).
Mechanism analysis reported in the paper identifying R&D expenditure intensity as a transmission channel between the Pilot Zone policy and firm ESG performance (presumably mediation/interaction tests within DID framework).
The positive effect of the Pilot Zone policy on manufacturing firms' ESG performance is robust to parallel trends tests, placebo tests, and multiple robustness checks.
Reported application of common DID robustness diagnostics: parallel trends test, placebo tests, and additional robustness checks (details not provided in abstract). Same sample frame: A-share listed manufacturing firms, 2010–2023.
The Artificial Intelligence Innovation and Development Pilot Zone policy exerts a significant positive effect on manufacturing enterprises' ESG performance.
Empirical analysis using a multi-period difference-in-differences (DID) model leveraging the establishment of National New-Generation Artificial Intelligence Innovation and Development Pilot Zones as a quasi-natural experiment; sample: A-share listed manufacturing enterprises on the Shanghai and Shenzhen Stock Exchanges, 2010–2023. Robustness checks reported (parallel trends, placebo tests, multiple robustness checks).
The classical First Fundamental Theorem of Welfare Economics is recovered as the low-autonomy limit of the autonomy-qualified model.
Analytical result in the paper showing limiting case of the model yields the classical theorem (theoretical/mathematical derivation).
Using a minimal general-equilibrium model with autonomy-conditioned welfare, welfare-status assignment, delegation accounting, and verification institutions, we set out conditions for which autonomy-complete competitive equilibrium is autonomy-Pareto efficient.
Formal theoretical development and derivation in a minimal general-equilibrium model described in the paper (mathematical/modeling evidence; no empirical sample).
The First Fundamental Theorem ought to be subject to an autonomy qualification where the impact of changes in autonomy assumptions is incorporated.
Normative prescription based on the paper's conceptual critique and modeling agenda; supported by theoretical reasoning rather than empirical testing.
Government transfers become compelling when singularity-driven growth overwhelms deadweight costs.
Conditional policy conclusion stated in the abstract based on model comparison of welfare gains versus deadweight costs; no empirical calibration or data reported.
Market incompleteness creates a rationale for government transfers.
Normative/policy implication stated in the abstract, derived from the model's welfare comparisons; no empirical validation provided.
Because markets are incomplete -- investors cannot trade private AI capital -- AI stocks command a premium.
Theoretical result asserted in the paper's abstract, derived from the asset-pricing model under market incompleteness (no empirical data provided).
We develop an asset pricing model in which investors use AI stocks to hedge against an AI singularity that displaces their consumption.
Description of the paper's theoretical asset-pricing model and stated model mechanism in the abstract; no empirical test reported.
AI stocks trade at extraordinary valuations.
Explicit statement in the paper's abstract; no empirical data, sample, or statistical analysis reported.
The sustainability of the algorithmic state rests on a movement from technocratic secrecy to value-based transparency to ensure AI- and human collaboration is founded on institutional accountability and algorithmic justice.
Authorial conclusion from the systematic review synthesis (2018-2026) advocating a policy/practice shift; presented as normative policy recommendation rather than quantified empirical finding.
Empirical evidence shows great gains in efficiency in fiscal forecasting.
Empirical studies included in the PRISMA-guided review (2018-2026) reporting improved fiscal forecasting outcomes; no quantitative effect sizes provided in abstract.
Empirical evidence shows great gains in efficiency at routinised administrative tasks.
Empirical studies reported in the systematic review (2018-2026); the abstract claims empirical evidence of efficiency gains but does not report specific study counts, sample sizes, or effect magnitudes.
This survey provides scholars and practitioners with a structured understanding of how agentic AI is reshaping financial markets and identifies critical research directions to ensure these systems enhance both operational efficiency and market resilience.
Statement of contribution in the paper; based on the paper's literature review, taxonomy, and identified research agenda.
Agentic AI offers substantial potential for enhanced market efficiency, liquidity provision, and risk management.
Survey synthesis of foundational research, market applications, and technical architectures suggesting potential benefits; no original empirical evaluation reported.
The emergence of agentic AI represents a fundamental transformation in financial markets, characterized by autonomous systems capable of reasoning, planning, and adaptive decision-making with minimal human intervention.
Conceptual claim stated in the survey's introduction and synthesis of recent advances; based on literature review and theoretical framing rather than new empirical data.
Countries around the world are rushing to encourage greater investment and growth in their domestic AI industries.
Statement/observation presented in the paper's introduction; based on the paper's descriptive overview of global policy activity (literature review / policy survey implied). No sample size reported.
When unfairness is driven by uncertainty (rather than incidental noise), accounting for uncertainty is essential to achieving fair and effective decision-making.
Synthesis/argument based on formalization and simulation experiments showing cases where uncertainty causes unfair outcomes and methods that account for uncertainty mitigate those outcomes.
The proposed framework can help practitioners diagnose, audit, and govern fairness risks in socio-technical decision systems.
Authors propose a diagnostic/audit/governance framework (conceptual contribution) and illustrate its use through examples and simulations; no field deployment evidence provided in the abstract.
Algorithmic examples in the paper demonstrate it is possible to reduce outcome variance for disadvantaged groups while preserving institutional objectives such as expected utility.
Algorithmic examples and simulation experiments reported in the paper demonstrating reductions in outcome variance for disadvantaged groups together with preserved expected utility (results from synthetic/simulated data and model runs).
The authors formalize model and feedback uncertainty using counterfactual logic and reinforcement learning.
Paper describes formalization/mathematical definitions linking counterfactual logic and reinforcement learning to model and feedback uncertainty (theoretical/methodological contribution).
This paper introduces a taxonomy of uncertainty in sequential decision-making consisting of three types: model uncertainty, feedback uncertainty, and prediction uncertainty.
Paper presents a conceptual taxonomy and names the three uncertainty types in the text/abstract; theoretical exposition in the methods/definitions sections (no external empirical sample required).
The emergence of 'Industry 4.0 Inc.' is likely to induce further collaboration among participating incumbents.
Authors' inference based on observed interconnections and overlapping investments in the M&A-based mapping (predictive/interpretive claim; no quantified projection provided in the excerpt).
One consequence of increased M&A activity and overlapping investments is the emergence of interconnections that have given rise to a new structure the authors term 'Industry 4.0 Inc.'
Network mapping of corporate linkages and overlapping investments derived from the M&A deal analysis spanning more than two decades (method: empirical mapping of inter-corporate ties); exact counts not provided in the excerpt.
Mergers and acquisitions are one of the principal tools industrial firms use to overcome this dual challenge.
Authors' argumentation supported by an empirical analysis of more than two decades of M&A deals (method: M&A deal analysis); exact sample size not stated in provided text.
When models err, their incorrect predictions disproportionately lean intervention-oriented.
Error analysis of model predictions showing that among incorrect predictions, a larger share favor intervention-oriented causal signs than market-oriented ones (directional skew in errors).
Across 18 of 20 models, accuracy is systematically higher when the empirically verified causal sign aligns with intervention-oriented expectations than with market-oriented ones.
Model-by-model accuracy comparison broken down by whether the empirically verified causal sign aligns with intervention-oriented vs market-oriented expectations; observed higher accuracy for intervention-aligned cases in 18/20 models.
Effective governance of AI as a dual-use technology will likely require a multilateral institutional architecture functionally analogous (though not identical) to the role performed by the IAEA in the nuclear domain, with explicit safeguards against co-option of hardware controls for domestic repression.
Normative institutional design argument and analogy to the IAEA presented in the paper (policy proposal; comparative institutional analysis).
Hardware-layer governance, including chip-level attestation mechanisms such as FlexHEG, trusted execution environments, confidential computing, and complementary software-layer safeguards, offers a defense-in-depth alternative to the current binary framing of openness vs restriction.
Proposed governance architecture and technical discussion in the paper citing concrete mechanisms (technical-proposal and conceptual analysis; no experimental or deployment data reported in the summary).
The global concentration of compute infrastructure makes open-weight models one of the most viable pathways to sovereign AI capacity in the Global South.
Analysis of global compute infrastructure concentration and pathway mapping in the paper (conceptual/structural analysis; no numerical sample provided in the summary).
The findings point to a staged progression of AI utility from low-consequence assistance toward higher-order automation, as trust, infrastructure, and verification mature.
Synthesis of interview responses (over 30) indicating current use cases are lower-risk assistance and that stakeholders expect (or prefer) gradual progression toward automation contingent on trust/infrastructure/verification improvements.
Reliability, verification, and auditability are central requirements for adoption, driving human-in-the-loop frameworks and governance aligned with existing engineering reviews.
Consistent themes from interviews (over 30) indicating stakeholders prioritize reliability, verifiability, and audit trails, leading to preference for human-in-the-loop designs integrated with current review processes.
Higher-value agentic gains come from orchestrating multi-step workflows across tools.
Observed and reported in interviews (over 30) with stakeholders in engineering and manufacturing workflows describing value from agentic orchestration across tools.
Near-term AI gains cluster around structured, repetitive work and data-intensive synthesis.
Qualitative findings from an exploratory state-of-practice study based on over 30 semi-structured interviews across four stakeholder groups (large enterprises, small/medium firms, AI developers, and CAD/CAM/CAE vendors).
‘Smarter’ AI agents are more profitable.
Measured profits earned by agents of different capability levels in the trading experiment and observed higher profits for higher-capability ('smarter') agents.
‘Smarter’ AI agents perform better at information aggregation.
Experimental comparison of AI agents with different capability levels ('smarter' vs. less smart) in the trading experiment; measured aggregation via log error of last price and found better performance for higher-capability agents.
Prediction markets are robust to cheap talk, market duration, initial price, and strategic prompting.
Synthesis of experimental results showing no change in aggregation performance across manipulations (cheap talk, duration, initial price, strategic prompting).
The median market is effective at aggregating information in the easy information structures.
Controlled laboratory experiment in which AI agents traded in prediction markets after receiving private signals; information aggregation measured by the log error of the last price; comparison across 'easy' information structures using median-market outcomes.
Because misalignment can occur along each axis -- and affect stakeholders differently -- alignment cannot be 'solved' through technical design alone, but must be managed through ongoing institutional processes that determine how objectives are set, how systems are evaluated, and how affected communities can contest or reshape those decisions.
Normative conclusion drawn from the three-axis framework and discussion of stakeholder impacts (conceptual policy prescription; no empirical testing reported).
Alignment is inherently pluralistic and context-dependent, and resolving misalignment involves trade-offs among competing values.
Theoretical and normative argument in the paper about pluralism and context-dependence of values (conceptual discussion; no empirical quantification).
The three-axis decomposition implies that alignment is fundamentally a problem of governance rather than engineering alone.
Logical inference from the proposed decomposition and normative argument in the paper (theoretical reasoning; no empirical evidence).
The three-axis framework provides a systematic way of diagnosing why misalignment arises in real-world systems and clarifies that alignment cannot be treated as a single technical property of models but an outcome shaped by how objectives are specified, how information is distributed, and whose interests count in practice.
Conceptual argument and analytic claim about the explanatory utility of the proposed framework (theoretical demonstration; no empirical tests reported).
Misalignment can be reconceptualised as arising along three interacting axes: objectives, information, and principals (drawing on the principal–agent framework).
Theoretical framing using the principal–agent framework; conceptual decomposition proposed in the paper (no empirical validation reported).
The alignment problem is better understood as a structural question about governance: not whether an AI system is aligned in the abstract, but whether it is aligned enough, for whom, and at what cost.
Normative and conceptual argument presented by the author proposing a governance-focused reconceptualization (theoretical analysis; no empirical data).
Statelessness is the load-bearing property explaining enterprises' preference for weaker but replayable retrieval pipelines, and DPM demonstrates this property is attainable without the decisioning penalty retrieval pays.
Synthesis/conclusion based on theoretical argument and empirical results presented (architectural analysis + experiments showing DPM performance and auditability).