Evidence (14922 claims)
Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.
The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).
Browse by theme
Nine broad, paper-level topics. Click one to filter the claims below.
Adoption
9047 claims
Filter claims →
Productivity
8066 claims
Filter claims →
Governance
7278 claims
Filter claims →
Human-AI Collaboration
6912 claims
Filter claims →
Org Design
4439 claims
Filter claims →
Innovation
4359 claims
Filter claims →
Labor Markets
3652 claims
Filter claims →
Skills & Training
3018 claims
Filter claims →
Inequality
2160 claims
Filter claims →
Claims by outcome category
Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 795 | 210 | 105 | 955 | 2131 |
| Governance & Regulation | 886 | 414 | 197 | 126 | 1654 |
| Organizational Efficiency | 826 | 204 | 129 | 87 | 1257 |
| Technology Adoption Rate | 681 | 259 | 128 | 110 | 1189 |
| Research Productivity | 464 | 138 | 65 | 349 | 1028 |
| Output Quality | 503 | 196 | 61 | 53 | 813 |
| Decision Quality | 351 | 180 | 84 | 51 | 673 |
| AI Safety & Ethics | 238 | 288 | 71 | 34 | 637 |
| Firm Productivity | 455 | 58 | 92 | 20 | 631 |
| Market Structure | 186 | 172 | 123 | 25 | 511 |
| Task Allocation | 222 | 70 | 76 | 34 | 407 |
| Innovation Output | 238 | 28 | 48 | 18 | 334 |
| Skill Acquisition | 177 | 62 | 62 | 17 | 318 |
| Employment Level | 107 | 57 | 108 | 13 | 287 |
| Fiscal & Macroeconomic | 135 | 72 | 44 | 26 | 284 |
| Firm Revenue | 172 | 50 | 28 | 5 | 256 |
| Consumer Welfare | 121 | 68 | 45 | 12 | 246 |
| Task Completion Time | 183 | 33 | 10 | 13 | 240 |
| Inequality Measures | 45 | 126 | 50 | 6 | 227 |
| Worker Satisfaction | 95 | 74 | 23 | 12 | 204 |
| Error Rate | 77 | 98 | 11 | 4 | 190 |
| Regulatory Compliance | 84 | 73 | 17 | 7 | 181 |
| Automation Exposure | 61 | 61 | 27 | 14 | 166 |
| Training Effectiveness | 98 | 21 | 14 | 19 | 154 |
| Wages & Compensation | 78 | 37 | 25 | 6 | 146 |
| Developer Productivity | 105 | 18 | 14 | 6 | 144 |
| Team Performance | 87 | 17 | 28 | 10 | 143 |
| Job Displacement | 12 | 83 | 23 | 1 | 119 |
| Hiring & Recruitment | 53 | 8 | 8 | 3 | 72 |
| Social Protection | 39 | 17 | 8 | 2 | 66 |
| Creative Output | 32 | 20 | 8 | 3 | 64 |
| Skill Obsolescence | 5 | 50 | 6 | 1 | 62 |
| Labor Share of Income | 17 | 20 | 17 | — | 54 |
| Worker Turnover | 15 | 15 | — | 3 | 33 |
| Industry | — | — | — | 1 | 1 |
Embedding compliance features into automation can reduce regulatory fines and litigation risk, thereby affecting firm risk profiles and cost of capital.
Theoretical implication drawn from aligning governance with compliance objectives; no empirical evidence linking the proposed pattern to reduced fines or changes in cost of capital in the paper.
The framework is applicable across multiple sectors and aligns with industry best practices; it is presented as a deployable pattern rather than a one-size-fits-all product.
Authors' assertion based on multi-sector practitioner examples and alignment with documented industry practices (qualitative). Details on sector coverage and case selection are limited.
The proposed governed hyperautomation pattern yields benefits including faster scaling of automation, reduced operational risk, maintained regulatory compliance, and preserved long-term system integrity.
Claim grounded in conceptual argument and practitioner case-based illustrations; no large-scale quantitative evaluation or causal inference provided in the paper.
Technical mitigations such as prompt/response attestation, watermarking, model output provenance, access controls, differential-design of prompts (few-shot safety), and monitoring tools can help detect or prevent prompt fraud.
Proposed technical controls and rationale derived from threat modeling and prior literature on provenance/watermarking; proposals are not empirically validated in the paper.
Targeted subsidies or support for SMEs to access SECaaS could accelerate secure AI adoption where scale barriers exist.
Economic rationale and proposed field-experiment designs; no empirical trial results presented in the chapter.
Clarifying liability and the shared responsibility model will better align incentives between providers and customers and improve security outcomes.
Policy and legal analysis; case studies of incidents where unclear responsibilities hampered response; recommended as an intervention rather than proven by causal evidence.
Promoting interoperable standards and certification can reduce lock-in and lower search costs for buyers, fostering competition in SECaaS markets.
Policy recommendation grounded in market-design theory and analogies to other standardization efforts; supporting case studies from other technology markets suggested but not empirically established here.
Open, linked phenomic–genomic datasets could inform policy and conservation markets (e.g., biodiversity credits) by improving monitoring and trait-based risk assessment models.
Policy implication advanced in the discussion; presented as potential application rather than demonstrated outcome.
Paired phenome–genome data increases the scientific and commercial value of the dataset for models predicting phenotype from genotype and vice versa.
Analytical argument in the implications section; no empirical demonstrations in the paper of improved model performance using these pairings.
Open, standardized 3D phenomic datasets reduce the need for individual labs/companies to finance expensive scanning campaigns and democratize access for academic groups and startups.
Argument in the paper's implications section based on the public release of a large standardized dataset; not an empirically tested economic outcome in the study.
Demand would grow for liability insurance tailored to EdTech, third‑party audits, fairness certifications, and specialized legal advisory services; these markets would affect costs and differential competitiveness.
Predictive market analysis and policy reasoning (no survey or market data presented).
Stricter legal exposure may slow some risky experimentation but encourage investment in fairness testing, robust evaluation, and explainability tools — potentially increasing the quality and trustworthiness of deployed AI in education.
Normative economic argumentation about incentives for R&D and testing; no empirical measurement of innovation rates provided.
Faster iterative experimental cycles enabled by LLM orchestration may increase returns to experimental R&D and change the optimal allocation between computation, instrumentation, and labor.
Economic argumentation about iterative cycles and returns to capital/labor; proposed rather than empirically demonstrated.
The method can identify frontier topics and cross-field convergence (e.g., methods migrating from NLP to vision) to inform assessments of comparative advantage and specialization across institutions/countries.
Proposed implication: using topic maps and cluster dynamics to detect frontier topics and cross-field migration; no concrete empirical examples or validation presented in summary beyond general mapping claim on ICML/ACL abstracts.
The approach is scalable and model-agnostic: different LLMs and embedding models can be swapped into the pipeline without changing the overall method.
Claimed design property in the paper summary (asserted ability to substitute different LLMs/embedding models). No detailed cross-model robustness experiments or scalability benchmarks provided in the summary.
The paper provides an initial mapping from diagnosis to intervention strategies (therapeutics) — i.e., treatment planning for model dysfunctions.
Conceptual mapping and proposed intervention strategies documented in the therapeutics section (initial mappings; not claimed as exhaustive).
Policy recommendation: governments should shift from direct administrative provision toward a strategic purchaser role using digital platforms to foster inclusive labor market access.
Policy implication derived from empirical pattern of platform-mediated employment growth and the identified Fiscal-Digital Synergy; recommendation based on observed heterogeneity by digital infrastructure and procurement channels (280-city analysis).
Public cultural services can function as productive social infrastructure that advances SDG 8 (decent work) provided adequate digital capacity exists.
Interpretation of empirical results showing employment gains contingent on digital infrastructure; normative linkage to SDG 8 drawn by authors based on observed Fiscal-Digital Synergy effects (empirical sample: 280 cities, 2008–2021).
AI should serve precision and purpose in public policy — improving foresight, enabling better trade-offs, and preserving democratic accountability.
Normative policy prescription and conceptual argumentation in the book; no empirical testing or quantified outcomes reported.
AI-driven systems should empower people with knowledge and pathways to participate in global markets rather than concentrate gains.
Normative recommendation derived from policy analysis and value judgments in the book; not supported by empirical evidence in the blurb.
Algorithmic transparency and auditability can reduce systemic risk from opaque automated lending decisions and improve regulator oversight and macroprudential policy.
Conceptual/systemic-risk argument in the "Systemic risk & governance externalities" section; no empirical systemic-risk analysis provided.
Improved algorithmic transparency could reduce information asymmetries, lowering adverse selection and moral hazard over time and potentially expanding credit to underserved populations.
Conceptual economic argument in the "Credit allocation & pricing" section; based on theory rather than empirical testing.
If properly designed and enforced, the protocol measures can improve credit access for underserved populations and reduce biased exclusion, supporting inclusive growth.
Normative claim supported by doctrinal arguments, comparative regulatory literature and technical fairness literature synthesized in the audit (no controlled empirical evaluation reported).
Firms that effectively implement governed hyperautomation may realize sustainable efficiency and reliability advantages, potentially increasing market concentration in some sectors unless governance costs level the playing field.
Strategic and competitive-dynamics argument derived from case examples and best-practice synthesis; no sector-level empirical concentration measures presented.
Standardized governance patterns reduce information asymmetries, enabling insurers and regulators to better price and manage enterprise AI risks.
Policy implication argued from the existence of standardized governance artifacts (audit trails, certifications) and industry practice; conceptual, no empirical insurer/regulator data presented.
Embedding governance reduces downside risks (compliance fines, data breaches), improving expected net returns of automation investments and lowering the adoption threshold for risk-averse firms.
Conceptual cost-benefit argument and industry best-practice examples; lacking quantitative measurement of returns or threshold shifts.
High non-wage costs (NWC ≈ 51%) and a large formalization premium (CFIL ≈ +88%) increase the private incentive to substitute labor with capital, including AI/automation, especially for routine tasks.
Policy implication derived from the measured 2023 NWC and CFIL values for the 19-country sample combined with economic substitution logic (cost of labor relative to capital/technology); no direct empirical firm-level evidence of automation responses presented in the note.
VIS can be integrated into macro/meso AI-economics models (input–output general equilibrium, growth models) to capture embodied labor and capital effects and to enable counterfactual analysis of AI diffusion scenarios.
Authors propose methodological extensions and modeling directions that embed VIS-style accounting into larger economic models for scenario analysis (conceptual suggestion).
VIS metrics can inform policy decisions (workforce retraining, sectoral subsidies, taxation) by revealing where AI-induced productivity changes will propagate through supply chains.
Authors argue policy relevance based on VIS’s ability to map upstream/downstream labor effects; presented as an implication rather than empirically validated policy outcomes.
VIS-based measures can improve measurement of AI’s productivity impacts by better capturing indirect labor displacement or augmentation from AI-driven automation across supply chains.
Conceptual extension: VIS framework captures indirect labor effects that would matter when assessing AI-driven automation impacts; not empirically tested for AI within the paper.
Research should prioritize more granular skill-to-AI-capability mappings, longitudinal tracking of adoption vs. exposure, and integration of firm behavior and regulatory dynamics into agent-based models to move from exposure assessment toward outcome prediction.
Paper's recommendations for future work built on acknowledged limitations and the gap between capability exposure and realized outcomes.
Incentives for human‑augmenting AI (e.g., subsidies or tax incentives tied to task redesign and training) can promote inclusive adoption patterns.
Policy analysis and comparative case studies; theoretical models that predict firm adoption responses to incentives, but limited causal empirical evidence specific to AI-targeted incentives.
By synthesizing computer science, engineering, and financial policy insights, DRL should be viewed not merely as a mathematical tool but as a transformative agent within the global socio-technical infrastructure of capital markets.
High-level synthesis and interdisciplinary argumentation in the paper; no empirical evidence or longitudinal studies are cited in the excerpt to demonstrate systemic transformation.
Research agenda items include quantifying social returns to different alignment interventions, studying market equilibria under participatory vs. opaque strategies, and modeling optimal regulatory mixes under uncertainty about harms and capability growth.
Prescriptive research agenda derived from the paper's economic analysis and identified knowledge gaps; presented as proposed studies rather than completed research.
If conformal filtering produces vacuous outputs at factuality levels customers demand, adoption in knowledge-intensive domains may be limited until methods simultaneously provide robustness and informativeness; vendors using efficient verifiers and robust calibration may gain competitive advantage.
Paper's market/economic discussion drawing on empirical trade-offs (informativeness vs. factuality) and cost comparisons; this is an applied implication rather than a direct experimental result.
Authors propose the 'AI orchestra' concept: future development will involve coordinated ensembles of specialized AI agents (code generation, test generation, dependency analysis, security scanning) orchestrated by humans and higher-level controllers.
Theoretical/conceptual argument by the authors grounded in qualitative findings from Netlight (practitioner reports of multiple tools and coordination frictions); this is a forward-looking synthesis rather than an empirically established fact.
Modular and cell‑free platforms could enable decentralized, localized manufacturing of specialty compounds, potentially altering trade flows away from centralized petrochemical hubs.
Conceptual synthesis plus small-scale demonstrations of modular/cell-free units in the reviewed literature; limited pilot projects and discussion of potential scalability and portability.
Canvas Design Principles aimed at reducing algorithmic myopia matter for welfare and regulatory concerns: better adaptive behavior reduces mispricing/misattribution risks but raises questions about transparency, accountability, and systemic amplification of shocks.
Policy and governance implication inferred from the claimed reductions in algorithmic myopia and increased adaptivity; study does not report direct welfare/regulatory impact measurements.
Faster, more accurate identification of demand shifts can compress the window for first‑mover advantages, intensify competitive dynamics, and raise the premium on organizational agility and human–AI integration capabilities.
Theoretical implication derived from observed improvements in signal detection (~5.8×) and resilience; not directly measured as market‑level competitive outcomes in the study.
Product teams evaluating LLM-powered features rely on a spectrum of practices—from informal “vibe checks” to organizational meta-work—to cope with LLMs’ unpredictability.
Qualitative interview study with 19 practitioners; thematic coding of transcripts produced descriptions of a range of evaluation practices used by teams.
Platform design choices (property rights, portability, reputation, tokenization, escrowed memories) will shape incentives for contributions to shared knowledge and agent improvement.
Policy and mechanism-design implications drawn from observed phenomena (shared memories, contributions, and trust) in the qualitative dataset; recommendation rather than empirically tested claim.
Shared memory architectures create public-good–like externalities (knowledge diffusion and spillovers) that may be underprovided absent coordination or platform governance.
Qualitative observations of shared memories and diffusion patterns plus theoretical economic interpretation; no empirical quantification of spillover magnitudes provided.
Easier specification of constraints can reduce some harms (clear safety violations) but centralizes normative power (who defines constraints) and creates international/cultural externalities and risks of regulatory capture.
Normative and economic argument in the paper combining technical tractability of constraints with governance concerns; this is an inference about likely distributional effects rather than empirically established fact.
Adoption of C.A.P. may reduce demand for routine oversight/clarification roles and increase demand for higher-skill roles such as prompt/system designers and dialogue curators.
Labor demand and task composition analysis presented as a conceptual projection in the paper; no labor-market empirical study reported.
Because failure modes such as definition misalignment and hypothesis creep were observed, the authors argue for regulation/standards around disclosure of AI-assisted scientific claims and archival of verification artifacts.
Policy recommendation in the paper derived from the documented process-level failure modes in the single project; recommendation is prescriptive, not empirically validated beyond the project.
Lower data and compute requirements could decentralize innovation (reducing incumbent advantages tied to massive compute/data), but the complexity of embodied systems and real-world testing could create new specialized incumbents (robotics platforms, simulation providers).
Market-structure hypothesis based on trade-offs between resource needs and platform value; speculative and not empirically tested in the paper.
Improved recovery capability from LEAFE reduces brittle failure modes but may also enable more autonomous behavior in novel settings, increasing both benefits and potential misuse risks.
Safety/risk discussion in the paper linking enhanced recovery/autonomy to both reduced brittleness (benefit) and heightened autonomy-related risks; supported by observed improved recovery behavior in experiments and conceptual risk analysis.
Widespread adoption of LEAFE-like learning could accelerate diffusion of agentic automation across sectors, affecting wages, task allocation, and demand for complementary capital (tooling, monitoring, retraining systems).
High-level economic reasoning in Discussion/Implications section tying observed performance improvements and sample-efficiency gains to possible macroeconomic effects; no empirical macroeconomic data provided.
If smaller tuned models can capture most performance of much larger systems, market power may shift toward specialized, cheaper models plus toolchains, promoting niche competition and verticalized offerings.
Inference from empirical finding that a 7B tuned model achieves 91.2% of a larger model's quality; market-structure implication (theoretical/economic argument, not empirically tested).
Proprietary, high-quality surrogate models could create competitive advantage and barriers to entry, whereas open-source surrogates would democratize access.
This is an implication/policy argument in the paper's discussion about IP and market effects; it is a theoretical/qualitative claim rather than an empirical result from the experiments.