Evidence (14922 claims)

Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.

The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).

Browse by theme

Nine broad, paper-level topics. Click one to filter the claims below.

Human-AI Collaboration

Claims by outcome category

Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.

Outcome	Positive	Negative	Mixed	Null	Total
Other	795	210	105	955	2131
Governance & Regulation	886	414	197	126	1654
Organizational Efficiency	826	204	129	87	1257
Technology Adoption Rate	681	259	128	110	1189
Research Productivity	464	138	65	349	1028
Output Quality	503	196	61	53	813
Decision Quality	351	180	84	51	673
AI Safety & Ethics	238	288	71	34	637
Firm Productivity	455	58	92	20	631
Market Structure	186	172	123	25	511
Task Allocation	222	70	76	34	407
Innovation Output	238	28	48	18	334
Skill Acquisition	177	62	62	17	318
Employment Level	107	57	108	13	287
Fiscal & Macroeconomic	135	72	44	26	284
Firm Revenue	172	50	28	5	256
Consumer Welfare	121	68	45	12	246
Task Completion Time	183	33	10	13	240
Inequality Measures	45	126	50	6	227
Worker Satisfaction	95	74	23	12	204
Error Rate	77	98	11	4	190
Regulatory Compliance	84	73	17	7	181
Automation Exposure	61	61	27	14	166
Training Effectiveness	98	21	14	19	154
Wages & Compensation	78	37	25	6	146
Developer Productivity	105	18	14	6	144
Team Performance	87	17	28	10	143
Job Displacement	12	83	23	1	119
Hiring & Recruitment	53	8	8	3	72
Social Protection	39	17	8	2	66
Creative Output	32	20	8	3	64
Skill Obsolescence	5	50	6	1	62
Labor Share of Income	17	20	17	—	54
Worker Turnover	15	15	—	3	33
Industry	—	—	—	1	1

Embedding compliance features into automation can reduce regulatory fines and litigation risk, thereby affecting firm risk profiles and cost of capital.

Theoretical implication drawn from aligning governance with compliance objectives; no empirical evidence linking the proposed pattern to reduced fines or changes in cost of capital in the paper.

low positive Governed Hyperautomation for CRM and ERP: A Reference Patter... regulatory fines/litigation incidents; firm risk profile; cost of capital (hypot...

The framework is applicable across multiple sectors and aligns with industry best practices; it is presented as a deployable pattern rather than a one-size-fits-all product.

Authors' assertion based on multi-sector practitioner examples and alignment with documented industry practices (qualitative). Details on sector coverage and case selection are limited.

low positive Governed Hyperautomation for CRM and ERP: A Reference Patter... cross-sector applicability and alignment with best practices (qualitative/applic...

The proposed governed hyperautomation pattern yields benefits including faster scaling of automation, reduced operational risk, maintained regulatory compliance, and preserved long-term system integrity.

Claim grounded in conceptual argument and practitioner case-based illustrations; no large-scale quantitative evaluation or causal inference provided in the paper.

low positive Governed Hyperautomation for CRM and ERP: A Reference Patter... automation deployment speed; operational risk incidents; regulatory compliance i...

Technical mitigations such as prompt/response attestation, watermarking, model output provenance, access controls, differential-design of prompts (few-shot safety), and monitoring tools can help detect or prevent prompt fraud.

Proposed technical controls and rationale derived from threat modeling and prior literature on provenance/watermarking; proposals are not empirically validated in the paper.

low positive Prompt Engineering or Prompt Fraud? Governance Challenges fo... effectiveness of specific technical mitigations in detecting/preventing prompt f...

Targeted subsidies or support for SMEs to access SECaaS could accelerate secure AI adoption where scale barriers exist.

Economic rationale and proposed field-experiment designs; no empirical trial results presented in the chapter.

low positive Security- as- a- service: enhancing cloud security through m... SME SECaaS adoption rates, AI adoption by SMEs

Clarifying liability and the shared responsibility model will better align incentives between providers and customers and improve security outcomes.

Policy and legal analysis; case studies of incidents where unclear responsibilities hampered response; recommended as an intervention rather than proven by causal evidence.

low positive Security- as- a- service: enhancing cloud security through m... alignment of incentives, incident response effectiveness, legal clarity

Promoting interoperable standards and certification can reduce lock-in and lower search costs for buyers, fostering competition in SECaaS markets.

Policy recommendation grounded in market-design theory and analogies to other standardization efforts; supporting case studies from other technology markets suggested but not empirically established here.

low positive Security- as- a- service: enhancing cloud security through m... buyer switching costs, market competition indicators

Open, linked phenomic–genomic datasets could inform policy and conservation markets (e.g., biodiversity credits) by improving monitoring and trait-based risk assessment models.

Policy implication advanced in the discussion; presented as potential application rather than demonstrated outcome.

low positive High-throughput phenomics of global ant biodiversity potential influence on policy and conservation market analytics (projected)

Paired phenome–genome data increases the scientific and commercial value of the dataset for models predicting phenotype from genotype and vice versa.

Analytical argument in the implications section; no empirical demonstrations in the paper of improved model performance using these pairings.

low positive High-throughput phenomics of global ant biodiversity value for phenotype–genotype predictive modeling (projected)

Open, standardized 3D phenomic datasets reduce the need for individual labs/companies to finance expensive scanning campaigns and democratize access for academic groups and startups.

Argument in the paper's implications section based on the public release of a large standardized dataset; not an empirically tested economic outcome in the study.

low positive High-throughput phenomics of global ant biodiversity reduction in data-acquisition costs/barriers for downstream users (projected)

Demand would grow for liability insurance tailored to EdTech, third‑party audits, fairness certifications, and specialized legal advisory services; these markets would affect costs and differential competitiveness.

Predictive market analysis and policy reasoning (no survey or market data presented).

low positive Civil Rights and the EdTech Revolution size/growth of insurance and certification markets and effect on vendor costs/co...

Stricter legal exposure may slow some risky experimentation but encourage investment in fairness testing, robust evaluation, and explainability tools — potentially increasing the quality and trustworthiness of deployed AI in education.

Normative economic argumentation about incentives for R&D and testing; no empirical measurement of innovation rates provided.

low positive Civil Rights and the EdTech Revolution innovation behavior (risk‑taking vs. investment in fairness/testing) and resulti...

Faster iterative experimental cycles enabled by LLM orchestration may increase returns to experimental R&D and change the optimal allocation between computation, instrumentation, and labor.

Economic argumentation about iterative cycles and returns to capital/labor; proposed rather than empirically demonstrated.

low positive ChatMicroscopy: A Perspective Review of Large Language Model... returns to experimental R&D and allocation of spending across computation, instr...

The method can identify frontier topics and cross-field convergence (e.g., methods migrating from NLP to vision) to inform assessments of comparative advantage and specialization across institutions/countries.

Proposed implication: using topic maps and cluster dynamics to detect frontier topics and cross-field migration; no concrete empirical examples or validation presented in summary beyond general mapping claim on ICML/ACL abstracts.

low positive Soft-Prompted Semantic Normalization for Unsupervised Analys... detection of frontier topics and cross-field convergence

The approach is scalable and model-agnostic: different LLMs and embedding models can be swapped into the pipeline without changing the overall method.

Claimed design property in the paper summary (asserted ability to substitute different LLMs/embedding models). No detailed cross-model robustness experiments or scalability benchmarks provided in the summary.

low positive Soft-Prompted Semantic Normalization for Unsupervised Analys... pipeline compatibility across different LLMs/embedding models and computational ...

The paper provides an initial mapping from diagnosis to intervention strategies (therapeutics) — i.e., treatment planning for model dysfunctions.

Conceptual mapping and proposed intervention strategies documented in the therapeutics section (initial mappings; not claimed as exhaustive).

low positive Model Medicine: A Clinical Framework for Understanding, Diag... Existence of a proposed mapping from diagnostic categories to candidate interven...

Policy recommendation: governments should shift from direct administrative provision toward a strategic purchaser role using digital platforms to foster inclusive labor market access.

Policy implication derived from empirical pattern of platform-mediated employment growth and the identified Fiscal-Digital Synergy; recommendation based on observed heterogeneity by digital infrastructure and procurement channels (280-city analysis).

low positive Redefining Policy Effectiveness in the Digital Era: From Cor... policy effectiveness for inclusive labor market access (inferred from employment...

Public cultural services can function as productive social infrastructure that advances SDG 8 (decent work) provided adequate digital capacity exists.

Interpretation of empirical results showing employment gains contingent on digital infrastructure; normative linkage to SDG 8 drawn by authors based on observed Fiscal-Digital Synergy effects (empirical sample: 280 cities, 2008–2021).

low positive Redefining Policy Effectiveness in the Digital Era: From Cor... alignment with SDG 8 (decent work) inferred from cultural-sector employment effe...

AI should serve precision and purpose in public policy — improving foresight, enabling better trade-offs, and preserving democratic accountability.

Normative policy prescription and conceptual argumentation in the book; no empirical testing or quantified outcomes reported.

low positive Governing The Future policy foresight quality, decision trade-off management, and preservation of dem...

AI-driven systems should empower people with knowledge and pathways to participate in global markets rather than concentrate gains.

Normative recommendation derived from policy analysis and value judgments in the book; not supported by empirical evidence in the blurb.

low positive Governing The Future distribution of economic gains and levels of participation in global markets

Algorithmic transparency and auditability can reduce systemic risk from opaque automated lending decisions and improve regulator oversight and macroprudential policy.

Conceptual/systemic-risk argument in the "Systemic risk & governance externalities" section; no empirical systemic-risk analysis provided.

low positive Diego Saucedo Portillo Sauceport Research systemic risk indicators related to automated lending (e.g., correlated default ...

Improved algorithmic transparency could reduce information asymmetries, lowering adverse selection and moral hazard over time and potentially expanding credit to underserved populations.

Conceptual economic argument in the "Credit allocation & pricing" section; based on theory rather than empirical testing.

low positive Diego Saucedo Portillo Sauceport Research levels of information asymmetry, incidence of adverse selection/moral hazard, an...

If properly designed and enforced, the protocol measures can improve credit access for underserved populations and reduce biased exclusion, supporting inclusive growth.

Normative claim supported by doctrinal arguments, comparative regulatory literature and technical fairness literature synthesized in the audit (no controlled empirical evaluation reported).

low positive Diego Saucedo Portillo Sauceport Research credit access for underserved populations; incidence of biased exclusion

Firms that effectively implement governed hyperautomation may realize sustainable efficiency and reliability advantages, potentially increasing market concentration in some sectors unless governance costs level the playing field.

Strategic and competitive-dynamics argument derived from case examples and best-practice synthesis; no sector-level empirical concentration measures presented.

low positive Governed Hyperautomation for CRM and ERP: A Reference Patter... firm-level efficiency/reliability gains and sector market concentration

Standardized governance patterns reduce information asymmetries, enabling insurers and regulators to better price and manage enterprise AI risks.

Policy implication argued from the existence of standardized governance artifacts (audit trails, certifications) and industry practice; conceptual, no empirical insurer/regulator data presented.

low positive Governed Hyperautomation for CRM and ERP: A Reference Patter... ability of insurers/regulators to assess/price/manage enterprise AI risk

Embedding governance reduces downside risks (compliance fines, data breaches), improving expected net returns of automation investments and lowering the adoption threshold for risk-averse firms.

Conceptual cost-benefit argument and industry best-practice examples; lacking quantitative measurement of returns or threshold shifts.

low positive Governed Hyperautomation for CRM and ERP: A Reference Patter... expected net returns on automation investments and adoption threshold for firms

High non-wage costs (NWC ≈ 51%) and a large formalization premium (CFIL ≈ +88%) increase the private incentive to substitute labor with capital, including AI/automation, especially for routine tasks.

Policy implication derived from the measured 2023 NWC and CFIL values for the 19-country sample combined with economic substitution logic (cost of labor relative to capital/technology); no direct empirical firm-level evidence of automation responses presented in the note.

low positive Salaried Labor Costs in Latin America and the Caribbean: A T... Incentive/probability of firm-level substitution of labor with capital/automatio...

VIS can be integrated into macro/meso AI-economics models (input–output general equilibrium, growth models) to capture embodied labor and capital effects and to enable counterfactual analysis of AI diffusion scenarios.

Authors propose methodological extensions and modeling directions that embed VIS-style accounting into larger economic models for scenario analysis (conceptual suggestion).

low positive Measuring labor productivity dynamics in U.S. industrial and... feasibility of integrating VIS into macro/meso models for counterfactual AI diff...

VIS metrics can inform policy decisions (workforce retraining, sectoral subsidies, taxation) by revealing where AI-induced productivity changes will propagate through supply chains.

Authors argue policy relevance based on VIS’s ability to map upstream/downstream labor effects; presented as an implication rather than empirically validated policy outcomes.

low positive Measuring labor productivity dynamics in U.S. industrial and... policy-relevant insights on propagation of productivity changes across supply ch...

VIS-based measures can improve measurement of AI’s productivity impacts by better capturing indirect labor displacement or augmentation from AI-driven automation across supply chains.

Conceptual extension: VIS framework captures indirect labor effects that would matter when assessing AI-driven automation impacts; not empirically tested for AI within the paper.

low positive Measuring labor productivity dynamics in U.S. industrial and... comprehensiveness/accuracy of measured AI-induced labor productivity changes (di...

Research should prioritize more granular skill-to-AI-capability mappings, longitudinal tracking of adoption vs. exposure, and integration of firm behavior and regulatory dynamics into agent-based models to move from exposure assessment toward outcome prediction.

Paper's recommendations for future work built on acknowledged limitations and the gap between capability exposure and realized outcomes.

low positive The Iceberg Index: Measuring Workforce Exposure in the AI Ec... proposed research directions (not an empirical measurement)

Incentives for human‑augmenting AI (e.g., subsidies or tax incentives tied to task redesign and training) can promote inclusive adoption patterns.

Policy analysis and comparative case studies; theoretical models that predict firm adoption responses to incentives, but limited causal empirical evidence specific to AI-targeted incentives.

low positive Intelligence and Labor Market Transformation: A Critical Ana... patterns of AI adoption (augmenting vs. substituting) and associated worker outc...

By synthesizing computer science, engineering, and financial policy insights, DRL should be viewed not merely as a mathematical tool but as a transformative agent within the global socio-technical infrastructure of capital markets.

High-level synthesis and interdisciplinary argumentation in the paper; no empirical evidence or longitudinal studies are cited in the excerpt to demonstrate systemic transformation.

low speculative Deep Reinforcement Learning for Dynamic Portfolio Optimizati... transformative impact on socio-technical structures of capital markets (institut...

Research agenda items include quantifying social returns to different alignment interventions, studying market equilibria under participatory vs. opaque strategies, and modeling optimal regulatory mixes under uncertainty about harms and capability growth.

Prescriptive research agenda derived from the paper's economic analysis and identified knowledge gaps; presented as proposed studies rather than completed research.

low speculative LLM Alignment should go beyond Harmlessness–Helpfulness and ... evidence produced by future studies quantifying returns, market equilibria, and ...

If conformal filtering produces vacuous outputs at factuality levels customers demand, adoption in knowledge-intensive domains may be limited until methods simultaneously provide robustness and informativeness; vendors using efficient verifiers and robust calibration may gain competitive advantage.

Paper's market/economic discussion drawing on empirical trade-offs (informativeness vs. factuality) and cost comparisons; this is an applied implication rather than a direct experimental result.

low speculative Is Conformal Factuality for RAG-based LLMs Robust? Novel Met... market adoption likelihood, product reliability vs. cost (qualitative)

Authors propose the 'AI orchestra' concept: future development will involve coordinated ensembles of specialized AI agents (code generation, test generation, dependency analysis, security scanning) orchestrated by humans and higher-level controllers.

Theoretical/conceptual argument by the authors grounded in qualitative findings from Netlight (practitioner reports of multiple tools and coordination frictions); this is a forward-looking synthesis rather than an empirically established fact.

low speculative Rethinking How IT Professionals Build IT Products with Artif... anticipated architecture of AI tool ecosystems (multiple specialized agents coor...

Modular and cell‑free platforms could enable decentralized, localized manufacturing of specialty compounds, potentially altering trade flows away from centralized petrochemical hubs.

Conceptual synthesis plus small-scale demonstrations of modular/cell-free units in the reviewed literature; limited pilot projects and discussion of potential scalability and portability.

low speculative Harnessing Microbial Factories: Biotechnology at the Edge of... feasibility metrics for localized production (unit throughput, cost per unit at ...

Canvas Design Principles aimed at reducing algorithmic myopia matter for welfare and regulatory concerns: better adaptive behavior reduces mispricing/misattribution risks but raises questions about transparency, accountability, and systemic amplification of shocks.

Policy and governance implication inferred from the claimed reductions in algorithmic myopia and increased adaptivity; study does not report direct welfare/regulatory impact measurements.

speculative mixed The Algorithmic Canvas: On the Autopoietic Redefinition of S... algorithmic governance externalities (mispricing risk, transparency, accountabil...

Faster, more accurate identification of demand shifts can compress the window for first‑mover advantages, intensify competitive dynamics, and raise the premium on organizational agility and human–AI integration capabilities.

Theoretical implication derived from observed improvements in signal detection (~5.8×) and resilience; not directly measured as market‑level competitive outcomes in the study.

speculative mixed The Algorithmic Canvas: On the Autopoietic Redefinition of S... market dynamics (first‑mover window, competitive intensity) — theoretical implic...

Product teams evaluating LLM-powered features rely on a spectrum of practices—from informal “vibe checks” to organizational meta-work—to cope with LLMs’ unpredictability.

Qualitative interview study with 19 practitioners; thematic coding of transcripts produced descriptions of a range of evaluation practices used by teams.

medium-high mixed Results-Actionability Gap: Understanding How Practitioners E... types of evaluation practices used by product teams

Platform design choices (property rights, portability, reputation, tokenization, escrowed memories) will shape incentives for contributions to shared knowledge and agent improvement.

Policy and mechanism-design implications drawn from observed phenomena (shared memories, contributions, and trust) in the qualitative dataset; recommendation rather than empirically tested claim.

speculative mixed When Openclaw Agents Learn from Each Other: Insights from Em... rate/distribution of contributions to shared knowledge and agent improvement as ...

Shared memory architectures create public-good–like externalities (knowledge diffusion and spillovers) that may be underprovided absent coordination or platform governance.

Qualitative observations of shared memories and diffusion patterns plus theoretical economic interpretation; no empirical quantification of spillover magnitudes provided.

speculative mixed When Openclaw Agents Learn from Each Other: Insights from Em... degree of knowledge diffusion / presence of public-good spillovers from shared m...

Easier specification of constraints can reduce some harms (clear safety violations) but centralizes normative power (who defines constraints) and creates international/cultural externalities and risks of regulatory capture.

Normative and economic argument in the paper combining technical tractability of constraints with governance concerns; this is an inference about likely distributional effects rather than empirically established fact.

speculative mixed Via Negativa for AI Alignment: Why Negative Constraints Are ... measured reduction in certain harms (e.g., illegal instructions) and concentrati...

Adoption of C.A.P. may reduce demand for routine oversight/clarification roles and increase demand for higher-skill roles such as prompt/system designers and dialogue curators.

Labor demand and task composition analysis presented as a conceptual projection in the paper; no labor-market empirical study reported.

speculative mixed A Context Alignment Pre-processor for Enhancing the Coherenc... employment/demand changes by role/skill level, hours of human oversight required

Because failure modes such as definition misalignment and hypothesis creep were observed, the authors argue for regulation/standards around disclosure of AI-assisted scientific claims and archival of verification artifacts.

Policy recommendation in the paper derived from the documented process-level failure modes in the single project; recommendation is prescriptive, not empirically validated beyond the project.

speculative mixed Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau E... policy recommendation presence (advocacy for disclosure/archival standards) base...

Lower data and compute requirements could decentralize innovation (reducing incumbent advantages tied to massive compute/data), but the complexity of embodied systems and real-world testing could create new specialized incumbents (robotics platforms, simulation providers).

Market-structure hypothesis based on trade-offs between resource needs and platform value; speculative and not empirically tested in the paper.

speculative mixed Why AI systems don't learn and what to do about it: Lessons ... market concentration metrics; emergence of specialized incumbents; level of dece...

Improved recovery capability from LEAFE reduces brittle failure modes but may also enable more autonomous behavior in novel settings, increasing both benefits and potential misuse risks.

Safety/risk discussion in the paper linking enhanced recovery/autonomy to both reduced brittleness (benefit) and heightened autonomy-related risks; supported by observed improved recovery behavior in experiments and conceptual risk analysis.

speculative mixed Internalizing Agency from Reflective Experience System brittleness and autonomy-related risk potential (qualitative; no direct e...

Widespread adoption of LEAFE-like learning could accelerate diffusion of agentic automation across sectors, affecting wages, task allocation, and demand for complementary capital (tooling, monitoring, retraining systems).

High-level economic reasoning in Discussion/Implications section tying observed performance improvements and sample-efficiency gains to possible macroeconomic effects; no empirical macroeconomic data provided.

speculative mixed Internalizing Agency from Reflective Experience Macro-level economic outcomes (productivity, wages, task allocation) — not direc...

If smaller tuned models can capture most performance of much larger systems, market power may shift toward specialized, cheaper models plus toolchains, promoting niche competition and verticalized offerings.

Inference from empirical finding that a 7B tuned model achieves 91.2% of a larger model's quality; market-structure implication (theoretical/economic argument, not empirically tested).

speculative mixed Learning to Present: Inverse Specification Rewards for Agent... Market-structure shifts and competitive dynamics (speculative, not directly meas...

Proprietary, high-quality surrogate models could create competitive advantage and barriers to entry, whereas open-source surrogates would democratize access.

This is an implication/policy argument in the paper's discussion about IP and market effects; it is a theoretical/qualitative claim rather than an empirical result from the experiments.

speculative mixed Deep Learning-Driven Black-Box Doherty Power Amplifier with ... market competitive advantage / barriers to entry arising from control of surroga...

« Prev 1 2 3 … 290 291 292 … 298 299 Next »