Evidence (7395 claims)
Adoption
7395 claims
Productivity
6507 claims
Governance
5877 claims
Human-AI Collaboration
5157 claims
Innovation
3492 claims
Org Design
3470 claims
Labor Markets
3224 claims
Skills & Training
2608 claims
Inequality
1835 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 609 | 159 | 77 | 736 | 1615 |
| Governance & Regulation | 664 | 329 | 160 | 99 | 1273 |
| Organizational Efficiency | 624 | 143 | 105 | 70 | 949 |
| Technology Adoption Rate | 502 | 176 | 98 | 78 | 861 |
| Research Productivity | 348 | 109 | 48 | 322 | 836 |
| Output Quality | 391 | 120 | 44 | 40 | 595 |
| Firm Productivity | 385 | 46 | 85 | 17 | 539 |
| Decision Quality | 275 | 143 | 62 | 34 | 521 |
| AI Safety & Ethics | 183 | 241 | 59 | 30 | 517 |
| Market Structure | 152 | 154 | 109 | 20 | 440 |
| Task Allocation | 158 | 50 | 56 | 26 | 295 |
| Innovation Output | 178 | 23 | 38 | 17 | 257 |
| Skill Acquisition | 137 | 52 | 50 | 13 | 252 |
| Fiscal & Macroeconomic | 120 | 64 | 38 | 23 | 252 |
| Employment Level | 93 | 46 | 96 | 12 | 249 |
| Firm Revenue | 130 | 43 | 26 | 3 | 202 |
| Consumer Welfare | 99 | 51 | 40 | 11 | 201 |
| Inequality Measures | 36 | 105 | 40 | 6 | 187 |
| Task Completion Time | 134 | 18 | 6 | 5 | 163 |
| Worker Satisfaction | 79 | 54 | 16 | 11 | 160 |
| Error Rate | 64 | 78 | 8 | 1 | 151 |
| Regulatory Compliance | 69 | 64 | 14 | 3 | 150 |
| Training Effectiveness | 81 | 15 | 13 | 18 | 129 |
| Wages & Compensation | 70 | 25 | 22 | 6 | 123 |
| Team Performance | 74 | 16 | 21 | 9 | 121 |
| Automation Exposure | 41 | 48 | 19 | 9 | 120 |
| Job Displacement | 11 | 71 | 16 | 1 | 99 |
| Developer Productivity | 71 | 14 | 9 | 3 | 98 |
| Hiring & Recruitment | 49 | 7 | 8 | 3 | 67 |
| Social Protection | 26 | 14 | 8 | 2 | 50 |
| Creative Output | 26 | 14 | 6 | 2 | 49 |
| Skill Obsolescence | 5 | 37 | 5 | 1 | 48 |
| Labor Share of Income | 12 | 13 | 12 | — | 37 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Adoption
Remove filter
Better contestability may reduce litigation and regulatory frictions if decisions are transparently defensible.
Speculative legal-economic claim; no case studies or empirical legal analysis provided.
New service layers may emerge (argumentation-as-a-service, audit firms, explanation certification, human-in-the-loop orchestration platforms).
Speculative market/industry evolution claim based on analogous tech-service cretions; no empirical evidence.
Collaborative VR features can change team workflows (remote, synchronous inspection sessions), potentially lowering coordination costs across geographically distributed teams.
Paper lists collaborative multi-user sessions as a planned capability and posits organizational effects; no user studies or measurements of coordination cost savings presented.
Public funding for shared VR-capable data-exploration infrastructure could yield high leverage by improving returns on large observational investments.
Policy recommendation deriving from the platform and ROI arguments in the paper; no cost-benefit analysis or quantified ROI provided.
Using iDaVIE increases the usable fraction of large observational datasets by improving QC and annotation throughput, thereby raising returns to telescope investments and downstream AI efforts.
This is an inferred implication in the paper (returns-to-scale/platform effects) based on improved QC/annotation throughput; no empirical measurement of usable-fraction increases provided.
Higher-quality labels produced via immersive inspection can reduce label noise and lower required training-data sizes for a target ML performance level.
Paper presents this as an implication/expected outcome based on improved annotation quality from immersive inspection; no empirical ML training experiments or quantitative reductions reported.
iDaVIE demonstrably reduces cognitive load for multidimensional-data tasks compared with 2D-slice inspection.
Paper asserts reduced cognitive load and faster, more intuitive exploration as an aim and reported outcome; no formal user-study metrics, sample size, or statistical analysis provided.
The inverse-specification reward offers a domain-agnostic, holistic metric for fidelity to user intent and is recommended for measurement of model value/service quality.
Method introduces inverse-specification reward and asserts domain-agnostic applicability; recommendation based on its conceptual ability to recover briefs as fidelity measure (not necessarily validated across many domains).
High-quality automated slide generation has potential to reduce time spent on business presentation creation and produce productivity gains with partial substitution of routine creative/knowledge-worker tasks.
Empirical demonstration of near-SOTA automated slide generation capability on 48 briefs; domain-level economic implication extrapolated from performance improvements.
Economic agents and risk models that integrate LLM outputs should weight inferences more heavily in structured domains (capacity estimates, trade flows, sanctions impact) and downweight or cross-validate politically ambiguous predictions.
Implication drawn from domain heterogeneity in model performance observed in the study (better structured-domain performance, weaker political forecasting).
Deploying BATQuant with reliable 4-bit weight/activation quantization for MXFP-capable accelerators reduces memory footprint and memory-bandwidth pressure, enabling higher throughput and lower per-token inference costs.
Argumentative / economic analysis in the paper linking reduced precision and parameter storage to lower memory/bandwidth requirements and inferred throughput/cost improvements; not presented as a direct empirical measurement of cost per token in production environments in the summary.
Investment in multimodal continual learning, scalable and reliable knowledge-editing methods, and retrieval architectures that guarantee cross-modal consistency is economically justified.
Research/prioritization recommendations based on empirical benchmark findings showing current gaps; argumentation for R&D focus areas.
The findings argue for policies requiring disclosure of training-data timeframes and robust monitoring for time-sensitive factual accuracy in deployed systems.
Policy recommendations in the paper drawing on benchmark results and identified failure modes; prescriptive argumentation rather than empirical policy evaluation.
Models and platforms that offer transparent update mechanisms (frequent data updates, reliable RAG pipelines, clear training snapshot metadata) will have competitive advantages in the market.
Economic and market analysis in implications section recommending transparency and update mechanisms as differentiators; speculative/business-analytical evidence rather than experimental.
Design choices and open-weight availability are intended to align with EU AI Act expectations for regional sovereignty and compliance.
Stated intent in the paper: the authors explicitly frame design and release strategy as aiming to align with EU AI Act regulatory expectations. The summary notes this intention but provides no technical compliance proof or audits.
EngGPT2 requires substantially less inference compute than comparable dense models—reported as roughly 20%–50% of the inference compute used by dense 8B–16B models.
Paper reports relative inference compute reductions (1/5–1/2). The summary states these percentages but no supporting FLOP counts, latency measurements, hardware, batching conditions, or benchmark-query workloads are provided.
Embedding culturally aligned moderation and multi-layer safety orchestration can reduce regulatory frictions and increase adoption in conservative or tightly regulated markets.
Paper claims regulatory and safety economics implications from their safety/moderation architecture; this is an asserted implication rather than an empirically validated outcome in the summary.
The methods used (data quality focus, continual pre-training, model merging, modular product stacks) are potentially transferable to other underrepresented/low-resource languages, lowering barriers to regional AI competitiveness.
Paper posits this policy/transferability implication as an argument in the 'Implications for AI Economics' section; no cross-language experimental evidence provided in the summary.
Fanar 2.0 demonstrates that targeted data curation, continual pre-training, and model-merging can be a viable alternative to the raw-scale pre-training arms race for language-specific competitiveness.
Paper argues this implication based on achieving benchmark gains on Arabic and English using curated data (120B tokens), continual pre-training, model-merging, and a 256 H100 GPU training budget rather than massively larger-scale pre-training.
Oryx provides Arabic-aware image/video understanding and culturally grounded image generation.
Paper identifies Oryx as the vision component with Arabic-aware understanding and culturally grounded generation; no benchmark metrics are provided in the summary.
Exchanging generative modules (rather than raw data) and enabling modular unlearning improves auditability and aligns better with privacy/regulatory compliance than raw-data sharing.
Argument in the paper that module exchange and deterministic module deletion are more compatible with data sovereignty and regulatory requirements; no formal legal validation or compliance testing reported in the summary.
FederatedFactory enables new economic opportunities (module marketplaces, synthetic-data services) and affects incentives by shifting value toward modular generative assets and orchestration rather than raw centralized datasets.
Conceptual and economic discussion in the paper about potential implications; not based on empirical market data—presented as analysis and hypotheses about economic impact.
The single-round exchange decreases communication rounds and associated coordination/network costs compared to typical iterative federated learning.
Protocol design: single exchange of generative modules vs. typical multi-round weight-aggregation loops in standard FL; paper argues reduced networking/coordination cost. (No quantitative network-cost measurements provided in the summary.)
Investment in data quality and feature engineering yields tangible predictive gains for workforce performance models.
Paper emphasizes use of engineered features capturing engagement dynamics and learning trends and reports better model performance relative to baseline; however, no isolated ablation study quantifying the sole contribution of data-quality investments is reported in the summary.
Tools that improve detection or quantification may reduce downstream costs from missed diagnoses or unnecessary follow-ups, improving cost-effectiveness in some scenarios.
Economic modeling and limited observational analyses that extrapolate diagnostic improvements to downstream resource use; direct empirical cost-effectiveness studies are scarce.
The metacognitive reliability metric can reduce adoption risk for purchasers by providing transparent error-risk assessments and enabling performance-based autonomy thresholds.
Conceptual claim supported by the existence of an empirical confidence metric from the recursive meta-model and discussion of procurement/decision-making implications; not empirically tested with purchasers or procurement outcomes.
HACL/CS supports human trust and situational awareness.
Human factors measured with trust and situational awareness questionnaires in the simulation; summary reports supportive effects on trust and situational awareness but lacks sample-size/statistical detail.
Intelligent turn-level assignment can reduce costly human attention to only high-value moments, improving overall system productivity.
Conceptual implication from the assignment-layer design and empirical trade-offs reported; presented as an advantage in the paper rather than a directly measured economic productivity study.
HADT demonstrates a concrete way to substitute expensive human diagnostic labor with AI assistance while preserving high accuracy, implying reductions in marginal cost per consultation.
Inference drawn in the paper's implications section based on reported reductions in required human effort and maintained diagnostic accuracy (economic claim extrapolating from experimental results; not directly measured as cost in experiments).
Organizational norms and UX influence adoption rates and diffusion of AI: social calibration processes at the team level matter for adoption beyond individual cost–benefit calculations.
Reported by interviewees (N=40) as factors shaping whether and how teams incorporated AI into routines; integrated into theoretical implications for diffusion modeling.
Well-calibrated trust tends to encourage AI being used as a complement to human labor (augmentation), increasing effective productivity; miscalibration (over- or under-trust) can lead to productivity losses.
Inferential claim drawn from interviewees' accounts of when teams appropriately relied on AI (augmentation) versus when inappropriate reliance or avoidance occurred; supported by thematic interpretation rather than quantitative measurement.
Policymakers should support standards for auditability, human‑in‑the‑loop thresholds and training subsidies to reduce coordination failures and make the social benefits of AI adoption more widely shared.
Normative policy recommendation derived from the paper’s analysis of risks, governance needs and distributional concerns; not empirically validated within the paper.
Organisations will invest more in training for AI‑related sensemaking, trust calibration and governance competencies; returns to such training should be evaluated relative to investments in model quality.
Prescriptive inference from the framework and human‑capital theory; supported by referenced literature but not empirically tested in this paper.
Explicit comparative‑advantage allocation will shift the composition of tasks across humans and AI, altering demand for routine versus non‑routine skills and potentially increasing demand for high‑level judgement, oversight and sensemaking skills.
Projected labour‑market implication based on theoretical reasoning and prior literature on task‑based skill demand; not empirically estimated in the paper.
Operationalising the four symbiarchic practices through updated HR systems lets firms capture AI‑enabled productivity gains without eroding trust, ethics or employee well‑being.
Normative claim based on theoretical synthesis and managerial prescription; no empirical testing or field data presented in the paper.
Public data sharing, reproducibility standards, and shared benchmarks could raise the floor of AI utility across the industry.
Policy implication grounded in arguments about data quality, coverage, and generalizability from the narrative review; speculative recommendation rather than evidence-backed empirical claim.
There is potential for consolidation as firms acquire data, talent, or validated AI-driven assets.
Industry-structure implication drawn from economics of complementary assets and observed M&A activity patterns; presented as a likely trend rather than demonstrated empirically in the paper.
AI startups that demonstrate validated, reproducible wet-lab outcomes and access to high-quality data are more likely to command premium valuations.
Argument from observed market behavior and economics of complementary assets presented in the narrative; no systematic valuation analysis included.
Investors should recalibrate expectations: greater value accrues to firms that integrate AI with experimental pipelines and proprietary data assets rather than firms that only possess AI capability.
Economics-focused implications drawn from thematic analysis of heterogeneity in firm outcomes and integration requirements; market-practice inference rather than empirical valuation study.
AI tools complement sensory expertise and design thinking, shifting skill demand toward interdisciplinary competencies (e.g., computational rheology, psychophysics, cultural analytics).
Reasoned inference from technology literature and skill-complementarity theory; literature synthesis but no labor-market empirical analysis provided.
The paper provides a Differentiated Path reference for Emerging Economies to cope with Technological Nationalism.
Claim about the paper's contribution; based on authors' proposed policy framework and recommendations derived from literature review and theoretical analysis; not empirically validated for emerging economies in the excerpt.
The reduction of the AI Model Performance Gap between China and the United States to single digits highlights the new trend of Technology Competition.
Empirical/observational claim stated in the paper; no information in the excerpt about the benchmark metric used for model performance, measurement methodology, time frame, or data sources; 'single digits' not numerically specified.
By integrating psychological trust factors with cognitive capability optimisation, this model offers actionable insights for knowledge management practitioners implementing AI‑augmented decision systems while advancing theoretical understanding of human–AI collaboration effectiveness.
Integrative theoretical claim based on combining constructs from psychological trust research and cognitive/capability literature via systematic synthesis; no empirical evaluation reported in the abstract.
The framework provides practical guidance for executives designing human–AI teams, developing trust calibration training, and establishing performance metrics.
Prescriptive recommendations derived from the proposed model and literature synthesis; the abstract does not report empirical testing of the recommended interventions or their effects.
Supportive regulatory frameworks and digital infrastructure development are important for leveraging AI technologies to improve global trade efficiency.
Study recommendation derived from empirical findings and discussion; this is a policy implication rather than a directly tested empirical claim (no policy evaluation data provided in the summary).
The study provides empirical support for digital transformation theories within financial intermediation.
Authors interpret quantitative results as empirical evidence consistent with digital transformation theories; specific theoretical tests, model fit statistics, and sample information are not included in the summary.
AI-enhanced compliance systems increased regulatory transparency.
Study reports improvements in regulatory transparency as part of operational efficiency gains attributed to AI-driven compliance systems in the quantitative analysis; precise transparency metrics and sample details not provided.
The system demonstrates 100% alignment with GAAP/IFRS regulatory compliance.
Reported regulatory compliance assessment or stakeholder validation claiming full alignment with GAAP/IFRS. (Summary lacks details on the compliance assessment method, criteria, or independent verification; sample/coverage not specified.)
AI has increased the accuracy of patient selection to 80–90%.
Stated performance range for AI-enabled patient selection in the review. The excerpt does not specify the datasets, evaluation metrics (e.g., accuracy vs. AUC), clinical contexts, or sample sizes used to obtain these numbers.
AI-driven ESG analytics strengthened the financial relevance of sustainability integration and supported better-informed investment decision-making.
Study conclusion synthesizing empirical findings (portfolio outperformance and regression results). This is a normative/concluding statement rather than a directly measured outcome; the summary does not quantify decision-making improvements or measure investor behavior.