Evidence (5126 claims)
Adoption
5126 claims
Productivity
4409 claims
Governance
4049 claims
Human-AI Collaboration
2954 claims
Labor Markets
2432 claims
Org Design
2273 claims
Innovation
2215 claims
Skills & Training
1902 claims
Inequality
1286 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 369 | 105 | 58 | 432 | 972 |
| Governance & Regulation | 365 | 171 | 113 | 54 | 713 |
| Research Productivity | 229 | 95 | 33 | 294 | 655 |
| Organizational Efficiency | 354 | 82 | 58 | 34 | 531 |
| Technology Adoption Rate | 277 | 115 | 63 | 27 | 486 |
| Firm Productivity | 273 | 33 | 68 | 10 | 389 |
| AI Safety & Ethics | 112 | 177 | 43 | 24 | 358 |
| Output Quality | 228 | 61 | 23 | 25 | 337 |
| Market Structure | 105 | 118 | 81 | 14 | 323 |
| Decision Quality | 154 | 68 | 33 | 17 | 275 |
| Employment Level | 68 | 32 | 74 | 8 | 184 |
| Fiscal & Macroeconomic | 74 | 52 | 32 | 21 | 183 |
| Skill Acquisition | 85 | 31 | 38 | 9 | 163 |
| Firm Revenue | 96 | 30 | 22 | — | 148 |
| Innovation Output | 100 | 11 | 20 | 11 | 143 |
| Consumer Welfare | 66 | 29 | 35 | 7 | 137 |
| Regulatory Compliance | 51 | 61 | 13 | 3 | 128 |
| Inequality Measures | 24 | 66 | 31 | 4 | 125 |
| Task Allocation | 64 | 6 | 28 | 6 | 104 |
| Error Rate | 42 | 47 | 6 | — | 95 |
| Training Effectiveness | 55 | 12 | 10 | 16 | 93 |
| Worker Satisfaction | 42 | 32 | 11 | 6 | 91 |
| Task Completion Time | 71 | 5 | 3 | 1 | 80 |
| Wages & Compensation | 38 | 13 | 19 | 4 | 74 |
| Team Performance | 41 | 8 | 15 | 7 | 72 |
| Hiring & Recruitment | 39 | 4 | 6 | 3 | 52 |
| Automation Exposure | 17 | 15 | 9 | 5 | 46 |
| Job Displacement | 5 | 28 | 12 | — | 45 |
| Social Protection | 18 | 8 | 6 | 1 | 33 |
| Developer Productivity | 25 | 1 | 2 | 1 | 29 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
| Creative Output | 15 | 5 | 3 | 1 | 24 |
| Skill Obsolescence | 3 | 18 | 2 | — | 23 |
| Labor Share of Income | 7 | 4 | 9 | — | 20 |
Adoption
Remove filter
The methodological landscape of the evidence base is heterogeneous, consisting of cross-sectional surveys, case studies, quasi-experimental designs, and a limited number of longitudinal analyses.
Study design information was extracted from the 145 included studies revealing a mix of designs and relatively few longitudinal or experimental studies.
Human factors (training, trust calibration, workflows) determine whether clinicians accept, override, or ignore GenAI suggestions.
Qualitative and quantitative human-AI interaction studies and pilot deployments discussed in the paper; specific sample sizes and effect sizes are not reported in the paper.
Safety and net benefit of GenAI CDS hinge on deployment details: user interface, real-time feedback, uncertainty quantification, calibration, and how recommendations are presented (strong vs. suggestive).
Human factors and implementation studies referenced; early A/B tests and human-AI interaction research suggest interface and presentation affect acceptance and error rates; no large-scale standardized implementation trial data cited.
Reimbursement models (fee-for-service vs. capitation) will influence whether cost savings from GenAI are realized or offset by increased service volume.
Economic incentive framework and prior health-economics literature cited; the paper does not provide direct empirical tests but references plausible incentive channels.
RL and adaptive methods are good for real-time adaptation but can be myopic, require large amounts of interaction data, and struggle to incorporate long-term preference structure and ethical constraints.
Surveyed properties of reinforcement learning and adaptive methods in HRI/RS literature; no new empirical evaluation in this paper.
Key tradeoffs in contemporary financing models include speed/flexibility versus regulatory coverage and long‑term cost, and data reliance versus privacy/fairness.
Multi‑criteria comparative evaluation and conceptual analysis across financing models; synthesis draws on regulatory context and observed product features rather than primary quantitative tradeoff estimation.
Performance of structure prediction models scales with data, model size, and compute; there are tradeoffs between accuracy and inference speed/simplicity.
Paper explicitly states scaling behavior and tradeoffs in 'Compute and training' and 'Representative models' sections; no precise scaling curves or thresholds are provided in the text.
Important tradeoffs exist (privacy vs. utility; centralized vs. federated data architectures; automated moderation vs. freedom of expression; cost/complexity of secure hardware) that must be balanced in VR security design.
Comparative evaluation across the reviewed corpus (31 studies) identifying recurring ethical and technical tradeoffs; authors discuss these qualitatively.
Across the EU, Algeria, and Pakistan there is convergent recognition of dual‑use risks, increasing use of export controls, and interest in developing domestic AI capacity.
Cross‑jurisdictional synthesis of national/supranational legal texts, export‑control policies, and policy documents showing discussion of dual‑use issues and capacity building.
The community knowledge functions both as practical how-to guidance and as collective experimentation with platform rules and revenue mechanisms.
Observed dual nature in the 377-video corpus: instructional workflows alongside demonstrations/testing of platform-tailored monetization tactics and workarounds.
Typical practices emphasized by creators include rapid mass production of content, productizing prompt engineering, repurposing existing material via synthesis/localization, and packaging AI outputs as sellable creative services or assets.
Recurring practices surfaced through qualitative coding of workflows, tools, and pipelines described in the 377 videos.
Across the 377 videos, creators converge on a set of repeatable use cases and platform‑tailored monetization tactics.
Thematic coding of 377 videos produced a catalog of recurring use cases and tactics; the paper reports convergence across that sample.
YouTube creators have collectively constructed and circulated a practical knowledge repository about how to monetize GenAI-driven creative work.
Systematic qualitative content analysis (thematic coding) of 377 publicly available YouTube videos in which creators promote GenAI workflows and monetization strategies.
Citation counts across repeated samples follow a power-law (heavy-tailed) distribution: a few domains are cited often while many domains are cited rarely.
Empirical distributional analysis of citation counts from repeated samples collected across the three platforms and three topics (multi-day and high-frequency regimes); observed heavy-tailed / power-law fit to citation-count distribution.
The topology of service-dependency graphs (modelled as DAGs of compute stages) is a first-order determinant of whether decentralised, price-based resource allocation will be stable and scalable.
Systematic ablation study using simulation: 1,620 runs total across six experiment types, sweeping graph topology (hierarchical vs cross-cutting), load, hybrid integrator presence, and governance constraints; metrics included price convergence/volatility and allocation throughput/quality. Effect sizes reported in the paper show topology had the largest impact on price stability and scalability.
Choice of scaffold materially affects outcomes: an open-source scaffold outperformed vendor-provided scaffolds by up to approximately 5 percentage points.
Comparative experiments across three scaffolding approaches (vendor scaffolds and at least one open-source scaffold) showing up to ~5 percentage point differences in measured outcomes.
Absence of irreducibility, positive recurrence, or aperiodicity in the state dynamics can produce non-ergodic reward behavior.
Theoretical argument and examples in the paper illustrating how breakdowns of these chain conditions lead to multiple invariant measures or absorbing regimes; analysis-based evidence.
Standard Markov chain ergodicity conditions (irreducibility, positive recurrence, aperiodicity) imply ergodic reward processes when rewards depend only on the chain state.
Formal mapping in the paper between Markov-chain ergodicity properties and reward-process ergodicity; theoretical derivation (no empirical sample).
Non-ergodic processes admit path-dependent long-run behavior (e.g., absorbing sets, multiple invariant measures, path-dependent reinforcement), so different runs with the same policy can have different long-run averages.
Analytic discussion of Markov-chain examples and theory plus the paper's illustrative constructed example showing path-dependent locking into regimes; theoretical and example-driven evidence.
Ergodic reward processes are those where time averages along almost every long trajectory converge to the same value as the ensemble average.
Formal definition and discussion in the paper mapping ergodicity concepts from stochastic processes to reward processes; theoretical exposition.
The model explicitly separates competition into two stages: discovery (first-passage to resource patches) and monopolization (local takeover and stabilization).
Model specification in the paper: stochastic, spatially-structured population model with distinct discovery and monopolization dynamics; this is a modeling assumption/structure rather than empirical measurement.
Two qualitatively distinct mechanisms underlie observed dominance: (1) extreme-event-mediated lucky discovery (transient), and (2) mechanistic asymmetries (non-reciprocal biases) that convert lucky discovery into permanent dominance.
Conceptual separation in the model structure (discovery vs monopolization phases), analytic results on first-passage extreme events, and absorbing-state analysis showing necessity of asymmetry for permanence; supported by simulations demonstrating the two-stage behavior. The claim is theoretical.
Explanations change workflows, shift responsibilities between humans and machines, and can reshape power dynamics—creating both opportunities (better oversight) and risks (over-reliance, gaming).
Qualitative and conceptual studies synthesized in the review, including socio-technical analyses and case studies reporting observed or theorized workflow and responsibility shifts; no meta-analytic causal estimate.
Explanations increase user trust principally when they are understandable, actionable, and aligned with users’ domain knowledge; opaque or overly technical explanations can fail to build trust or even decrease it.
Thematic synthesis of empirical and conceptual studies in the reviewed literature reporting conditional effects of explanation form and comprehensibility on trust; review notes heterogeneity in study designs and contexts.
Explainability improves perceived legitimacy, user trust, and organizational accountability only when technical transparency is paired with human-centered explanation design and governance mechanisms.
Synthesis of studies from the reviewed literature showing conditional effects of algorithmic interpretability combined with explanation design and governance; derived via thematic coding across technical and social-science sources (no new primary experimental data reported).
Explainability is a necessary but not sufficient condition for trustworthy AI in high-stakes domains.
Systematic literature review (thematic coding and synthesis) of interdisciplinary scholarship (peer-reviewed research, technical reports, policy documents); the paper synthesizes conceptual and empirical studies rather than presenting new primary data. Emphasis on high-stakes domains (healthcare, finance, public sector).
Some patients value human contact for sensitive cases; automated interactions can feel impersonal.
Semi-structured interviews with patients/staff and open-ended survey responses documenting preferences for human interaction in sensitive/complex complaints.
The benefits of FDI (jobs, productivity, skills) are uneven and often conditional on institutional quality, labor regulation, and sectoral composition of investments.
Mechanism mapping and thematic synthesis linking heterogeneous empirical findings to contextual moderators (governance, regulation, sector); review emphasizes consistent role of these moderators across studies.
FDI’s effects on employment, wages, and income distribution in Sub‑Saharan Africa are mixed and highly context‑dependent.
Conceptual literature review synthesizing theoretical frameworks and empirical findings across micro, firm, sectoral, and macro studies; no new primary data. Review notes heterogeneous identification strategies and results across studies and contexts.
India’s reported post-harvest loss is relatively low (3.2%) despite poor food-security outcomes (Global Hunger Index rank 111/125).
Reported statistics cited in the paper (FAO/Kaggle for post-harvest loss; Global Hunger Index ranking referenced).
Data‑driven policies can either amplify or mitigate inequalities depending on data representativeness, model design, and deployment governance.
Multiple empirical examples and theoretical analyses in the review highlighting cases of both harm (bias amplification) and mitigation, identified across the 103 items.
Citizen acceptance, transparency, and perceived fairness strongly shape adoption trajectories and the political feasibility of AI tools in government.
Repeated empirical findings in the reviewed literature linking public trust, transparency measures, and fairness perceptions to successful or failed deployments (drawn from multiple case studies in the 103 items).
Adoption of AI and data-driven governance is highly uneven across jurisdictions and sectors, driven by institutional capacity, governance frameworks, and public trust.
Cross‑regional and cross‑sector comparisons in the review corpus (103 items) showing varying maturity levels and repeated identification of institutional capacity, governance arrangements, and trust factors as determinants.
Governance approaches are emerging at global, regional and national levels; they vary widely across sectors and jurisdictions, creating opportunities for regulatory experimentation but also risks of fragmentation and regulatory arbitrage.
Cross-jurisdictional comparison of existing/global/regional/national governance instruments and sectoral guidance; gap analysis highlighting heterogeneity.
Weak formal institutions often coexist with strong informal institutions in African contexts, shaping governance, trust, and enforcement mechanisms in supply chains.
Cross-disciplinary literature review presented in the paper; conceptual argumentation rather than primary empirical analysis.
Technology effectiveness depends on institutional support (extension, property rights), finance, and local knowledge — technologies are not a silver bullet alone.
Conceptual frameworks and comparative analysis in the review; supporting case studies and program evaluations linking adoption and impact to institutional factors (extension reach, tenure security, access to credit).
Existing evidence is time-sensitive and heterogeneous: rapidly evolving models, heterogeneous study designs, and many short-term lab/microtask studies limit direct comparability and long-run inference.
Meta-observation from the review: documented methodological limitations across the literature (variation in models, tasks, metrics; prevalence of short-term studies).
Real‑time and LLM‑based methods improve responsiveness but raise governance, transparency, and reproducibility challenges that BLS must manage (audit trails, uncertainty communication).
Operational tradeoff discussion in the paper identifying governance risks; no case studies or incident analyses provided.
Distinguishing automation versus augmentation using causal methods changes policy responses (e.g., income support versus reskilling).
Policy implication drawn from conceptual separation of substitution and complementarity effects; logical inference rather than empirical demonstration in the paper.
The authors were able to fully reproduce the reported results for 49% of CHI papers that had publicly shared study data and analysis code.
Empirical reproduction attempts performed by the authors on the population of CHI papers that publicly shared study data and analysis code (sample defined as 'all CHI papers that had publicly shared study data and analysis code' — exact number/time window not specified in the summary).
Evaluation of the equivalency system should use metrics such as concordance between claimed competencies and verified inputs, predictive validity versus labor-market integration outcomes, and false positive/negative rates in automated decisions.
Methodological recommendation in the paper outlining specific evaluation metrics; this is a prescriptive claim (no empirical implementation reported).
Despite laboratory and pilot successes, many engineered bioprocesses remain at bench or pilot scale and require techno‑economic validation before industrial competitiveness can be established.
Review aggregate noting scale and validation status of case studies (many reported at lab or pilot fermenter scale) and explicit references to the need for TEA and LCA for industrial assessment.
Results and implications are limited by the sample and context: evidence comes from law students on a single issue-spotting exam using one brief training intervention, so generalizability to experienced professionals, other tasks, or other models is untested.
Authors’ reported sample (164 law students) and explicit caution about generalizability in the study summary; the intervention and outcome are specific to one exam and one ~10-minute training.
Some mechanism-specific estimates are imprecise due to the sample size; confidence intervals for those estimates are wide.
Authors report wide confidence intervals for mechanism decomposition (principal stratification) results based on the randomized sample of 164 students.
Overall, the protocol reframes AI governance in finance as a rights‑centered institutional design problem with direct economic consequences for market structure, credit allocation, compliance costs, and incentives shaping AI model development.
High-level synthesis claim made by the author, supported by the corpus audit (~4,200 texts), 12 years of legal research, doctrinal/comparative analysis, and the economics implications section.
Machine learning, recommender systems, NLP, computer vision, causal inference, reinforcement learning, federated learning/differential privacy/secure computation, and algorithmic governance tools are co-deployed in modern ad-tech.
Technical methods inventory drawn from literature and industry reports; no new experimental sample reported.
Personalization now spans data infrastructures, real-time bidding markets, recommender systems, creative generation, attribution pipelines, privacy tools, and governance regimes — all tightly coupled.
Survey of technical components and industry practice (system-analysis level); descriptive synthesis of common ad-tech stacks and interdependencies; no single-sample empirical audit provided.
AI has transformed personalized digital advertising from a narrow prediction task into a complex socio-technical infrastructure.
System-level conceptual analysis and literature synthesis presented in the paper; no single empirical dataset or sample size reported (review of industry components such as RTB, recommender systems, identity graphs).
There is no consensus in the literature on net job effects — studies diverge on whether AI produces net job gains.
Direct finding from the review: the 17 peer‑reviewed studies produce heterogeneous results on net employment impacts (some positive, some negative, some neutral).
Effects of AI adoption are heterogeneous across industries, firm sizes, regions, and worker characteristics (education, experience, occupation).
Microdata and firm-level studies exploiting cross-sectional and panel variation, quasi-experimental designs leveraging differential adoption across firms/regions, and comparative institutional analyses showing variation by context.