Evidence (13870 claims)
Adoption
8467 claims
Productivity
7558 claims
Governance
6805 claims
Human-AI Collaboration
6363 claims
Org Design
4132 claims
Innovation
4065 claims
Labor Markets
3526 claims
Skills & Training
2945 claims
Inequality
2066 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 749 | 196 | 98 | 892 | 1984 |
| Governance & Regulation | 817 | 394 | 188 | 121 | 1544 |
| Organizational Efficiency | 771 | 189 | 124 | 83 | 1177 |
| Technology Adoption Rate | 627 | 233 | 123 | 96 | 1088 |
| Research Productivity | 411 | 123 | 56 | 332 | 933 |
| Output Quality | 467 | 178 | 59 | 47 | 751 |
| Decision Quality | 320 | 174 | 75 | 42 | 618 |
| Firm Productivity | 435 | 55 | 88 | 20 | 604 |
| AI Safety & Ethics | 214 | 276 | 65 | 33 | 593 |
| Market Structure | 178 | 167 | 122 | 24 | 496 |
| Task Allocation | 207 | 64 | 71 | 32 | 379 |
| Skill Acquisition | 165 | 59 | 60 | 17 | 301 |
| Innovation Output | 203 | 27 | 43 | 18 | 292 |
| Employment Level | 105 | 52 | 107 | 13 | 279 |
| Fiscal & Macroeconomic | 131 | 69 | 43 | 26 | 276 |
| Consumer Welfare | 116 | 63 | 42 | 11 | 232 |
| Firm Revenue | 150 | 48 | 26 | 3 | 227 |
| Inequality Measures | 44 | 122 | 49 | 6 | 221 |
| Task Completion Time | 169 | 29 | 8 | 12 | 219 |
| Worker Satisfaction | 89 | 63 | 20 | 12 | 184 |
| Error Rate | 69 | 92 | 10 | 2 | 173 |
| Regulatory Compliance | 76 | 68 | 14 | 5 | 163 |
| Training Effectiveness | 93 | 21 | 13 | 19 | 148 |
| Wages & Compensation | 77 | 36 | 25 | 6 | 144 |
| Automation Exposure | 51 | 54 | 22 | 12 | 142 |
| Team Performance | 86 | 17 | 27 | 9 | 140 |
| Developer Productivity | 94 | 17 | 14 | 6 | 132 |
| Job Displacement | 12 | 80 | 20 | 1 | 113 |
| Hiring & Recruitment | 51 | 7 | 8 | 3 | 69 |
| Creative Output | 31 | 17 | 7 | 3 | 59 |
| Skill Obsolescence | 5 | 46 | 6 | 1 | 58 |
| Social Protection | 27 | 16 | 8 | 2 | 53 |
| Labor Share of Income | 17 | 17 | 17 | — | 51 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
The dataset covers taxonomic breadth of 212 genera and 792 species.
Reported counts of taxa included in the dataset as stated in the paper.
The Antscan project produced 2,193 whole-body 3D ant datasets (scans).
Reported dataset size in the paper: 2,193 whole-body 3D volumes/meshes produced via the described scanning pipeline.
The United States manages the openness–security trade-off via a decentralized, rights‑based coordination emphasizing procedural transparency and public accountability.
Qualitative content analysis of national‑level policy texts: 18 U.S. policy documents coded across the same four analytical dimensions.
If companies are treated as recipients, they would be required to comply with nondiscrimination obligations (e.g., Title VI, Title IX, Section 504) in education contexts and may be subject to enforcement actions, corrective requirements, and private suits where applicable.
Interpretation of recipient obligations under existing civil‑rights statutes and enforcement mechanisms; doctrinal analysis and illustrative case law.
Systems biology, constraint‑based metabolic modeling (e.g., FBA), kinetic modeling, and hybrid models are effective tools to predict fluxes and identify metabolic bottlenecks.
Discussion and aggregation of modeling studies using COBRA/OptFlux frameworks, FBA simulations, and kinetic/dynamic modeling applied to engineered strains to predict flux changes and suggest genetic interventions; validated in multiple reported DBTL cycles.
Engineered microorganisms are maturing into modular, programmable “microbial factories” capable of producing complex chemicals, specialty compounds, and next‑generation biofuels.
Synthesis of multiple experimental case studies reported in the literature (bench and pilot scale fermentations) demonstrating microbial production of natural products, specialty chemicals, and biofuel molecules using engineered strains and heterologous pathways; methods include pathway assembly, enzyme engineering, and fermentation optimization.
Cluster-level interpretation can be performed via LLM-based semantic decoding to generate concise human-readable labels and descriptions for discovered themes.
Pipeline step implemented: use of an LLM to decode cluster content and produce labels/descriptions; reported in experimental workflow on ICML and ACL abstracts.
Normalized representations can be embedded into a continuous vector space and then clustered using density-based clustering to identify latent themes without pre-specifying the number of topics.
Methodological pipeline: embedding model applied to normalized representations followed by density-based clustering (algorithmic property: density-based methods do not require pre-specified cluster count). Demonstrated in experiments on ICML and ACL 2025 abstracts.
The authors introduce clinical-model instruments such as the Model Temperament Index (behavioral profiling), Model Semiology (structured symptom lexicon), and M-CARE (standardized case reporting).
Proposed indices and reporting formats presented in the methods and applied in demonstrations/cases within the paper.
The paper proposes a five-layer diagnostic framework: staged assessment from symptom description to mechanistic localization and prognosis.
Framework design documented in the paper and applied in case demonstrations (descriptive pipeline combining symptom elicitation, profiling, semiology, imaging/localization, and reporting).
Neural MRI (Model Resonance Imaging) maps five medical neuroimaging modalities to corresponding AI interpretability techniques (e.g., structural → weight-space maps, functional → activation dynamics, connectivity → representational similarity).
Methodological mapping and toolkit design described in the paper (conceptual mapping and implemented open-source toolkit).
The authors present a discipline taxonomy comprising 15 subdisciplines grouped into four divisions: Basic Model Sciences, Clinical Model Sciences, Model Public Health, and Model Architectural Medicine.
Taxonomic synthesis produced by the authors from interpretability, reliability, governance, and architecture literatures (documented taxonomy in the paper).
The paper defines 'Model Medicine' as a unified research program treating AI models like organisms with diagnosable, classifiable, and treatable states.
Conceptual framing and theoretical synthesis presented in the paper (literature-driven argumentation; no empirical sample required).
China’s National Public Cultural Service System Demonstration Zone program raised employment in the cultural sector.
Multi-period difference-in-differences (DID) analysis exploiting staggered adoption of the Demonstration Zone designation across 280 prefecture-level Chinese cities, 2008–2021; primary outcome measured: city-level cultural-sector employment; models include city and year fixed effects.
Training improved exam scores by 0.27 grade points relative to optional access without training (p = 0.027).
Intent-to-treat comparison between the optional-access-with-training arm and the optional-access-without-training arm in the randomized trial (n = 164); reported effect size = +0.27 grade points and p-value = 0.027.
A brief, targeted training increased voluntary LLM use from 26% (optional access without training) to 41% (optional access with training).
Randomized experiment with 164 law students assigned to three arms (no access, optional access, optional access + ~10-minute training). Observed adoption rates in the two optional-access arms were 26% (untrained) vs. 41% (trained).
A one standard deviation increase in regional AI exposure raises total factor energy efficiency (TFEE) by about 3.2% in Chinese cities.
Panel analysis of 274 Chinese cities over 2007–2021 using an AI exposure index and TFEE as outcome; causal estimation relies on an instrumental-variables strategy (instruments: U.S. robot-adoption patterns and geographic proximity to external AI clusters).
A research agenda prioritizing empirical evaluation, model transparency, and rigorous impact assessment is required to translate conceptual promise into measurable public value.
Explicit recommendation in the blurb identifying research priorities; not an empirical claim but a proposed course of action.
Illustrative vignettes show AI in action: logistics optimization for trade, AI models for national fiscal decision-making, and algorithmic job-acceleration for individual labor market navigation.
Reference to specific case vignettes contained in the book; these are illustrative scenarios rather than empirical case studies with measured outcomes.
Ten defining policy questions structure the book’s approach, turning abstract AI capabilities into operational policy choices.
Descriptive claim about the book's organization; verifiable by inspecting the book's table of contents (no external empirical data).
International comparability in these analyses is achieved using PPP adjustments for monetary measures and standardized occupation/task classifications (ISCO/ISCO-08) with harmonized baseline years and variable definitions.
Described data harmonization procedures across multi-country firm and worker datasets, including PPP adjustments and use of ISCO classification for occupations.
Adoption of advanced AI tools (especially generative AI) raises firm-level productivity on average.
Meta-analysis of firm-level panel studies using administrative tax and manufacturing surveys and proprietary AI-usage logs; difference-in-differences and event-study estimates comparing adopters vs non-adopters with firm fixed effects and robustness checks.
The compendium issues specific policy-design recommendations for economic policymakers: deploy proportional compliance obligations and regulatory sandboxes, subsidize or certify third‑party auditors, monitor credit availability and pricing post‑implementation, and coordinate cross‑border standards.
Explicit policy recommendations listed in the "Policy design recommendations" subsection; derived from the paper's interdisciplinary analysis.
The protocol has been prepared/indexed across 15 strategic languages to facilitate international diffusion and comparative uptake.
Stated multilingual/global indexing claim in the compendium (15 languages).
The paper implements a "White Box" regulatory protocol for AI in Mexico's financial sector requiring algorithmic transparency, auditability, explainability, and non‑discrimination standards for credit/FinTech algorithms.
Output of the technical protocol described in the compendium; developed from a forensic audit of source materials and legal-methodological synthesis (doctrinal/comparative analysis).
The compendium proposes recognizing "Digital Sovereignty" as a new fundamental human right that protects individuals’ autonomy, data sovereignty, due process, and non-discrimination in algorithmic financial decision‑making.
Normative definitional claim in the protocol; grounded in the author's doctrinal and comparative legal analysis across 12 years (2014–2026).
Recommended policy approach: run pilots to empirically measure trade‑offs, combine obligations with capacity building (technical assistance, shared datasets, sandboxes), harmonize with international frameworks, and use staged implementation with cost‑benefit analyses.
Policy recommendations derived from the compendium’s interdisciplinary synthesis and economic/policy analysis (prescriptive, not empirically validated within the paper).
Policy operationalization should include algorithmic impact assessments, audit logs, disclosure regimes to regulators/judiciary, redress/grievance mechanisms, and governance principles (open, transparent, accountable).
Prescriptive policy instruments and standards proposed in the compendium based on the forensic audit and normative design work; descriptive claim about the protocol’s recommended instruments.
There is a need for standardized metrics to quantify benefits and costs of governed hyperautomation (e.g., ROI adjusted for compliance risk, incident rate per automation scale, oversight hours per automated transaction, model drift frequency and remediation cost).
Paper's recommendations and research agenda calling for standardized metrics and empirical studies; prescriptive statement rather than empirical finding.
Researchers and policymakers should promote auditable, privacy-preserving attribution standards and independent audits while supporting randomized trials and field experiments under privacy constraints.
Policy/actionable takeaways informed by methodological challenges and literature on randomized trials and privacy-preserving methods; prescriptive guidance rather than an empirically tested program.
There is a need for standardized benchmarks and privacy-preserving shared datasets to enable independent economic evaluation of ad-tech.
Methodological recommendation informed by stated data access asymmetries and reproducibility concerns; not accompanied by a new benchmark in the paper.
Antitrust analysis of ad-tech should incorporate algorithmic effects such as endogenous use of ML to entrench platform position and data network effects.
Theoretical and policy argument drawing on platform economics and ML scale advantages; recommendation rather than empirical finding.
Combining secure aggregation and differential privacy can materially reduce centralized custody risks.
Conceptual systems design and analytical discussion combining cryptographic and statistical privacy mechanisms; threat model argues joint effect reduces reconstruction and limits leakage. No field measurements of residual risk provided.
Secure aggregation protocols (cryptographic aggregation, MPC) can prevent reconstruction of individual updates and thus materially reduce risk of exposing raw behavioral logs to centralized custodians.
Systems design and threat modeling mapping secure aggregation techniques to privacy risk reduction; references to standard cryptographic protocols. Empirical support limited to conceptual mapping and prototype/simulation; no deployment measurements.
Model training can occur locally on devices/publishers/advertiser endpoints such that only model updates (not raw behavior logs) are shared and aggregated to produce cross-platform personalization.
Architectural description and conceptual design of a federated advertising paradigm (multi-layer architecture); prototype/simulation examples illustrating update-only aggregation. No real-world deployment data.
The positive effect of digital rural development on AGTFP is robust to alternative variable constructions, sample adjustments, and endogeneity treatments (e.g., instrumental-variable/other methods).
Robustness exercises reported in the paper: re-specification of the digitalization measure, re-sampling/alternative sample specifications, and use of instrumental/other methods to address endogeneity.
Digital rural development in China significantly increases agricultural green total factor productivity (AGTFP).
Fixed-effects panel regression using provincial panel data for 30 Chinese provinces from 2012–2022 (≈330 province-year observations), with reported significance and robustness checks (alternative measures, sample adjustments, and endogeneity tests).
VIS produces interrelated metrics that explicitly include indirect labor embodied throughout the supply chain rather than only direct labor employed in a reported sector.
Computation of vertically integrated sector vectors from input–output matrices and allocation of upstream labor inputs to final-sector output; reported construction of VIS-based labor input metrics.
Adapting Pasinetti’s vertically integrated sectors framework enables production of time-series productivity measures at the subsystem level.
Methodological adaptation described and applied to annual data (2014–2023) to produce VIS-based productivity time series for subsystems (e.g., electric generation subsystem).
The VIS approach captures both direct and indirect (upstream) labor effects by attributing upstream labor requirements to final-sector outputs using Leontief-type inverses / vertically integrated sector vectors.
Methodology constructs annual input–output matrices (BEA + IMPLAN mapping) and computes Leontief-type inverses/vertically integrated sector vectors to allocate direct and indirect requirements; upstream labor is attributed to final output using BLS employment/hours data.
There is a widespread consensus across the reviewed literature on the need for worker upskilling, active labor‑market policies, and targeted support for displaced workers.
Policy recommendations recurring in the majority of the 17 peer‑reviewed papers synthesized in the review.
The framework supports counterfactual scenario simulations that vary capability diffusion, adoption rates, policy interventions, and firm behavior to explore how exposures might translate into outcomes.
Description of scenario and simulation capabilities in the methods: Agent-based experiments run with parameterized counterfactuals for diffusion, adoption, and policy levers.
Alternative training channels (self-education and professional retraining) are nontrivial contributors to the AI workforce supply.
Comparative analysis showing inclusion of self-education and retraining contributions in the aggregate coverage estimate (the 43.9% figure explicitly includes these channels); descriptive counts/estimates of non-degree trained entrants.
A subset of universities performs markedly better on employment effectiveness, graduate wages, and placement into popular AI roles (i.e., identifiable high-performing institutions).
Comparative analysis across the 191 universities, including employment rates, observed wage outcomes, and placement distributions; identification and reporting of key/high-performing institutions and their metrics.
Russian universities that run AI-related educational programs are contributing substantially to the national AI workforce supply.
Institutional-level monitoring data from n = 191 universities showing program enrollments, graduate counts and graduate employment into AI-related roles (descriptive analysis of supply from degree programs).
AI complements high-skill labor and raises returns to advanced cognitive and creative skills.
Microdata wage analyses and task-complementarity mappings that link AI-exposed tasks with skill groups, supported by panel regressions showing higher wages/earnings growth for higher-skill workers and by theoretical task-based models predicting complementarity.
Environmental gains materialise where oversight intensity, data quality, and targeted use cases align — governance quality conditions the conversion of adoption into credible emissions reductions.
Case-level comparisons and cross-case synthesis from interviews, surveys, and document analysis suggesting that alignment of oversight, data quality and use-case targeting is associated with measurable environmental outcomes in some cases. (Sample size not reported; no quantified emissions effects provided.)
Leadership practices alone may be insufficient to improve Employee Productivity unless supported by appropriate technological infrastructure and organizational flexibility.
Interpretation by the authors based on the non-significant direct effect of Digital Leadership on Employee Productivity (reported β and p-value) and theoretical reasoning within the paper.
These empirical findings quantify the verifiability-usefulness tension identified theoretically by Leinweber et al.
Comparison of the paper's empirical measurements (verification acceptance of random work, lack of useful inference, economic and market impacts) to the previously-theorized verifiability-usefulness tension.
Anthropic's asymmetry is consistent with its more retrieval-unattributed generation route: 43-52% recommendations without observed retrieval-layer evidence, vs OpenAI's 8-29% (documented in Jack 2026).
Reported proportions of recommendations lacking observed retrieval-layer evidence attributed to Anthropic vs OpenAI; citation to Jack 2026 for documentation.