Evidence (2215 claims)
Adoption
5126 claims
Productivity
4409 claims
Governance
4049 claims
Human-AI Collaboration
2954 claims
Labor Markets
2432 claims
Org Design
2273 claims
Innovation
2215 claims
Skills & Training
1902 claims
Inequality
1286 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 369 | 105 | 58 | 432 | 972 |
| Governance & Regulation | 365 | 171 | 113 | 54 | 713 |
| Research Productivity | 229 | 95 | 33 | 294 | 655 |
| Organizational Efficiency | 354 | 82 | 58 | 34 | 531 |
| Technology Adoption Rate | 277 | 115 | 63 | 27 | 486 |
| Firm Productivity | 273 | 33 | 68 | 10 | 389 |
| AI Safety & Ethics | 112 | 177 | 43 | 24 | 358 |
| Output Quality | 228 | 61 | 23 | 25 | 337 |
| Market Structure | 105 | 118 | 81 | 14 | 323 |
| Decision Quality | 154 | 68 | 33 | 17 | 275 |
| Employment Level | 68 | 32 | 74 | 8 | 184 |
| Fiscal & Macroeconomic | 74 | 52 | 32 | 21 | 183 |
| Skill Acquisition | 85 | 31 | 38 | 9 | 163 |
| Firm Revenue | 96 | 30 | 22 | — | 148 |
| Innovation Output | 100 | 11 | 20 | 11 | 143 |
| Consumer Welfare | 66 | 29 | 35 | 7 | 137 |
| Regulatory Compliance | 51 | 61 | 13 | 3 | 128 |
| Inequality Measures | 24 | 66 | 31 | 4 | 125 |
| Task Allocation | 64 | 6 | 28 | 6 | 104 |
| Error Rate | 42 | 47 | 6 | — | 95 |
| Training Effectiveness | 55 | 12 | 10 | 16 | 93 |
| Worker Satisfaction | 42 | 32 | 11 | 6 | 91 |
| Task Completion Time | 71 | 5 | 3 | 1 | 80 |
| Wages & Compensation | 38 | 13 | 19 | 4 | 74 |
| Team Performance | 41 | 8 | 15 | 7 | 72 |
| Hiring & Recruitment | 39 | 4 | 6 | 3 | 52 |
| Automation Exposure | 17 | 15 | 9 | 5 | 46 |
| Job Displacement | 5 | 28 | 12 | — | 45 |
| Social Protection | 18 | 8 | 6 | 1 | 33 |
| Developer Productivity | 25 | 1 | 2 | 1 | 29 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
| Creative Output | 15 | 5 | 3 | 1 | 24 |
| Skill Obsolescence | 3 | 18 | 2 | — | 23 |
| Labor Share of Income | 7 | 4 | 9 | — | 20 |
Innovation
Remove filter
This study uses panel data from 30 Chinese provinces (2011–2022) and estimates a spatial simultaneous equations model using the Generalized Spatial Three-Stage Least Squares (GS3SLS) approach.
Described methodology in the paper: panel dataset covering 30 provinces over 2011–2022 (12 years), spatial simultaneous equations estimated by GS3SLS.
Personal data are nonrivalrous and highly replicable, so selling data does not follow ordinary scarcity logic.
Analytic/property claim about the economic characteristics of digital information; supported by conceptual definitions and common technical facts about data replication; no empirical sampling needed.
Empirical approach measured and compared expectation formation, innovation responses, and pipeline outcomes across local exposure to closures and across distinct entrepreneurial identity groups.
Methodological description: survey-based, cross-country quantitative approach using measures of local exposure (nearby closures), identity classification (family/purpose-driven vs. wealth-driven), and outcomes (expectations, perceived impediments, self-reported innovation, pipeline transitions) in a sample >27,000.
The study analyzes a cross-country sample of more than 27,000 entrepreneurs across 43 countries (survey-based, comparative).
Descriptive claim about the dataset used in the paper: survey-based sample size >27,000 spanning 43 countries as reported in Data & Methods.
The empirical approach tests for common long-run relationships across patenting series and identifies structural breaks concentrated after 2010.
Description of empirical strategy: time-series econometric analysis of patent filing series (1980–2019) including tests for common long-run relationships (cointegration) and structural break detection. The paper reports results of these tests (presence/absence of common trends and timing of breaks).
Empirical validation of the integrated Kondratieff–Schumpeter–Mandel framework requires firm-level adoption and profitability data, sectoral investment series, and cross-country comparisons using panel methods and identification strategies (e.g., diff-in-diff, IV).
Methods/limitations section recommendation (explicitly states no single micro-econometric identification strategy was reported and outlines required data/methods).
The three frameworks (Kondratieff, Schumpeter, Mandel) are complementary: Kondratieff frames periodicity, Schumpeter provides micro-mechanisms of innovation-driven change, and Mandel foregrounds socio-political constraints and distributional outcomes.
Conceptual integration and comparative theoretical analysis (qualitative synthesis).
Kondratieff's framework is useful for identifying broad periodicities (recurring phases of expansion and stagnation) in capitalist development but is less specific about microeconomic mechanisms.
Theoretical review of Kondratieff literature and conceptual assessment (qualitative).
No new laboratory measurements or datasets are reported in the paper; the approach is methodological and conceptual rather than empirical.
Methods section and explicit statements within the paper noting absence of new data; verifiable by reading the paper.
These operators are presented as conceptual/theoretical bridges rather than immediately quantifiable laboratory units.
Explicit methodological statement in the paper emphasizing interpretive/theoretical intent; no empirical operationalization reported.
The paper calls for subsequent quantitative validation (using task-based, matched employer-employee, and provider-level panel data) to estimate causal impacts on productivity, health outcomes, wages, and employment composition across the three interaction levels.
Stated research agenda and measurement recommendations in the paper's discussion section.
The study is qualitative and small-sample (four case) and therefore interpretive and illustrative rather than statistically generalizable.
Explicit methodological statement in the paper: design = qualitative multiple case study, sample = four AI healthcare applications.
The study identifies a three-level taxonomy of human–AI interaction in healthcare: AI-assisted, AI-augmented, and AI-automated.
Conceptual taxonomy derived from multiple qualitative case studies (n=4) using cross-case comparison and Bolton et al. (2018)'s three-dimensional service-innovation framework.
The paper's empirical scope is primarily conceptual/theoretical and literature‑based rather than an empirical case study or large‑scale data experiment; it emphasizes the need for future empirical validation.
Explicit methodological description within the paper stating reliance on literature review and conceptual development; absence of empirical sample or case study.
Typical evaluation metrics reported are accuracy, precision, recall, F1-score, AUC, detection rate, false positive rate, latency, and computational cost.
Survey of evaluation practices in reviewed papers listing the metrics authors commonly report.
Emerging approaches in the literature include federated learning, online/streaming learning, and transfer learning for cross-device generalization.
Trend analysis across recent papers indicating adoption of federated and continual learning paradigms and transfer-learning techniques.
Unsupervised and semi-supervised methods (clustering, one-class classifiers, autoencoder-based anomaly detectors) are commonly employed to handle unlabeled/anomalous IoT traffic.
Synthesis of studies using anomaly-detection paradigms and unsupervised techniques reported in the reviewed papers.
Deep learning approaches used include CNNs, RNNs/LSTMs for sequence/traffic analysis, and autoencoders for anomaly detection.
Surveyed literature and taxonomy noting multiple studies that apply convolutional and recurrent architectures and autoencoders to network/traffic data.
Common ML approaches reported for IoT IDS include supervised models (random forest, SVM, gradient boosting, neural networks).
Taxonomy and literature synthesis showing frequent use of classical supervised classifiers in surveyed papers and experiments.
This work is a conceptual framework and design proposal synthesizing methods from recommender systems and HRI rather than a report of novel empirical experiments.
Explicit statement in the Data & Methods section of the paper.
The study's empirical identification relies on longitudinal variation with city fixed effects and time effects, plus non-linear/threshold identification via polynomial (DE^2) terms and threshold-regression using green-technology-innovation as the threshold variable.
Description of empirical strategy in the paper: panel fixed-effects models (controlling for time-invariant city heterogeneity and common time shocks), mediating-effect models for channel tests, and threshold-regression models for regime-dependent effects, applied to the 278-city 2011–2022 panel.
Research recommendation: invest in longer-run, rigorous impact evaluations (RCTs, panel studies) and system-level assessments to capture spillovers and sustainability outcomes.
Authors' stated research agenda based on identified methodological gaps (limited long-term and system-level evidence) in the review.
There is variation in study design and quality in the evidence base (RCTs, quasi-experimental studies, observational case studies, pilots).
Methodological caveats noted by the authors summarizing the diversity of designs reported across reviewed studies.
The review used a structured literature review with thematic synthesis and a comparative effect-size analysis to quantify ranges for yield, cost, and efficiency outcomes.
Authors' description of review approach and analytical methods in the Data & Methods section.
The evidence base reviewed comprises more than 60 peer-reviewed articles and institutional reports from 2020–2025, primarily focusing on Sub-Saharan Africa.
Statement in the paper's Data & Methods section describing the scope and composition of the review sample.
Effect sizes and impacts vary substantially across contexts—by crop, farm size, and institutional setting.
Comparative synthesis across studies showing heterogeneity in reported outcomes and authors' methodological caveats highlighting context dependence.
Technologies assessed in the review include predictive analytics, digital advisory systems, smart irrigation, pest/disease detection, and precision fertilization.
Descriptive synthesis of the types of AI and digital technologies evaluated across the >60 reviewed articles and reports (2020–2025).
These quantitative performance figures come from case‑level, high‑performer pilots and should not be treated as typical industry benchmarks.
Authors' caveat based on the composition of evidence in the review (skew towards pilots and selected advanced implementations; limited longitudinal/multi‑project empirical studies).
Inter‑rater reliability for the study selection/encoding was Cohen’s κ = 0.83 (substantial agreement).
Reported inter‑rater reliability statistic from the review's quality control step (Cohen's kappa = 0.83).
The review screened 463 Scopus records (2018–2026) and selected 160 peer‑reviewed studies using a PRISMA‑guided process.
Systematic literature review described in paper: Scopus search (2018–2026), PRISMA screening and eligibility filtering; initial n=463, final n=160.
Experimental structure determination (X‑ray, NMR, cryo‑EM) remains the gold standard but is slow, costly, and low‑throughput.
Paper explicitly states experimental methods are 'gold standard' and characterizes them as slow, costly, low‑throughput; the PDB is cited as the source of structural ground truth.
The authors did not perform primary empirical validation or simulation of TVR‑Sec across real VR deployments.
Methods and limitations section explicitly state no original empirical experiments or simulations were conducted; analysis is conceptual and qualitative.
The paper's scope comprised a comparative literature review and conceptual integration of 31 peer‑reviewed studies published between 2023 and 2025.
Authors' methods description specifying sample size and publication window: 31 peer‑reviewed studies (2023–2025).
This study is descriptive and comparative rather than quantitative; it relies on available policy documents and secondary literature rather than original field interviews or measured outcomes.
Explicit methodological statement in the paper listing qualitative document analysis, comparative literature review, and policy commentary; limitation acknowledged by authors.
A research agenda for AI economics should include: formalizing consent as a transaction/contracting problem; empirical RCTs and natural experiments measuring effects of consent designs; mechanism design for privacy-preserving data sharing; and policy evaluation of consent regulations.
Explicitly listed research directions in the workshop outputs and position papers; these are proposed next steps rather than empirical findings.
Follow-up empirical methods should include qualitative interviews, focus groups, usability studies, field experiments (A/B tests), and policy/legal-technical assessments.
Recommended research methods enumerated in the workshop outputs and position papers; these are proposed future methods rather than findings from conducted studies.
The Futures Design Toolkit (scenario planning, persona generation, speculative design) was used as a primary method in the workshop.
Methodological description in the workshop summary listing the Futures Design Toolkit and associated activities; procedural claim rather than empirical.
The workshop identifies specific research directions for AI economics: cost–benefit and ROI analyses of shared infrastructure; market design for procurement of co-designed systems; models of innovation incentives under different IP/data-governance regimes; labor market impact assessments; and empirical studies of how validation ecosystems affect adoption rates and pricing.
Explicitly listed research directions in the workshop summary and roadmap produced by consensus at the NSF workshop (Sept 26–27, 2024).
The workshop's findings are based on qualitative synthesis of expert judgment and stakeholder inputs rather than primary empirical data or controlled experiments.
Explicitly stated in the Data & Methods section of the workshop summary; methods: expert panels, thematic breakout sessions, cross-disciplinary discussions, consensus-building.
The workshop convened researchers, clinicians, and industry leaders to address co-design across four thematic areas: teleoperations/telehealth/surgical operations; wearable and implantable medicine; home ICU/hospital systems/elderly care; and medical sensing/imaging/reconstruction.
Workshop agenda and participant list from the two-day NSF workshop (Sept 26–27, 2024); methods included thematic breakout sessions focused on these four areas. Documentation at https://sites.google.com/view/nsfworkshop.
The paper uses a mixed-methods approach combining a systematic literature review with an empirical practitioner survey to assess perceptions, adoption, and impact of AI-driven tools.
Methodological statement in the paper; survey design covers tool usage, perceived benefits, challenges, and expectations.
The approach shifts some computational burden to obtaining MCMC samples of the parameter posterior, requiring access to (or ability to compute) MCMC samples before surrogate training.
Method description: training data are MCMC-drawn parameter vectors; the paper notes this practical requirement and trade-off (MCMC cost vs. avoiding repeated expensive forward-model evaluations).
More theoretical work is needed to establish guarantees (consistency, asymptotic behavior, and frequentist coverage) for these networks when applied in economic settings.
Stated research need/caveat in the paper; no new theoretical proofs are provided in the summary to establish these properties.
The Boson Sampling Born Machine (BSBM) is a generative model whose model distribution is the output probability distribution of a linear-optical (bosonic modes) circuit.
Definition and constructive specification in the paper: model architecture described as linear-optical circuits with outputs given by bosonic-mode measurement probabilities (the paper's formal definition/construction). The claim is definitional/theoretical (no empirical sample size).
Because this is a conceptual/systems-architecture paper, it does not present new empirical performance benchmarks.
Explicit statement in the paper's Data & Methods section that no new empirical benchmarks are presented.
DPS was empirically evaluated across diverse reasoning domains (mathematical reasoning, planning, and visual-geometry) to test generality.
Paper reports experiments on those three categories of tasks; they are listed as the evaluated tasks in the methods/experiments section.
DPS uses the inferred per-prompt state distributions as a predictive prior to select prompts estimated to be most informative, avoiding exhaustive candidate rollouts for filtering.
Method and selection mechanism described: predictive prior ranking/filtering replaces rollout-heavy candidate evaluation. (Procedure described in paper; empirical comparisons reported.)
Dynamics-Predictive Sampling (DPS) models each prompt’s "extent of solving" under the current policy as a latent state in a dynamical system (a hidden Markov model) and performs online Bayesian inference on historical rollout reward signals to estimate that state.
Methodological description in the paper: DPS uses an HMM representation of per-prompt solving progress and applies online Bayesian updates using past rollout rewards. (No numerical sample size needed for this modeling claim.)
Measuring AI's economic impact requires new metrics that account for decision-value uplift, reduced tail-risk exposures, and dynamic gains from continuous learning; causal identification will require experiments or staggered rollouts.
Methodological recommendation backed by conceptual discussion of measurement challenges; no implementation of such measurement approaches is reported in the paper.
Performance and evaluation should be measured using forecast accuracy, decision lift/value added, latency, and false positive/negative rates.
Paper-prescribed evaluation metrics; presented as recommended practice rather than derived from empirical testing within the paper.