Evidence (13870 claims)
Adoption
8467 claims
Productivity
7558 claims
Governance
6805 claims
Human-AI Collaboration
6363 claims
Org Design
4132 claims
Innovation
4065 claims
Labor Markets
3526 claims
Skills & Training
2945 claims
Inequality
2066 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 749 | 196 | 98 | 892 | 1984 |
| Governance & Regulation | 817 | 394 | 188 | 121 | 1544 |
| Organizational Efficiency | 771 | 189 | 124 | 83 | 1177 |
| Technology Adoption Rate | 627 | 233 | 123 | 96 | 1088 |
| Research Productivity | 411 | 123 | 56 | 332 | 933 |
| Output Quality | 467 | 178 | 59 | 47 | 751 |
| Decision Quality | 320 | 174 | 75 | 42 | 618 |
| Firm Productivity | 435 | 55 | 88 | 20 | 604 |
| AI Safety & Ethics | 214 | 276 | 65 | 33 | 593 |
| Market Structure | 178 | 167 | 122 | 24 | 496 |
| Task Allocation | 207 | 64 | 71 | 32 | 379 |
| Skill Acquisition | 165 | 59 | 60 | 17 | 301 |
| Innovation Output | 203 | 27 | 43 | 18 | 292 |
| Employment Level | 105 | 52 | 107 | 13 | 279 |
| Fiscal & Macroeconomic | 131 | 69 | 43 | 26 | 276 |
| Consumer Welfare | 116 | 63 | 42 | 11 | 232 |
| Firm Revenue | 150 | 48 | 26 | 3 | 227 |
| Inequality Measures | 44 | 122 | 49 | 6 | 221 |
| Task Completion Time | 169 | 29 | 8 | 12 | 219 |
| Worker Satisfaction | 89 | 63 | 20 | 12 | 184 |
| Error Rate | 69 | 92 | 10 | 2 | 173 |
| Regulatory Compliance | 76 | 68 | 14 | 5 | 163 |
| Training Effectiveness | 93 | 21 | 13 | 19 | 148 |
| Wages & Compensation | 77 | 36 | 25 | 6 | 144 |
| Automation Exposure | 51 | 54 | 22 | 12 | 142 |
| Team Performance | 86 | 17 | 27 | 9 | 140 |
| Developer Productivity | 94 | 17 | 14 | 6 | 132 |
| Job Displacement | 12 | 80 | 20 | 1 | 113 |
| Hiring & Recruitment | 51 | 7 | 8 | 3 | 69 |
| Creative Output | 31 | 17 | 7 | 3 | 59 |
| Skill Obsolescence | 5 | 46 | 6 | 1 | 58 |
| Social Protection | 27 | 16 | 8 | 2 | 53 |
| Labor Share of Income | 17 | 17 | 17 | — | 51 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
The paper models interactions among AI capital, physical capital, and labor using a Lotka–Volterra (predator–prey type) system adapted to include self-limiting (saturation) terms.
Model specification described in Methods: deterministic Lotka–Volterra system with added self-limitation terms for three stocks (AI capital, physical capital, labor).
Key methodological details are missing or not reported: training/test split, cross-validation scheme, hyperparameter tuning, treatment of confounders/endogeneity, exact definition/measurement of the outcome, and whether results were validated out-of-sample or in field trials.
Summary lists these specific missing methodological elements as not provided in the paper.
The paper does not report (or the summary omits) the sample size and full provenance of the Indian farm dataset.
Summary explicitly states that sample size and full provenance of the Indian dataset are not reported.
Data sources used are FAO and Kaggle datasets for global context and a proprietary/field Indian farm dataset for modeling.
Paper cites FAO and Kaggle for global context and uses a proprietary Indian farm-level dataset for the core modeling work (summary notes that full provenance not reported).
The chosen ML technique is gradient boosting regression.
Explicit statement in the methods section that gradient-boosting regression was used for modeling.
Features used in modeling include pesticide/fertilizer use, farm size, crop type, harvest date, and climatic variables.
Listed predictor variables in the paper's modeling/methods section.
Instrumental-variable (IV) estimation is used to address endogeneity of AI adoption and to identify causal effects on employment and wages.
Paper states IV identification strategy applied to the 38-country panel; robustness checks and alternative specifications reported (paper refers to instrument details in full text).
The AI Adoption Index is constructed as a composite measure combining enterprise investment in AI, AI-related patent filings, and workforce/firm surveys on AI use across 38 OECD countries (2019–2025).
Paper's methodological description of the index construction; data sources enumerated as investment, patenting, and survey measures over the panel period.
The paper is entirely theoretical/analytical and does not report an empirical dataset.
Paper methodology section and abstract state primary tool is an analytical economic model; no empirical data or sample sizes are reported.
The same formal framework can be interpreted as a firm-level model where human skill investment maps onto AI/chatbot investment decisions.
Paper provides an alternative interpretation and formally maps agent skill-investment choices into an analogous firm R&D/AI-capital decision problem within the same mathematical framework.
There is a need for standardized metrics and measurement protocols for public-sector productivity and non-market outcomes (service quality, processing time, cost per transaction, transparency, trust).
Methodological critique within the review pointing to heterogeneity of outcome measures across studies and calling for standardized metrics; based on synthesis of reviewed literature.
Much of the literature on public-sector digital/AI interventions is descriptive or case-based; causal, quantitative evidence on net productivity effects is limited and context-dependent.
Methodological assessment within the review noting heterogeneous study designs, reliance on secondary sources, and a lack of randomized or quasi-experimental studies; the review explicitly states this limitation.
Research and monitoring priorities for economists include task-level analyses of substitutability/complementarity, modeling adoption as a function of regulatory costs and reimbursement incentives, and evaluating long-run welfare and distributional effects.
Explicit research recommendations stated in the narrative review, based on gaps identified in the literature and evolving empirical questions.
Policymakers and payers should consider liability reform, reimbursement models that reward safe human–AI collaboration, funding for independent clinical validation, and measures to prevent market concentration.
Policy recommendations and implications derived from the narrative review's synthesis of regulatory, economic, and implementation challenges.
Research priorities include causal studies on AI’s impacts on SME productivity, employment and inequality in LMICs; cost–benefit analyses of financing and policy interventions; evaluation of data governance models; and development of metrics/monitoring systems for inclusive adoption.
Authors' identification of evidence gaps from the structured literature review highlighting areas with insufficient causal or evaluative research.
Empirical causal evidence on long-run welfare, distributional outcomes, and labor effects of AI in LMIC SMEs remains thin.
Gap identified through the structured review: few causal studies (e.g., RCTs, natural experiments) addressing long-run effects in LMIC SME contexts.
Heterogeneity in SME types and sectors limits the generalizability of findings about AI adoption and impacts.
Authors' methodological limitation noted in the review: the evidence base spans diverse firm sizes, sectors, and contexts, constraining broad generalization.
Theoretical framing integrates Resource-Based View (RBV), Dynamic Capabilities (DC), Technology–Organization–Environment (TOE), and Diffusion of Innovation (DOI) to explain how firm resources, learning capacity, organizational and environmental factors shape AI adoption.
Conceptual synthesis performed as part of the literature review; integration based on existing theoretical literature rather than primary empirical testing.
The systematic review followed PRISMA protocol and analyzed a corpus of 103 items (peer‑reviewed articles and institutional reports) published 2010–2024.
Explicit methodological statement in the paper describing PRISMA use and corpus size/timeframe.
Further longitudinal cost-benefit studies, scalability benchmarks, and cross-domain trials are needed to determine when on-prem RAG is the dominant economic choice.
Paper's research & evaluation recommendations calling for additional longitudinal and cross-domain empirical work; presented as a recommendation rather than an empirical finding.
Human-in-the-loop judgments were central to the paper's relevance/usefulness claims rather than relying solely on synthetic benchmarks.
Methods description explicitly states human evaluation by domain experts was used alongside quantitative benchmarks.
Research gaps remain: quantifying welfare gains from specific AI applications in extraction (productivity, safety, emissions), evaluating cost-effectiveness of policy bundles, and estimating dynamic returns to data ecosystems and human capital.
Identification of gaps from literature and data coverage in the comparative analysis; calls for future empirical and modelling work.
The study is limited by being a single‑country case; contextual factors (regulatory regime, infrastructure capacity, procurement practices) may limit generalizability and the study emphasizes institutional and ethical analysis rather than quantitative measurement of economic impacts.
Explicit limitations reported in the paper summarizing scope and emphasis.
Methods used include qualitative interviews with researchers and administrators, observation/documentation of tool use, mapping of data flows and third‑party dependencies, and normative/legal analysis contrasting local practices with GDPR principles.
Methods section of the paper as reported in the provided summary.
The study's empirical basis is a qualitative case study centered on environmental science research in Chile that adopts the GDPR as an organizing normative framework.
Paper description of study scope and normative framing (methods and focus described in Data & Methods).
There is a need for validated administrative and firm-level data on AI adoption, workplace monitoring, and worker outcomes, and for evaluation of policy interventions (mandated impact assessments, transparency requirements, worker representation rules) using randomized or quasi-experimental designs where feasible.
Research and measurement priorities set out in the commentary based on identified gaps; prescriptive recommendation rather than evidence-based finding.
The paper is a policy and legal commentary/synthesis and not an empirical causal study; it does not provide microdata on employment or wage effects but identifies plausible channels and institutional dynamics.
Author-stated methodology and limitations section describing type of study and data sources; explicitly reports lack of primary empirical data.
The federal U.S. approach to AI governance combines export controls for key AI hardware/software with a relatively permissive domestic regulatory stance that relies on executive guidance, voluntary standards, and sector-specific measures rather than comprehensive federal worker protections.
Comparative policy and legal review of federal-level instruments (export control lists, executive orders, agency guidance, proposed/final rules) described in the commentary; no primary empirical data or sample size.
The report has limited primary quantitative impact evaluation and relies on policy texts and secondary sources rather than large-scale empirical measurement of AI’s economic effects.
Explicit limitations section in the report describing methods and data constraints.
The paper's empirical and policy conclusions are limited by its jurisdictional sample size (eleven) and reliance on available empirical/operational data, which the authors note is increasingly patchy due to declining transparency.
Methods and limitations sections explicitly noting sample size (eleven jurisdictions) and data availability constraints.
Methodological needs for AI-era labor models include dynamic skill taxonomies, high-frequency labor data (job postings, firm-level automation measures), and uncertainty quantification.
Paper's Research & policy recommendations and Methodological needs section (explicit recommendations).
The scenario analysis framework varies economic growth, automation rates, policy interventions, and investment to produce probabilistic demand–supply gaps.
Methods description of scenario analysis components and the variables varied in scenario experiments (explicit in Data & Methods).
Intended users of the Hub include organizations, educational institutions, and policymakers to inform reskilling/education strategies, regional economic policy, and labor-market interventions.
Explicit statement of target users and use cases in the Key Points / Implications sections.
The system produces interpretable outputs for stakeholders: demand–supply trend analysis, geospatial hotspot maps, skill-gap radar charts, and policy simulation dashboards.
Paper's description of outputs and interactive visual analytics (listed output modalities).
The core modeling approach uses probabilistic growth modeling combined with intelligent skill synthesis to estimate future workforce requirements under alternative economic and policy scenarios.
Methods section describing the modeling components: probabilistic growth modeling and intelligent skill synthesis (architectural description).
The platform integrates multiple indicators such as regional economic growth projections, automation velocity, policy intervention strength, investment intensity, and market volatility (macro- and micro-level indicators).
List of input indicators given in the Data & Methods section of the paper (explicit enumeration of macro and micro variables).
Significant empirical gaps remain on long-term impacts (wage trajectories, employment composition, firm-level returns), verification/remediation cost quantification, and public-good risks of insecure code proliferation.
Cross-study synthesis explicitly identifying missing longitudinal and firm-level empirical research in the reviewed literature.
The paper's conclusions are limited by reliance on secondary sources, heterogeneous cross‑study comparisons, limited causal identification of long‑run macro effects, and measurement challenges for AI‑driven intangible capital.
Authors' stated limitations section summarizing the nature of evidence used (qualitative literature review, secondary macro indicators, sectoral examples); this is an explicit self‑reported methodological limitation rather than an external empirical finding.
Methodology used in the paper is a narrative review relying on secondary sources (literature, legal cases, policy reports, empirical perception studies) and conceptual synthesis; no new primary data were collected.
Paper's Data & Methods section explicitly states narrative review and secondary-data analysis.
Important empirical research gaps remain (consumer willingness-to-pay for authenticated vs. synthetic content, labor-displacement elasticities, market concentration dynamics, and cost–benefit evaluations of regulatory options).
Explicit statement of limitations and research needs in the paper, based on the authors' narrative review and absence of primary empirical studies within the paper.
The paper's methodology is a secondary-data, narrative (qualitative) literature review; it contains no original empirical data or primary quantitative analysis.
Explicit methodological statement in the paper describing secondary data analysis and narrative synthesis; absence of primary datasets or statistical analyses.
This paper is conceptual/theoretical and does not conduct primary empirical data collection.
Explicit methodological statement in the paper's Data & Methods section.
More granular firm- and household-level panel data are needed to empirically validate the dissertation's theoretical predictions about nonlinear effects and causal channels.
Author recommendation based on limitations noted in Essay 3 (no primary empirical estimation) and the conditional/simulation-based nature of other essays; this is a methodological claim about future research needs rather than an empirical result.
Further causal, experimental research (randomized deployments) is needed to precisely quantify net productivity and labor reallocation effects of AI agents.
Paper's stated research priorities and explicit acknowledgement of limitations from observational design; no randomized trials reported in the study.
There are measurement challenges for quality-adjusted productivity—errors and downstream effects may reduce net benefits of agent automation and are under-measured in the study.
Authors' noted limitations and concerns about quality-adjusted productivity measurement (error rates, downstream externalities) based on observational deployment experience; no formal measurement of downstream costs reported.
Small-scale, domain-specific deployments of Alfred AI limit external validity to other industries or larger firms.
Deployment context described as small-scale e-commerce; authors note generalizability limitations stemming from domain- and scale-specific nature of the experiments.
Because the study is observational and non-randomized, causal claims about the effect of AI agents on productivity and labor are limited.
Study design explicitly described as applied experimentation and observational deployments (non-randomized); potential confounding and selection biases acknowledged by the authors.
Researchers and firms should measure generation throughput, verification throughput, defect accumulation rates, mean time to detection/fix, costs per incident, and the marginal value of additional verification capacity to evaluate the framework's claims.
Prescriptive measurement priorities listed in the paper as recommendations for empirical validation.
The abstract reports no empirical tests, simulations, or field experiments; empirical validation of the framework is left for future work.
Direct observation of the paper's abstract and methods description indicating lack of empirical validation.
The paper's contribution is primarily conceptual/architectural rather than empirical.
Explicit statement in the paper and absence of reported empirical tests, simulations, or field experiments in the abstract and methods section.