Evidence (1902 claims)
Adoption
5126 claims
Productivity
4409 claims
Governance
4049 claims
Human-AI Collaboration
2954 claims
Labor Markets
2432 claims
Org Design
2273 claims
Innovation
2215 claims
Skills & Training
1902 claims
Inequality
1286 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 369 | 105 | 58 | 432 | 972 |
| Governance & Regulation | 365 | 171 | 113 | 54 | 713 |
| Research Productivity | 229 | 95 | 33 | 294 | 655 |
| Organizational Efficiency | 354 | 82 | 58 | 34 | 531 |
| Technology Adoption Rate | 277 | 115 | 63 | 27 | 486 |
| Firm Productivity | 273 | 33 | 68 | 10 | 389 |
| AI Safety & Ethics | 112 | 177 | 43 | 24 | 358 |
| Output Quality | 228 | 61 | 23 | 25 | 337 |
| Market Structure | 105 | 118 | 81 | 14 | 323 |
| Decision Quality | 154 | 68 | 33 | 17 | 275 |
| Employment Level | 68 | 32 | 74 | 8 | 184 |
| Fiscal & Macroeconomic | 74 | 52 | 32 | 21 | 183 |
| Skill Acquisition | 85 | 31 | 38 | 9 | 163 |
| Firm Revenue | 96 | 30 | 22 | — | 148 |
| Innovation Output | 100 | 11 | 20 | 11 | 143 |
| Consumer Welfare | 66 | 29 | 35 | 7 | 137 |
| Regulatory Compliance | 51 | 61 | 13 | 3 | 128 |
| Inequality Measures | 24 | 66 | 31 | 4 | 125 |
| Task Allocation | 64 | 6 | 28 | 6 | 104 |
| Error Rate | 42 | 47 | 6 | — | 95 |
| Training Effectiveness | 55 | 12 | 10 | 16 | 93 |
| Worker Satisfaction | 42 | 32 | 11 | 6 | 91 |
| Task Completion Time | 71 | 5 | 3 | 1 | 80 |
| Wages & Compensation | 38 | 13 | 19 | 4 | 74 |
| Team Performance | 41 | 8 | 15 | 7 | 72 |
| Hiring & Recruitment | 39 | 4 | 6 | 3 | 52 |
| Automation Exposure | 17 | 15 | 9 | 5 | 46 |
| Job Displacement | 5 | 28 | 12 | — | 45 |
| Social Protection | 18 | 8 | 6 | 1 | 33 |
| Developer Productivity | 25 | 1 | 2 | 1 | 29 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
| Creative Output | 15 | 5 | 3 | 1 | 24 |
| Skill Obsolescence | 3 | 18 | 2 | — | 23 |
| Labor Share of Income | 7 | 4 | 9 | — | 20 |
Skills Training
Remove filter
Instrumental-variable (IV) estimation is used to address endogeneity of AI adoption and to identify causal effects on employment and wages.
Paper states IV identification strategy applied to the 38-country panel; robustness checks and alternative specifications reported (paper refers to instrument details in full text).
The AI Adoption Index is constructed as a composite measure combining enterprise investment in AI, AI-related patent filings, and workforce/firm surveys on AI use across 38 OECD countries (2019–2025).
Paper's methodological description of the index construction; data sources enumerated as investment, patenting, and survey measures over the panel period.
There is a need for standardized metrics and measurement protocols for public-sector productivity and non-market outcomes (service quality, processing time, cost per transaction, transparency, trust).
Methodological critique within the review pointing to heterogeneity of outcome measures across studies and calling for standardized metrics; based on synthesis of reviewed literature.
Much of the literature on public-sector digital/AI interventions is descriptive or case-based; causal, quantitative evidence on net productivity effects is limited and context-dependent.
Methodological assessment within the review noting heterogeneous study designs, reliance on secondary sources, and a lack of randomized or quasi-experimental studies; the review explicitly states this limitation.
Research priorities include causal studies on AI’s impacts on SME productivity, employment and inequality in LMICs; cost–benefit analyses of financing and policy interventions; evaluation of data governance models; and development of metrics/monitoring systems for inclusive adoption.
Authors' identification of evidence gaps from the structured literature review highlighting areas with insufficient causal or evaluative research.
Empirical causal evidence on long-run welfare, distributional outcomes, and labor effects of AI in LMIC SMEs remains thin.
Gap identified through the structured review: few causal studies (e.g., RCTs, natural experiments) addressing long-run effects in LMIC SME contexts.
Heterogeneity in SME types and sectors limits the generalizability of findings about AI adoption and impacts.
Authors' methodological limitation noted in the review: the evidence base spans diverse firm sizes, sectors, and contexts, constraining broad generalization.
Theoretical framing integrates Resource-Based View (RBV), Dynamic Capabilities (DC), Technology–Organization–Environment (TOE), and Diffusion of Innovation (DOI) to explain how firm resources, learning capacity, organizational and environmental factors shape AI adoption.
Conceptual synthesis performed as part of the literature review; integration based on existing theoretical literature rather than primary empirical testing.
The systematic review followed PRISMA protocol and analyzed a corpus of 103 items (peer‑reviewed articles and institutional reports) published 2010–2024.
Explicit methodological statement in the paper describing PRISMA use and corpus size/timeframe.
Methodological needs for AI-era labor models include dynamic skill taxonomies, high-frequency labor data (job postings, firm-level automation measures), and uncertainty quantification.
Paper's Research & policy recommendations and Methodological needs section (explicit recommendations).
The scenario analysis framework varies economic growth, automation rates, policy interventions, and investment to produce probabilistic demand–supply gaps.
Methods description of scenario analysis components and the variables varied in scenario experiments (explicit in Data & Methods).
Intended users of the Hub include organizations, educational institutions, and policymakers to inform reskilling/education strategies, regional economic policy, and labor-market interventions.
Explicit statement of target users and use cases in the Key Points / Implications sections.
The system produces interpretable outputs for stakeholders: demand–supply trend analysis, geospatial hotspot maps, skill-gap radar charts, and policy simulation dashboards.
Paper's description of outputs and interactive visual analytics (listed output modalities).
The core modeling approach uses probabilistic growth modeling combined with intelligent skill synthesis to estimate future workforce requirements under alternative economic and policy scenarios.
Methods section describing the modeling components: probabilistic growth modeling and intelligent skill synthesis (architectural description).
The platform integrates multiple indicators such as regional economic growth projections, automation velocity, policy intervention strength, investment intensity, and market volatility (macro- and micro-level indicators).
List of input indicators given in the Data & Methods section of the paper (explicit enumeration of macro and micro variables).
Significant empirical gaps remain on long-term impacts (wage trajectories, employment composition, firm-level returns), verification/remediation cost quantification, and public-good risks of insecure code proliferation.
Cross-study synthesis explicitly identifying missing longitudinal and firm-level empirical research in the reviewed literature.
The paper's conclusions are limited by reliance on secondary sources, heterogeneous cross‑study comparisons, limited causal identification of long‑run macro effects, and measurement challenges for AI‑driven intangible capital.
Authors' stated limitations section summarizing the nature of evidence used (qualitative literature review, secondary macro indicators, sectoral examples); this is an explicit self‑reported methodological limitation rather than an external empirical finding.
Researchers and firms should measure generation throughput, verification throughput, defect accumulation rates, mean time to detection/fix, costs per incident, and the marginal value of additional verification capacity to evaluate the framework's claims.
Prescriptive measurement priorities listed in the paper as recommendations for empirical validation.
The abstract reports no empirical tests, simulations, or field experiments; empirical validation of the framework is left for future work.
Direct observation of the paper's abstract and methods description indicating lack of empirical validation.
The paper's contribution is primarily conceptual/architectural rather than empirical.
Explicit statement in the paper and absence of reported empirical tests, simulations, or field experiments in the abstract and methods section.
There are limited standardized measures of 'AI capital,' scarce data on firm-level AI investment and implementation quality, and few long-run causal estimates of AI’s effects on managerial productivity and labor outcomes.
Gap analysis based on literature review and methodological discussion within the book; observation about the state of available empirical evidence.
There is a lack of large‑scale causal evidence on generative AI’s effects; the paper recommends RCTs, difference‑in‑differences, matched employer–employee panels, and longitudinal studies to fill empirical gaps.
Methodological critique and research agenda provided in the review; observation based on the authors' survey of the literature.
Policy interventions are needed for data protection, bias mitigation, model transparency, accountability, and public investments in workforce retraining to smooth transitions and reduce inequality.
Normative policy recommendations grounded in the review's synthesis of risks and distributional concerns; not an empirical claim but a recommendation.
New productivity metrics are needed to capture AI impacts, including time‑use changes, quality‑adjusted output, and accounting for intangible AI capital.
Methodological recommendation from the conceptual synthesis, motivated by limitations of existing measures discussed in the paper.
Static equilibrium and representative-agent models neglect dynamic reallocation, task re-bundling, and firm-level heterogeneity, limiting their realism for forecasting labour outcomes under AI adoption.
Theoretical critique offered in the paper and referenced critiques in the literature; evidence is conceptual and based on model assumptions identified across studies.
Common empirical strategies (cross-sectional exposure correlations and panel-difference analyses) often lack strong causal identification due to endogeneity of adoption and unobserved confounders.
Surveyed analytical strategies and explicit critique in the paper noting endogeneity and confounding; evidence is methodological critique grounded in the literature's reliance on observational exposure measures.
Researchers construct AI exposure indices at the task level to indicate susceptibility to AI automation or augmentation.
Cited examples (Felten et al., 2023; Eloundou et al., 2023) that develop task-level scores; evidence basis is methodological papers that publish indices and mapping procedures (often using O*NET tasks, expert labeling, or model-based scoring).
Commonly used data sources for measuring AI exposure include job postings and descriptions, occupational task databases (O*NET-style), employer/household surveys, administrative payroll data, and firm-level productivity measures.
List of data sources compiled in the paper; evidence is a methodological summary of datasets used across the cited literature rather than novel data collection.
Many studies rely on static assumptions (fixed comparative advantage, no adaptation) and theoretical models, which limits causal inference and makes projections model-dependent.
Methodological critique cited in the paper (e.g., critique of Acemoglu & Restrepo, 2022; Webb, 2020) and the paper's survey of common modeling choices (static equilibrium or representative-agent models); evidence basis is theoretical critique and literature review rather than new causal estimates.
Task-level approaches capture within-occupation heterogeneity in automation and augmentation risk that occupation-level analyses miss.
Empirical and methodological work cited (Felten et al., 2023; Eloundou et al., 2023) that construct task-level exposure indices and show variation across tasks within the same occupation; evidence based on task mappings from O*NET-style databases and job descriptions.
Recent research in AI–labor economics has shifted from occupation-level analysis to task-level analysis, mapping task-by-task exposure to AI.
Synthesis of recent literature cited in the paper (e.g., Felten et al., 2023; Eloundou et al., 2023) which develop task-level exposure mappings using occupational task databases (O*NET-style) and job-posting text; evidence is bibliographic and methodological rather than a single new empirical dataset.
Further quantitative research is needed to measure task‑level productivity effects, skill‑depreciation trajectories, and market impacts of differential GenAI adoption; structural models could incorporate TGAIF to predict labor demand and wage effects.
Authors' stated research agenda and limitations acknowledged in the paper; this is a call for future empirical work rather than an empirical claim.
ChatGPT was used as the generative engine for the MLLM in the system implementation described in the paper.
Methods section: integration of AR overlays with an MLLM, with ChatGPT used as the generative engine (explicit in the summary).
Further quantitative and comparative research is needed to measure net productivity effects, skill trajectories, and generalizability across firm types and industries.
Authors' methodological assessment and limitations section noting single-firm qualitative design (Netlight) and rapidly evolving toolchains; recommendation for future empirical work.
Another important gap is quantifying complementarities between AI and different skill types (evaluative vs. generative tasks).
Review observation that existing empirical work has not systematically quantified how AI productivity gains vary with worker skill composition and complementary roles.
Key research gaps include a lack of long-run causal evidence on the effects of LLMs on firm-level innovation rates, business formation, and industry structure.
Explicit identification of gaps in the literature within the nano-review; the review states that most studies are short-term, task-level, or descriptive.
Study limitations include reliance on perceptual measures (rather than solely objective performance), heterogeneity across institutional samples, and likely correlational rather than strictly causal identification.
Authors' own noted limitations in the paper's methods section: mixed-methods design using perceptions from questionnaires and interviews, sample heterogeneity across multinational institutions, and quantitative analyses that are associative rather than strictly causal.
Measurement and research gaps (data scarcity, informality) complicate robust economic assessment of AI impacts; improved metrics, granular labour and firm‑level data, and mixed‑methods evaluation are required.
Methodological critique based on reviewed literature and identified gaps; no new data collection in the paper.
There is a lack of causal evidence on the long-run impacts of AI-driven HRM on employment, wages, and firm survival—this is a key research gap identified by the review.
Explicitly stated research gap in the review based on assessment of methodologies and findings across the 47 included studies.
A systematic review following PRISMA identified 47 peer-reviewed studies (2012–2024) on data-driven HRM and workforce resilience from Scopus, Web of Science, and Google Scholar.
Explicit review protocol and search/screening results reported by the paper (PRISMA-based), final sample size = 47 studies.
Recommended research designs to estimate impacts include RCTs, quasi-experimental methods (difference-in-differences, regression discontinuity, matching), and longitudinal cohort tracking.
Paper explicitly lists these evaluation designs as appropriate methods for causal inference and long-term outcomes measurement. This is a methodological recommendation rather than an empirical claim.
There is a need for causal, longitudinal studies on how AI‑enabled fintech affects women's portfolio outcomes and on algorithmic interventions designed to reduce gender gaps.
Explicit statement in the paper noting limitations of existing literature (heterogeneity, limited longitudinal causal evidence, possible platform sample selection).
Analyses were conducted as intent-to-treat comparisons across arms, with hypothesis tests reported (including p-values) and principal stratification used for mechanism decomposition.
Methods statement: intent-to-treat comparisons, reported p-values for score differences, and use of principal stratification for separating total effect into adoption and effectiveness channels in the randomized trial (n = 164).
The primary outcomes analyzed were LLM adoption (use), exam score (grade points), and answer length.
Study’s stated primary outcomes in methods: adoption indicator, exam score on an issue-spotting exam, and answer length (measured). Sample size n = 164.
The study used a randomized controlled design with three arms: no LLM access, optional LLM access, and optional LLM access plus brief training.
Study methods description: randomized assignment of 164 law students to three experimental conditions as listed.
The intervention consisted of roughly a ten-minute training focused on how to use the LLM effectively.
Study description of the intervention in the randomized experiment (three-arm design with one arm receiving ~10-minute targeted training).
Empirical validation of the book’s proposals would require complementary case studies, model documentation, and outcome measurements.
Author/reviewer recommendation in the blurb about methodological limitations and next steps; not an empirical finding.
The book is predominantly conceptual and policy-analytic and uses illustrative case vignettes rather than presenting a single empirical study.
Explicit methodological description in the Data & Methods blurb: synthesis of technical ideas, governance requirements, and illustrative vignettes; no empirical sample or experimental protocol described.
Limitations of the review include the small sample of studies, uneven geographic coverage, heterogeneity in methods across studies, and limited long‑run evidence (especially on generative AI), which complicate causal aggregation.
Author-reported limitations based on the meta-assessment of the 17 included studies (variation in methods, contexts, and time horizons).
Design of this work: a systematic literature review and meta‑synthesis of empirical findings from peer‑reviewed journals (2020–2025), based on 17 publications.
Stated methods and inclusion criteria of the paper: systematic review of peer‑reviewed literature (sample = 17).