Evidence (1902 claims)
Adoption
5126 claims
Productivity
4409 claims
Governance
4049 claims
Human-AI Collaboration
2954 claims
Labor Markets
2432 claims
Org Design
2273 claims
Innovation
2215 claims
Skills & Training
1902 claims
Inequality
1286 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 369 | 105 | 58 | 432 | 972 |
| Governance & Regulation | 365 | 171 | 113 | 54 | 713 |
| Research Productivity | 229 | 95 | 33 | 294 | 655 |
| Organizational Efficiency | 354 | 82 | 58 | 34 | 531 |
| Technology Adoption Rate | 277 | 115 | 63 | 27 | 486 |
| Firm Productivity | 273 | 33 | 68 | 10 | 389 |
| AI Safety & Ethics | 112 | 177 | 43 | 24 | 358 |
| Output Quality | 228 | 61 | 23 | 25 | 337 |
| Market Structure | 105 | 118 | 81 | 14 | 323 |
| Decision Quality | 154 | 68 | 33 | 17 | 275 |
| Employment Level | 68 | 32 | 74 | 8 | 184 |
| Fiscal & Macroeconomic | 74 | 52 | 32 | 21 | 183 |
| Skill Acquisition | 85 | 31 | 38 | 9 | 163 |
| Firm Revenue | 96 | 30 | 22 | — | 148 |
| Innovation Output | 100 | 11 | 20 | 11 | 143 |
| Consumer Welfare | 66 | 29 | 35 | 7 | 137 |
| Regulatory Compliance | 51 | 61 | 13 | 3 | 128 |
| Inequality Measures | 24 | 66 | 31 | 4 | 125 |
| Task Allocation | 64 | 6 | 28 | 6 | 104 |
| Error Rate | 42 | 47 | 6 | — | 95 |
| Training Effectiveness | 55 | 12 | 10 | 16 | 93 |
| Worker Satisfaction | 42 | 32 | 11 | 6 | 91 |
| Task Completion Time | 71 | 5 | 3 | 1 | 80 |
| Wages & Compensation | 38 | 13 | 19 | 4 | 74 |
| Team Performance | 41 | 8 | 15 | 7 | 72 |
| Hiring & Recruitment | 39 | 4 | 6 | 3 | 52 |
| Automation Exposure | 17 | 15 | 9 | 5 | 46 |
| Job Displacement | 5 | 28 | 12 | — | 45 |
| Social Protection | 18 | 8 | 6 | 1 | 33 |
| Developer Productivity | 25 | 1 | 2 | 1 | 29 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
| Creative Output | 15 | 5 | 3 | 1 | 24 |
| Skill Obsolescence | 3 | 18 | 2 | — | 23 |
| Labor Share of Income | 7 | 4 | 9 | — | 20 |
Skills Training
Remove filter
Long-term evidence on generative AI’s structural labor‑market effects is scarce; few longitudinal studies exist.
Assessment of study horizons and methods among the 17 papers indicates limited long-run and longitudinal analyses specifically on generative AI impacts.
Empirical coverage is limited for low‑income countries; evidence from such settings is scarce.
Geographic distribution of the 17 reviewed studies shows concentration in advanced economies with few or no studies focused on low-income countries.
The literature shows a surge in research activity on AI and labor markets in 2023–2025 and a concentration of studies in advanced economies.
Meta-analytic summary of the publication years and geographic focus among the 17 selected publications (temporal and geographic count of included studies).
Results depend on accurate skill extraction from vacancy texts and valid measures of occupational exposure/complementarity; causal interpretation of diffusion effects may be limited by endogeneity (e.g., technology adoption responding to labor-market conditions).
Authors' stated methodological limitations: reliance on text-analysis identification of skills and on constructed measures of exposure/complementarity; acknowledgement of endogeneity concerns limiting causal claims.
The paper proposes two conceptual models (AI/ML‑Driven Labor Market Transformation Model and Sectoral Impact and Resilience Model) to organize heterogeneous findings and generate testable hypotheses about how AI reshapes labor across sectors and skill levels.
Conceptual synthesis integrating Technological Determinism, Socio‑Technical Systems Theory (STS), and Skill‑Biased Technological Change (SBTC); the models are theoretical outputs of the review used to map mechanisms and heterogeneity rather than empirical findings.
There are substantial measurement and identification gaps in the literature: heterogeneity in measuring 'AI adoption', limited long‑run causal evidence, and geographic bias toward advanced economies.
Methodological assessment within the review noting variability across studies in AI measures (patents, investment, task exposure proxies), paucity of long‑run causal designs, and concentration of empirical studies in advanced economies; this is a meta‑evidence limitation statement.
The study maps employment channels for AI-competent graduates and documents the most frequent job titles/roles and associated wage levels.
Descriptive analysis of employer channels, occupational role frequencies, and wage data compiled in the monitoring dataset covering graduates and alternative-route entrants.
Quasi-experimental designs (difference-in-differences, instrumental variables, event studies) and panel regressions are useful methods for identifying causal effects of AI adoption where plausibly exogenous variation exists.
Methodological summary in the paper listing common empirical strategies used in the literature to estimate causal impacts of technology adoption.
Current research is limited by measurement challenges in capturing AI capabilities and firm-level adoption, and by a lack of longitudinal worker-firm data and causal identification in many settings.
Explicit limitations noted by the paper: gaps in task measures, scarce longitudinal linked datasets, and methodological challenges in causal inference.
This paper's approach is qualitative and based on secondary literature synthesis; it does not collect primary survey, experimental, or administrative data.
Explicit statement in the Data & Methods section of the paper.
Key empirical gaps remain: better measurement of K_T (AI/software capital), more granular matched employer‑employee and wealth data, and improved estimates of task-substitution elasticities are required to precisely quantify incidence and policy impacts.
Authors’ stated research agenda and limitations section, including sensitivity analyses showing outcome variation with parameter choices and measurement uncertainty.
The framework provides a roadmap for coordinated response across educational institutions, government agencies, and industry to ensure workforce resilience and domestic leadership in the emerging agentic finance era.
Authors' proposed integrated roadmap (prescriptive recommendation; no empirical testing or outcome measurement reported in the provided text).
We develop a comprehensive government policy framework including: 1) Federal AI literacy mandates for post-secondary business education; 2) Department of Labor workforce retraining programs with income support for displaced financial professionals; 3) SEC and Treasury regulatory innovations creating market incentives for workforce development; 4) State-level workforce partnerships implementing regional transition support; and 5) Enhanced social safety nets for workers navigating career transitions during the estimated 5-15 year transformation period.
Author-presented policy framework and recommendations (policy design proposals and an asserted 5–15 year transformation timeframe; no empirical evaluation reported).
We propose a multi-layered integration strategy for higher education encompassing: 1) Foundational AI literacy modules for all business students; 2) A specialized "Agentic Financial Planning" course with hands-on labs; 3) AI-augmented redesign of core courses (Investments, Portfolio Management, Ethics); 4) Interdisciplinary project-based learning with Computer Science; and 5) A governance and policy module addressing regulatory compliance (NIST AI RMF, SEC regulations).
Proposed curricular framework presented by the authors (recommendation/proposal, not empirically tested within the paper).
Empirical findings demonstrate that digitalization significantly boosts efficiency and competitiveness of industrial production.
Correlation and regression analyses reported in the study linking digitalization measures to indicators of efficiency and competitiveness across levels of analysis.
Digital technologies (automation, IIoT, ERP systems, AI applications) reduce nonproductive costs, increase per-worker output, and improve the cost-efficiency of production in Kazakhstani enterprises.
Case studies and real examples from named enterprises (Asia Auto, Karaganda Foundry and Engineering Plant, Eurasian Resources Group) presented in the article.
The number of employees and working time have a positive but limited effect on labor productivity.
Results from the study's correlation and regression analysis comparing labor input measures (employee count and working time) with productivity outcomes.
Digitalization is the key driver of labor productivity growth in Kazakhstan.
Empirical correlation and regression analysis reported in the study across enterprise, industry, and national economy levels.
Investments in education and training are crucial for mitigating AI-induced employment disruptions and enhancing workforce adaptability.
Policy recommendation drawn from the paper's empirical findings (PLS-SEM, n = 351) and discussion.
Job displacement intensifies the demand for new skills, highlighting the need for reskilling and upskilling initiatives.
Finding reported from the study's PLS-SEM analysis of survey responses (n = 351).
AI has also fostered employment growth in emerging industries.
Empirical finding reported from the study's analysis of survey data (PLS-SEM, n = 351).
These results provide a mechanistic account of how humans adapt their trust in AI confidence signals through experience.
Combined behavioral evidence (N = 200) and computational modeling (LLO + Rescorla–Wagner) presented in the paper.
The model indicates that humans adapt by updating two components: baseline trust and confidence sensitivity, and they use asymmetric learning rates that prioritize the most informative errors.
Parameter recovery / model-fitting results reported in the paper showing updates to baseline trust and sensitivity parameters and asymmetric learning-rate estimates.
A computational model using a linear-in-log-odds (LLO) transformation combined with a Rescorla–Wagner learning rule explains the observed learning dynamics.
Modeling analysis reported in the paper fitting an LLO + Rescorla–Wagner model to participants' behavioral data (N = 200).
Humans can compensate for monotonic miscalibration (overconfidence and underconfidence) through repeated experience.
Behavioral experiment results showing participants adapted successfully in overconfidence and underconfidence conditions (N = 200, 50 trials).
Robust learning occurred across all calibration conditions (standard, overconfidence, underconfidence, reverse) with participants improving accuracy, discrimination, and calibration.
Behavioral experiment (N = 200) reporting consistent learning improvements across the four experimental conditions over 50 trials.
Participants significantly improved their calibration alignment (alignment between their confidence predictions and actual AI correctness) over 50 trials.
Behavioral experiment (N = 200) reporting improvements in calibration alignment metrics across trials.
Participants significantly improved their discrimination (ability to distinguish correct vs. incorrect AI outputs) over 50 trials.
Behavioral experiment (N = 200) reporting improved discrimination metrics across repeated trials.
Participants significantly improved their prediction accuracy of the AI's correctness over 50 trials.
Behavioral experiment (N = 200), longitudinal measurement across 50 trials reporting statistically significant improvement in accuracy.
The results of this regional research outline a multi-dimensional policy roadmap that dives deep into the region’s current capabilities and the hurdles it faces in catching up with the AI revolution from a governance and policy perspective, presenting them in a practical framework for public sector leaders.
Report summary claiming that the study's results produce a comprehensive roadmap and practical framework (content description).
This executive report provides a roadmap for establishing an AI governance infrastructure through a set of strategic policy recommendations across seven key pillars.
Document assertion describing the content and structure of the report (authors' deliverable).
The reality of limited AI governance capacity calls for a series of policy interventions at both local and regional levels to empower the AI ecosystem in the Arab region.
Authors' policy recommendation derived from the regional study and synthesis of findings.
A policy of 20% mandatory practice preserves 92% more capability than the simulation baseline (baseline includes a 5% background AI-failure rate).
Simulation comparing baseline (5% background AI-failure rate) to a counterfactual with 20% mandatory practice; reported 92% relative preservation of capability.
The model predicts that periodic AI failures improve human capability 2.7-fold (relative improvement reported in simulations).
Simulation experiments comparing scenarios with/without periodic AI failures; reported fold-change in capability of 2.7×.
Validated against 15 countries' PISA data (102 points), the model achieves R^2 = 0.946 with 3 parameters and attains the lowest BIC among compared specifications.
Empirical validation using PISA dataset covering 15 countries and 102 data points; reported fit statistics (R^2, number of parameters, BIC).
The model was calibrated to four domains: education, medicine, navigation, and aviation.
Model calibration procedures applied separately to four named domains reported in the paper.
We present a two-variable dynamical systems model coupling capability (H) and delegation (D), grounded in three axioms: learning requires capability, practice, and disuse causes forgetting.
Model specification and theoretical construction described in the paper (two-variable dynamical system; three axioms).
This work offers a cost-effective, scientifically grounded blueprint for ubiquitous AI education.
Authors' concluding statement based on the SOP, low labor/hardware claims, and the pilot exam results showing high accuracy with the Shadow Agent in newer 32B models.
This suggests that structured reasoning guidance (as implemented by the Shadow Agent) is the key to unlocking the latent power of modern small language models.
Interpretive claim based on the pilot study's observed large gains for newer 32B models when using Shadow Agent guidance versus smaller gains for older models and stagnation in baselines.
In contrast, older models see only modest gains (~10%) from the Shadow Agent guidance.
Same pilot study reporting that older (unspecified) model generations showed only about a ~10% improvement when using the Shadow Agent versus baseline. No exact accuracy numbers, sample size, or model names provided.
The Shadow Agent, which provides structured reasoning guidance, triggers a massive capability surge in newer 32B models, boosting performance from 74% (Naive RAG) to mastery level (90%).
Pilot study on a full graduate-level final exam reported comparisons between Naive RAG (74% accuracy) and the Shadow Agent (90% accuracy) for newer 32B models. Specific number of exam items or statistical testing not stated.
We used a Vision-Language Model data cleaning strategy and a novel Shadow-RAG architecture as core technical components of the localization pipeline.
Methodological description in the practitioner report; the paper explicitly names these two techniques as the data-cleaning and architectural contributions used to create the tutor.
Using a Vision-Language Model data cleaning strategy and a novel Shadow-RAG architecture, we localized a graduate-level Applied Mathematics tutor using only 3 person-days of non-expert labor and open-weights 32B models deployable on a single consumer-grade GPU.
Practitioner report describing a replicable Standard Operating Procedure (SOP); method claims include Vision-Language Model data cleaning and Shadow-RAG; deployment described as using open-weight 32B models on a single consumer GPU; labor reported as '3 person-days of non-expert labor'. No sample size or independent replication reported in text.
Human-replacing technologies have a strategic role in enhancing industrial productivity and ensuring the long-term resilience of Ukraine’s mining and metallurgical sector amid workforce shortages and structural labour-market changes due to war and demographic decline.
Integrated sectoral assessment in the paper combining current context (workforce shortages, structural changes), literature on technology-driven productivity/resilience, and industry-specific considerations; presented as a high-level conclusion.
Integrating ergonomic assessments and human–systems–interaction approaches into automation projects is important to prevent cognitive overload, occupational stress and operational risks for control‑room operators.
Recommendation and emphasis in the paper, supported by references to ergonomics and human-factors literature; presented as a preventive/mitigative approach rather than a quantified empirical result for the sector.
Successful technological modernization requires continuous investment in human capital, reskilling and the development of digital and engineering competencies.
Policy/recommendation based on the paper's synthesis of the sector analysis and literature on skill requirements and technology adoption; not presented as an original empirical estimate in the summary.
Higher robot density is associated with productivity gains, particularly in low-robotized sectors such as Ukraine’s mining and metallurgical industry.
Empirical evidence cited from international and industry-specific studies reviewed in the paper (literature review/meta-analytic style evidence); no Ukraine-specific causal estimate with sample size reported in the summary.
Human-replacing technologies also have an indirect impact on productivity by increasing total factor productivity (TFP).
Analytical argumentation in the paper supported by references to empirical studies showing TFP effects of automation/digitalization; literature synthesis rather than a new econometric estimate presented for Ukraine.
Human-replacing technologies (mechanization, automation, robotization, digitalization and AI-augmentation) make a direct contribution to labour productivity growth in Ukraine's mining and metallurgical sector.
Sectoral analysis and synthesis in the paper drawing on empirical international and industry-specific studies; literature review of productivity impacts of mechanization/automation/robotization/digitalization/AI in industrial contexts.
Trained participants more often assigned tasks to the agent by defining strategies compared to participants who did not receive teamwork training.
Behavioral measure in experiment (frequency of assigning tasks using defined strategies) comparing trained vs. untrained participants in the KeyWe game with a scripted agent.