Evidence (2608 claims)
Adoption
7395 claims
Productivity
6507 claims
Governance
5877 claims
Human-AI Collaboration
5157 claims
Innovation
3492 claims
Org Design
3470 claims
Labor Markets
3224 claims
Skills & Training
2608 claims
Inequality
1835 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 609 | 159 | 77 | 736 | 1615 |
| Governance & Regulation | 664 | 329 | 160 | 99 | 1273 |
| Organizational Efficiency | 624 | 143 | 105 | 70 | 949 |
| Technology Adoption Rate | 502 | 176 | 98 | 78 | 861 |
| Research Productivity | 348 | 109 | 48 | 322 | 836 |
| Output Quality | 391 | 120 | 44 | 40 | 595 |
| Firm Productivity | 385 | 46 | 85 | 17 | 539 |
| Decision Quality | 275 | 143 | 62 | 34 | 521 |
| AI Safety & Ethics | 183 | 241 | 59 | 30 | 517 |
| Market Structure | 152 | 154 | 109 | 20 | 440 |
| Task Allocation | 158 | 50 | 56 | 26 | 295 |
| Innovation Output | 178 | 23 | 38 | 17 | 257 |
| Skill Acquisition | 137 | 52 | 50 | 13 | 252 |
| Fiscal & Macroeconomic | 120 | 64 | 38 | 23 | 252 |
| Employment Level | 93 | 46 | 96 | 12 | 249 |
| Firm Revenue | 130 | 43 | 26 | 3 | 202 |
| Consumer Welfare | 99 | 51 | 40 | 11 | 201 |
| Inequality Measures | 36 | 105 | 40 | 6 | 187 |
| Task Completion Time | 134 | 18 | 6 | 5 | 163 |
| Worker Satisfaction | 79 | 54 | 16 | 11 | 160 |
| Error Rate | 64 | 78 | 8 | 1 | 151 |
| Regulatory Compliance | 69 | 64 | 14 | 3 | 150 |
| Training Effectiveness | 81 | 15 | 13 | 18 | 129 |
| Wages & Compensation | 70 | 25 | 22 | 6 | 123 |
| Team Performance | 74 | 16 | 21 | 9 | 121 |
| Automation Exposure | 41 | 48 | 19 | 9 | 120 |
| Job Displacement | 11 | 71 | 16 | 1 | 99 |
| Developer Productivity | 71 | 14 | 9 | 3 | 98 |
| Hiring & Recruitment | 49 | 7 | 8 | 3 | 67 |
| Social Protection | 26 | 14 | 8 | 2 | 50 |
| Creative Output | 26 | 14 | 6 | 2 | 49 |
| Skill Obsolescence | 5 | 37 | 5 | 1 | 48 |
| Labor Share of Income | 12 | 13 | 12 | — | 37 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Skills Training
Remove filter
Existing evidence is time-sensitive and heterogeneous: rapidly evolving models, heterogeneous study designs, and many short-term lab/microtask studies limit direct comparability and long-run inference.
Meta-observation from the review: documented methodological limitations across the literature (variation in models, tasks, metrics; prevalence of short-term studies).
Real‑time and LLM‑based methods improve responsiveness but raise governance, transparency, and reproducibility challenges that BLS must manage (audit trails, uncertainty communication).
Operational tradeoff discussion in the paper identifying governance risks; no case studies or incident analyses provided.
Distinguishing automation versus augmentation using causal methods changes policy responses (e.g., income support versus reskilling).
Policy implication drawn from conceptual separation of substitution and complementarity effects; logical inference rather than empirical demonstration in the paper.
Methodological caveats across the literature (heterogeneity of tasks/measures, publication bias, short-term studies) limit the generalizability of current findings.
Meta-level critique within the synthesis noting study heterogeneity, likely publication/short-term biases, and variable domain-specific performance dependent on user expertise and workflows.
Standard productivity metrics are likely to undercount the value generated by AI-augmented ideation; quality-adjusted measures of creative output are required.
Measurement critique based on the mismatch between existing productivity statistics and the kinds of upstream idea-generation gains observed in empirical studies; supported by the review's methodological discussion.
Evaluation of the equivalency system should use metrics such as concordance between claimed competencies and verified inputs, predictive validity versus labor-market integration outcomes, and false positive/negative rates in automated decisions.
Methodological recommendation in the paper outlining specific evaluation metrics; this is a prescriptive claim (no empirical implementation reported).
Results and implications are limited by the sample and context: evidence comes from law students on a single issue-spotting exam using one brief training intervention, so generalizability to experienced professionals, other tasks, or other models is untested.
Authors’ reported sample (164 law students) and explicit caution about generalizability in the study summary; the intervention and outcome are specific to one exam and one ~10-minute training.
Some mechanism-specific estimates are imprecise due to the sample size; confidence intervals for those estimates are wide.
Authors report wide confidence intervals for mechanism decomposition (principal stratification) results based on the randomized sample of 164 students.
There is no consensus in the literature on net job effects — studies diverge on whether AI produces net job gains.
Direct finding from the review: the 17 peer‑reviewed studies produce heterogeneous results on net employment impacts (some positive, some negative, some neutral).
Effects of AI adoption are heterogeneous across industries, firm sizes, regions, and worker characteristics (education, experience, occupation).
Microdata and firm-level studies exploiting cross-sectional and panel variation, quasi-experimental designs leveraging differential adoption across firms/regions, and comparative institutional analyses showing variation by context.
The effects of K_T adoption are heterogeneous across industries, firms, countries, and cohorts — early adopters and capital-rich firms/countries gain most — implying important transition dynamics for political economy.
Cross-country comparisons, industry- and firm-level panel heterogeneity analyses, and case studies demonstrating variation in adoption timing and gains; model simulations emphasizing transition path dependence.
Aggregate productivity (output per worker or per unit of inputs) can rise while labor’s share and employment decline due to substitution toward K_T.
Macro growth-accounting exercises decomposing output growth into contributions from labor, traditional capital, and technological capital; model simulations showing productivity gains coexisting with falling labor shares under substitution elasticities.
Rather than restoring stability, this cycle intensifies anxiety, undermines mastery, and erodes professional confidence.
Theoretical claim about psychological outcomes from the conceptual reskilling loop; paper provides argumentation but no empirical measurements.
Based on Job Demands–Resources (JD-R) theory and Conservation of Resources (COR) theory, the paper conceptualizes an AI-induced reskilling loop in which ongoing technological change leads to skill erosion, continuous reskilling demands, cognitive and emotional depletion, and reinforced learning as a defensive response to perceived obsolescence.
Theoretical model/loop derived from applying JD-R and COR frameworks; no empirical test or sample reported in the paper.
The paper introduces the concept of 'reskilling fatigue' to explain the human consequences of persistent skill volatility among Established Knowledge Professionals (EKPs).
Conceptual/theoretical contribution presented by the authors; definition and argumentation rather than empirical validation.
Continuous reskilling is widely promoted as a solution to AI-driven disruption, but little attention has been paid to its cumulative psychological costs.
Argument from literature review/observation in the paper; no empirical measurement or sample reported in the paper.
Employees experience technostress, anxiety and micro-political negotiation around AI tools in everyday work.
Reported experiences from semistructured interviews with 28 managers/professionals across 12 organizations; thematic analysis highlighting technostress and anxiety as themes.
Increased levels of AI assistance may degrade productivity, leading to potentially significant shortfalls under the model's identified conditions.
Model-based comparative-statics and steady-state analysis showing scenarios where marginal increases in AI assistance reduce expected task output; examples/parameter illustrations provided in the paper (theoretical, no empirical sample).
Introducing AI unreliability (errors/noise in AI outputs) in the model can also generate a productivity paradox: greater AI assistance may lower productivity.
Analytical/theoretical model incorporating AI unreliability; model derivations and examples demonstrating conditions under which unreliability leads to reduced productivity (no empirical data).
Incorporating endogeneity in skill development into the model can induce a productivity paradox where increased AI assistance reduces productivity.
Analytical/theoretical model of human-AI interaction with utility-maximizing human agents and endogenous skill development; steady-state and comparative-static analysis reported in the paper (no empirical sample).
AI integration simultaneously increases labor concerns about skill obsolescence by 33%.
Reported as a survey/result in the paper; the study includes surveys of 800 marketers (self-reported concerns about skill obsolescence are likely derived from that survey sample).
Rising data velocity renders legacy systems obsolete—threatening approximately $3.4 trillion in global marketing spending.
Paper reports an estimate/claim about threatened global marketing spending tied to legacy systems becoming obsolete (derivation likely from the study's quantitative analysis or economic estimate described in the paper).
62% of teams suffer from "AI paralysis," unable to scale pilot initiatives beyond isolated implementations.
Reported as a finding in the paper's mixed-methods study (paper states AI adoption audits of 120 organizations and surveys of 800 marketers as part of the study).
Using LLMs led to fewer creative moments observed in participants (p=0.002).
Within-subject comparison between LLM-assisted and unassisted conditions with reported p-value p=0.002. Study sample N=20.
Participants using LLMs had significantly shorter idea-generation periods (p=0.0004).
Within-subject comparison between LLM-assisted and unassisted conditions reported in paper; p-value reported as p=0.0004. Sample size N=20.
AI-assisted engineering teams concurrently face a 19% risk of skills obsolescence.
Empirical finding reported by the study, presumably based on the mixed-methods data (survey/Delphi/case studies) described in abstract.
Forecasts indicate that automation may supplant as much as 45% of traditional tasks by 2030.
Statement in paper referencing external forecasts (no specific source or sample reported in abstract).
Credential erosion is evident in the aggregate pattern (credentials losing signaling value relative to AI-augmented skill demonstrations).
Synthesis statement from included studies noting credential erosion alongside skill signaling changes; not quantified in the excerpt.
Developing economies reliant on cognitive services outsourcing face disproportionate disruption through both direct exposure and indirect demand-erosion channels.
Preliminary empirical evidence across included studies indicating larger negative impacts for economies dependent on cognitive-services exports; described as preliminary but material.
Observable labor market data already document patterns consistent with AI-driven displacement rather than mere transformation—concentrated among routine cognitive tasks and junior roles.
Synthesis of observed labor market indicators from retained empirical studies since 2020 showing concentration of declines in routine cognitive tasks and junior roles.
Evidence from online labor markets shows a 2%–21% reduction in posting volumes for automatable creative tasks following ChatGPT's release.
Empirical analyses of online labor market posting volumes reported in multiple studies included in the review; range reported across studies.
Across synthesized studies, there was a 14–41% reduction in postings for entry- and mid-level software development and content-creation roles in high-income economies between 2022 and 2024 (range across individual studies: −14% to −41%; median: −23%).
Synthesis of empirical studies retained in the systematic review (numerical range and median reported across non-overlapping study designs and geographies); no pooled meta-analytic estimate provided.
Without parallel investment in digital literacy, organizational culture, and inter-firm networks, AI will reproduce rather than reduce employment inequalities.
Authors' conclusion drawn from thematic analysis of interviews and conceptual framing; predictive statement based on qualitative findings.
AI adoption in peripheral economies is not a purely technological or financial challenge but a social and human capital challenge, embedded in a biocultural environment shaped by brain drain, institutional thinness, and weak civic intermediation.
Synthesis of interview findings using Bitsani's Biocultural City framework; qualitative evidence from 12 interviews supports this argument.
Knowledge deficits and financial constraints emerge as primary barriers [to AI adoption].
Thematic analysis of the twelve semi-structured interviews reporting these themes as primary barriers.
Firms do not internalize the congestion externality they impose on the retraining queue, the irreversibility of permanent exit, or the wage depression borne by non-routine incumbents — explaining why market adoption speed exceeds the social optimum.
Model-based mechanism: normative/comparative analysis showing omitted externalities in firm-level optimization relative to social planner, leading to divergence between private and social adoption speeds.
Social welfare is strictly concave in adoption speed and is maximized at an interior optimum below the market rate of adoption.
Analytical welfare optimization in the theoretical model: social-welfare function as a function of adoption speed yields strict concavity and an interior social optimum; comparison with market equilibrium adoption speed indicates market rate exceeds social optimum.
Faster adoption causes a sustained compression of the labor share throughout the transition window.
Model result showing time-path of labor's income share under varying adoption speeds in the theoretical framework.
Faster adoption produces a steeper and more persistent decline in labor force participation.
Dynamic model trajectories and comparative statics showing time path of labor force participation under different adoption-speed parameters.
Faster adoption overwhelms the retraining pipeline and generates permanent labor-force exit through worker discouragement.
Model mechanism: finite-capacity retraining queue in the dynamic model leads to queue congestion, producing a discouraged stock of permanently exited workers (analytical result in the theoretical model).
Current AI development trajectory reflects value choices that prioritize conversational generality over domain specificity, accountability, and long-term social sustainability.
Normative/critical analysis in the paper highlighting design priorities and trade-offs; no empirical measurement provided.
Sustained investment in large-scale chatbot infrastructures increases environmental costs.
Paper asserts environmental impacts from infrastructure investment (energy, resource use) as part of systemic critique; no quantified environmental measurements or sample size reported.
Chatbot-driven AI development contributes to concentration of economic power.
Argumentation about industry dynamics and infrastructure centralization in the paper; no empirical market-concentration metrics or sample provided.
The normalization of chatbots contributes to labor displacement.
Theoretical argument linking widespread chatbot adoption to changes in work and employment; no empirical displacement estimates provided.
Normalization of chatbot-mediated interaction alters patterns of work, learning, and decision-making, contributing to deskilling, homogenization of knowledge, and shifting expectations of expertise.
Analytical reasoning and literature-informed claims in the paper; no quantitative measurement or sample reported.
Chatbot-based systems often fail to adequately meet user needs, particularly in complex or high-stakes contexts, while projecting confidence and authority.
Qualitative argumentation and illustrative examples in the paper; no reported controlled empirical study or sample size.
The chatbot paradigm is not a neutral interface choice, but a dominant sociotechnical configuration whose widespread adoption reshapes social, economic, legal, and environmental systems.
Conceptual argument and synthesis in the paper (theoretical analysis); no empirical sample or quantitative data reported.
There is an absence of agreed-upon benchmarks for evaluating AI systems.
Introductory chapter notes lack of standardized evaluation benchmarks as a cross-cutting concern; presented as an analytical observation by the task force.
AI systems exhibit bias.
Introductory chapter points to bias in AI systems as a recurring theme; supported by the broader literature cited in the report (no numerical sample reported in the introduction).
AI model outputs are often opaque and non-replicable.
Introductory chapter identifies opacity and non-replicability of AI outputs as a cross-cutting theme; claim is based on literature synthesis and conceptual critique in the report.