Evidence (3231 claims)
Adoption
7395 claims
Productivity
6507 claims
Governance
5921 claims
Human-AI Collaboration
5192 claims
Org Design
3497 claims
Innovation
3492 claims
Labor Markets
3231 claims
Skills & Training
2608 claims
Inequality
1842 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 609 | 159 | 77 | 738 | 1617 |
| Governance & Regulation | 671 | 334 | 160 | 99 | 1285 |
| Organizational Efficiency | 626 | 147 | 105 | 70 | 955 |
| Technology Adoption Rate | 502 | 176 | 98 | 78 | 861 |
| Research Productivity | 349 | 109 | 48 | 322 | 838 |
| Output Quality | 391 | 121 | 45 | 40 | 597 |
| Firm Productivity | 385 | 46 | 85 | 17 | 539 |
| Decision Quality | 277 | 145 | 63 | 34 | 526 |
| AI Safety & Ethics | 189 | 244 | 59 | 30 | 526 |
| Market Structure | 152 | 154 | 109 | 20 | 440 |
| Task Allocation | 158 | 50 | 56 | 26 | 295 |
| Innovation Output | 178 | 23 | 38 | 17 | 257 |
| Skill Acquisition | 137 | 52 | 50 | 13 | 252 |
| Fiscal & Macroeconomic | 120 | 64 | 38 | 23 | 252 |
| Employment Level | 93 | 46 | 96 | 12 | 249 |
| Firm Revenue | 130 | 43 | 26 | 3 | 202 |
| Consumer Welfare | 99 | 51 | 40 | 11 | 201 |
| Inequality Measures | 36 | 106 | 40 | 6 | 188 |
| Task Completion Time | 134 | 18 | 6 | 5 | 163 |
| Worker Satisfaction | 79 | 54 | 16 | 11 | 160 |
| Error Rate | 64 | 79 | 8 | 1 | 152 |
| Regulatory Compliance | 69 | 66 | 14 | 3 | 152 |
| Training Effectiveness | 82 | 16 | 13 | 18 | 131 |
| Wages & Compensation | 70 | 25 | 22 | 6 | 123 |
| Team Performance | 74 | 16 | 21 | 9 | 121 |
| Automation Exposure | 41 | 48 | 19 | 9 | 120 |
| Job Displacement | 11 | 71 | 16 | 1 | 99 |
| Developer Productivity | 71 | 14 | 9 | 3 | 98 |
| Hiring & Recruitment | 49 | 7 | 8 | 3 | 67 |
| Social Protection | 26 | 14 | 8 | 2 | 50 |
| Creative Output | 26 | 14 | 6 | 2 | 49 |
| Skill Obsolescence | 5 | 37 | 5 | 1 | 48 |
| Labor Share of Income | 12 | 13 | 12 | — | 37 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Labor Markets
Remove filter
Falsifiability condition for intermediation-collapse: If intermediary margins remain stable despite measurable declines in information frictions, the intermediation-collapse mechanism is falsified.
Stated empirical test in the paper that compares measured intermediary markups/margins to proxies for information frictions and AI-driven automation across affected sectors.
Falsifiability condition for Ghost GDP: If monetary velocity does not decline (or instead rises) as the labor share falls, the Ghost GDP channel is unsupported by the data.
Explicit falsification condition provided in the paper based on the model link labor share -> velocity -> consumption; suggested empirical test using monetary-velocity proxies and labor-share series from FRED.
Empirically, top-quintile households account for roughly 47–65% of U.S. consumption.
Calibration and reported quantitative scenarios in the paper using U.S. consumption concentration data (constructed from U.S. consumption/income micro- and macro-data sources referenced in the methods section).
The paper uses a mixed-methods approach combining a systematic literature review with an empirical practitioner survey to assess perceptions, adoption, and impact of AI-driven tools.
Methodological statement in the paper; survey design covers tool usage, perceived benefits, challenges, and expectations.
Empirical work (experiments and measurements) is needed to quantify how much value interpretive traces add to downstream outputs, how RATs affect platform incentives, and what governance frameworks fairly allocate resulting rents.
Concluding recommendation in the paper stating the research gaps; not an empirical claim but a stated need.
The current presentation of RATs is speculative and illustrative; empirical validation, scalability, and ethical safeguards remain to be developed.
Limitations section of the paper explicitly states the speculative nature and lack of empirical evaluation.
Implementation of RATs requires instrumentation at the browser/platform level or via plugins and must address privacy/consent, storage/ownership, sharing controls, and interoperable trace formats.
Design and implementation considerations enumerated in the paper; this is a requirements statement rather than an empirical claim.
Analytical approaches compatible with RATs include sequence/trajectory mining, network analysis of associations/co-read graphs, embedding/clustering of trajectories, qualitative inspection of reflections, and experimental (A/B or RCT) evaluation of downstream effects.
Methods section of the paper listing suggested analytical techniques; these are proposed methods rather than applied analyses.
The paper is primarily theoretical and prescriptive: it synthesizes literature and proposes a framework and design guidelines rather than reporting large-scale empirical datasets or causal identification of economic outcomes.
Meta-claim about the paper's methods explicitly stated in the Data & Methods summary; based on the paper's methodological description.
Key measurable outcomes to assess Human–AI teams include accuracy/efficiency, robustness to novel cases, decision consistency, trust/misuse rates, training costs, and inequity indicators.
Prescriptive list of metrics offered by the authors as part of the research agenda and evaluation guidance; not empirically derived from a dataset in the paper.
Empirical evaluation strategies for Human–AI teams should include randomized interventions, field trials, lab experiments, phased rollouts (difference-in-differences), and structural models that allow interaction terms between human skill and AI quality.
Methodological recommendation in the paper; suggested study designs rather than implemented analyses.
Research priorities include empirical measurement of task‑level automation rates, firm and industry productivity effects, wage impacts across occupations, and diffusion patterns.
Paper's stated research agenda and identification of measurement gaps; based on methodological critique of current evidence base.
Measuring these productivity gains will be challenging because quality improvements, faster iteration, and creative outputs are harder to price/observe than lines of code.
Methodological argument about measurement difficulty; based on conceptual considerations, not empirical validation.
Measuring AI's economic impact requires new metrics that account for decision-value uplift, reduced tail-risk exposures, and dynamic gains from continuous learning; causal identification will require experiments or staggered rollouts.
Methodological recommendation backed by conceptual discussion of measurement challenges; no implementation of such measurement approaches is reported in the paper.
Performance and evaluation should be measured using forecast accuracy, decision lift/value added, latency, and false positive/negative rates.
Paper-prescribed evaluation metrics; presented as recommended practice rather than derived from empirical testing within the paper.
Core AI techniques for these frameworks include supervised/unsupervised ML, NLP for unstructured text, anomaly detection for control/transaction monitoring, and reinforcement/prescriptive models for recommendations.
Methodological claim listing standard ML/NLP/anomaly-detection techniques and prescriptive approaches; statement of methods rather than an empirical comparison of alternatives.
Next‑gen frameworks use large-scale structured (transactions, ledgers, KPIs) and unstructured sources (reports, news, contracts, call transcripts) to power models.
Descriptive claim listing data types the paper recommends; presented as design input requirements rather than empirically validated data-integration projects.
There is a need for quantitative studies and microdata on firm-level RM practices, AI adoption, and performance outcomes to measure effect sizes and causal pathways.
Stated research gaps and limitations in the review (lack of primary empirical quantification; heterogeneity across contexts).
The review's conclusions are limited by reliance on published literature (potential bias toward successful implementations), lack of primary empirical quantification (no effect sizes), and heterogeneity across organizational contexts limiting direct generalizability.
Explicit limitations stated in the paper summarizing scope and method (qualitative literature review, secondary evidence only).
Heterogeneity in system designs and deployment contexts complicates cross-site comparisons.
Limitations section and observed variation in platform architectures, degrees of automation, and governance across sites reported via descriptive data and interviews.
Non-random selection of institutions limits causal inference and external generalizability of the study's findings.
Study limitations explicitly state non-random site selection and heterogeneous deployments; methodological note that causal claims are constrained.
Estimation/calibration, stability assessment, and global sensitivity methods used: parameters calibrated/estimated on 2016–2023 data; equilibrium located; Jacobian eigenvalues computed for local stability; variance-based global sensitivity analysis performed over parameter space.
Methods section: description of parameter estimation/calibration, equilibrium computation, Jacobian-based stability analysis, and variance-based global sensitivity analysis.
The main empirical conclusions are based on a short annual panel (2016–2023) and a stylized aggregate interaction model; results should be interpreted with caution due to potential omitted variables, aggregation bias, and limited sample size.
Explicit limitations listed in the paper: short time series (eight annual observations), national aggregate data, simplified model structure, no firm/sector heterogeneity, possible endogeneity/measurement issues.
The empirical analysis uses annual, national-level aggregate Chinese series for 2016–2023 as proxies for AI capital, physical capital stock, and labor compensation (wage bill).
Data description in Data & Methods: annual Chinese aggregate series 2016–2023. Implied sample length: 2016–2023 inclusive (8 annual observations); national-level aggregates, no firm-level heterogeneity modeled.
The paper models interactions among AI capital, physical capital, and labor using a Lotka–Volterra (predator–prey type) system adapted to include self-limiting (saturation) terms.
Model specification described in Methods: deterministic Lotka–Volterra system with added self-limitation terms for three stocks (AI capital, physical capital, labor).
Instrumental-variable (IV) estimation is used to address endogeneity of AI adoption and to identify causal effects on employment and wages.
Paper states IV identification strategy applied to the 38-country panel; robustness checks and alternative specifications reported (paper refers to instrument details in full text).
The AI Adoption Index is constructed as a composite measure combining enterprise investment in AI, AI-related patent filings, and workforce/firm surveys on AI use across 38 OECD countries (2019–2025).
Paper's methodological description of the index construction; data sources enumerated as investment, patenting, and survey measures over the panel period.
The paper is entirely theoretical/analytical and does not report an empirical dataset.
Paper methodology section and abstract state primary tool is an analytical economic model; no empirical data or sample sizes are reported.
The same formal framework can be interpreted as a firm-level model where human skill investment maps onto AI/chatbot investment decisions.
Paper provides an alternative interpretation and formally maps agent skill-investment choices into an analogous firm R&D/AI-capital decision problem within the same mathematical framework.
Research and monitoring priorities for economists include task-level analyses of substitutability/complementarity, modeling adoption as a function of regulatory costs and reimbursement incentives, and evaluating long-run welfare and distributional effects.
Explicit research recommendations stated in the narrative review, based on gaps identified in the literature and evolving empirical questions.
Policymakers and payers should consider liability reform, reimbursement models that reward safe human–AI collaboration, funding for independent clinical validation, and measures to prevent market concentration.
Policy recommendations and implications derived from the narrative review's synthesis of regulatory, economic, and implementation challenges.
There is a need for validated administrative and firm-level data on AI adoption, workplace monitoring, and worker outcomes, and for evaluation of policy interventions (mandated impact assessments, transparency requirements, worker representation rules) using randomized or quasi-experimental designs where feasible.
Research and measurement priorities set out in the commentary based on identified gaps; prescriptive recommendation rather than evidence-based finding.
The paper is a policy and legal commentary/synthesis and not an empirical causal study; it does not provide microdata on employment or wage effects but identifies plausible channels and institutional dynamics.
Author-stated methodology and limitations section describing type of study and data sources; explicitly reports lack of primary empirical data.
The federal U.S. approach to AI governance combines export controls for key AI hardware/software with a relatively permissive domestic regulatory stance that relies on executive guidance, voluntary standards, and sector-specific measures rather than comprehensive federal worker protections.
Comparative policy and legal review of federal-level instruments (export control lists, executive orders, agency guidance, proposed/final rules) described in the commentary; no primary empirical data or sample size.
The report has limited primary quantitative impact evaluation and relies on policy texts and secondary sources rather than large-scale empirical measurement of AI’s economic effects.
Explicit limitations section in the report describing methods and data constraints.
Methodological needs for AI-era labor models include dynamic skill taxonomies, high-frequency labor data (job postings, firm-level automation measures), and uncertainty quantification.
Paper's Research & policy recommendations and Methodological needs section (explicit recommendations).
The scenario analysis framework varies economic growth, automation rates, policy interventions, and investment to produce probabilistic demand–supply gaps.
Methods description of scenario analysis components and the variables varied in scenario experiments (explicit in Data & Methods).
Intended users of the Hub include organizations, educational institutions, and policymakers to inform reskilling/education strategies, regional economic policy, and labor-market interventions.
Explicit statement of target users and use cases in the Key Points / Implications sections.
The system produces interpretable outputs for stakeholders: demand–supply trend analysis, geospatial hotspot maps, skill-gap radar charts, and policy simulation dashboards.
Paper's description of outputs and interactive visual analytics (listed output modalities).
The core modeling approach uses probabilistic growth modeling combined with intelligent skill synthesis to estimate future workforce requirements under alternative economic and policy scenarios.
Methods section describing the modeling components: probabilistic growth modeling and intelligent skill synthesis (architectural description).
The platform integrates multiple indicators such as regional economic growth projections, automation velocity, policy intervention strength, investment intensity, and market volatility (macro- and micro-level indicators).
List of input indicators given in the Data & Methods section of the paper (explicit enumeration of macro and micro variables).
Significant empirical gaps remain on long-term impacts (wage trajectories, employment composition, firm-level returns), verification/remediation cost quantification, and public-good risks of insecure code proliferation.
Cross-study synthesis explicitly identifying missing longitudinal and firm-level empirical research in the reviewed literature.
The paper's conclusions are limited by reliance on secondary sources, heterogeneous cross‑study comparisons, limited causal identification of long‑run macro effects, and measurement challenges for AI‑driven intangible capital.
Authors' stated limitations section summarizing the nature of evidence used (qualitative literature review, secondary macro indicators, sectoral examples); this is an explicit self‑reported methodological limitation rather than an external empirical finding.
Methodology used in the paper is a narrative review relying on secondary sources (literature, legal cases, policy reports, empirical perception studies) and conceptual synthesis; no new primary data were collected.
Paper's Data & Methods section explicitly states narrative review and secondary-data analysis.
Important empirical research gaps remain (consumer willingness-to-pay for authenticated vs. synthetic content, labor-displacement elasticities, market concentration dynamics, and cost–benefit evaluations of regulatory options).
Explicit statement of limitations and research needs in the paper, based on the authors' narrative review and absence of primary empirical studies within the paper.
The paper's methodology is a secondary-data, narrative (qualitative) literature review; it contains no original empirical data or primary quantitative analysis.
Explicit methodological statement in the paper describing secondary data analysis and narrative synthesis; absence of primary datasets or statistical analyses.
This paper is conceptual/theoretical and does not conduct primary empirical data collection.
Explicit methodological statement in the paper's Data & Methods section.
The paper is primarily conceptual/architectural and does not present large empirical studies quantifying the phenomenon across firms or repositories.
Explicit methodological statement in the paper describing its use of thought experiments, mechanism reasoning, and illustrative examples rather than empirical datasets.
The paper's conclusions are drawn from a mix of evidence types including literature review, surveys/interviews, case studies, usage-log or publication-metric analyses, and controlled experiments—although the abstract does not specify which of these were actually used or the sample sizes.
Explicitly noted in the Data & Methods summary as the likely underlying evidence types; the paper's abstract itself does not document original data or detailed methods.
There is a lack of large‑scale causal evidence on generative AI’s effects; the paper recommends RCTs, difference‑in‑differences, matched employer–employee panels, and longitudinal studies to fill empirical gaps.
Methodological critique and research agenda provided in the review; observation based on the authors' survey of the literature.