Evidence (7953 claims)
Adoption
5539 claims
Productivity
4793 claims
Governance
4333 claims
Human-AI Collaboration
3326 claims
Labor Markets
2657 claims
Innovation
2510 claims
Org Design
2469 claims
Skills & Training
2017 claims
Inequality
1378 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 402 | 112 | 67 | 480 | 1076 |
| Governance & Regulation | 402 | 192 | 122 | 62 | 790 |
| Research Productivity | 249 | 98 | 34 | 311 | 697 |
| Organizational Efficiency | 395 | 95 | 70 | 40 | 603 |
| Technology Adoption Rate | 321 | 126 | 73 | 39 | 564 |
| Firm Productivity | 306 | 39 | 70 | 12 | 432 |
| Output Quality | 256 | 66 | 25 | 28 | 375 |
| AI Safety & Ethics | 116 | 177 | 44 | 24 | 363 |
| Market Structure | 107 | 128 | 85 | 14 | 339 |
| Decision Quality | 177 | 76 | 38 | 20 | 315 |
| Fiscal & Macroeconomic | 89 | 58 | 33 | 22 | 209 |
| Employment Level | 77 | 34 | 80 | 9 | 202 |
| Skill Acquisition | 92 | 33 | 40 | 9 | 174 |
| Innovation Output | 120 | 12 | 23 | 12 | 168 |
| Firm Revenue | 98 | 34 | 22 | — | 154 |
| Consumer Welfare | 73 | 31 | 37 | 7 | 148 |
| Task Allocation | 84 | 16 | 33 | 7 | 140 |
| Inequality Measures | 25 | 77 | 32 | 5 | 139 |
| Regulatory Compliance | 54 | 63 | 13 | 3 | 133 |
| Error Rate | 44 | 51 | 6 | — | 101 |
| Task Completion Time | 88 | 5 | 4 | 3 | 100 |
| Training Effectiveness | 58 | 12 | 12 | 16 | 99 |
| Worker Satisfaction | 47 | 32 | 11 | 7 | 97 |
| Wages & Compensation | 53 | 15 | 20 | 5 | 93 |
| Team Performance | 47 | 12 | 15 | 7 | 82 |
| Automation Exposure | 24 | 22 | 9 | 6 | 62 |
| Job Displacement | 6 | 38 | 13 | — | 57 |
| Hiring & Recruitment | 41 | 4 | 6 | 3 | 54 |
| Developer Productivity | 34 | 4 | 3 | 1 | 42 |
| Social Protection | 22 | 10 | 6 | 2 | 40 |
| Creative Output | 16 | 7 | 5 | 1 | 29 |
| Labor Share of Income | 12 | 5 | 9 | — | 26 |
| Skill Obsolescence | 3 | 20 | 2 | — | 25 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
Deterministic automated verifiers provide objective pass/fail checks for task success.
Methods section: verifiers are deterministic and automated, enabling objective evaluation of whether an agent's trajectory accomplished the task.
Scale of experiments: seven agent–model configurations and 7,308 execution trajectories were used to compute pass rates and deltas.
Reported experimental scale in Methods: 7 agent–model configurations and a total of 7,308 agent execution traces collected and analyzed across tasks/conditions.
Each task was evaluated under three conditions: (1) no Skills, (2) curated (human-authored) Skills, and (3) self-authored (model-generated) Skills.
Experimental protocol described in Methods: three-arm evaluation per task across the SkillsBench benchmark.
SkillsBench benchmark: evaluates 86 tasks spanning 11 domains with deterministic, automated verifiers.
Dataset and benchmark description in the paper: SkillsBench contains 86 tasks across 11 domains and uses deterministic pass/fail verifiers for objective evaluation.
Research should prioritize dynamic, task-based models that include transitional frictions, heterogeneous agents, and sectoral structure to better measure AI exposure and impacts.
Methodological recommendation grounded in the paper's theoretical critique of static occupation-level automation metrics and noted empirical gaps.
Timing uncertainty and measurement challenges make forecasting the pace and scale of AI-induced employment change inherently uncertain.
Methodological limitations section noting uncertainty in AI adoption speed and difficulties mapping capabilities to tasks and predicting new occupation emergence.
Research agenda: there is a need for causal studies on AI’s impact on accounting labor demand and firm performance, analyses of distributional effects across firm sizes and industries, and evaluation of regulatory frameworks for reliable, interpretable AI in financial reporting.
Author-stated research priorities drawn from gaps identified in the literature review; not an empirical finding.
Policy implications include workforce retraining, standards for AI auditability and transparency, and regulation balancing innovation and controls (privacy, fraud prevention).
Policy recommendations based on identified risks and barriers discussed in the paper rather than empirical policy evaluation.
For stronger causal evidence, recommended empirical methods include difference-in-differences on adopting firms vs. controls, matched samples, and randomized pilots for particular tools, supplemented by qualitative interviews.
Methodological recommendations stated in the paper (not an empirical finding); no implementation/sample reported in the abstract.
Actionable research priorities include running larger-scale field trials linking game use to observed land-use and economic outcomes, developing validation protocols for game-backed models against empirical on-farm data, studying heterogeneity of impacts, and designing incentive mechanisms that leverage game-demonstrated profitability co-benefits.
Synthesis-driven recommendations based on identified evidence gaps—specifically the predominance of small-scale/qualitative studies and lack of long-term/causal evidence.
Rigorous economic evaluation (RCTs, quasi-experiments) is needed to quantify how game-enhanced DSTs affect investment, land-use choices, emissions outcomes, and farm incomes.
Chapter recommendation grounded in observed gaps: the literature lacks sufficiently rigorous causal impact evaluations; current evidence is largely qualitative or observational.
Personal data are nonrivalrous and highly replicable, so selling data does not follow ordinary scarcity logic.
Analytic/property claim about the economic characteristics of digital information; supported by conceptual definitions and common technical facts about data replication; no empirical sampling needed.
Empirical approach measured and compared expectation formation, innovation responses, and pipeline outcomes across local exposure to closures and across distinct entrepreneurial identity groups.
Methodological description: survey-based, cross-country quantitative approach using measures of local exposure (nearby closures), identity classification (family/purpose-driven vs. wealth-driven), and outcomes (expectations, perceived impediments, self-reported innovation, pipeline transitions) in a sample >27,000.
The study analyzes a cross-country sample of more than 27,000 entrepreneurs across 43 countries (survey-based, comparative).
Descriptive claim about the dataset used in the paper: survey-based sample size >27,000 spanning 43 countries as reported in Data & Methods.
The paper's evidence is policy‑oriented, qualitative and analytical; it does not report causal estimates from new field data and produces testable propositions and an empirical agenda instead.
Explicit methods statement in the paper: structured desk review, corridor process mapping, governance gap analysis; absence of field experiments or causal quantitative analysis.
Framing claim: Ideological contests typically produce opposing normative visions (e.g., collectivized command economies vs. market democracies), which makes the development of Western economic theories that portray markets and democracy as dysfunctional puzzling.
Framing and motivation provided in the paper's introduction and background sections; synthesis of conventional expectations about ideological contest outcomes.
The paper uses a qualitative case‑study approach (archival and textual analysis, contextualization, interpretive synthesis) rather than attempting exhaustive quantitative causal identification.
Explicit methods description in the paper: in‑depth historical/institutional examination, archival/textual work, and interpretive synthesis.
The empirical strategy uses baseline panel regressions with standard controls (e.g., firm size, performance, leverage) and fixed effects to estimate the AI → pay relationship.
Methods section describing regression specifications including firm controls and fixed effects applied to the A-share firm panel.
Data consist of a panel of Chinese A-share listed companies covering 2007–2023.
Data description in the paper specifying the sample period and population (A-share listed firms, 2007–2023).
The firm-level AI application indicator is constructed via textual analysis of corporate disclosures (e.g., filings/annual reports) to capture AI application intensity.
Methodological description in the paper describing text-based construction of an AI application indicator from corporate disclosures for listed firms in the 2007–2023 sample.
Calibration via Method of Simulated Moments (MSM) matches six empirical moments to discipline mechanism magnitudes.
Model calibration procedure reported in the paper: MSM matching six chosen empirical moments that summarize key pre/post-AI patterns (paper states six moments were used).
The empirical approach tests for common long-run relationships across patenting series and identifies structural breaks concentrated after 2010.
Description of empirical strategy: time-series econometric analysis of patent filing series (1980–2019) including tests for common long-run relationships (cointegration) and structural break detection. The paper reports results of these tests (presence/absence of common trends and timing of breaks).
The paper highlights governance risks requiring transparency about LLM-derived mappings, mitigation of model biases, privacy-preserving data practices, and careful communication of uncertainty to avoid overconfident policy recommendations.
Explicit discussion of risks and governance considerations in the paper; this is an acknowledgment rather than an empirical claim. No implementation or audit evidence is provided.
Backtesting the architecture on historical automation waves and recent AI introductions will validate model design and calibration.
Paper explicitly proposes backtesting and holdout validation using historical automation episodes and recent AI adoption events; does not report completed backtests or empirical sample sizes.
Empirical validation of the integrated Kondratieff–Schumpeter–Mandel framework requires firm-level adoption and profitability data, sectoral investment series, and cross-country comparisons using panel methods and identification strategies (e.g., diff-in-diff, IV).
Methods/limitations section recommendation (explicitly states no single micro-econometric identification strategy was reported and outlines required data/methods).
The three frameworks (Kondratieff, Schumpeter, Mandel) are complementary: Kondratieff frames periodicity, Schumpeter provides micro-mechanisms of innovation-driven change, and Mandel foregrounds socio-political constraints and distributional outcomes.
Conceptual integration and comparative theoretical analysis (qualitative synthesis).
Kondratieff's framework is useful for identifying broad periodicities (recurring phases of expansion and stagnation) in capitalist development but is less specific about microeconomic mechanisms.
Theoretical review of Kondratieff literature and conceptual assessment (qualitative).
No new laboratory measurements or datasets are reported in the paper; the approach is methodological and conceptual rather than empirical.
Methods section and explicit statements within the paper noting absence of new data; verifiable by reading the paper.
These operators are presented as conceptual/theoretical bridges rather than immediately quantifiable laboratory units.
Explicit methodological statement in the paper emphasizing interpretive/theoretical intent; no empirical operationalization reported.
The literature is heterogeneous (different LLM families/sizes, prompting techniques, participant persona modeling, environments, and evaluation protocols), which impedes general conclusions about when LLMs reliably mimic humans.
Review notes wide variation across study designs and methods in the 182 studies; inability to produce a single performance estimate motivated unified conceptual framing.
Paired-game design (baseline and matched decoy-enabled game per interaction) enables direct, causal measurement of deception impact.
Methodological design described in the paper: each interaction modeled as a paired-game enabling direct comparison of equilibrium outcomes (theoretical/method section).
Equilibrium outcomes are linked to an information-theoretic uncertainty construct (entropy-like) that captures residual attacker uncertainty after observation.
Theoretical construction and formal connection drawn in the paper between equilibrium utilities and an entropy-style measure (analytical derivation).
Defender-optimal deception allocations are characterized analytically (closed-form/structural characterization of optimal resource allocation under constraints).
Analytical derivation/proofs in the paper producing defender-optimal strategy characterizations under resource/budget constraints.
The paper introduces two operational metrics: (1) value of deception (change in defender equilibrium utility attributable to deception relative to baseline) and (2) price of transparency (marginal loss in deception value induced by increased observability).
Formal definitions and mathematical expressions in the theoretical model section of the paper (analytical definitions/proofs).
The paper provides a principled, game-theoretic framework to measure and compare the operational value of cyber deception relative to a matched non-deceptive baseline.
Analytical/modeling contribution: paired strategic-game construction (baseline vs deception) with formal definitions and equilibrium analysis presented in the paper (theoretical derivation/proofs).
Evaluations reporting outcomes predominantly relied on learner surveys, knowledge/skill tests, or self‑reported behavior change measures.
Methods of evaluation extracted from the included studies: most used surveys, tests, or self-report measures to assess Kirkpatrick‑Barr levels 1–3.
The study used a cross-sectional quantitative survey (purposive sampling) of pharmaceutical-sector employees in Karnataka, India (N = 350) and analyzed relationships using PLS-SEM (SmartPLS 4.0).
Study design and methods as reported in the paper summary: cross-sectional survey, purposive sampling, N = 350, analysis via Partial Least Squares Structural Equation Modeling (SmartPLS 4.0).
Policy recommendations include: invest in open metadata standards; fund pilot programs to evaluate ROI (earnings, placement, employer satisfaction); require model governance and periodic external audits for AI-assisted curriculum tools; and support smaller providers via shared infrastructure or accreditation hubs.
Explicit policy recommendations in paper (prescriptive).
Careful attention is needed to validity/reliability of assessments and to selection bias in employment outcome measurement.
Paper's methodological caveat (prescriptive); no empirical bias analysis provided.
Suggested evaluation metrics include placement rates, wage premiums, competency attainment, compliance scores, cost per qualification, and update latency.
Paper's recommended evaluation metrics (prescriptive).
Implementation requires integration with information systems for documentation, versioning, metadata, and audit trails, and benefits from continuous monitoring dashboards.
Paper's technical implementation recommendations (prescriptive).
Recommended analysis methods are qualitative (semi-structured interviews, focus groups, document review) and quantitative (surveys, competency mapping, statistical analysis of outcomes), plus systematic audit methods including traceability checks.
Paper's methods section (methodological specification).
Data inputs for the framework should include competency taxonomies, labor-market signals, regulatory requirements, learner assessment results, and stakeholder interviews.
Paper's data-input specification (descriptive).
Management principles emphasised are transparency, traceability of outcomes, IT integration for documentation, and continuous monitoring/evaluation.
Explicit management principles in paper (prescriptive).
Research and audit should emphasise validity, reliability, and compliance using mixed methods (qualitative interviews/focus groups; quantitative surveys/statistics) and systematic curriculum audits.
Recommended research & audit approach in paper (methodological guidance).
Tools recommended include logigrams (visual decision/compliance flows) and algorigram (algorithmic step-flows for planning, assessment, audit).
Tool definitions and recommendations in paper (descriptive).
Core components of the framework are inputs (learner needs, industry requirements, regulatory standards), processes (curriculum mapping, competency alignment, career assessment), and outputs (structured lesson plans, compliance-ready frameworks, career-path documentation).
Framework component list provided in paper (descriptive).
Scope of the program includes curriculum design, organisational management, career-alignment, and audit/compliance processes.
Explicit scope statement in paper (descriptive).
The framework foregrounds logical modelling (logigrams, algorigrams) and mixed-methods data analysis to support design, auditability, and alignment with industry and regulatory standards.
Paper's methodological design and tool recommendations (conceptual). No empirical implementation data reported.
The program offers a comprehensive curriculum-engineering framework linking organizational orientation, management systems, lesson planning, and career assessment into traceable, compliance-ready curriculum products.
Paper's program description and framework specification (conceptual); no empirical evaluation or sample size reported.