Evidence (1902 claims)
Adoption
5126 claims
Productivity
4409 claims
Governance
4049 claims
Human-AI Collaboration
2954 claims
Labor Markets
2432 claims
Org Design
2273 claims
Innovation
2215 claims
Skills & Training
1902 claims
Inequality
1286 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 369 | 105 | 58 | 432 | 972 |
| Governance & Regulation | 365 | 171 | 113 | 54 | 713 |
| Research Productivity | 229 | 95 | 33 | 294 | 655 |
| Organizational Efficiency | 354 | 82 | 58 | 34 | 531 |
| Technology Adoption Rate | 277 | 115 | 63 | 27 | 486 |
| Firm Productivity | 273 | 33 | 68 | 10 | 389 |
| AI Safety & Ethics | 112 | 177 | 43 | 24 | 358 |
| Output Quality | 228 | 61 | 23 | 25 | 337 |
| Market Structure | 105 | 118 | 81 | 14 | 323 |
| Decision Quality | 154 | 68 | 33 | 17 | 275 |
| Employment Level | 68 | 32 | 74 | 8 | 184 |
| Fiscal & Macroeconomic | 74 | 52 | 32 | 21 | 183 |
| Skill Acquisition | 85 | 31 | 38 | 9 | 163 |
| Firm Revenue | 96 | 30 | 22 | — | 148 |
| Innovation Output | 100 | 11 | 20 | 11 | 143 |
| Consumer Welfare | 66 | 29 | 35 | 7 | 137 |
| Regulatory Compliance | 51 | 61 | 13 | 3 | 128 |
| Inequality Measures | 24 | 66 | 31 | 4 | 125 |
| Task Allocation | 64 | 6 | 28 | 6 | 104 |
| Error Rate | 42 | 47 | 6 | — | 95 |
| Training Effectiveness | 55 | 12 | 10 | 16 | 93 |
| Worker Satisfaction | 42 | 32 | 11 | 6 | 91 |
| Task Completion Time | 71 | 5 | 3 | 1 | 80 |
| Wages & Compensation | 38 | 13 | 19 | 4 | 74 |
| Team Performance | 41 | 8 | 15 | 7 | 72 |
| Hiring & Recruitment | 39 | 4 | 6 | 3 | 52 |
| Automation Exposure | 17 | 15 | 9 | 5 | 46 |
| Job Displacement | 5 | 28 | 12 | — | 45 |
| Social Protection | 18 | 8 | 6 | 1 | 33 |
| Developer Productivity | 25 | 1 | 2 | 1 | 29 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
| Creative Output | 15 | 5 | 3 | 1 | 24 |
| Skill Obsolescence | 3 | 18 | 2 | — | 23 |
| Labor Share of Income | 7 | 4 | 9 | — | 20 |
Skills Training
Remove filter
Platforms emphasize local-language expertise and culturally grounded sourcing as a strategy to improve verification and credibility.
Observed practices and platform guidelines derived from document analysis and staff interviews describing the use of local-language expertise and sourcing.
Generative AI functions as a socio‑technical intermediary that facilitates interpretation, coordination, and decision support rather than merely automating discrete tasks.
Thematic analysis and co‑word linkage between terms related to interpretative work, coordination, and decision‑support and technical GenAI terms within the corpus.
The literature indicates a managerial shift away from hierarchical command‑and‑control toward guide‑and‑collaborate paradigms, where managers curate, guide, and coordinate AI‑augmented teams rather than micro‑manage tasks.
Synthesis of themes from the 212‑paper corpus (co‑word and thematic analyses) showing recurrent managerial/behavioural concepts such as autonomy, coordination, and decision‑support tied to GenAI discussions.
Higher educational attainment is positively associated with greater willingness to keep working before retirement.
Multivariate regression analysis of the cross-sectional survey (n=889) using education level as a key explanatory variable.
Male gender is positively associated with higher willingness to remain employed before retirement.
Multivariate regression on the survey sample (n=889) including gender as an explanatory variable, controlling for demographic and socioeconomic covariates.
Design and policy interventions that encourage active human contributions (e.g., draft-first workflows, co-creation interfaces, training) can help preserve worker agency and mitigate psychological costs.
Recommendation based on experimental evidence that Active-collaboration preserved psychological outcomes relative to passive use; presented as policy/design prescription rather than directly tested intervention at scale.
A complementary real-world survey (N = 270) across diverse tasks reproduced the experimental pattern, suggesting external validity beyond the lab writing tasks.
Cross-sectional survey of N = 270 respondents reporting on their AI use across multiple task types; reported patterns consistent with the experiment (passive use associated with lower efficacy/ownership/meaningfulness; active collaborative use did not).
Adoption of AI feedback could lower marginal costs of delivering high-quality feedback and change fixed vs. variable cost structures for instruction delivery.
Economic implication discussed by workshop participants (50 scholars) as a theoretical possibility; no quantitative cost estimates in the report.
Generative AI can enable new feedback modalities (text, hints, worked examples, formative prompts) adaptable to content and learner needs.
Thematic conclusions from the interdisciplinary meeting of 50 scholars, describing possible modality generation capabilities of current generative models; no empirical modality-comparison data provided.
Immediate AI-generated feedback may sustain learner momentum and improve formative assessment cycles (timeliness & engagement).
Expert-opinion synthesis from structured workshop (50 scholars) identifying timely feedback as a potential pedagogical benefit; no empirical trials reported.
Large language and generative models can tailor explanations, scaffolding, and practice to learners' current states and preferences (personalization).
Workshop expert consensus and thematic synthesis from 50 interdisciplinary scholars; illustrative examples discussed rather than empirical evaluation.
Generative AI can produce real-time, individualized feedback at scale, potentially reducing per-student feedback costs and increasing feedback frequency.
Synthesis of expert perspectives from an interdisciplinary workshop of 50 scholars (educational psychology, computer science, learning sciences); qualitative small-group activities and thematic extraction. No primary experimental or quantitative cost data presented.
Agents learn from one another without curricula (agent-to-agent learning occurs organically in the ecosystem).
Naturalistic daily observations across platforms noting peer-to-peer agent interactions and apparent transfer of behaviors/knowledge; no controlled tests of learning or counterfactuals.
Agents form idea cascades and quality hierarchies without any centrally designed curriculum or intervention (emergent peer learning and spontaneous knowledge diffusion).
Observed interaction patterns across platforms showing cascades, hierarchies, and diffusion among agents in the qualitative dataset; documentation is comparative and observational rather than experimental.
A rapidly growing ecosystem of autonomous AI agents is producing organic, multi-agent learning dynamics that go beyond dyadic human–AI interactions.
Naturalistic, qualitative daily observations over one month across multiple agent platforms (reported platforms: Moltbook, The Colony, 4claw); coverage reported of >167,000 agents interacting as peers; comparative observational documentation rather than controlled experimentation.
Open-source orchestration and evaluation harnesses plus a self-contained evaluation pipeline improve reproducibility for the Speedrunning Track.
Paper claims and documents the release of orchestration and evaluation code and describes the self-contained pipeline designed for deterministic reproducible evaluation.
Version 1.0 marks integration into operational workflows and establishes a base for future capabilities.
Authors report that v1.0 has been used in verification and mask-refinement loops for real datasets (MeerKAT, ASKAP, APERTIF); no detailed deployment metrics provided.
Immersive inspection tools like iDaVIE are complements to automated ML pipelines by helping generate higher-quality labels and curated training examples.
Paper argues conceptual complementarity and cites iDaVIE's use for mask refinement and curated subcube export; no experimental comparison of label quality or downstream ML performance provided.
iDaVIE accelerates inspection-driven parts of astronomy workflows (e.g., mask refinement, verification).
Reported use cases where iDaVIE was used to refine masks and verify sources in real datasets; no measured time-per-task or throughput statistics provided.
iDaVIE has already been integrated into real pipelines (MeerKAT, ASKAP, APERTIF) and used to improve quality control, refine detection masks, and identify new sources.
Author statement of integration and use cases citing verification of HI data cubes from MeerKAT, ASKAP and APERTIF; no quantitative deployment counts or independent validation provided in the text.
The taxonomy and measurement approach provide operational metrics to quantify empathic communication for economic analyses (productivity, customer satisfaction, retention).
Authors propose that their data-driven taxonomy and automated/coding measures can be used as metrics; the paper demonstrates derivation and use in trial outcomes but does not present direct economic outcome measurements.
LLM-generated responses frequently score as more empathic than human-written responses in blinded evaluations.
Blinded evaluations comparing LLM-generated replies to human-written replies using recipient/judge ratings of perceived empathy (reported in blinded tests described in paper). Exact blinded-test sample sizes not specified in the summary but derived from the study's evaluation procedures.
Employers are increasingly demanding digital literacy, basic data competencies, and stronger communication and interpersonal skills.
Employer survey analysis tracking changes in required skills; descriptive summary of survey frequencies and employer-reported skill priorities. Survey sample size and representativeness not specified in summary.
Some occupations experience efficiency and productivity gains where AI complements tasks, implying complementarity effects for those jobs.
Qualitative case studies of firms and employer survey reports documenting productivity/efficiency improvements in certain roles following AI adoption; descriptive analysis of sectoral/occupational outcomes. Quantitative magnitude not specified.
Policy implication: prioritize large-scale, targeted reskilling and lifelong learning programs to enable workforce adaptability and capture AI complementarity gains.
Policy recommendations derived from the paper's findings (association between AI adoption and skill shifts, heterogeneous sectoral impacts) and the literature synthesis that links reskilling interventions to better labor outcomes; recommendation is prescriptive rather than empirically tested within the study.
The paper provides empirical support for the complementarity hypothesis: AI tends to reconfigure jobs and create hybrid roles rather than eliminate employment wholesale.
Convergence of simulated sectoral employment patterns (some sectors showing net gains and hybrid-role growth), the strong correlation between AI adoption and skill shifts (r = 0.71), and corroborating studies from the literature synthesis emphasizing augmentation and hybridization mechanisms.
Institutional reskilling programs and governance frameworks markedly moderate labor-market outcomes: better frameworks correlate with more complementarities and lower net job loss.
Integration of literature-derived mechanisms with simulated empirical patterns; paper reports correlations/moderation-style comparisons across simulated sector-year cases incorporating policy/institutional variables (described in methods), supported by studies in the systematic review linking policy interventions to labor outcomes.
Healthcare and IT Services experienced net employment gains consistent with AI complementarity (augmented tasks and creation of new hybrid roles).
Simulated sectoral employment trends and net-change metrics for Healthcare and IT Services (2020–2024) presented in the paper, supported by literature synthesis examples showing human–AI complementarities in these sectors.
The largest rises in hybrid jobs occurred in IT Services and Healthcare.
Sectoral decomposition of hybrid job share trends in the simulated dataset across the seven industries (2020–2024) and supporting qualitative/quantitative findings from the literature synthesis focused on IT Services and Healthcare.
Hybrid human–AI jobs increased substantially across all seven analyzed sectors between 2020 and 2024.
Descriptive trend analysis of the simulated dataset's hybrid job share metric (fraction of roles reclassified as human–AI hybrid) for the seven industries over 2020–2024, combined with corroborating examples from the literature synthesis (selected ACM/IEEE/Springer studies 2020–2024).
Responsible implementation requires legal/liability clarity, continuous monitoring for performance drift and distributional shifts, usable explanations, baseline AI literacy for clinicians, and co-design with frontline radiology teams.
Synthesis of governance literature, implementation best-practice reports, and recommendations from usability and deployment studies.
Triage and automation can shorten time-to-diagnosis, increase throughput, and reduce time spent on repetitive tasks.
Observational deployment reports and simulation studies that measured time-to-report or throughput improvements in pilot settings (evidence heterogeneous and context-dependent).
Integration points for AI across the imaging pathway include acquisition (image quality/protocol selection), triage (prioritization), interpretation/reporting (detection, quantification, report pre-population), and post-interpretation (teaching, QA, model improvement loops).
Descriptive synthesis of reported implementations and proposed use cases in the literature and deployment reports across multiple institutions.
Human-AI collaboration can produce synergistic gains (diagnostic complementarity) when errors are uncorrelated and tasks are allocated to leverage comparative strengths.
Theoretical/analytical models of error complementarity and empirical reader studies showing instances where combined readings outperform either agent alone (evidence drawn from multiple small-to-moderate reader studies and simulations).
AI in radiology has clear potential to improve diagnostic performance and workflow efficiency.
Narrative synthesis of laboratory evaluation studies, reader/comparison studies, and a limited number of observational deployment reports showing improved algorithm accuracy and some improvements in measured throughput or time-to-review in pilots (study sizes and settings heterogeneous; few large-scale RCTs).
Cognitive Shadow supports real-time model updates based on immediate user feedback, enabling iterative improvement and continuous alignment with human decision patterns.
Described human-in-the-loop interaction loop where CS captures human decisions, provides recommendations, receives immediate feedback, and updates models dynamically in the simulation environment (implementation detail).
HACL/CS reduces omission rates (missed detections) in the simulated scenarios.
Omission/error rates were tracked and compared between conditions in the simulated testbed; summary claims reduction in omissions with HACL assistance but does not report numeric effect sizes or significance.
HACL/CS reduces time-to-decision in the simulated maritime surveillance tasks.
Measured time-to-classify in simulation under human-alone vs HACL-assisted conditions; summary indicates reductions in time-to-decision but lacks detailed statistics in the provided description.
In the simulated Canadian Arctic maritime surveillance domain, HACL/CS shows promise for improving classification accuracy.
Performance comparison between human-alone and HACL-assisted conditions in the maritime surveillance simulation measuring classification accuracy; summary reports improvement but does not provide sample size or significance levels.
Adjustable autonomy via self-confidence thresholds enables the system to act autonomously on high-certainty predictions and defer to humans on low-certainty cases.
System design feature of Cognitive Shadow implemented in simulation: autonomy decision rule based on meta-model confidence thresholds; behavior demonstrated in human-in-the-loop scenarios.
The Cognitive Shadow toolkit quantifies AI reliability with an empirical (0–1) confidence metric produced by a recursive meta-model.
Design and implementation detail: primary supervised models are paired with a recursive meta-model that predicts the primary model's reliability per situation and outputs a 0–1 empirical confidence score; applied in the simulated testbed.
Implementing an adaptive command-and-control process augmented by AI metacognition (the Cognitive Shadow toolkit) aligns AI judgments with expert human decision patterns.
Cognitive Shadow (CS) implemented as supervised ML models trained to mimic expert human decisions in the simulated maritime scenarios; alignment assessed by comparing model outputs to human expert decisions during human-in-the-loop interaction (implementation validated in simulation).
Human-AI co-learning (HACL) improves human-autonomy teaming (HAT) effectiveness.
Evaluated in a simulated Canadian Arctic maritime surveillance testbed using human-in-the-loop experiments comparing human-alone vs HACL-assisted conditions; exact participant sample size and statistical details not provided in the summary.
Self-directed autonomous agents (those that autonomously generated prompts and selected tools) bypassed human prompting failures and outperformed most human teams on the challenge set.
Comparative analysis of the four autonomous agents' trajectories, tool use, and success rates versus the 41 human participants/teams on the same fresh challenges; observed correlation between autonomous self-direction and higher success relative to most teams.
Trust in AI should be conceptualized as a socio-technical, team-level mechanism (trust calibration) that mediates between AI design/enablers and downstream collaboration and performance, rather than an individual-level stable attitude.
Theoretical synthesis combining findings from the thematic analysis of 40 interviews with socio-technical systems theory (STS) and adaptive structuration theory (AST) to propose an initial and revised conceptual model linking enablers → trust-calibration practices → collaboration dynamics → performance.
Five enablers support effective trust calibration: transparency/explainability, clear role definitions, good user experience (UX), supportive cultural norms, and timely system feedback.
Synthesized from recurring themes in the interview data (N=40) where respondents identified these factors as facilitating appropriate reliance on AI in project settings; coded and aggregated through thematic analysis.
Performance and reward structures must be redesigned to value oversight, hypothesis testing, escalation and governance behaviours that mitigate model risk but may not immediately increase output.
Managerial recommendation derived from the framework and organizational reward literature; no empirical evaluation provided.
Firms need new metrics to decompose value created by humans, AI, and their interaction (to distinguish complementarities versus substitution).
Analytic implication derived from the framework and literature on productivity measurement; presented as a recommendation for empirical work rather than tested evidence.
Symbiarchic leadership is a practical, HR‑oriented framework for leading integrated human–AI “cyber teams,” specifying four linked leadership practices that make AI a co‑actor in knowledge work while preserving human judgement, accountability and organizational legitimacy.
Paper's central proposition based on theoretical synthesis of academic literature on human–AI collaboration, hybrid teams and digital‑era leadership plus illustrative practitioner examples; no original empirical data or experiments.
Recommendations for policy include investing in public data infrastructure and standards, promoting regulatory clarity for AI validation, and supporting equitable access to AI-driven innovations.
Policy recommendations derived from synthesis of challenges and potential remedies presented in the narrative review; based on conceptual policy analysis and examples rather than empirical testing of interventions.