Evidence (4560 claims)
Adoption
5267 claims
Productivity
4560 claims
Governance
4137 claims
Human-AI Collaboration
3103 claims
Labor Markets
2506 claims
Innovation
2354 claims
Org Design
2340 claims
Skills & Training
1945 claims
Inequality
1322 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 378 | 106 | 59 | 455 | 1007 |
| Governance & Regulation | 379 | 176 | 116 | 58 | 739 |
| Research Productivity | 240 | 96 | 34 | 294 | 668 |
| Organizational Efficiency | 370 | 82 | 63 | 35 | 553 |
| Technology Adoption Rate | 296 | 118 | 66 | 29 | 513 |
| Firm Productivity | 277 | 34 | 68 | 10 | 394 |
| AI Safety & Ethics | 117 | 177 | 44 | 24 | 364 |
| Output Quality | 244 | 61 | 23 | 26 | 354 |
| Market Structure | 107 | 123 | 85 | 14 | 334 |
| Decision Quality | 168 | 74 | 37 | 19 | 301 |
| Fiscal & Macroeconomic | 75 | 52 | 32 | 21 | 187 |
| Employment Level | 70 | 32 | 74 | 8 | 186 |
| Skill Acquisition | 89 | 32 | 39 | 9 | 169 |
| Firm Revenue | 96 | 34 | 22 | — | 152 |
| Innovation Output | 106 | 12 | 21 | 11 | 151 |
| Consumer Welfare | 70 | 30 | 37 | 7 | 144 |
| Regulatory Compliance | 52 | 61 | 13 | 3 | 129 |
| Inequality Measures | 24 | 68 | 31 | 4 | 127 |
| Task Allocation | 75 | 11 | 29 | 6 | 121 |
| Training Effectiveness | 55 | 12 | 12 | 16 | 96 |
| Error Rate | 42 | 48 | 6 | — | 96 |
| Worker Satisfaction | 45 | 32 | 11 | 6 | 94 |
| Task Completion Time | 78 | 5 | 4 | 2 | 89 |
| Wages & Compensation | 46 | 13 | 19 | 5 | 83 |
| Team Performance | 44 | 9 | 15 | 7 | 76 |
| Hiring & Recruitment | 39 | 4 | 6 | 3 | 52 |
| Automation Exposure | 18 | 17 | 9 | 5 | 50 |
| Job Displacement | 5 | 31 | 12 | — | 48 |
| Social Protection | 21 | 10 | 6 | 2 | 39 |
| Developer Productivity | 29 | 3 | 3 | 1 | 36 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
| Skill Obsolescence | 3 | 19 | 2 | — | 24 |
| Creative Output | 15 | 5 | 3 | 1 | 24 |
| Labor Share of Income | 10 | 4 | 9 | — | 23 |
Productivity
Remove filter
Task-level approaches capture within-occupation heterogeneity in automation and augmentation risk that occupation-level analyses miss.
Empirical and methodological work cited (Felten et al., 2023; Eloundou et al., 2023) that construct task-level exposure indices and show variation across tasks within the same occupation; evidence based on task mappings from O*NET-style databases and job descriptions.
Recent research in AI–labor economics has shifted from occupation-level analysis to task-level analysis, mapping task-by-task exposure to AI.
Synthesis of recent literature cited in the paper (e.g., Felten et al., 2023; Eloundou et al., 2023) which develop task-level exposure mappings using occupational task databases (O*NET-style) and job-posting text; evidence is bibliographic and methodological rather than a single new empirical dataset.
Further quantitative research is needed to measure task‑level productivity effects, skill‑depreciation trajectories, and market impacts of differential GenAI adoption; structural models could incorporate TGAIF to predict labor demand and wage effects.
Authors' stated research agenda and limitations acknowledged in the paper; this is a call for future empirical work rather than an empirical claim.
ChatGPT was used as the generative engine for the MLLM in the system implementation described in the paper.
Methods section: integration of AR overlays with an MLLM, with ChatGPT used as the generative engine (explicit in the summary).
This paper is a narrative review synthesizing heterogeneous studies and case reports rather than providing meta-analytic estimates of effect sizes.
Methods statement in the paper describing review type as narrative synthesis and noting limitations (no meta-analysis).
The paper proposes measurable metrics such as projection congruence indices, alignment persistence measures, monitoring/oversight burden, and outcome variability/tail risks attributable to agentic autonomy.
Explicit metric proposals in the methods and metrics section of the paper; presented as part of a research agenda rather than empirically implemented.
The paper proposes specific empirical and analytic follow-ups — multi-agent simulations, lab experiments with humans and adaptive agents, field case studies, econometric analyses, and formal economic models — to test the conceptual claims.
Explicit methods and research agenda listed in the paper; these are recommended future methods, not evidence.
Agentic AI is characterized by three properties that drive structural uncertainty: open-ended action trajectories, generative representations/outputs, and evolving objectives.
Definitions and taxonomy developed in the paper based on conceptual synthesis; presented as framing rather than empirically measured properties.
The framework provides sector-specific implementation guidance tailored to healthcare and public administration, accounting for existing governance and regulatory structures.
Case/sector guidance sections offering practical recommendations and considerations for deployment in those sectors; design-oriented, not empirically piloted in the paper.
DAR identifies four trigger classes that govern transitions between authority states: data superiority, contextual judgment requirements, risk thresholds, and ethics/legal overrides.
Conceptual derivation and classification in the framework; mapping of trigger types to transition rules. Theoretical, no empirical data.
The Dynamic Authority Reversal (DAR) framework formalizes four discrete intra-episode authority states: Human-Leader/AI-Follower (HL), AI-Leader/Human-Follower (AL), Co-Leadership (CO), and Mutual Override (MO).
Formal conceptual specification and formal modeling within the paper; definitions of the four states and their roles. No empirical sample; theoretical/design artifact.
Further quantitative and comparative research is needed to measure net productivity effects, skill trajectories, and generalizability across firm types and industries.
Authors' methodological assessment and limitations section noting single-firm qualitative design (Netlight) and rapidly evolving toolchains; recommendation for future empirical work.
Long-term effects of adaptive marketing (habit formation, churn, lifetime value) are important for welfare and valuation but are harder to measure and require longitudinal or structural economic models.
Conceptual claim in measurement challenges; argues that short-horizon A/B tests may miss long-run harms or benefits, recommending longitudinal studies and structural models; no empirical long-term study presented.
Offline evaluation metrics (intent/sentiment classification accuracy, human-rated generation quality and factuality, simulated policy evaluation) are useful for pipeline development but do not fully capture online performance.
Paper contrasts offline metrics with online A/B testing and notes the need for online experiments; this is a methodological claim supported by the described evaluation pipeline rather than a presented empirical study.
Another important gap is quantifying complementarities between AI and different skill types (evaluative vs. generative tasks).
Review observation that existing empirical work has not systematically quantified how AI productivity gains vary with worker skill composition and complementary roles.
Key research gaps include a lack of long-run causal evidence on the effects of LLMs on firm-level innovation rates, business formation, and industry structure.
Explicit identification of gaps in the literature within the nano-review; the review states that most studies are short-term, task-level, or descriptive.
High-priority research includes randomized controlled trials on hybrid vs. automated routing, long-run studies on labor markets in service sectors, and models quantifying trust externalities and governance costs.
Paper's stated research agenda based on identified evidence gaps and limitations (lack of randomized long-run studies).
Current evidence is promising but early: case studies, pilot deployments, and short-run experiments dominate; long-run causal evidence on labor and welfare effects is limited.
Explicit methodological assessment in the paper noting source types (deployments, pilots, vendor reports, short-run experiments) and limitations (heterogeneity, lack of randomized controls, short horizons).
Study limitations include reliance on perceptual measures (rather than solely objective performance), heterogeneity across institutional samples, and likely correlational rather than strictly causal identification.
Authors' own noted limitations in the paper's methods section: mixed-methods design using perceptions from questionnaires and interviews, sample heterogeneity across multinational institutions, and quantitative analyses that are associative rather than strictly causal.
Statistical analyses reported improvements across metrics, but specific effect sizes and detailed statistical results were not provided in the summary.
Summary indicates statistical analyses were performed and improvements reported, but it also states that specific effect sizes were not included in the provided summary.
Measurement and research gaps (data scarcity, informality) complicate robust economic assessment of AI impacts; improved metrics, granular labour and firm‑level data, and mixed‑methods evaluation are required.
Methodological critique based on reviewed literature and identified gaps; no new data collection in the paper.
There is a lack of causal evidence on the long-run impacts of AI-driven HRM on employment, wages, and firm survival—this is a key research gap identified by the review.
Explicitly stated research gap in the review based on assessment of methodologies and findings across the 47 included studies.
A systematic review following PRISMA identified 47 peer-reviewed studies (2012–2024) on data-driven HRM and workforce resilience from Scopus, Web of Science, and Google Scholar.
Explicit review protocol and search/screening results reported by the paper (PRISMA-based), final sample size = 47 studies.
Recommended research designs to estimate impacts include RCTs, quasi-experimental methods (difference-in-differences, regression discontinuity, matching), and longitudinal cohort tracking.
Paper explicitly lists these evaluation designs as appropriate methods for causal inference and long-term outcomes measurement. This is a methodological recommendation rather than an empirical claim.
There is a need for causal, longitudinal studies quantifying economic returns of ERP-AI integration and for measurement frameworks for quality-adjusted decision improvements.
Stated limitation and research opportunity in the review; reviewers found scarcity of longitudinal causal studies in the 2020–2025 literature.
There is a need for empirical research to quantify net economic impact (productivity gains vs governance costs), effects on employment composition and wages, and market outcomes from alternative governance architectures.
Explicit research gaps listed in the paper; recommendation for future empirical strategies (difference-in-differences, event studies, randomized pilots, instrumental variables) and suggested data sources.
The article’s evidence is predominantly practitioner-driven and illustrative, relying on qualitative case evidence rather than systematic quantitative causal estimates.
Explicit statement in the paper’s Data & Methods section describing nature of evidence and limitations; methods listed include synthesis, comparative analysis, illustrative architectures, and anecdotal cases.
Key technical components of the pattern include low-code platforms for rapid governed app development, RPA for deterministic process automation and legacy integration, and generative AI for document understanding, conversational interfaces, and decision support — with guardrails.
Paper’s component list and rationale based on practitioner experience and multi-sector examples; presented as recommended components in the reference architecture; no experimental validation of component selection given.
The proposed layered deployment pattern integrates organizational governance (roles, policies, decision rights), technical architecture (platforms, APIs, data flows), and AI risk management (controls, monitoring, human-in-the-loop).
Design and architectural proposal within the paper; described via illustrative deployment patterns and reference architectures. This is a descriptive claim about the proposed pattern rather than an empirical effect.
There is a need for empirical research (empirical studies quantifying prompt-fraud incidents and losses, field experiments comparing control portfolios, and economic models of optimal investment in AI controls).
Explicit research agenda and limitations acknowledged by the authors noting lack of empirical prevalence data and need for operational validation.
Recommended next steps for validation include controlled pilots, before-after studies on operational metrics, and cross-firm panel analyses to estimate economic impacts and risk reductions.
Authors' explicit recommendations for empirical validation in the Data & Methods and Implications sections.
There is no reported large-scale quantitative evaluation (e.g., productivity gains, cost-benefit metrics, or causal impact estimates) supporting the framework in the paper.
Explicit limitation noted by the authors stating absence of large-scale quantitative evaluation.
The evidence base for the paper is qualitative: a synthesis of industry best practices and lessons from multi-sector enterprise implementations; methods used include conceptual framework development, architecture design, and case-based illustration.
Explicit methodological statement in the Data & Methods section of the paper.
The article is largely qualitative and prescriptive rather than empirical; it does not provide systematic incidence estimates or large-scale measured losses from prompt fraud and identifies empirical validation as needed.
Authors' stated methods and limitations: conceptual analysis, threat modeling, literature review, illustrative vignettes; explicit note of absent systematic empirical data.
SECaaS offerings commonly include threat intelligence, managed detection & response (MDR), endpoint protection, IAM, CASB, security orchestration/automation, and compliance-as-a-service.
Survey of SECaaS product categories in industry reports and vendor catalogs; technical benchmarks describing typical feature sets.
Achieving CIA in the cloud requires technical controls (encryption, access controls, IAM, MFA, zero-trust), resilience measures (backups, redundancy, DR/BCP), and continuous monitoring (logging, SIEM, EDR/XDR).
Synthesis of technical best practices and vendor/industry guidance; supported by technical evaluations and case studies in the literature.
Core cloud security goals remain confidentiality, integrity, and availability (CIA).
Canonical security literature and standards cited in the chapter; general consensus across technical controls and industry best-practice frameworks (e.g., NIST, ISO).
Evaluation methods reported commonly include visual inspection by researchers/clinicians, correlation with known biomarkers/frequency bands, and ablation/perturbation faithfulness tests; few studies report standardized quantitative metrics for robustness, stability, or neuroscientific fidelity.
Survey of evaluation practices across the literature compiled in the review.
Modeling approaches in the literature include end-to-end deep models operating on raw or time–frequency representations, recurrent architectures for temporal dynamics, attention mechanisms, and hybrid feature-based classifiers.
Summary of modeling choices described across reviewed studies.
Typical datasets used in EEG XAI research include public collections such as the TUH EEG Corpus, BCI Competition datasets, PhysioNet sleep databases, CHB-MIT for pediatric seizures, as well as many small/clinical cohorts.
Listing of commonly referenced datasets across the surveyed literature.
A common taxonomy emphasized in EEG XAI work distinguishes local vs global explanations, model-specific vs model-agnostic methods, and post-hoc vs intrinsically interpretable models.
Conceptual organization presented in the review synthesizing common taxonomic distinctions used by authors in the field.
XAI methods applied to EEG in the literature include gradient-based saliency methods, Integrated Gradients, layer-wise relevance propagation (LRP), CAM/Grad-CAM, occlusion/perturbation analyses, LIME, SHAP, TCAV, and counterfactual explanations.
Cataloging of explanation techniques reported across surveyed EEG papers.
Models used in EEG XAI work include deep learning architectures (CNNs, RNNs, attention/transformers), classical machine learning, and hybrid pipelines combining feature extraction with classifiers.
Summary of modeling approaches reported across reviewed studies.
The literature on EEG XAI covers tasks including seizure detection, sleep staging, brain–computer interfaces (BCI), cognitive/emotional state recognition, and diagnostic/supportive tools.
Descriptive review of topical coverage across surveyed papers; specific task categories enumerated in the review.
Limitation: the study analyzes national‑level formal policy texts only and does not measure enforcement, implementation outcomes, or public reactions.
Author‑stated limitations in the paper specifying scope restricted to formal policy documents and absence of empirical enforcement/compliance data.
The paper uses qualitative content analysis, coding documents against the four analytical dimensions to generate a comparative typology of coordination approaches.
Method description: manual qualitative coding of the 36 documents into the specified dimensions, producing the typology distinguishing Chinese and U.S. approaches.
The study's empirical basis comprises 36 national‑level policy documents (18 from China; 18 from the United States) focused on scientific data governance.
Author‑reported dataset and sampling description in the Data & Methods section.
The comparative analysis is organized across four dimensions: coordination objectives, institutional actors, governance mechanisms, and stakeholder legitimacy.
Methodological design reported in the paper; documents were coded against these four analytic categories.
The authors recommend empirical approaches for future work including randomized controlled trials in labs, before-after adoption studies, and collection of microdata on instrument usage, model versions, and provenance to measure impacts.
Explicit methodological recommendations in the Measurement and empirical research agenda section; these are proposals rather than executed studies.
There is a need for rigorous evaluation metrics and benchmarks for safety, reproducibility, and empirical studies quantifying productivity or scientific impact of LLM-driven instrument control.
Identified research gaps and recommended empirical research agenda described by the authors; these are recommendations rather than empirical findings.