Evidence (7953 claims)
Adoption
5539 claims
Productivity
4793 claims
Governance
4333 claims
Human-AI Collaboration
3326 claims
Labor Markets
2657 claims
Innovation
2510 claims
Org Design
2469 claims
Skills & Training
2017 claims
Inequality
1378 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 402 | 112 | 67 | 480 | 1076 |
| Governance & Regulation | 402 | 192 | 122 | 62 | 790 |
| Research Productivity | 249 | 98 | 34 | 311 | 697 |
| Organizational Efficiency | 395 | 95 | 70 | 40 | 603 |
| Technology Adoption Rate | 321 | 126 | 73 | 39 | 564 |
| Firm Productivity | 306 | 39 | 70 | 12 | 432 |
| Output Quality | 256 | 66 | 25 | 28 | 375 |
| AI Safety & Ethics | 116 | 177 | 44 | 24 | 363 |
| Market Structure | 107 | 128 | 85 | 14 | 339 |
| Decision Quality | 177 | 76 | 38 | 20 | 315 |
| Fiscal & Macroeconomic | 89 | 58 | 33 | 22 | 209 |
| Employment Level | 77 | 34 | 80 | 9 | 202 |
| Skill Acquisition | 92 | 33 | 40 | 9 | 174 |
| Innovation Output | 120 | 12 | 23 | 12 | 168 |
| Firm Revenue | 98 | 34 | 22 | — | 154 |
| Consumer Welfare | 73 | 31 | 37 | 7 | 148 |
| Task Allocation | 84 | 16 | 33 | 7 | 140 |
| Inequality Measures | 25 | 77 | 32 | 5 | 139 |
| Regulatory Compliance | 54 | 63 | 13 | 3 | 133 |
| Error Rate | 44 | 51 | 6 | — | 101 |
| Task Completion Time | 88 | 5 | 4 | 3 | 100 |
| Training Effectiveness | 58 | 12 | 12 | 16 | 99 |
| Worker Satisfaction | 47 | 32 | 11 | 7 | 97 |
| Wages & Compensation | 53 | 15 | 20 | 5 | 93 |
| Team Performance | 47 | 12 | 15 | 7 | 82 |
| Automation Exposure | 24 | 22 | 9 | 6 | 62 |
| Job Displacement | 6 | 38 | 13 | — | 57 |
| Hiring & Recruitment | 41 | 4 | 6 | 3 | 54 |
| Developer Productivity | 34 | 4 | 3 | 1 | 42 |
| Social Protection | 22 | 10 | 6 | 2 | 40 |
| Creative Output | 16 | 7 | 5 | 1 | 29 |
| Labor Share of Income | 12 | 5 | 9 | — | 26 |
| Skill Obsolescence | 3 | 20 | 2 | — | 25 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
Task-level approaches capture within-occupation heterogeneity in automation and augmentation risk that occupation-level analyses miss.
Empirical and methodological work cited (Felten et al., 2023; Eloundou et al., 2023) that construct task-level exposure indices and show variation across tasks within the same occupation; evidence based on task mappings from O*NET-style databases and job descriptions.
Recent research in AI–labor economics has shifted from occupation-level analysis to task-level analysis, mapping task-by-task exposure to AI.
Synthesis of recent literature cited in the paper (e.g., Felten et al., 2023; Eloundou et al., 2023) which develop task-level exposure mappings using occupational task databases (O*NET-style) and job-posting text; evidence is bibliographic and methodological rather than a single new empirical dataset.
Further quantitative research is needed to measure task‑level productivity effects, skill‑depreciation trajectories, and market impacts of differential GenAI adoption; structural models could incorporate TGAIF to predict labor demand and wage effects.
Authors' stated research agenda and limitations acknowledged in the paper; this is a call for future empirical work rather than an empirical claim.
ChatGPT was used as the generative engine for the MLLM in the system implementation described in the paper.
Methods section: integration of AR overlays with an MLLM, with ChatGPT used as the generative engine (explicit in the summary).
This paper is a narrative review synthesizing heterogeneous studies and case reports rather than providing meta-analytic estimates of effect sizes.
Methods statement in the paper describing review type as narrative synthesis and noting limitations (no meta-analysis).
The paper proposes measurable metrics such as projection congruence indices, alignment persistence measures, monitoring/oversight burden, and outcome variability/tail risks attributable to agentic autonomy.
Explicit metric proposals in the methods and metrics section of the paper; presented as part of a research agenda rather than empirically implemented.
The paper proposes specific empirical and analytic follow-ups — multi-agent simulations, lab experiments with humans and adaptive agents, field case studies, econometric analyses, and formal economic models — to test the conceptual claims.
Explicit methods and research agenda listed in the paper; these are recommended future methods, not evidence.
Agentic AI is characterized by three properties that drive structural uncertainty: open-ended action trajectories, generative representations/outputs, and evolving objectives.
Definitions and taxonomy developed in the paper based on conceptual synthesis; presented as framing rather than empirically measured properties.
The framework provides sector-specific implementation guidance tailored to healthcare and public administration, accounting for existing governance and regulatory structures.
Case/sector guidance sections offering practical recommendations and considerations for deployment in those sectors; design-oriented, not empirically piloted in the paper.
DAR identifies four trigger classes that govern transitions between authority states: data superiority, contextual judgment requirements, risk thresholds, and ethics/legal overrides.
Conceptual derivation and classification in the framework; mapping of trigger types to transition rules. Theoretical, no empirical data.
The Dynamic Authority Reversal (DAR) framework formalizes four discrete intra-episode authority states: Human-Leader/AI-Follower (HL), AI-Leader/Human-Follower (AL), Co-Leadership (CO), and Mutual Override (MO).
Formal conceptual specification and formal modeling within the paper; definitions of the four states and their roles. No empirical sample; theoretical/design artifact.
Further quantitative and comparative research is needed to measure net productivity effects, skill trajectories, and generalizability across firm types and industries.
Authors' methodological assessment and limitations section noting single-firm qualitative design (Netlight) and rapidly evolving toolchains; recommendation for future empirical work.
Long-term effects of adaptive marketing (habit formation, churn, lifetime value) are important for welfare and valuation but are harder to measure and require longitudinal or structural economic models.
Conceptual claim in measurement challenges; argues that short-horizon A/B tests may miss long-run harms or benefits, recommending longitudinal studies and structural models; no empirical long-term study presented.
Offline evaluation metrics (intent/sentiment classification accuracy, human-rated generation quality and factuality, simulated policy evaluation) are useful for pipeline development but do not fully capture online performance.
Paper contrasts offline metrics with online A/B testing and notes the need for online experiments; this is a methodological claim supported by the described evaluation pipeline rather than a presented empirical study.
Another important gap is quantifying complementarities between AI and different skill types (evaluative vs. generative tasks).
Review observation that existing empirical work has not systematically quantified how AI productivity gains vary with worker skill composition and complementary roles.
Key research gaps include a lack of long-run causal evidence on the effects of LLMs on firm-level innovation rates, business formation, and industry structure.
Explicit identification of gaps in the literature within the nano-review; the review states that most studies are short-term, task-level, or descriptive.
High-priority research includes randomized controlled trials on hybrid vs. automated routing, long-run studies on labor markets in service sectors, and models quantifying trust externalities and governance costs.
Paper's stated research agenda based on identified evidence gaps and limitations (lack of randomized long-run studies).
Current evidence is promising but early: case studies, pilot deployments, and short-run experiments dominate; long-run causal evidence on labor and welfare effects is limited.
Explicit methodological assessment in the paper noting source types (deployments, pilots, vendor reports, short-run experiments) and limitations (heterogeneity, lack of randomized controls, short horizons).
The authors elicited additional insights via a survey of paper authors plus follow-up interviews to collect self-assessments of reproducibility and qualitative explanations for obstacles and motivations.
Methods section describing the mixed-methods approach: empirical reproduction attempts triangulated with surveys and interviews of original authors.
Reproducibility (as used in this study) is defined as producing the reported results from the shared data and analysis code, distinct from replicability which involves independent recollection of data.
Authors' definitional statement in the paper clarifying reproducibility vs. replicability.
Study limitations include reliance on perceptual measures (rather than solely objective performance), heterogeneity across institutional samples, and likely correlational rather than strictly causal identification.
Authors' own noted limitations in the paper's methods section: mixed-methods design using perceptions from questionnaires and interviews, sample heterogeneity across multinational institutions, and quantitative analyses that are associative rather than strictly causal.
Statistical analyses reported improvements across metrics, but specific effect sizes and detailed statistical results were not provided in the summary.
Summary indicates statistical analyses were performed and improvements reported, but it also states that specific effect sizes were not included in the provided summary.
Measurement and research gaps (data scarcity, informality) complicate robust economic assessment of AI impacts; improved metrics, granular labour and firm‑level data, and mixed‑methods evaluation are required.
Methodological critique based on reviewed literature and identified gaps; no new data collection in the paper.
There is a lack of causal evidence on the long-run impacts of AI-driven HRM on employment, wages, and firm survival—this is a key research gap identified by the review.
Explicitly stated research gap in the review based on assessment of methodologies and findings across the 47 included studies.
A systematic review following PRISMA identified 47 peer-reviewed studies (2012–2024) on data-driven HRM and workforce resilience from Scopus, Web of Science, and Google Scholar.
Explicit review protocol and search/screening results reported by the paper (PRISMA-based), final sample size = 47 studies.
Recommended research designs to estimate impacts include RCTs, quasi-experimental methods (difference-in-differences, regression discontinuity, matching), and longitudinal cohort tracking.
Paper explicitly lists these evaluation designs as appropriate methods for causal inference and long-term outcomes measurement. This is a methodological recommendation rather than an empirical claim.
There is a need for causal, longitudinal studies quantifying economic returns of ERP-AI integration and for measurement frameworks for quality-adjusted decision improvements.
Stated limitation and research opportunity in the review; reviewers found scarcity of longitudinal causal studies in the 2020–2025 literature.
There is a need for causal, longitudinal studies on how AI‑enabled fintech affects women's portfolio outcomes and on algorithmic interventions designed to reduce gender gaps.
Explicit statement in the paper noting limitations of existing literature (heterogeneity, limited longitudinal causal evidence, possible platform sample selection).
Empirical validation on experimental or field data is needed to fully establish k-QREM's practical applicability; current results are based on numerical examples and simulations.
Paper's methodology and validation section: validation confined to two numerical example datasets and simulation studies; authors acknowledge lack of real experimental/field validation and propose it as future work.
Extensions such as Bayesian hierarchical estimation and integration with multi-agent reinforcement learning are promising future directions but not implemented in the paper.
Authors' discussion of future work and limitations; no empirical or methodological implementation presented for these extensions in the current paper.
k-QREM explicitly models heterogeneity both across cognitive levels (different proportions of players at each level) and within levels (stochastic variability among players assigned to the same level).
Model specification: the paper defines level-specific quantal response functions and allows distributions over player types within each level (theoretical/modeling choices demonstrated in equations and architecture).
k-QREM is a hierarchical quantal-response model that nests the Cognitive Hierarchy Model (CHM) and Quantal Response Equilibrium (QRE) as special or limiting cases.
Analytical model construction in the paper: k-level hierarchical formulation showing CHM (discrete levels, deterministic best-response limit) and QRE (single-level stochastic best-response) arise as special/limiting parameterizations of k-QREM (model derivation/proofs provided).
There is a need for empirical research to quantify net economic impact (productivity gains vs governance costs), effects on employment composition and wages, and market outcomes from alternative governance architectures.
Explicit research gaps listed in the paper; recommendation for future empirical strategies (difference-in-differences, event studies, randomized pilots, instrumental variables) and suggested data sources.
The article’s evidence is predominantly practitioner-driven and illustrative, relying on qualitative case evidence rather than systematic quantitative causal estimates.
Explicit statement in the paper’s Data & Methods section describing nature of evidence and limitations; methods listed include synthesis, comparative analysis, illustrative architectures, and anecdotal cases.
Key technical components of the pattern include low-code platforms for rapid governed app development, RPA for deterministic process automation and legacy integration, and generative AI for document understanding, conversational interfaces, and decision support — with guardrails.
Paper’s component list and rationale based on practitioner experience and multi-sector examples; presented as recommended components in the reference architecture; no experimental validation of component selection given.
The proposed layered deployment pattern integrates organizational governance (roles, policies, decision rights), technical architecture (platforms, APIs, data flows), and AI risk management (controls, monitoring, human-in-the-loop).
Design and architectural proposal within the paper; described via illustrative deployment patterns and reference architectures. This is a descriptive claim about the proposed pattern rather than an empirical effect.
There is a need for empirical research (empirical studies quantifying prompt-fraud incidents and losses, field experiments comparing control portfolios, and economic models of optimal investment in AI controls).
Explicit research agenda and limitations acknowledged by the authors noting lack of empirical prevalence data and need for operational validation.
Recommended next steps for validation include controlled pilots, before-after studies on operational metrics, and cross-firm panel analyses to estimate economic impacts and risk reductions.
Authors' explicit recommendations for empirical validation in the Data & Methods and Implications sections.
There is no reported large-scale quantitative evaluation (e.g., productivity gains, cost-benefit metrics, or causal impact estimates) supporting the framework in the paper.
Explicit limitation noted by the authors stating absence of large-scale quantitative evaluation.
The evidence base for the paper is qualitative: a synthesis of industry best practices and lessons from multi-sector enterprise implementations; methods used include conceptual framework development, architecture design, and case-based illustration.
Explicit methodological statement in the Data & Methods section of the paper.
The article is largely qualitative and prescriptive rather than empirical; it does not provide systematic incidence estimates or large-scale measured losses from prompt fraud and identifies empirical validation as needed.
Authors' stated methods and limitations: conceptual analysis, threat modeling, literature review, illustrative vignettes; explicit note of absent systematic empirical data.
SECaaS offerings commonly include threat intelligence, managed detection & response (MDR), endpoint protection, IAM, CASB, security orchestration/automation, and compliance-as-a-service.
Survey of SECaaS product categories in industry reports and vendor catalogs; technical benchmarks describing typical feature sets.
Achieving CIA in the cloud requires technical controls (encryption, access controls, IAM, MFA, zero-trust), resilience measures (backups, redundancy, DR/BCP), and continuous monitoring (logging, SIEM, EDR/XDR).
Synthesis of technical best practices and vendor/industry guidance; supported by technical evaluations and case studies in the literature.
Core cloud security goals remain confidentiality, integrity, and availability (CIA).
Canonical security literature and standards cited in the chapter; general consensus across technical controls and industry best-practice frameworks (e.g., NIST, ISO).
Evaluation methods reported commonly include visual inspection by researchers/clinicians, correlation with known biomarkers/frequency bands, and ablation/perturbation faithfulness tests; few studies report standardized quantitative metrics for robustness, stability, or neuroscientific fidelity.
Survey of evaluation practices across the literature compiled in the review.
Modeling approaches in the literature include end-to-end deep models operating on raw or time–frequency representations, recurrent architectures for temporal dynamics, attention mechanisms, and hybrid feature-based classifiers.
Summary of modeling choices described across reviewed studies.
Typical datasets used in EEG XAI research include public collections such as the TUH EEG Corpus, BCI Competition datasets, PhysioNet sleep databases, CHB-MIT for pediatric seizures, as well as many small/clinical cohorts.
Listing of commonly referenced datasets across the surveyed literature.
A common taxonomy emphasized in EEG XAI work distinguishes local vs global explanations, model-specific vs model-agnostic methods, and post-hoc vs intrinsically interpretable models.
Conceptual organization presented in the review synthesizing common taxonomic distinctions used by authors in the field.
XAI methods applied to EEG in the literature include gradient-based saliency methods, Integrated Gradients, layer-wise relevance propagation (LRP), CAM/Grad-CAM, occlusion/perturbation analyses, LIME, SHAP, TCAV, and counterfactual explanations.
Cataloging of explanation techniques reported across surveyed EEG papers.
Models used in EEG XAI work include deep learning architectures (CNNs, RNNs, attention/transformers), classical machine learning, and hybrid pipelines combining feature extraction with classifiers.
Summary of modeling approaches reported across reviewed studies.