Evidence (8486 claims)
Adoption
5821 claims
Productivity
5033 claims
Governance
4561 claims
Human-AI Collaboration
3600 claims
Labor Markets
2749 claims
Innovation
2687 claims
Org Design
2648 claims
Skills & Training
2107 claims
Inequality
1429 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 440 | 117 | 68 | 507 | 1148 |
| Governance & Regulation | 458 | 216 | 125 | 67 | 883 |
| Research Productivity | 270 | 101 | 34 | 303 | 713 |
| Organizational Efficiency | 441 | 105 | 76 | 43 | 669 |
| Technology Adoption Rate | 346 | 130 | 76 | 45 | 602 |
| Firm Productivity | 322 | 38 | 72 | 13 | 450 |
| Output Quality | 272 | 75 | 27 | 30 | 404 |
| AI Safety & Ethics | 122 | 188 | 46 | 27 | 385 |
| Market Structure | 119 | 134 | 86 | 14 | 358 |
| Decision Quality | 182 | 79 | 41 | 20 | 326 |
| Fiscal & Macroeconomic | 95 | 58 | 34 | 22 | 216 |
| Employment Level | 78 | 37 | 80 | 9 | 206 |
| Skill Acquisition | 102 | 37 | 41 | 9 | 189 |
| Innovation Output | 124 | 12 | 26 | 13 | 176 |
| Firm Revenue | 99 | 37 | 24 | — | 160 |
| Consumer Welfare | 77 | 38 | 37 | 7 | 159 |
| Task Allocation | 93 | 17 | 36 | 8 | 156 |
| Inequality Measures | 29 | 81 | 33 | 6 | 149 |
| Regulatory Compliance | 54 | 61 | 13 | 3 | 131 |
| Task Completion Time | 92 | 8 | 4 | 3 | 107 |
| Error Rate | 45 | 53 | 6 | — | 104 |
| Worker Satisfaction | 48 | 36 | 12 | 8 | 104 |
| Training Effectiveness | 59 | 13 | 12 | 16 | 101 |
| Wages & Compensation | 56 | 16 | 20 | 5 | 97 |
| Team Performance | 50 | 13 | 15 | 8 | 87 |
| Automation Exposure | 28 | 29 | 12 | 7 | 79 |
| Job Displacement | 7 | 45 | 13 | — | 65 |
| Hiring & Recruitment | 40 | 4 | 7 | 3 | 54 |
| Developer Productivity | 38 | 4 | 4 | 3 | 49 |
| Social Protection | 22 | 12 | 7 | 2 | 43 |
| Creative Output | 17 | 8 | 6 | 1 | 32 |
| Skill Obsolescence | 3 | 25 | 2 | — | 30 |
| Labor Share of Income | 12 | 7 | 10 | — | 29 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
The paper issues a research agenda for economists: empirically develop instruments linking first‑person temporal reports with behavioral and neural proxies; theoretically incorporate subjective temporality into models of utility, human capital, attention economics, and platform competition; and evaluate policy accounting for temporal‑experience externalities.
Explicitly stated research agenda and methodological recommendations in the paper; no empirical follow‑up included.
Economists will need new empirical measures: validated instruments translating phenomenological constructs (e.g., Chronons) into observable proxies or composite indices for welfare and labor studies, facing standardization and comparability challenges.
Methodological recommendation and discussion in the paper; no empirical measure development or validation reported.
The paper proposes candidate mappings from subjective reports to neural/behavioral signatures (e.g., neural markers of attentional episodes, temporal binding windows) and suggests experimental paradigms to operationalize temporal units.
Methodological proposals and suggested experimental agendas in the paper; no implemented experiments or sample sizes reported.
The framework situates itself at the intersection of neurophenomenology, computational phenomenology, brain–computer interfaces, and human–AI teaming research.
Cross-disciplinary literature synthesis and conceptual mapping in the paper; descriptive claim with no empirical sampling (N/A).
The paper introduces symbolic operators—Chronons, Hexachronons, Metachronos—as theoretical units intended to bridge first-person phenomenology of temporal experience with third‑person neurotechnology descriptions.
Theoretical proposal and definitional introduction within the paper (conceptual development); no experimental validation or sample (N/A).
XChronos is a philosophical-epistemological framework arguing that transhumanism must place subjective temporality (lived time, presence, attention, meaning) at the center of design and evaluation.
Conceptual/philosophical analysis and literature synthesis presented in the paper; no empirical sample or dataset (N/A).
A Random Survival Forest built on curated cancer‑death‑related genes (CDRG‑RSF) achieved the best long‑term prognostic performance among 14 tested ML algorithms for pancreatic cancer, with 3‑ and 5‑year AUCs > 0.7.
Comparison of 14 ML survival algorithms on curated prognostic genes; Random Survival Forest (CDRG‑RSF) reported superior 3‑ and 5‑year AUCs exceeding 0.7 (exact sample sizes/cohort details not provided in summary).
Experimental knockdown of PSME3 reduced proliferation and invasion and increased apoptosis in LUAD cells, implicating the PI3K/AKT/Bcl‑2 pathway as a mediator.
Functional assays (gene knockdown experiments) reported in the PIGRS study showing decreased proliferation/invasion and increased apoptosis after PSME3 knockdown, with pathway analyses implicating PI3K/AKT/Bcl‑2.
Deep neural networks (DNNs) better captured cross‑study differential expression (DEA) signals when predicting miRNA from mRNA than sparse linear models (LASSO); for HIV the cross‑study log2 fold‑change (log2FC) correlation was approximately R ≈ 0.59 for the DNN approach.
Analysis on seven paired viral infection datasets (including WNV and HIV); compared DNNs vs. LASSO for mRNA→miRNA prediction; reported cross‑study log2FC correlation R ≈ 0.59 for HIV for the DNNs. Methods included differential expression signal recovery across studies.
An AI‑powered pipeline (EPheClass) produced a parsimonious saliva microbiome classifier for periodontal disease with AUC = 0.973 using 13 features.
EPheClass pipeline using ensemble ML (kNN, RF, SVM, XGBoost, MLP), centred log‑ratio (CLR) transform and Recursive Feature Elimination (RFE); reported performance AUC = 0.973 for periodontal disease model with 13 features (sample size not specified in summary).
Recommendation: Treat synthetic participants as heuristic tools (supplemental roles) rather than replacements; use hybrid designs, validate against held-out human samples, pre-register synthetic-data usage, and adopt transparency and reproducibility practices (document prompts, model versions, seeds, fine-tuning).
Authors' recommendations drawn from the systematic review of 182 studies and the identified failure modes and risks.
Approximation guarantees are provided that justify scalable allocation rules and heuristics (i.e., provable performance bounds versus optimal).
Analytical approximation-theory results in the paper showing performance ratios/guarantees for specific heuristics relative to the optimal solution.
A systematic review of 27 evaluated AI education/training programs for the healthcare workforce was conducted following PRISMA guidance and a PROSPERO-registered protocol.
Systematic review design reported in the paper: PRISMA-guided review, protocol registered in PROSPERO, searches of five databases (PubMed, Scopus, CINAHL, Embase, ERIC) on 20 Aug 2024; 27 programs met inclusion criteria.
Higher job performance is positively associated with greater employee retention.
PLS-SEM analysis, N = 350. Reported direct path: Performance → Retention, β = 0.348, p < 0.001.
About 78% of the included studies document productivity increases related to digital transformation initiatives.
Quantitative summary across the 145 included studies indicating the proportion reporting productivity gains (~78%).
A systematic review of 145 empirical studies (published 2020–2025) finds a consistent positive association between digital transformation and work productivity.
Systematic review following PRISMA 2020 of 145 included empirical studies identified and screened from searches (see Methods); inclusion period 2020–2025; productivity outcomes extracted from each study.
The paper identifies gaps and recommends that economists conduct randomized evaluations and quasi-experimental studies to estimate causal effects of interventions (hands-on labs, instructor training, compute subsidies) on competencies and earnings.
Policy and research agenda section of the paper arguing for randomized/quasi-experimental methods; no such causal interventions were implemented in this study.
The study conducted a cross-sectional online survey of more than 600 higher-education students and educators from multiple world regions.
Cross-sectional online survey; sample size reported as >600 participants; recruitment targeted a mix of disciplines and institution types; survey mapped to UNESCO 2024 AI competency frameworks.
A PaaS layer enables industry-specific customization (complex contract logic, milestone handling, multi-entity consolidation).
Paper's architectural proposal; described as the role of PaaS in the hybrid framework. This is a design claim, not a measured outcome in the summary.
A SaaS layer should provide standardized accounting, invoicing, and reporting workflows for the EPC industry.
Architectural proposition in the paper: design recommendation rather than an empirically isolated test. The claim is descriptive of the proposed architecture.
Core supply‑chain management challenges targeted by simulation are production layout, product strategy, and managing volume and variety.
Survey and critique of simulation applications presented in the paper; conceptual taxonomy of application areas.
The paper proposes a 'manufacturing operation tree'—an organizationally structured framework—to guide development of more realistic, validated, and industry‑relevant simulation models.
Conceptual/modeling output in the paper (diagram and explanation of the manufacturing operation tree); theoretical development rather than empirical testing.
Standardizing datasets, benchmarks, and evaluation protocols (including real-time metrics and resource/latency measurements) is necessary to improve comparability and deployment relevance.
Surveyed inconsistencies and methodological shortcomings motivate the recommendation for standardization; many papers call for better benchmarks.
Hybrid architectures combining rule-based filters with ML classifiers and ensembles are used to improve detection performance and reduce false positives.
Comparative analysis and examples from the literature where multi-stage or hybrid pipelines are proposed and evaluated.
Econometric and causal-inference tools (difference-in-differences, instrumental variables, randomized encouragement designs) are needed to estimate long-term effects of personalized robot interventions.
Recommended methodological agenda for AI economists in the paper; no applied causal studies presented.
Research and deployment will require new datasets: longitudinal multimodal interaction logs, user preference surveys, simulated user populations, and ethically annotated datasets for fairness and safety evaluation.
Data & Methods recommendations based on identified empirical needs; no dataset release or analysis in this paper.
Measuring welfare impact of personalized robots requires going beyond engagement to include non-market outcomes such as well-being, autonomy, and mental health.
Methodological recommendation in the implications and evaluation sections; no empirical measures provided.
A/B testing and longitudinal field studies are necessary for real-world validation of robot personalization, and metrics should include welfare-oriented outcomes (well-being, trust) in addition to engagement.
Recommended evaluation strategy drawing from HRI and RS experimental standards; no field trials reported in this work.
Prior to live trials, offline RS evaluation metrics (precision/recall, NDCG), counterfactual/off-policy estimators, and simulated users should be used to validate personalization policies.
Methodological recommendation based on RS evaluation practices; no empirical comparison with live trials in robots presented.
Contextual bandits and counterfactual/off-policy learning can enable safe exploration and off-policy evaluation when adapting robot interactions from logged data.
Methodological synthesis referencing contextual bandit and counterfactual learning techniques from RS and causal inference; no robotic implementation experiments reported.
Sequence-aware recommenders (RNNs, Transformers, Markov/session-based models) are suitable for modeling session dynamics and short-term preference shifts in robot interactions.
Survey of sequence/temporal RS models and their typical use cases; conceptual recommendation only.
RS tooling covers long-term user profiles, short-term/session signals, context-awareness, multi-objective ranking, and evaluation methods suited for personalization at scale.
Review of recommender-systems methods and tooling in the literature; conceptual synthesis without empirical new data.
Recommender systems are specialized in representing, predicting, and ranking user preferences across time and contexts (e.g., collaborative filtering, content-based models, sequential/session models).
Established RS literature surveyed and cited as the basis for the claim; conceptual argument, no new experiments.
Perceived customer value is the core determinant of value-based pricing (VBP) decisions in digital marketing.
Systematic Literature Review (SLR) of 30 scholarly articles (Scopus, 2020–2025) coded into thematic categories; multiple included studies emphasize perceived value as central to pricing decisions.
Digital trade development raises city-level house prices in China in a robust, linear manner.
City-level panel regressions using a constructed digital trade index (entropy-TOPSIS aggregation of multiple indicators). Authors report tests for nonlinearity (none found) and multiple robustness checks. Sample: Chinese cities (years and exact sample size not specified in the summary).
Breakthroughs in structure prediction arise from end‑to‑end deep models that combine evolutionary information (MSAs, coevolutionary signals), geometric constraints and equivariant architectures, and large‑scale pretraining on sequence databases.
Paper describes methodological components: end‑to‑end architectures using MSAs, SE(3)/E(3)-equivariant layers, transformer‑based pretraining on UniRef/UniProt/metagenomic catalogs; no quantitative ablation studies are provided in the text.
Canada emphasizes teacher-led assessment, cautious regulation, and a focus on equity and professional development in responding to AI-related assessment issues.
Country case study based on Canadian policy documents and secondary sources highlighting teacher-led approaches and regulatory caution; illustrative description.
Algeria’s national approach centers on capacity building and technological independence as central security priorities in its AI strategy.
Analysis of Algeria’s national AI and security documents and related policy texts cited in the comparative case review.
The EU has developed a detailed, rights‑protective regulatory framework that includes procedural safeguards and explicit risk prohibitions for AI.
Qualitative document analysis of EU regulatory acts and strategies (e.g., bloc‑level AI regulatory proposals and legal texts) and comparative literature review.
Practical takeaway: economists should treat consent design as a lever that changes data availability and incorporate consent frictions into demand and production-side models; they should collaborate with HCI and legal scholars to design experiments capturing behavioral and welfare effects.
Recommendation from the workshop summary intended for economists; based on interdisciplinary discussions and agendas rather than tested interventions.
The workshop produced interdisciplinary outputs including personas, prototypes, and a research agenda to better align user capabilities and values with data-driven AI systems.
Documented workshop activities (Futures Design Toolkit, co-design, position papers) and stated expected deliverables in the workshop summary; these are reported outputs rather than evaluated outcomes.
Creators explicitly name advertising, direct sales, affiliate marketing, and revenue-sharing models as common monetization channels for GenAI-enabled content.
Explicit references to these monetization channels appeared repeatedly across the 377 videos and were extracted during thematic coding.
Practical measurement guidance: researchers and practitioners should use repeated sampling (high-frequency and multi-day), compute bootstrap confidence intervals for citation shares and prevalence, run rank-stability analyses, and determine required sample size empirically via pilots.
Methodological recommendations grounded in the paper's empirical findings (non-determinism, heavy tails, wide bootstrap CIs) and demonstrated use of repeated sampling and bootstrap/resampling techniques in the study.
XAI analyses (e.g., SHAP / feature importance) indicate that forecasted features are among the top contributors to model predictions.
Feature attribution experiments described in the paper using SHAP or similar methods showing high importance scores for TSFM-generated forecasted features in the downstream regression.
The forecasted features produced by a frozen TSFM drive most of the predictive gains.
Ablation studies reported in the paper that remove forecasted features and measure performance degradation, plus XAI analyses (feature importance / SHAP) showing forecasted features rank highly.
The THETA project provides an interactive, reproducible analysis platform and open-source code (https://github.com/CodeSoul-co/THETA).
Explicit statement and URL in paper; code and platform availability claimed for reproducibility and interactive use.
THETA wraps modeling in an AI Scientist Agent framework (Data Steward, Modeling Analyst, Domain Expert) that simulates grounded-theory judgment and iterative refinement.
Detailed description of a three-role agent workflow in the methods section: Data Steward (ingestion/preprocessing), Modeling Analyst (modeling/hyperparameter tuning), Domain Expert (qualitative assessment/constant comparison).
THETA uses hybrid textual embeddings that combine pretrained foundation-model semantic structure with DAFT adaptations to better capture latent, domain-relevant meanings.
Method description of 'textual hybrid embeddings' combining base foundation encoders and DAFT-tuned parameters; asserted benefit for capturing latent domain meanings (no quantitative ablation reported in summary).
THETA adapts foundation embedding models to domain language using parameter-efficient LoRA fine-tuning (Domain-Adaptive Fine-Tuning, DAFT), avoiding full model retraining.
Method description: LoRA applied to foundation embedding models as the DAFT procedure; claim of parameter-efficient fine-tuning rather than end-to-end retraining (no compute benchmarks in summary).
Over 56% of comments were classified as formulaic, implying patterned, low-information responses dominate agent interaction.
Lexical-structural analysis and pattern detection (embedding/lexical measures) applied to ~2.8M comments; classification operationalized as 'formulaic comments' based on repetitive lexical/structural features, yielding >56% of comments labeled formulaic.