Evidence (4114 claims)
Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 758 | 199 | 100 | 900 | 2007 |
| Governance & Regulation | 826 | 400 | 191 | 122 | 1563 |
| Organizational Efficiency | 777 | 193 | 124 | 84 | 1189 |
| Technology Adoption Rate | 635 | 233 | 124 | 97 | 1098 |
| Research Productivity | 422 | 128 | 57 | 336 | 954 |
| Output Quality | 476 | 179 | 59 | 47 | 761 |
| Decision Quality | 328 | 177 | 81 | 47 | 640 |
| Firm Productivity | 435 | 57 | 88 | 20 | 606 |
| AI Safety & Ethics | 218 | 277 | 65 | 33 | 599 |
| Market Structure | 180 | 170 | 123 | 24 | 502 |
| Task Allocation | 213 | 64 | 72 | 33 | 387 |
| Skill Acquisition | 170 | 61 | 61 | 17 | 309 |
| Innovation Output | 203 | 27 | 43 | 18 | 292 |
| Employment Level | 105 | 54 | 107 | 13 | 281 |
| Fiscal & Macroeconomic | 131 | 69 | 43 | 26 | 276 |
| Consumer Welfare | 117 | 63 | 42 | 11 | 233 |
| Firm Revenue | 153 | 48 | 26 | 3 | 230 |
| Task Completion Time | 173 | 31 | 8 | 12 | 225 |
| Inequality Measures | 44 | 122 | 49 | 6 | 221 |
| Worker Satisfaction | 89 | 65 | 22 | 12 | 188 |
| Error Rate | 69 | 92 | 10 | 2 | 173 |
| Regulatory Compliance | 77 | 69 | 14 | 5 | 165 |
| Automation Exposure | 56 | 56 | 26 | 13 | 154 |
| Training Effectiveness | 94 | 21 | 13 | 19 | 149 |
| Wages & Compensation | 77 | 36 | 25 | 6 | 144 |
| Team Performance | 86 | 17 | 27 | 10 | 141 |
| Developer Productivity | 95 | 17 | 14 | 6 | 133 |
| Job Displacement | 12 | 80 | 20 | 1 | 113 |
| Hiring & Recruitment | 52 | 7 | 8 | 3 | 70 |
| Creative Output | 31 | 18 | 8 | 3 | 61 |
| Skill Obsolescence | 5 | 46 | 6 | 1 | 58 |
| Social Protection | 27 | 16 | 8 | 2 | 53 |
| Labor Share of Income | 17 | 19 | 17 | — | 53 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Innovation
Remove filter
The interaction between AI and trade openness is positive and significant, underscoring the role of international trade in technological diffusion and competitiveness to boost growth.
GMM interaction models on panel data (19 G20 countries, 2005–2023); reported AI × trade openness interaction coefficient is positive and statistically significant.
The interaction between AI and financial innovation has a positive and significant impact on economic growth, indicating that innovative finance mediates AI's technological potential into tangible economic gains.
GMM models with interaction terms using panel data of 19 G20 countries (2005–2023); reported AI × financial innovation interaction coefficient is positive and statistically significant.
AI-related innovation has a positive and significant effect on economic growth (linear model, GMM).
Panel analysis of 19 G20 countries (2005–2023) using the Generalized Method of Moments (GMM) linear model; reported positive and statistically significant coefficient for AI-related innovation.
The paper proposes a technical and regulatory pivot: bounding the evidentiary weight of behavioral evidence in legal text and extending voluntary pre-deployment access with mechanistic-evidence classes (specifically linear probes, activation patching, and before/after-training comparisons).
Policy and technical recommendations presented in the paper (proposal, not empirical test).
We introduce the concept of 'fragile assurance' to describe cases where the evidential structure does not support the asserted safety claim.
Paper's conceptual contribution defining 'fragile assurance' and illustrating the notion with argumentation/examples.
We formalize the structural mismatch between required and achievable verification access as the 'audit gap' (the divergence between required and achievable verification access).
Paper introduces a formal definition and conceptual framing called the 'audit gap'.
AI governance frameworks enacted between 2019 and early 2026 require reviewable evidence of properties such as the absence of hidden objectives, resistance to loss-of-control precursors, and bounded catastrophic capability.
Paper's review of AI governance frameworks enacted between 2019 and early 2026 (policy/literature review as reported in the paper).
Task complexity positively moderates the relationships between GenAI usage patterns and knowledge integration capability.
Moderation analysis using three-wave lagged survey data from 381 matched employees in knowledge-intensive firms in China; interaction terms between task complexity and GenAI usage patterns reported to have positive effects on knowledge integration capability.
Employees' knowledge integration capability plays a critical complementary mediating role in the relationships between GenAI usage patterns (exploitative and exploratory) and creativity.
Mediation analysis conducted on three-wave lagged survey data from 381 matched employees in knowledge-intensive firms in China; knowledge integration capability measured and tested as mediator between GenAI usage patterns and creativity outcomes.
Exploratory GenAI use is more strongly positively associated with radical creativity than incremental creativity.
Three-wave lagged survey design; 381 valid matched employees from knowledge-intensive firms in China; statistical analysis comparing associations of exploratory GenAI use with radical vs. incremental creativity (mediation and moderation models reported in paper).
Exploitative GenAI use is more strongly positively associated with incremental creativity than radical creativity.
Three-wave lagged survey design; 381 valid matched employees from knowledge-intensive firms in China; statistical analysis comparing associations of exploitative GenAI use with incremental vs. radical creativity (mediation and moderation models reported in paper).
Policy options should centre on building institutional capacity for AGI situational awareness, strengthening Europe's position in the AI value chain, and developing frameworks for international stability in an era of increasingly capable AI systems.
Paper's recommended policy agenda derived from its assessment of risks and gaps (as stated in abstract); the abstract does not report empirical testing of these options or quantified expected effects.
These findings point to a need for a coordinated European preparedness agenda.
Paper's synthesis and policy recommendation based on the identified capability and governance gaps (as stated in abstract); recommendation not supported by quantified impact estimates in the abstract.
A plausible window for AGI emergence falls between 2030 and 2040, or potentially earlier, though substantial uncertainty remains.
Paper's synthesis of empirical trends in AI capabilities, expert forecasting surveys, and policy analysis (as stated in abstract). No specific sample size or survey details provided in the abstract.
Using FraudBench, we evaluate MLLMs, specialized AI-generated image detectors, and human participants under the same settings.
Experimental evaluation section comparing performance of MLLMs, specialized detectors, and human participants on the benchmark.
We synthesized fake-damaged evidence from genuine undamaged reference images using six state-of-the-art image editing and generation models.
Dataset augmentation methodology using six SOTA image editing/generation models to produce fake-damaged images.
We curated real evidence images together with their associated review and product metadata, identified genuine damaged and undamaged evidence through MLLM-assisted filtering and human annotation.
Data curation pipeline combining multimodal large language model (MLLM) filtering and human annotation as described in the methods.
FraudBench is constructed from real-world user-review evidence across e-commerce, food delivery, and travel-service scenarios.
Dataset construction procedure described in the paper specifying source domains (e-commerce, food delivery, travel services).
We introduce FraudBench, a multimodal benchmark for detecting AI-generated fraudulent refund evidence.
Methodological contribution described in the paper: design and release of a benchmark dataset (FraudBench).
Team-based ventures are increasingly dominant in the top tiers of platform rankings.
Ranking-tier analysis in the Product Hunt dataset showing an increasing share of team-founded launches among top-tier (highest-ranked) products over the study period.
The increase in entrepreneurial entry was driven disproportionately by solo entrepreneurs.
Same Product Hunt dataset (>160,000 launches) with analysis of launch ownership structure showing a larger post-release increase in launches by solo founders relative to teams.
Entrepreneurial entry increased sharply following the public release of ChatGPT-3.5.
Analysis of over 160,000 product launches on Product Hunt comparing entry rates before and after the public release of ChatGPT-3.5 (event-study / pre-post comparison across the platform).
AI is the most important predictive factor for Lae (based on artificial neural network analysis).
Artificial neural network (ANN) predictive modeling on composite indices for AI and Lae using panel data from 2012–2022 across 30 provincial regions; variable importance ranking from ANN indicates AI as top predictor.
An exogenous shock test using the Big Data Pilot Zone policy further confirms the robustness of the AI–Lae relationship findings.
Policy shock (Big Data Pilot Zone) robustness test performed on the same panel of 30 provincial regions (2012–2022); described as an exogenous shock test corroborating the main results.
Industrial robots influence global value chain length primarily through technological innovation.
Mechanism analysis in the paper linking robot adoption to technological innovation measures and then to GVC length, based on the IFR and 14-subsector panel data; exact innovation indicators and estimation details not provided in the abstract.
Industrial robots influence global value chain length primarily through human capital upgrading.
Mechanism analysis reported in the paper linking robot adoption to changes in human capital (upgrading) and then to changes in GVC length using the same IFR and panel data; specific tests/mediation approaches not detailed in the abstract.
Industrial robots promote participation in global production networks within capital-intensive industries (i.e., they increase global value chain length for capital-intensive sectors).
Subsample or heterogeneous-effects analysis across capital-intensive vs. labor-intensive sub-sectors using the panel of 14 Chinese manufacturing sub-sectors; results reported for capital-intensive industries as positive effect on GVC participation/length.
The application of industrial robots significantly extends the length of global value chains in manufacturing.
Empirical analysis using IFR robot data and panel data on 14 manufacturing sub-sectors; significance reported in paper (panel regression results). Exact model specifications and significance levels not provided in the abstract.
Regulatory modernisation, secure national data infrastructure and targeted digital training are essential to enable sustainable innovation in valuation practice.
Policy and practitioner recommendations derived from interview data and thematic analysis; synthesis into prescriptive recommendations.
Deep learning models (particularly LSTM and Transformer) exhibit stronger tail-risk control than traditional benchmark models.
Empirical risk analysis reported in the paper (tail-risk metrics/comparisons) indicating better tail-risk outcomes for LSTM and Transformer relative to linear and tree-based benchmarks.
Deep learning models (especially LSTM and Transformer) produce more stable WEI scores than traditional benchmarks.
Empirical comparison of WEI (the paper's proposed weighted evaluation index) across model types showing LSTM and Transformer with more stable (less variable/improved) WEI over the evaluation period.
Deep learning models, particularly LSTM and Transformer, deliver superior prediction accuracy compared to traditional benchmarks (linear and tree-based models).
Empirical model comparison using rolling-window forecasts on A-share data (2013–2024) across the listed factors; accuracy metrics reported in the paper (e.g., RMSE or similar) show better performance for deep learning models, with LSTM and Transformer highlighted.
Hallucinated references disproportionately assign credit to already prominent and male scholars, suggesting LLM-generated errors may reinforce existing inequities in scientific recognition.
Analysis linking hallucinated citations to characteristics of the (intended or assigned) cited authors, including measures of prominence and inferred gender, showing over-representation of prominent and male scholars among hallucinated attributions.
Hallucinated references are especially pronounced among small and early-career author teams.
Analysis of hallucination prevalence by author-team characteristics (team size and author career stage) within the audited dataset.
Hallucinated references are especially pronounced in manuscripts with linguistic signatures of AI-assisted writing.
Classification of manuscripts by linguistic features (signatures) indicative of AI-assistance and comparison of hallucination prevalence between groups.
These errors are diffusely embedded across many papers but especially pronounced in fields with rapid AI uptake.
Cross-field comparison within the audited dataset showing higher rates of non-existent references in fields identified as having rapid AI adoption.
We provide a conservative estimate of 146,932 hallucinated citations in 2025 alone.
Quantitative extrapolation/estimation from the audit of references in the dataset, producing an annualized (2025) conservative count.
We find a sharp rise in non-existent references following widespread LLM adoption.
Temporal analysis of the audited references comparing prevalence of non-existent (hallucinated) citations before and after the period of widespread LLM adoption across the 111M-reference dataset.
The paper extends the TOE (Technology-Organization-Environment) framework by identifying an optimal AI adoption range and empirically validating the homogenization trap.
Theoretical contribution claimed in discussion linking empirical inverted-U and homogenization findings back to TOE framework.
AI’s enabling effect on innovation is more sustainable in high-technology firms (relative to low-tech firms).
Heterogeneity analyses by firm technology intensity (high-tech vs. others) showing more sustained positive AI effects in high-tech firms.
AI’s enabling effect on innovation is more sustainable in non-state-owned firms (compared to state-owned firms).
Heterogeneity analyses by ownership type reported in the paper showing stronger/sustained positive AI–innovation effects for non-state-owned firms.
Firm absorptive capacity partially mediates the AI–innovation relationship.
Bootstrap mediation analysis performed on the sample indicating a partial mediation effect of absorptive capacity between AI and innovation.
The positive effect of GGFs on digital–intelligent transformation is particularly strong for firms with robust dynamic capabilities.
Heterogeneity analysis reported in the paper comparing effects across firms with differing levels of dynamic capabilities using the DID sample of Chinese A–share listed firms (2012–2024).
The positive effect of GGFs on digital–intelligent transformation is particularly strong for firms operating in high‑tech industries.
Heterogeneity analysis reported in the paper comparing effects across industries (high‑tech vs. others) using the DID sample of Chinese A–share listed firms (2012–2024).
The positive effect of GGFs on digital–intelligent transformation is particularly strong in firms with high-quality internal controls.
Heterogeneity analysis reported in the paper comparing effects across firms with different internal control quality using the DID sample of Chinese A–share listed firms (2012–2024).
GGFs promote firms’ digital–intelligent transformation by encouraging knowledge spillovers.
Mechanism analysis reported in the paper that identifies knowledge spillovers as a channel from GGFs to firm-level digital–intelligent transformation, using the DID framework on Chinese A–share listed firms (2012–2024).
GGFs promote firms’ digital–intelligent transformation by transmitting policy guidance.
Mechanism analysis reported in the paper indicating a pathway from GGFs to firm transformation via policy guidance channels, based on the DID sample of Chinese A–share listed firms (2012–2024).
GGFs promote firms’ digital–intelligent transformation by easing firms' financing constraints.
Mechanism analysis reported in the paper (mediation / pathway analysis tied to the DID framework) using the same sample of Chinese A–share listed firms (2012–2024).
Government-guided funds (GGFs) significantly promote firms’ digital–intelligent transformation.
Difference-in-differences (DID) analysis applied to Chinese A–share listed firms over 2012–2024, as reported in the paper's main empirical results.
Broader equity markets, proxied by the S&P 500, remain the dominant source of spillovers throughout the sample period.
Directional spillover results from the TVP-VAR indicating the S&P 500 has the largest and persistent net outward spillover contributions over the full sample.