Evidence (4114 claims)
Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 758 | 199 | 100 | 900 | 2007 |
| Governance & Regulation | 826 | 400 | 191 | 122 | 1563 |
| Organizational Efficiency | 777 | 193 | 124 | 84 | 1189 |
| Technology Adoption Rate | 635 | 233 | 124 | 97 | 1098 |
| Research Productivity | 422 | 128 | 57 | 336 | 954 |
| Output Quality | 476 | 179 | 59 | 47 | 761 |
| Decision Quality | 328 | 177 | 81 | 47 | 640 |
| Firm Productivity | 435 | 57 | 88 | 20 | 606 |
| AI Safety & Ethics | 218 | 277 | 65 | 33 | 599 |
| Market Structure | 180 | 170 | 123 | 24 | 502 |
| Task Allocation | 213 | 64 | 72 | 33 | 387 |
| Skill Acquisition | 170 | 61 | 61 | 17 | 309 |
| Innovation Output | 203 | 27 | 43 | 18 | 292 |
| Employment Level | 105 | 54 | 107 | 13 | 281 |
| Fiscal & Macroeconomic | 131 | 69 | 43 | 26 | 276 |
| Consumer Welfare | 117 | 63 | 42 | 11 | 233 |
| Firm Revenue | 153 | 48 | 26 | 3 | 230 |
| Task Completion Time | 173 | 31 | 8 | 12 | 225 |
| Inequality Measures | 44 | 122 | 49 | 6 | 221 |
| Worker Satisfaction | 89 | 65 | 22 | 12 | 188 |
| Error Rate | 69 | 92 | 10 | 2 | 173 |
| Regulatory Compliance | 77 | 69 | 14 | 5 | 165 |
| Automation Exposure | 56 | 56 | 26 | 13 | 154 |
| Training Effectiveness | 94 | 21 | 13 | 19 | 149 |
| Wages & Compensation | 77 | 36 | 25 | 6 | 144 |
| Team Performance | 86 | 17 | 27 | 10 | 141 |
| Developer Productivity | 95 | 17 | 14 | 6 | 133 |
| Job Displacement | 12 | 80 | 20 | 1 | 113 |
| Hiring & Recruitment | 52 | 7 | 8 | 3 | 70 |
| Creative Output | 31 | 18 | 8 | 3 | 61 |
| Skill Obsolescence | 5 | 46 | 6 | 1 | 58 |
| Social Protection | 27 | 16 | 8 | 2 | 53 |
| Labor Share of Income | 17 | 19 | 17 | — | 53 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Innovation
Remove filter
Existing approaches remain fragmented across formal verification, runtime assurance, neuro-symbolic reasoning and trustworthy Artificial Intelligence (AI) research communities.
Author claim about the state of the research landscape; asserted fragmentation without bibliometric or survey data provided in excerpt.
Current reasoning systems still suffer from hidden logical inconsistencies, hallucinated symbolic transitions, unsupported theorem applications, and limited reliability guarantees.
Author assertion identifying failure modes of current reasoning systems; presented qualitatively without quantitative error rates or experimental sample sizes in the excerpt.
Translators have functioned as 'invisible teachers' of AI—through the construction of translation memories, post-editing, and quality assessment—without recognition as teachers of models.
Conceptual framing and synthesis of workflow practices (TM construction, post-editing, QA) and their role as supervision for ML; qualitative argument and illustrative examples in the paper. No quantitative sample reported.
Translators' renditions have been bought as deliverables under contract, segmented as technical objects, and processed as 'information analysis' data under copyright law—resulting in the loss of moral, creative, and economic attribution to the translators who produced them.
Comparative reading of contract practices and copyright treatment (legal/contractual analysis across jurisdictions), descriptive examples of how translations are delivered, segmented, and processed; qualitative argumentation in the paper. No quantitative sample reported.
Existing legal perspectives on the intellectual property of AI-generated works and related enforcement challenges are inadequately addressed under current frameworks.
Analytic review of legal perspectives and enforcement issues presented in the paper; conclusion based on the author's analysis rather than quantitative data.
The current Iranian legal framework contains significant regulatory gaps with respect to intellectual property protection for AI-generated works.
Comparative legal analysis of Iranian statutes (1969 Law for the Protection of Authors, Composers, and Artists Rights and the Patent and Trademark Registration Law) against other legal systems (European Union, United Kingdom, United States); the paper's findings are based on legal/textual analysis rather than empirical sampling.
The most critical intellectual property issue raised by AI-generated outputs is ownership of moral and economic rights in the absence of a human creator.
Theoretical discussion and literature review presented in the paper identifying legal and doctrinal questions around authorship and ownership when no human creator is involved (no empirical sample size).
There is an urgent question of how humans can effectively supervise and control an economy operated by AI agents when this system may expand beyond the capacity of traditional governance.
Framed as a central research/policy concern in the paper's abstract; conceptual argument rather than empirical finding.
The Agent Economy raises new regulatory challenges concerning data privacy, security, ethics, and the risk of job displacement.
Stated in paper abstract as identified risks; based on literature synthesis and comparative policy analysis approach (method described), but no empirical incidence metrics reported.
Under water-constrained conditions, the framework achieves reductions of approximately 3-5% in generation-related freshwater withdrawals.
Quantitative results from simulation case studies on the IEEE test systems (reported percentage reduction ~3-5%); sample context: water-constrained simulation scenarios on IEEE 30-bus and 118-bus systems (sample_size = 2 test systems).
Because they are decoupled from the optimization process, static statistical accounting approaches are incapable of guiding workload relocation or power dispatch to mitigate water stress.
Argumentative claim in paper about limitations of static accounting methods with respect to guiding operational decisions (methodological critique).
Existing approaches typically rely on static statistical accounting to quantify these water footprints, but such static methods fail to capture how dispatch optimization and workload relocation dynamically affect water withdrawals.
Critical assessment in paper contrasting prior static statistical accounting approaches with dynamic needs; presented as methodological critique (no particular empirical sample in excerpt).
As these systems scale, the bottleneck shifts away from raw model capability toward coordination.
Analytical/argumentative claim in the paper framing a shift in primary constraint; no empirical study or quantified benchmark reported.
Current systems still struggle with evidence preservation, reproducibility, weak-direction rejection, provenance tracking, cross-domain robustness, and accountable scientific closure.
Survey-identified recurring failure modes and limitations reported in literature and system descriptions; qualitative synthesis.
Current systems remain fragmented, differing in autonomy, domain scope, execution environment, validation mechanism, and human oversight.
Survey of existing systems and categorization across the listed dimensions; descriptive synthesis rather than an empirical meta-analysis.
AI power demand is growing at an unprecedented rate while power grids are often ailing and struggle to keep up.
Statement in paper's motivation/background; no empirical method or sample size reported in the abstract.
Monotonic baselines collapse when extrapolating beyond the training regime (e.g., predicting a 12B model up to 307B tokens) whereas the Shannon Scaling Law remains predictive.
Empirical comparison on the held-out 12B extrapolation: authors report collapse/failure of monotonic baseline scaling laws in that regime contrasted with Shannon law's successful prediction (pooled R^2 reported).
This Shannon perspective reveals a fundamental Shannon capacity for LLMs: scaling model size or data without preserving a sufficient signal-to-noise ratio (SNR) inevitably amplifies noise, inducing a transition from monotonic improvement to U-shaped performance degradation.
Theoretical argument derived from the Shannon-Hartley based formulation plus supporting empirical examples claimed in the paper showing non-monotonic (U-shaped) loss/accuracy behavior when SNR is insufficient.
Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute.
Author assertion based on literature/contextual observation and motivating examples (catastrophic overtraining, quantization-induced degradation) referenced in the paper; no specific numeric sample provided in the excerpt.
Commercial or dual-use AI models and semiconductors do not meet the security exception criteria under GATT Article XXI(b), so security interests should be interpreted restrainedly.
Legal argument and interpretive analysis in the paper contending that the GATT Article XXI(b) security exception does not encompass routine commercial or dual-use AI models and semiconductors; doctrinal legal reasoning rather than empirical measurement.
Overusing export controls can complicate dispute resolution and hinder AI progress.
Normative and legal-political argument in the paper: overuse raises legal disputes (e.g., WTO litigation) and may slow cross-border AI development and diffusion (qualitative reasoning).
Overly strict or arbitrary controls may violate WTO obligations.
Legal analysis in the paper arguing that some export controls could conflict with WTO law (GATT) depending on scope and justification; interpretive legal reasoning cited.
The long-term effectiveness of export controls is questionable.
Paper's argumentative assessment drawing on historical examples and theoretical considerations (qualitative reasoning rather than quantitative causal inference).
China responded with export curbs on critical minerals and filed a WTO complaint against the U.S. under GATT.
Factual claim citing China's counter-measures (export curbs) and legal action (WTO complaint under GATT) as described in the paper.
Technical bottlenecks (cross-border data compliance, algorithm interpretability) and ethical challenges (algorithmic bias, privacy infringement, cultural conflicts) are intertwined impediments to intelligent international marketing.
Synthesis of challenges identified across the reviewed literature (systematic review and content analysis, 2010–2025) as reported in the paper.
Traditional international marketing theories, constrained by static assumptions and linear logic, struggle to explain intelligent contexts.
Conclusion from the paper's systematic review and content analysis of core literature (2010–2025); no quantitative test or sample size reported in the summary.
Because contracts are negotiated by legal departments alone, many apparent legal disputes are incentive misalignment problems that only scientists at the table can correctly diagnose.
Argumentative claim presented in the paper (normative/diagnostic); no empirical study or sample provided in the excerpt.
These failures are not for scientific reasons, but because academics must publish while companies must protect models trained on proprietary data, and no standard contract framework resolves this tension.
The paper presents this as the causal explanation (analytical/argumentative claim); no empirical testing or sample reported in the provided text.
Industry-academia ML collaborations routinely fail to launch.
Asserted in the paper as an empirical observation/statement; no empirical methods, data, or sample size reported in the provided text (argument/anecdote).
Current regulatory frameworks—designed for human-intermediated payments—are ill-equipped to address the dynamic and decentralised nature of agent-led transactions.
Regulatory and legal analysis asserted in the abstract (argument that existing frameworks are mismatched to agent-led payments).
The article identifies and categorises a range of technical, legal and societal risks, including cybersecurity vulnerabilities, liability gaps, regulatory non-compliance, and potential economic disruption.
Risk identification and categorisation presented in the paper (qualitative analysis and case studies referenced in the abstract). No quantitative risk measurement reported in the abstract.
The lack of prediction stability and predictability can lead to advertiser-perceivable problems such as repeatability issues, cold start, and under-exploration.
Stated as an intuitive/motivational claim in the paper linking instability to advertiser-facing problems; no empirical quantification provided in the excerpt.
Traditional ads recommendation systems have primarily focused on optimizing for prediction accuracy of click or conversion events using canonical metrics such as recall or normalized discounted cumulative gain (NDCG).
Background/contextual claim about prior work and standard practice; stated in the paper as motivation (no empirical evidence provided in the excerpt).
AIO is negatively associated with the carbon emission intensity of upstream suppliers.
Authors report a negative association between firms' AIO and the carbon emission intensity of their upstream suppliers in the empirical results using Chinese listed firms (2010–2023).
AIO is negatively associated with the carbon emission intensity of industry peers.
Authors report a negative association between a firm's AIO and the carbon emission intensity of its industry peers based on their empirical analyses of Chinese listed companies over 2010–2023.
Stronger AIO is associated with lower carbon emission intensity within the focal firm.
Empirical association reported between firm-level AIO (measured via LLMs) and firm carbon emission intensity in the authors' analysis of Chinese listed firms (2010–2023); result described as a negative relationship.
Kamunun Ar-Ge harcamalarının etkin ve verimli kullanılmadığına işaret eden bulgular vardır (kamu Ar-Ge negatif ilişki gösterdiği için).
Negatif ilişkiyi gösteren rassal etkiler regresyon sonuçlarına dayanan çıkarım (G8 + Türkiye, 2010-2020).
Ekonomik büyüme ile yapay zekâ patent sayıları arasında negatif bir ilişki bulunmaktadır.
Panel regresyon (random effects) sonuçları (G8 + Türkiye, 2010-2020) raporlanmıştır; ekonomik büyüme (muhtemelen GSMH büyüme oranı) değişkeninin AI patent sayıları ile negatif ilişki gösterdiği bildirilmiştir.
Kamunun Ar-Ge harcamaları ile yapay zekâ patent sayıları arasında negatif bir ilişki bulunmaktadır.
Rassal etkiler panel regresyonu üzerine raporlanan sonuçlar (G8 + Türkiye, 2010-2020); kamu Ar-Ge harcamaları değişkeninin AI patent sayısı ile negatif ilişki gösterdiği bildirilmiştir.
Science-to-technology knowledge flow in AI has been insufficiently examined in a systematic and structural way.
Literature-gap claim in the paper motivating the study.
Unrestricted frontier-scale checkpoint synthesis remains open (i.e., not yet solved).
Authors' assessment in the abstract noting current limits; asserts that unrestricted synthesis at frontier/model-scale has not been achieved.
In the context of search retrieval, current cold-start models suffer from the misalignment between training objectives and online business metrics, and they lack effective mechanisms to measure an item's growth potential.
Claim made in paper as motivation/background; no empirical details provided in the excerpt.
Existing systems tend to prioritize presenting users with already popular items, a phenomenon often referred to as the "Matthew effect".
Statement/observation in the paper; presented as background/motivation (no empirical evidence or sample size reported in the excerpt).
An analysis of a 21-instrument inventory identifies an incentive gradient where geopolitical and industrial pressures systematically reward surface-level behavioral proxies over deep structural verification.
Empirical/qualitative analysis of an inventory of 21 governance instruments compiled and analysed in the paper (n=21 instruments).
Behavioural assurance, even when carefully designed, is being asked to carry safety claims it cannot verify.
The paper's normative and conceptual argument synthesising governance requirements and the epistemic limits of behavioural testing.
Current assurance methodologies (primarily behavioural evaluations and red-teaming) are epistemically limited to observable model outputs and cannot verify latent representations or long-horizon agentic behaviours.
Conceptual/analytic argument and review of existing assurance methodologies presented in the paper.
Policy responses in Europe are fragmented across the EU and Member State levels and do not match the potential scale of disruption from AGI.
Paper's policy analysis of EU- and Member-State-level responses (stated in abstract); no quantitative metrics provided in the abstract.
Europe has low rates of industrial AI adoption.
Paper's empirical/policy review claiming low industrial AI adoption in Europe (as stated in abstract); the abstract does not provide numeric adoption rates or sample sizes.
Europe exhibits structural weaknesses in compute infrastructure and talent retention.
Paper's structural assessment of Europe's AI value-chain capabilities (stated in abstract); no numerical measures provided in the abstract.
Europe has limited strategic awareness of frontier AI progress.
Paper's assessment of Europe's positioning based on policy analysis and review of capabilities monitoring (as stated in abstract); no supporting metrics or sample sizes provided in the abstract.