The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (4114 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Innovation Remove filter
Existing approaches remain fragmented across formal verification, runtime assurance, neuro-symbolic reasoning and trustworthy Artificial Intelligence (AI) research communities.
Author claim about the state of the research landscape; asserted fragmentation without bibliometric or survey data provided in excerpt.
high negative ReasonOps: A Unified Operational Paradigm for Trustworthy Ve... degree of integration/coordination across research communities
Current reasoning systems still suffer from hidden logical inconsistencies, hallucinated symbolic transitions, unsupported theorem applications, and limited reliability guarantees.
Author assertion identifying failure modes of current reasoning systems; presented qualitatively without quantitative error rates or experimental sample sizes in the excerpt.
high negative ReasonOps: A Unified Operational Paradigm for Trustworthy Ve... reliability / correctness of reasoning systems
Translators have functioned as 'invisible teachers' of AI—through the construction of translation memories, post-editing, and quality assessment—without recognition as teachers of models.
Conceptual framing and synthesis of workflow practices (TM construction, post-editing, QA) and their role as supervision for ML; qualitative argument and illustrative examples in the paper. No quantitative sample reported.
high negative Translators as Invisible Teachers of AI: Copyright, Translat... lack of recognition/attribution for contributors who effectively trained AI
Translators' renditions have been bought as deliverables under contract, segmented as technical objects, and processed as 'information analysis' data under copyright law—resulting in the loss of moral, creative, and economic attribution to the translators who produced them.
Comparative reading of contract practices and copyright treatment (legal/contractual analysis across jurisdictions), descriptive examples of how translations are delivered, segmented, and processed; qualitative argumentation in the paper. No quantitative sample reported.
high negative Translators as Invisible Teachers of AI: Copyright, Translat... loss of attribution and economic recognition for translators
Existing legal perspectives on the intellectual property of AI-generated works and related enforcement challenges are inadequately addressed under current frameworks.
Analytic review of legal perspectives and enforcement issues presented in the paper; conclusion based on the author's analysis rather than quantitative data.
high negative Examining the Challenges of Intellectual Property in AI-Gene... adequacy of legal perspectives and enforcement mechanisms for AI-generated IP
The current Iranian legal framework contains significant regulatory gaps with respect to intellectual property protection for AI-generated works.
Comparative legal analysis of Iranian statutes (1969 Law for the Protection of Authors, Composers, and Artists Rights and the Patent and Trademark Registration Law) against other legal systems (European Union, United Kingdom, United States); the paper's findings are based on legal/textual analysis rather than empirical sampling.
high negative Examining the Challenges of Intellectual Property in AI-Gene... presence of regulatory gaps in Iranian IP law regarding AI-generated works
The most critical intellectual property issue raised by AI-generated outputs is ownership of moral and economic rights in the absence of a human creator.
Theoretical discussion and literature review presented in the paper identifying legal and doctrinal questions around authorship and ownership when no human creator is involved (no empirical sample size).
high negative Examining the Challenges of Intellectual Property in AI-Gene... clarity/assignment of moral and economic IP rights for works lacking a human aut...
There is an urgent question of how humans can effectively supervise and control an economy operated by AI agents when this system may expand beyond the capacity of traditional governance.
Framed as a central research/policy concern in the paper's abstract; conceptual argument rather than empirical finding.
high negative Regulatory Policy for the Agent Economy in the Digital Age: ... capacity of traditional governance to supervise/control AI-operated economy
The Agent Economy raises new regulatory challenges concerning data privacy, security, ethics, and the risk of job displacement.
Stated in paper abstract as identified risks; based on literature synthesis and comparative policy analysis approach (method described), but no empirical incidence metrics reported.
high negative Regulatory Policy for the Agent Economy in the Digital Age: ... regulatory challenges related to privacy, security, ethics, and job displacement...
Under water-constrained conditions, the framework achieves reductions of approximately 3-5% in generation-related freshwater withdrawals.
Quantitative results from simulation case studies on the IEEE test systems (reported percentage reduction ~3-5%); sample context: water-constrained simulation scenarios on IEEE 30-bus and 118-bus systems (sample_size = 2 test systems).
high negative From Accounting to Coordination: A Virtual Water-Aware Elect... generation-related freshwater withdrawals
Because they are decoupled from the optimization process, static statistical accounting approaches are incapable of guiding workload relocation or power dispatch to mitigate water stress.
Argumentative claim in paper about limitations of static accounting methods with respect to guiding operational decisions (methodological critique).
high negative From Accounting to Coordination: A Virtual Water-Aware Elect... suitability of static accounting to guide workload relocation and power dispatch...
Existing approaches typically rely on static statistical accounting to quantify these water footprints, but such static methods fail to capture how dispatch optimization and workload relocation dynamically affect water withdrawals.
Critical assessment in paper contrasting prior static statistical accounting approaches with dynamic needs; presented as methodological critique (no particular empirical sample in excerpt).
high negative From Accounting to Coordination: A Virtual Water-Aware Elect... accuracy/adequacy of static statistical accounting methods for water footprint a...
As these systems scale, the bottleneck shifts away from raw model capability toward coordination.
Analytical/argumentative claim in the paper framing a shift in primary constraint; no empirical study or quantified benchmark reported.
high negative Foundation Protocol: A Coordination Layer for Agentic Societ... primary system bottleneck (model capability versus coordination capacity)
Current systems still struggle with evidence preservation, reproducibility, weak-direction rejection, provenance tracking, cross-domain robustness, and accountable scientific closure.
Survey-identified recurring failure modes and limitations reported in literature and system descriptions; qualitative synthesis.
high negative AutoResearch AI: Towards AI-Powered Research Automation for ... capabilities related to evidence preservation, reproducibility, rejection of wea...
Current systems remain fragmented, differing in autonomy, domain scope, execution environment, validation mechanism, and human oversight.
Survey of existing systems and categorization across the listed dimensions; descriptive synthesis rather than an empirical meta-analysis.
high negative AutoResearch AI: Towards AI-Powered Research Automation for ... heterogeneity/fragmentation across AI research systems along autonomy, domain sc...
AI power demand is growing at an unprecedented rate while power grids are often ailing and struggle to keep up.
Statement in paper's motivation/background; no empirical method or sample size reported in the abstract.
high negative XWind: A Cross-site Router for Large Language Model Inferenc... strain on power grids relative to AI power demand
Monotonic baselines collapse when extrapolating beyond the training regime (e.g., predicting a 12B model up to 307B tokens) whereas the Shannon Scaling Law remains predictive.
Empirical comparison on the held-out 12B extrapolation: authors report collapse/failure of monotonic baseline scaling laws in that regime contrasted with Shannon law's successful prediction (pooled R^2 reported).
high negative LLMs as Noisy Channels: A Shannon Perspective on Model Capac... extrapolative predictive failure/success of baseline vs proposed scaling laws
This Shannon perspective reveals a fundamental Shannon capacity for LLMs: scaling model size or data without preserving a sufficient signal-to-noise ratio (SNR) inevitably amplifies noise, inducing a transition from monotonic improvement to U-shaped performance degradation.
Theoretical argument derived from the Shannon-Hartley based formulation plus supporting empirical examples claimed in the paper showing non-monotonic (U-shaped) loss/accuracy behavior when SNR is insufficient.
high negative LLMs as Noisy Channels: A Shannon Perspective on Model Capac... performance vs. scale behavior (transition from monotonic improvement to U-shape...
Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute.
Author assertion based on literature/contextual observation and motivating examples (catastrophic overtraining, quantization-induced degradation) referenced in the paper; no specific numeric sample provided in the excerpt.
high negative LLMs as Noisy Channels: A Shannon Perspective on Model Capac... ability of prior scaling laws to explain non-monotonic performance phenomena (e....
Commercial or dual-use AI models and semiconductors do not meet the security exception criteria under GATT Article XXI(b), so security interests should be interpreted restrainedly.
Legal argument and interpretive analysis in the paper contending that the GATT Article XXI(b) security exception does not encompass routine commercial or dual-use AI models and semiconductors; doctrinal legal reasoning rather than empirical measurement.
high negative Strategic Stalemates: The Paradox of Export Controls in the ... applicability of GATT Article XXI(b) security exception to dual-use/commercial A...
Overusing export controls can complicate dispute resolution and hinder AI progress.
Normative and legal-political argument in the paper: overuse raises legal disputes (e.g., WTO litigation) and may slow cross-border AI development and diffusion (qualitative reasoning).
high negative Strategic Stalemates: The Paradox of Export Controls in the ... frequency/complexity of trade disputes and pace of AI progress/development
Overly strict or arbitrary controls may violate WTO obligations.
Legal analysis in the paper arguing that some export controls could conflict with WTO law (GATT) depending on scope and justification; interpretive legal reasoning cited.
high negative Strategic Stalemates: The Paradox of Export Controls in the ... compatibility of export controls with WTO obligations
The long-term effectiveness of export controls is questionable.
Paper's argumentative assessment drawing on historical examples and theoretical considerations (qualitative reasoning rather than quantitative causal inference).
high negative Strategic Stalemates: The Paradox of Export Controls in the ... effectiveness of export controls over the long term
China responded with export curbs on critical minerals and filed a WTO complaint against the U.S. under GATT.
Factual claim citing China's counter-measures (export curbs) and legal action (WTO complaint under GATT) as described in the paper.
high negative Strategic Stalemates: The Paradox of Export Controls in the ... China's retaliatory trade measures and litigation
Technical bottlenecks (cross-border data compliance, algorithm interpretability) and ethical challenges (algorithmic bias, privacy infringement, cultural conflicts) are intertwined impediments to intelligent international marketing.
Synthesis of challenges identified across the reviewed literature (systematic review and content analysis, 2010–2025) as reported in the paper.
high negative Research on International Marketing in the Context of Intell... presence and interrelation of technical and ethical barriers
Traditional international marketing theories, constrained by static assumptions and linear logic, struggle to explain intelligent contexts.
Conclusion from the paper's systematic review and content analysis of core literature (2010–2025); no quantitative test or sample size reported in the summary.
high negative Research on International Marketing in the Context of Intell... theoretical explanatory adequacy of traditional international marketing theories
Because contracts are negotiated by legal departments alone, many apparent legal disputes are incentive misalignment problems that only scientists at the table can correctly diagnose.
Argumentative claim presented in the paper (normative/diagnostic); no empirical study or sample provided in the excerpt.
high negative Position: The Pre/Post-Training Boundary Should Govern IP in... quality of contract negotiations / correct diagnosis of incentives in disputes
These failures are not for scientific reasons, but because academics must publish while companies must protect models trained on proprietary data, and no standard contract framework resolves this tension.
The paper presents this as the causal explanation (analytical/argumentative claim); no empirical testing or sample reported in the provided text.
high negative Position: The Pre/Post-Training Boundary Should Govern IP in... incentive alignment between academic publication requirements and company IP pro...
Industry-academia ML collaborations routinely fail to launch.
Asserted in the paper as an empirical observation/statement; no empirical methods, data, or sample size reported in the provided text (argument/anecdote).
high negative Position: The Pre/Post-Training Boundary Should Govern IP in... success rate of launching industry-academia ML collaborations
Current regulatory frameworks—designed for human-intermediated payments—are ill-equipped to address the dynamic and decentralised nature of agent-led transactions.
Regulatory and legal analysis asserted in the abstract (argument that existing frameworks are mismatched to agent-led payments).
high negative AI Agents in Payments: Applications, Risks and Regulations adequacy of existing regulatory frameworks for agent-led transactions
The article identifies and categorises a range of technical, legal and societal risks, including cybersecurity vulnerabilities, liability gaps, regulatory non-compliance, and potential economic disruption.
Risk identification and categorisation presented in the paper (qualitative analysis and case studies referenced in the abstract). No quantitative risk measurement reported in the abstract.
high negative AI Agents in Payments: Applications, Risks and Regulations technical, legal and societal risks (cybersecurity, liability, regulatory non-co...
The lack of prediction stability and predictability can lead to advertiser-perceivable problems such as repeatability issues, cold start, and under-exploration.
Stated as an intuitive/motivational claim in the paper linking instability to advertiser-facing problems; no empirical quantification provided in the excerpt.
high negative LLM Retrieval for Stable and Predictable Ad Recommendations repeatability, cold start, under-exploration (advertiser-perceived issues)
Traditional ads recommendation systems have primarily focused on optimizing for prediction accuracy of click or conversion events using canonical metrics such as recall or normalized discounted cumulative gain (NDCG).
Background/contextual claim about prior work and standard practice; stated in the paper as motivation (no empirical evidence provided in the excerpt).
high negative LLM Retrieval for Stable and Predictable Ad Recommendations optimization focus on click/conversion prediction accuracy (recall, NDCG)
AIO is negatively associated with the carbon emission intensity of upstream suppliers.
Authors report a negative association between firms' AIO and the carbon emission intensity of their upstream suppliers in the empirical results using Chinese listed firms (2010–2023).
high negative Artificial intelligence orientation and decarbonization spil... carbon emission intensity (upstream suppliers)
AIO is negatively associated with the carbon emission intensity of industry peers.
Authors report a negative association between a firm's AIO and the carbon emission intensity of its industry peers based on their empirical analyses of Chinese listed companies over 2010–2023.
high negative Artificial intelligence orientation and decarbonization spil... carbon emission intensity (industry peers)
Stronger AIO is associated with lower carbon emission intensity within the focal firm.
Empirical association reported between firm-level AIO (measured via LLMs) and firm carbon emission intensity in the authors' analysis of Chinese listed firms (2010–2023); result described as a negative relationship.
high negative Artificial intelligence orientation and decarbonization spil... carbon emission intensity (focal firm)
Kamunun Ar-Ge harcamalarının etkin ve verimli kullanılmadığına işaret eden bulgular vardır (kamu Ar-Ge negatif ilişki gösterdiği için).
Negatif ilişkiyi gösteren rassal etkiler regresyon sonuçlarına dayanan çıkarım (G8 + Türkiye, 2010-2020).
high negative AR-GE HARCAMALARININ VE VERGİ TEŞVİKLERİNİN YAPAY ZEKAYA ETK... etkinlik/verimlilik (yorumsal çıkarım, doğrudan ölçülmemiş)
Ekonomik büyüme ile yapay zekâ patent sayıları arasında negatif bir ilişki bulunmaktadır.
Panel regresyon (random effects) sonuçları (G8 + Türkiye, 2010-2020) raporlanmıştır; ekonomik büyüme (muhtemelen GSMH büyüme oranı) değişkeninin AI patent sayıları ile negatif ilişki gösterdiği bildirilmiştir.
high negative AR-GE HARCAMALARININ VE VERGİ TEŞVİKLERİNİN YAPAY ZEKAYA ETK... AI patent sayıları (yapay zekâ patent sayısı)
Kamunun Ar-Ge harcamaları ile yapay zekâ patent sayıları arasında negatif bir ilişki bulunmaktadır.
Rassal etkiler panel regresyonu üzerine raporlanan sonuçlar (G8 + Türkiye, 2010-2020); kamu Ar-Ge harcamaları değişkeninin AI patent sayısı ile negatif ilişki gösterdiği bildirilmiştir.
high negative AR-GE HARCAMALARININ VE VERGİ TEŞVİKLERİNİN YAPAY ZEKAYA ETK... AI patent sayıları (yapay zekâ patent sayısı)
Science-to-technology knowledge flow in AI has been insufficiently examined in a systematic and structural way.
Literature-gap claim in the paper motivating the study.
high negative Knowledge flows from science to AI technology: Identifying c... extent of systematic/structural study of science-to-technology knowledge flow in...
Unrestricted frontier-scale checkpoint synthesis remains open (i.e., not yet solved).
Authors' assessment in the abstract noting current limits; asserts that unrestricted synthesis at frontier/model-scale has not been achieved.
high negative Position: Weight Space Should Be a First-Class Generative AI... feasibility/status of unrestricted frontier-scale checkpoint synthesis
In the context of search retrieval, current cold-start models suffer from the misalignment between training objectives and online business metrics, and they lack effective mechanisms to measure an item's growth potential.
Claim made in paper as motivation/background; no empirical details provided in the excerpt.
high negative Towards Sustainable Growth: A Multi-Value-Aware Retrieval Fr... alignment between model training objectives and online business metrics / abilit...
Existing systems tend to prioritize presenting users with already popular items, a phenomenon often referred to as the "Matthew effect".
Statement/observation in the paper; presented as background/motivation (no empirical evidence or sample size reported in the excerpt).
high negative Towards Sustainable Growth: A Multi-Value-Aware Retrieval Fr... presentation/exposure bias toward popular items
An analysis of a 21-instrument inventory identifies an incentive gradient where geopolitical and industrial pressures systematically reward surface-level behavioral proxies over deep structural verification.
Empirical/qualitative analysis of an inventory of 21 governance instruments compiled and analysed in the paper (n=21 instruments).
high negative Position: Behavioural Assurance Cannot Verify the Safety Cla... governance_and_regulation
Behavioural assurance, even when carefully designed, is being asked to carry safety claims it cannot verify.
The paper's normative and conceptual argument synthesising governance requirements and the epistemic limits of behavioural testing.
Current assurance methodologies (primarily behavioural evaluations and red-teaming) are epistemically limited to observable model outputs and cannot verify latent representations or long-horizon agentic behaviours.
Conceptual/analytic argument and review of existing assurance methodologies presented in the paper.
Policy responses in Europe are fragmented across the EU and Member State levels and do not match the potential scale of disruption from AGI.
Paper's policy analysis of EU- and Member-State-level responses (stated in abstract); no quantitative metrics provided in the abstract.
high negative Europe and the Geopolitics of AGI: The Need for a Preparedne... governance_and_regulation
Europe has low rates of industrial AI adoption.
Paper's empirical/policy review claiming low industrial AI adoption in Europe (as stated in abstract); the abstract does not provide numeric adoption rates or sample sizes.
Europe exhibits structural weaknesses in compute infrastructure and talent retention.
Paper's structural assessment of Europe's AI value-chain capabilities (stated in abstract); no numerical measures provided in the abstract.
Europe has limited strategic awareness of frontier AI progress.
Paper's assessment of Europe's positioning based on policy analysis and review of capabilities monitoring (as stated in abstract); no supporting metrics or sample sizes provided in the abstract.
high negative Europe and the Geopolitics of AGI: The Need for a Preparedne... governance_and_regulation