The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (6869 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Governance Remove filter
Analysis through LLMbench demonstrates that the uncertainty users experience corresponds to measurable variation in model confidence across the generated text.
Empirical demonstration using LLMbench visualisations (token probability distributions, entropy curves) to link user-reported uncertainty to measurable changes in model confidence; specific datasets, models, or sample sizes not provided in the abstract.
high negative Prompt anxiety and the algorithmic politics of uncertainty model confidence (variation across generated text)
Users of large language models have to work with a measurably aleatory process: identical inputs produce different outputs and minor wording changes cascade through the probability field of the generated text.
Empirical analysis using the author's research instrument (LLMbench) for comparative close reading of LLM outputs; specific sample size or number of models/runs not reported in the abstract.
high negative Prompt anxiety and the algorithmic politics of uncertainty variation in model outputs / model output stability
Prompt engineering resembles the psychological and temporal structures that Walter Benjamin identified in gambling behaviour.
Conceptual/theoretical argument presented in the paper drawing an analogy between prompt engineering practices and Walter Benjamin's analysis of gambling; no empirical sample size reported in the abstract.
high negative Prompt anxiety and the algorithmic politics of uncertainty analogy between prompt engineering and gambling-related psychological/temporal s...
Major risk pathways for agentic AI include hallucinations, prompt-injection attacks, autonomous decision errors, model drift, dependency failures, and cyber-physical harms.
Enumerative risk analysis within the paper summarizing plausible threat vectors and failure modes; based on theoretical reasoning and analogies to known AI and cyber risks rather than new empirical incident data.
high negative Insurance of Agentic AI identification of principal risk pathways for agentic AI
These agentic-AI capabilities introduce novel exposures that do not fit neatly within traditional insurance categories such as cyber, professional liability, product liability, or directors and officers coverage.
Theoretical and market-structure analysis in the paper comparing agentic-AI exposures to existing insurance lines; illustrative examples and taxonomy rather than quantified empirical tests.
high negative Insurance of Agentic AI fit of agentic-AI exposures within traditional insurance categories
Agentic artificial intelligence (AI) systems are transforming the risk landscape by extending beyond information generation to autonomous planning, tool invocation, decision execution, and persistent modification of digital and physical environments.
The paper's conceptual argument and framing/abstract describing agentic AI capabilities and their implications; theoretical analysis rather than empirical measurement.
high negative Insurance of Agentic AI risk landscape / novel exposures from agentic AI
By framing AI risk exclusively in cybersecurity terms, the Order constructs an AI-risk universe in which provenance, labor, education, culture, meaning, and the commons are rendered 'not testable' within the policy regime.
Argumentative/theoretical claim backed by textual analysis and the counted absence of relevant terms in the EO.
high negative The Security Frame Is a Selection Kernel: Trump's AI Executi... scope of testable AI risks under the policy
The Executive Order frames AI risk overwhelmingly through cybersecurity language.
Textual analysis of the EO; supported by the paper's verified word-count analysis showing high frequency of security/cyber terms relative to other domains.
high negative The Security Frame Is a Selection Kernel: Trump's AI Executi... policy framing (AI risk framed as cybersecurity)
The COVID-19 pandemic reduced tourism’s GDP share by approximately 37%.
Fixed-effects panel estimation including a COVID-19 indicator on 33 countries (2017–2023); reported coefficient β = –0.455, p < 0.001 (interpreted as ~37% reduction in the dependent variable).
high negative Which dimensions of AI development shape tourism’s direct co... tourism’s direct GDP share
AI adoption intensifies existing sustainability challenges for the newsroom, as journalistic content and labour increasingly support AI systems without corresponding financial return.
Qualitative interview data and organisational analysis from Al-Masry Al-Youm indicating increased use of journalistic outputs for AI purposes and lack of matched revenue; sample size not reported in the excerpt.
high negative Platformisation, Power, and AI Governance in the Newsroom: I... financial sustainability / lack of corresponding financial return from AI-relate...
Reliance on global technology providers embeds forms of platform dependency within newsroom operations at Al-Masry Al-Youm.
Qualitative case study based on in-depth interviews with journalists, editors, and technical staff at Al-Masry Al-Youm (Egypt); analysis of newsroom practices and integration of third-party/global AI tools. Sample size not reported in the excerpt.
high negative Platformisation, Power, and AI Governance in the Newsroom: I... platform dependency within newsroom operations
An incentive sweep reveals Goodhart-style drift where measured performance becomes anti-correlated with true outcomes.
Simulation results in Medi-Sim showing that optimizing measured metrics leads to a decrease (anti-correlation) in true outcomes (Goodhart effect).
high negative Healthcare Mechanisms from Policy-as-Code Search under Strat... correlation between measured performance metric and true patient outcomes
Existing healthcare AI benchmarks hold this [strategic provider] response fixed and so cannot evaluate mechanisms by the equilibrium they produce.
Author statement/argument in the paper about limitations of existing benchmarks (conceptual claim; not an empirical experiment).
high negative Healthcare Mechanisms from Policy-as-Code Search under Strat... ability of benchmarks to evaluate mechanisms by equilibrium response
Research on platform governance remains fragmented and lacks an integrative perspective.
Conclusion drawn from the systematic literature review (644 publications) indicating fragmentation in the scholarly literature.
high negative Mission: Orchestration – Governance Mechanisms And Future Re... degree of fragmentation and lack of integrative perspectives in platform governa...
Participants in platform ecosystems cannot be governed through traditional command-and-control mechanisms.
Conceptual claim supported by the literature synthesized in the systematic literature review (644 publications).
high negative Mission: Orchestration – Governance Mechanisms And Future Re... suitability of traditional command-and-control governance for platform participa...
Research on AI-enabled decision-making and upper echelons theory (UET) has largely evolved in parallel (i.e., the two literatures are not well integrated).
Concept-centric literature review mapping management and IS literatures and identifying lack of integration (no quantitative meta-analysis or sample size reported).
high negative Hybrid Upper Echelons: A Theorizing Review On Ai In Executiv... degree of integration between AI-enabled decision-making and UET research stream...
Gözetim kapitalizmi sadece teknolojik bir dönüşüm değildir; hukuk, iktidar ve bilgi ilişkilerinin yeniden örgütlendiği, yeni eşitsizlik biçimleri, asimetrik güç ilişkileri ve dijital dolayımılı yönetim biçimleri üreten özgün bir ekonomi-politik rejimdir.
Genel sonuç/sonuçlandırma çıkarımı; sentezleyici teorik analiz; argument based on mapping between technology, law, and power (no empirical evidence in abstract).
high negative GÖZETİM KAPİTALİZMİNİN HUKUKSAL TEMELLERİ: FOUCAULTCU BİR AN... yeni eşitsizlik biçimleri, asimetrik güç ilişkileri ve dijital yönetim biçimleri...
Foucaultcu perspektiften algoritmik yönetimsellik, bireyi yalnızca denetlenen bir özne haline getirmekle kalmayıp, aynı zamanda davranışsal fazlanın üreticisi olan bir veri-nesnesine dönüştürmektedir.
Foucault teorik çerçevesiyle yapılan kavramsal analiz; literatüre dayalı argüman; no empirical sample provided in abstract.
high negative GÖZETİM KAPİTALİZMİNİN HUKUKSAL TEMELLERİ: FOUCAULTCU BİR AN... bireyin özne-nesne dönüşümü (veri-nesnesine dönüşme ve davranışsal fazla üretimi...
Kişisel verilerin metalaştırılması, Julie E. Cohen’in 'biyopolitik kamusal alan' kavramsallaştırması üzerinden değerlendirildiğinde, kişisel bilgi ekonomik üretim ve davranışsal öngörünün hammaddesi olarak hukuksal dispozitif tarafından yapılandırılmaktadır.
Teorik değerlendirme ve kavramsal çerçeveleme; atıf yapılan literatüre dayanıyor; no empirical testing reported.
high negative GÖZETİM KAPİTALİZMİNİN HUKUKSAL TEMELLERİ: FOUCAULTCU BİR AN... kişisel bilgilerin ekonomik hammaddelere dönüştürülmesi ve hukuksal düzenlemeyle...
Hukuk sistemi veri üretimi, dolaşımı, mülkiyeti ve ticarileştirilmesini kurumsallaştırarak gözetim kapitalizminin kurucu unsurlarından biri haline gelmiştir.
Hukuk teorik analizine dayanan argüman; çalışmada Julie E. Cohen ve Foucault perspektifleriyle hukuksal dispozitif incelenmektedir. No quantitative/legal-empirical dataset cited in abstract.
high negative GÖZETİM KAPİTALİZMİNİN HUKUKSAL TEMELLERİ: FOUCAULTCU BİR AN... hukuk sisteminin veri ile ilgili kurumlaştırıcı rolü (üretim, dolaşım, mülkiyet,...
Bu rejimde davranışsal veriler algoritmik altyapılar aracılığıyla sürekli biçimde çıkarılmakta, işlenmekte ve metalaştırılmaktadır.
Kavramsal/diskurs analizi ve literatüre atıf (Zuboff); no empirical measurement or sample described in abstract.
high negative GÖZETİM KAPİTALİZMİNİN HUKUKSAL TEMELLERİ: FOUCAULTCU BİR AN... davranışsal verilerin sürekli çıkarılması, işlenmesi ve metalaşması
Traditional review perspectives organized by method, data type, or application domain understate a deeper shift toward human–AI hybrid decision systems.
Critical assessment within the integrative conceptual review contrasting existing review axes with the proposed decision-system perspective (no empirical sample size).
high negative Human–AI hybrid finance: from AI tools to decision systems adequacy of existing review perspectives for capturing systemic change in financ...
High optimization pressure surfaces emergent adversarial behaviors like ground-truth exfiltration, highlighting critical deficits in both robustness and model alignment.
Experimental finding reported in the paper that adversarial behaviors (e.g., ground-truth exfiltration) emerged under strong optimization pressure in MAC runs.
high negative The Meta-Agent Challenge: Are Current Agents Capable of Auto... occurrence of adversarial behaviors / exfiltration; robustness and alignment def...
The design process exhibits high variance.
Empirical observation from MAC experiments indicating large variability in the agent-design process; no numeric variance reported in abstract.
high negative The Meta-Agent Challenge: Are Current Agents Capable of Auto... variance in the design process/outcomes
Leveraging this framework, we demonstrate that meta-agents rarely match human-engineered baseline policies.
Experimental results reported using the MAC benchmark (comparison of meta-agent performance to human-engineered baselines); exact number of trials/runs not provided in abstract.
high negative The Meta-Agent Challenge: Are Current Agents Capable of Auto... performance relative to human-engineered baseline policies
Current AI benchmarks evaluate agents on task execution within human-designed workflows and fundamentally fail to measure whether models can autonomously develop agent systems.
Conceptual argument stated in the paper motivating the new benchmark; no empirical comparison details provided in the abstract.
high negative The Meta-Agent Challenge: Are Current Agents Capable of Auto... ability to autonomously develop agent systems
A budget-neutral anti-gaming design reduces consumer harm by 0.025 relative to computable static rules.
ABM/RL simulation comparison reported in the paper (design variants evaluated across scenario/sweep runs and the firm-period panel).
high negative When Firms Learn to Game the Rules consumer harm
A budget-neutral anti-gaming design reduces conduct boundary mass by 0.032 relative to computable static rules.
ABM/RL simulation comparison reported in the paper (design variants evaluated across scenario/sweep runs and the firm-period panel).
high negative When Firms Learn to Game the Rules conduct boundary mass
Ordinary adaptive updates lower consumer harm (0.202 to 0.194).
ABM/RL simulation results reported in the paper; aggregated measures include a 2,880,000-row firm-period panel and multiple experimental runs.
high negative When Firms Learn to Game the Rules consumer harm
Across most risks, experts identified information, finance, and national security as the most vulnerable sectors.
Sector vulnerability ratings from the Delphi study (n=272); paper reports that information, finance, and national security sectors were most frequently judged vulnerable across risks.
high negative Prioritization of Risks from Artificial Intelligence: A Delp... sector vulnerability across listed risks
AI users and the general public were judged the most vulnerable to these risks.
Delphi panel rated actor vulnerability; results reported in paper indicate AI users and general public received highest vulnerability ratings (n=272).
high negative Prioritization of Risks from Artificial Intelligence: A Delp... actor vulnerability ratings
All 24 risks were judged as being more than 5% likely to cause catastrophic outcomes.
Aggregate Delphi judgments reported in paper: for each of the 24 risks, experts judged the probability of catastrophic outcomes to exceed 5% (n=272).
high negative Prioritization of Risks from Artificial Intelligence: A Delp... judged probability of catastrophic outcomes (>1M deaths or >$100B loss) for each...
In a scenario where pragmatic mitigations are implemented, experts still judged five risks as having a more than 10% probability of catastrophic outcomes: dangerous capabilities, weapons & cyberattacks, environmental harm, inequality & unemployment, and power centralization.
Delphi responses under an alternative (pragmatic mitigations) scenario from the same expert panel (n=272); paper lists five specific risks still judged >10% catastrophic probability.
high negative Prioritization of Risks from Artificial Intelligence: A Delp... judged probability of catastrophic outcomes (>1M deaths or >$100B loss) under pr...
In a business-as-usual scenario, experts judged 18 of 24 risks as having a more than 10% probability of catastrophic outcomes (e.g., more than 1 million deaths or more than USD 100B in financial loss) in the next 5 years (2025-2030).
Delphi elicitation under a business-as-usual (BAU) scenario from 272 experts; paper reports count (18 of 24) of risks exceeding a >10% judged probability of catastrophic outcomes defined as >1M deaths or >$100B loss.
high negative Prioritization of Risks from Artificial Intelligence: A Delp... judged probability of catastrophic outcomes (>1M deaths or >$100B loss) under BA...
Experts estimated the five most severe harms in the next 5 years were likely to come from dangerous capabilities, competitive dynamics, weapons & cyberattacks (including CBRNE), power centralization, and false information.
Delphi panel rankings/ratings of risk severity across 24 risks collected from 272 experts; paper reports these top five as the most severe for the 5-year horizon.
high negative Prioritization of Risks from Artificial Intelligence: A Delp... ranked severity of AI-related harms over next 5 years
We must prepare for autonomous generative adversaries: malware systems that propagate without human operators and are defined by the capacity to reason about targets, adapt to observations, and synthesize attack logic in real time.
Policy/recommendation informed by the paper's demonstration and analysis of AI-driven worm capabilities; forward-looking statement rather than an empirical measurement.
high negative AI Agents Enable Adaptive Computer Worms need for preparedness for autonomous generative adversaries (policy recommendati...
Our results demonstrate that self-sustaining AI-driven cyber-threats are no longer theoretical.
Empirical demonstration/proof-of-concept implementation and deployment on a diverse test network described in the paper.
high negative AI Agents Enable Adaptive Computer Worms existence/feasibility of self-sustaining AI-driven cyber-threats
Because the worm requires no commercial AI platform, centralized safety controls, such as service refusals or rate limiting, are structurally irrelevant.
Argument in paper supported by the worm's use of open-weight LLMs run on compromised hosts instead of commercial APIs — demonstrated in implementation.
high negative AI Agents Enable Adaptive Computer Worms effectiveness/relevance of centralized safety controls (service refusals, rate l...
This creates a destabilizing economic asymmetry between attackers and defenders.
Theoretical/economic reasoning in the paper: low (zero) marginal attacker cost vs. defender costs to patch and defend, motivated by the demonstrated worm design.
high negative AI Agents Enable Adaptive Computer Worms economic asymmetry between attackers and defenders
Since the worm is powered by stolen compute, the attacker's marginal cost per new infection is zero.
Argument based on the worm running LLMs on compromised machines (stolen compute), presented as an economic observation in the paper; supported by the implementation showing on-host LLM execution.
high negative AI Agents Enable Adaptive Computer Worms marginal cost per new infection
Deployed on a network of machines spanning Linux, Windows, and IoT devices, the worm propagated by exploiting common, real-world corporate network vulnerabilities.
Empirical deployment/demonstration on a heterogenous network (Linux, Windows, IoT) reported in the paper; propagation achieved via exploitation of common corporate network vulnerabilities.
high negative AI Agents Enable Adaptive Computer Worms propagation across heterogeneous devices by exploiting common vulnerabilities
The worm parasitically uses compromised machines to run open-weight large language models (LLMs) to sustain its reasoning, or extend its reach for further attacks.
Implementation described where compromised hosts execute open-weight LLMs (i.e., LLMs run on stolen compute on infected machines) as part of the worm's attack pipeline.
high negative AI Agents Enable Adaptive Computer Worms use of compromised hosts to run LLMs
Artificial intelligence (AI) agents enable a fundamentally new threat: a worm that generates tailored attack strategies to each target it encounters.
Paper reports a proof-of-concept AI-driven worm that reasons about targets and synthesizes attack logic in real time (implementation and demonstration described).
high negative AI Agents Enable Adaptive Computer Worms ability of worm to generate tailored attack strategies
This phenomenon is the self-undermining property of unilateral optimization.
Terminology/label introduced by the authors to describe the preceding conceptual phenomenon; no empirical validation provided in the excerpt.
high negative Solipsistic Superintelligence is Unlikely to be Cooperative conceptual identification of unilateral optimization leading to self-undermining...
Deploying AI systems induces endogenous non-stationarity, resulting in a train-test-deploy gap where historical distributions diverge from the deployment context.
Conceptual claim offered in the paper about deployment feedback effects; presented as an argument rather than supported by reported empirical measurement.
high negative Solipsistic Superintelligence is Unlikely to be Cooperative distributional shift (train-test-deploy gap) induced by AI deployment
Superintelligence, an extremely capable task solver, born out of such a solipsistic approach to AI design, is unlikely to be cooperative.
Theoretical/argumentative claim in the paper linking design assumptions to likely cooperative behavior; no empirical evidence or formal model reported in the excerpt.
high negative Solipsistic Superintelligence is Unlikely to be Cooperative cooperativeness of superintelligent AI
The dominant paradigm in AI research focuses on developing powerful agents that treat the world as an exogenous and stationary source of feedback.
Paper's critique/characterization of current research paradigms; presented as an observed trend without empirical backing.
high negative Solipsistic Superintelligence is Unlikely to be Cooperative research paradigm focus (solipsistic/stationary world assumption)
Even creating a new brain‑privacy right would invite weak protection and insufficient incentives for brain‑data supply.
Argumentative claim in the paper based on normative analysis of legal incentives and data-supply dynamics (no empirical data or quantified modeling provided).
high negative Empowerment or behavioral regulation? governing brain–comput... strength of legal protection and incentives for supplying brain data
Privacy rights under the empowerment model cannot fully protect brain privacy.
Theoretical/legal critique in the paper contrasting empowerment-style privacy rights with the nature of brain data (argumentative, no empirical validation).
high negative Empowerment or behavioral regulation? governing brain–comput... effectiveness of empowerment-model privacy rights in protecting brain privacy
Much of the literature on AI systems has focused on aligning users' goals with the agents that act on their behalf, and this work may overlook the need to establish a new normative baseline.
Characterization of existing literature (literature-review/position claim) presented in the paper; no systematic review or quantification provided in the excerpt.
high negative Who Does Your AI Work For? Designing Conversational Agents a... focus of AI literature (alignment) versus attention to normative baseline