The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (6869 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Governance Remove filter
Decoys contribute to the network-making power that is at the heart of the Project's extraction and exploitation.
Theoretical synthesis and interpretive argument grounded in literature across relevant fields; the paper posits a mechanism (decoys → strengthened networks → increased extraction/exploitation) but provides no empirical quantification.
high negative Reckoning with the Political Economy of AI: Avoiding Decoys ... network-making power and related extraction/exploitation
Decoys often create the illusion of accountability while masking the emerging political economies that the Project of AI has set into motion.
Conceptual critique supported by literature from communication, STS, and economic sociology; argument that particular practices/instruments function rhetorically to appear accountable while obscuring material political economy. No empirical sample or quantified measures reported.
high negative Reckoning with the Political Economy of AI: Avoiding Decoys ... perceived accountability versus actual visibility of political economy
As AI funders and developers expand their access to resources and configure sociotechnical conditions, they benefit from decoys that animate scholars, critics, policymakers, journalists, and the public into co-constructing industry-empowering AI futures.
Theoretical analysis and literature review; paper identifies and interprets how discursive and institutional phenomena (termed 'decoys') function to produce consent and co-construction of industry-aligned futures. No empirical sample size provided.
high negative Reckoning with the Political Economy of AI: Avoiding Decoys ... co-construction of industry-empowering AI futures by multiple societal actors
Those who fund and develop AI systems operate through and seek to sustain networks of power and wealth.
Conceptual argument and literature synthesis drawing on communication studies, science & technology studies (STS), and economic sociology; no empirical sample reported.
high negative Reckoning with the Political Economy of AI: Avoiding Decoys ... operation and maintenance of networks of power and wealth by AI funders/develope...
Current attack policies do not saturate LinuxArena (human-crafted attacks evade monitors at substantially higher rates than model-generated attacks, indicating headroom for attackers).
Empirical observation comparing human-crafted attacks (LaStraj) and elicited model-generated attacks; authors interpret higher human evasion rates as evidence that current automated attack policies have not saturated the challenge posed by LinuxArena.
high negative LinuxArena: A Control Setting for AI Agents in Live Producti... relative performance gap between human-crafted and model-generated attacks (impl...
LaStraj is a dataset of human-crafted attack trajectories that evade monitors at substantially higher rates than any model-generated attacks we elicited.
Authors release LaStraj and report empirical comparisons showing human-crafted trajectories evade monitors at higher rates than the model-generated attacks they tested (exact evasion rates and sample sizes not provided in the excerpt).
high negative LinuxArena: A Control Setting for AI Agents in Live Producti... monitor evasion rate of human-crafted attack trajectories versus model-generated...
Against a GPT-5-nano trusted monitor at a 1% step-wise false positive rate, Claude Opus 4.6 achieves roughly a 23% undetected sabotage success rate.
Empirical sabotage evaluation reported by the authors: monitoring a trusted monitor (GPT-5-nano) at a specified step-wise false positive rate and reporting attacking model (Claude Opus 4.6) undetected success rate. (Sample size / number of evaluated runs not provided in the excerpt.)
high negative LinuxArena: A Control Setting for AI Agents in Live Producti... undetected sabotage success rate (attacker success despite monitoring)
Prior research has focused mainly on functional or behavioral alignment rather than moral alignment.
Asserted as a characterization of the literature in the paper (literature-review / conceptual claim; no empirical sampling or quantitative synthesis reported in the supplied text).
high negative Smart But Not Moral? Moral Alignment In Human-AI Decision-Ma... focus/themes of prior AI alignment research
AI can exacerbate occupational polarization, digital exclusion, and discriminatory outcomes when models are trained on biased data or deployed without transparency and accountability.
Thematic synthesis across included studies identifying mechanisms (biased training data, lack of transparency/accountability) linked to negative distributional outcomes (occupational polarization, digital exclusion, discrimination).
high negative Artificial Intelligence in the Labor Market: Evidence on Wor... distributional and equity outcomes (polarization, exclusion, discrimination)
Even explicitly aligned agents exhibit intrinsic biases toward certain ethical frameworks, consistent with known left-leaning tendencies in large language models.
Empirical observation in the alignment-conditioned agents' choices and reasoning frameworks in the triage experiments; authors relate these observations to prior literature on LLM political/ideological tendencies.
high negative Beyond Arrow's Impossibility: Fairness as an Emergent Proper... intrinsic alignment bias (preference for certain ethical frameworks / ideologica...
While achieving financial autonomy, firms are also getting exposed to new constraints by shifting their reliance on third-party software, technological infrastructures and opaque algorithms (Gaviyau & Godi, 2025; Suhrab et al., 2026).
Stated with citations to Gaviyau & Godi (2025) and Suhrab et al. (2026); presented as an observed/paraphrased risk or unintended consequence in the paper. No empirical sample details in the excerpt.
high negative Re-Evaluation of Resource Dependence in AI Enabled SME Finan... increased reliance/dependency on third-party technology and opaque algorithms (n...
SMEs are suffering from various financial constraints, mostly relying heavily on traditional financial institutions for their survival (Kadzima et al., 2025).
Statement supported by citation to Kadzima et al. (2025); presented as a literature-supported empirical generalization in the paper's background/introduction. No sample size or empirical details given in the excerpt.
high negative Re-Evaluation of Resource Dependence in AI Enabled SME Finan... financial constraints / reliance on traditional financial institutions
Large language models remain confined to linguistic simulation rather than grounded understanding.
Conceptual assertion in the paper arguing limits of current models; no empirical tests or measurements reported.
high negative Governing Reflective Human-AI Collaboration: A Framework for... grounded_understanding (absence thereof)
Fluency is not reliability: without structures that stabilise both human and model reasoning, AI cannot be trusted or governed where it matters most.
Central thesis/claim of the paper; normative argument synthesising the paper's observations and proposals rather than an empirically tested finding provided here.
high negative The Missing Knowledge Layer in AI: A Framework for Stable Hu... trustworthiness/governability of AI in high-stakes contexts
Humans often mistake fluency for reliability: when a model responds smoothly, users tend to trust it, even when both model and user are drifting together.
Behavioral/psychological assertion in the paper referencing human interaction patterns with fluent outputs; no experimental data or sample size reported in this paper excerpt.
high negative The Missing Knowledge Layer in AI: A Framework for Stable Hu... user trust in model outputs
LLMs produce fluent outputs even when their internal reasoning has drifted; a confident answer can conceal uncertainty, speculation, or inconsistency, and small changes in phrasing can lead to different conclusions.
Conceptual/observational claim presented in the paper; no original empirical test or sample size reported here.
high negative The Missing Knowledge Layer in AI: A Framework for Stable Hu... reliability/consistency of model outputs (decision quality)
Stronger reasoning capabilities do not prevent LLMs from defecting in single-shot social dilemmas (i.e., models defect with or without reasoning enabled).
Authors' experiments that explicitly compared model behavior with reasoning enabled vs disabled in single-shot social dilemmas; details not provided in the excerpt.
high negative CoopEval: Benchmarking Cooperation-Sustaining Mechanisms and... cooperation/defection rates conditional on reasoning capability being enabled
Repetition-induced cooperation deteriorates drastically when co-players vary.
Authors' experimental observation comparing repeated-game cooperation under fixed vs varying co-players in their study; no quantitative metrics or sample sizes provided in the excerpt.
high negative CoopEval: Benchmarking Cooperation-Sustaining Mechanisms and... cooperation level under repeated interactions when co-players vary
Our experiments show that recent models — with or without reasoning enabled — consistently defect in single-shot social dilemmas.
Authors' experimental results comparing recent LLMs in single-shot social dilemma games, with reasoning enabled vs disabled; specific models, number of games, and sample sizes are not provided in the excerpt.
high negative CoopEval: Benchmarking Cooperation-Sustaining Mechanisms and... rate of defection (vs cooperation) in single-shot social dilemmas
Recent works report that LLMs with stronger reasoning capabilities behave less cooperatively in mixed-motive games such as the prisoner's dilemma and public goods settings.
Statement referencing prior literature (recent works) summarized in the paper's introduction/background; no specific dataset or sample size given in the excerpt.
high negative CoopEval: Benchmarking Cooperation-Sustaining Mechanisms and... cooperative behavior in mixed-motive games (e.g., prisoner's dilemma, public goo...
Most Sub-Saharan African states still lack the institutional frameworks needed to turn these innovations into sustainable development.
Comparative policy analysis stated in the paper; no quantitative sample size or formal survey data reported in the excerpt.
high negative A Framework for Sovereign AI Governance and Economic Growth ... presence/absence of institutional frameworks enabling AI-driven sustainable deve...
Existing competition-aware CFL and incentive-design approaches reward organizations based on marginal training contributions but fail to account for the costs of strengthening competitors.
Literature critique and comparison in the paper; theoretical discussion rather than a reported empirical trial or sample.
high negative Cooperate to Compete: Strategic Data Generation and Incentiv... adequacy_of_incentive_design (accounting for competitor-strengthening costs)
Non-IID data amplifies this coopetition dilemma by producing asymmetric learning gains across organizations and undermining sustained participation.
Conceptual claim supported by the paper's theoretical modeling and later experiments (described as 'non-IID data' experiments); no numeric sample size given in abstract.
high negative Cooperate to Compete: Strategic Data Generation and Incentiv... asymmetry_in_learning_gains / sustained_participation
Cross-silo federated learning (CFL) deployments in data-sensitive domains are inherently coopetitive: organizations cooperate during model training while competing in downstream markets, so training contributions can inadvertently strengthen rivals.
Conceptual argument and literature motivation presented in the paper's introduction; no empirical sample size reported.
high negative Cooperate to Compete: Strategic Data Generation and Incentiv... strengthening_of_rivals / participation incentives
This mismatch makes it difficult to predict post-deployment success and obscures competitive effects such as early-adoption advantages and market dominance.
Argument in paper linking limitations of current evaluation methods to inability to predict deployment outcomes; conceptual claim without empirical demonstration in the abstract.
high negative Evaluation of Agents under Simulated AI Marketplace Dynamics predictability of post-deployment success and visibility of competitive effects ...
Evaluation is still largely conducted on static benchmarks with accuracy-focused measures that assume systems operate in isolation.
Statement in paper critiquing prevailing evaluation practice; presented as a general observation without cited systematic review or quantitative evidence in the abstract.
high negative Evaluation of Agents under Simulated AI Marketplace Dynamics evaluation practice (use of static accuracy-focused benchmarks)
Infrastructure constraints, particularly in developing countries, limit AI adoption in auditing.
Thematic analysis of reviewed articles noting infrastructure limitations (e.g., ICT infrastructure) in developing-country contexts.
high negative Implementing Artificial Intelligence in Auditing: A Systemat... infrastructure constraints affecting AI adoption
Limitations in auditor competencies (skills and training) hinder effective AI adoption in auditing.
Thematic findings across the sample of articles report auditor competency gaps as a challenge to AI implementation.
high negative Implementing Artificial Intelligence in Auditing: A Systemat... auditor competencies / skill gaps
Ethical and data privacy concerns are persistent challenges to AI implementation in auditing.
Recurring theme in the reviewed literature identified via thematic analysis; papers cite ethics and privacy as obstacles.
high negative Implementing Artificial Intelligence in Auditing: A Systemat... ethical and data privacy concerns as barriers
Several challenges persist for AI adoption in auditing, including high technology investment costs.
Thematic analysis of barriers reported across the 15 articles highlighting cost as a recurrent challenge.
high negative Implementing Artificial Intelligence in Auditing: A Systemat... barrier: technology investment costs to AI adoption
Early iterations suffered severe execution decay.
Reported observation from the longitudinal study describing early-phase performance problems (qualitative; no quantitative metric in the excerpt).
high negative OOM-RL: Out-of-Money Reinforcement Learning Market-Driven Al... execution decay (degradation of execution/performance in early iterations)
Execution-based environments suffer from adversarial 'Test Evasion' by unconstrained agents.
Stated assertion in the paper's motivation/abstract; presented as a limitation of execution-based evaluation (no empirical sample size or experiment details provided in the excerpt).
high negative OOM-RL: Out-of-Money Reinforcement Learning Market-Driven Al... test evasion (agents adversarially bypassing execution-based tests)
Current paradigms, such as Reinforcement Learning from Human Feedback (RLHF) and AI Feedback (RLAIF), frequently induce model sycophancy.
Stated assertion in the paper's motivation/abstract; presented as a limitation of existing alignment paradigms (no empirical sample size or experiment details provided in the excerpt).
high negative OOM-RL: Out-of-Money Reinforcement Learning Market-Driven Al... model sycophancy (agents producing sycophantic behaviour)
Agent frameworks infer authority conversationally, reconstruct accountability from logs, and produce silent errors: incorrect determinations that execute without any human review signal.
Statement/argument in paper describing failure modes of general-purpose agent frameworks; no empirical sample or experiment reported for this claim in the excerpt.
high negative Governed Reasoning for Institutional AI occurrence of silent errors (incorrect determinations executing without human-re...
Findings suggest that previous results relying on attitudinal outcomes may generalize poorly to behaviour, and therefore risk substantially mischaracterizing the real-world behavioural impact of AI persuasion.
Interpretation/conclusion based on the paper's empirical results: discrepancy between attitudinal effects and behavioural effects observed in the preregistered experiments.
high negative Artificial intelligence can persuade people to take politica... generalizability of attitudinal findings to real-world behavior
A foreign state actor threat model for enterprise identity governance establishing that Silk Typhoon, Salt Typhoon, Volt Typhoon, and North Korean AI-enhanced identity fraud operations have already operationalized AI identity vulnerabilities as active attack vectors.
Paper claims to provide a threat model and asserts these named actors have operationalized AI identity vulnerabilities; stated grounding implied to be threat intelligence and incident analysis, though not detailed in the excerpt.
high negative Who Governs the Machine? A Machine Identity Governance Taxon... operationalization of AI identity vulnerabilities by named foreign actor groups
Nation-state actors including Silk Typhoon and Salt Typhoon have operationalized ungoverned machine credentials as primary espionage vectors against critical infrastructure.
Asserted in paper and described as grounded in threat intelligence; no specific threats, incidents, or data described in the excerpt.
high negative Who Governs the Machine? A Machine Identity Governance Taxon... use of ungoverned machine credentials by nation-state actors for espionage again...
A single ungoverned automated agent produced $5.4-10 billion in losses in the 2024 CrowdStrike outage.
Statement in paper attributing a $5.4-10B loss to an ungoverned automated agent during the 2024 CrowdStrike outage; no citation or method shown in excerpt.
high negative Who Governs the Machine? A Machine Identity Governance Taxon... financial losses caused by an ungoverned automated agent in the 2024 CrowdStrike...
No integrated framework exists to govern machine identities (AI agents, service accounts, API tokens, automated workflows).
Asserted in paper as a gap in existing governance frameworks; no empirical test or survey reported in the excerpt.
high negative Who Governs the Machine? A Machine Identity Governance Taxon... existence of an integrated governance framework for machine identities
Automated agents, service accounts, API tokens, and automated workflows now outnumber human identities in enterprise environments by ratios exceeding 80 to 1.
Statement in paper (asserted prevalence); no sample size or data source provided in the excerpt.
high negative Who Governs the Machine? A Machine Identity Governance Taxon... number of machine identities relative to human identities in enterprise environm...
Major methodological risks include overfitting, regime instability, interpretability deficits, and institutional dependence.
Critical evaluation within the review identifying key methodological risks across the surveyed streams (conceptual assessment; no empirical estimate provided).
high negative Artificial Intelligence in Financial Decision-Making presence of methodological risks in AI applications to finance
The literature remains fragmented across at least three partially connected domains: financial time-series forecasting, portfolio construction, and firm-level sustainability analysis.
Author's characterization of the existing literature in the review (synthesis of published work; no single empirical sample; survey-based statement).
high negative Artificial Intelligence in Financial Decision-Making degree of fragmentation/disciplinary separation in the literature
Persistent challenges to AI implementation include resistance to change, data quality limitations, and concerns regarding transparency and algorithmic bias.
Recurring barriers identified across the 27 included studies, summarized in the review's findings.
high negative Artificial Intelligence for Business Decision-Making in Lati... implementation barriers (resistance, data quality, transparency, bias)
AI infrastructure owners may command more wealth and capability than most governments, threatening the future viability or authority of the nation-state.
Futuristic projection based on the paper's modeling and synthesis of wealth/capability concentration under AI; no empirical measures or comparative data versus governments provided in the excerpt.
high negative A Framework for Understanding the Convergence of Geopolitica... relative wealth and capability of AI infrastructure owners vs. governments; impa...
Universal Basic Income (UBI), evaluated through incentive-structure lens, will default to a pacification mechanism rather than a genuine solution in the absence of a revolutionary threat that historically forced redistribution.
Normative and theoretical analysis of incentive structures and historical mechanisms of redistribution; the excerpt presents this as an argument rather than reporting empirical trials or quantified outcomes.
high negative A Framework for Understanding the Convergence of Geopolitica... policy effect of UBI (pacification vs. genuine redistribution/solution)
Unlike previous feudal orders, this one may prove uniquely resistant to revolution because the mechanisms of enforcement (autonomous weapons, AI surveillance, algorithmic propaganda) do not require human cooperation and therefore cannot be undermined by human dissent.
Logical and theoretical claim based on characteristics of AI-enabled enforcement technologies; presented as an argument rather than an empirically tested finding in the excerpt.
high negative A Framework for Understanding the Convergence of Geopolitica... resistance of a future authoritarian/feudal order to revolution due to autonomou...
Under this emerging order, the vast majority of humanity will lose their political leverage.
Theoretical and historical argument linking concentration of infrastructure control to political disempowerment; no empirical metrics or sample size provided in the excerpt.
high negative A Framework for Understanding the Convergence of Geopolitica... political leverage of the majority
Under this emerging order, the vast majority of humanity will lose their labor value.
Claim made via theoretical argument about automation and AI replacing labor value; no quantitative empirical evidence or sample detailed in the excerpt.
high negative A Framework for Understanding the Convergence of Geopolitica... labor value of the majority (economic value of human labor)
This structural transformation could stabilize into a neo-feudal equilibrium in which a vanishingly small class of infrastructure owners wields power comparable to pre-Enlightenment monarchs.
Futuristic projection and normative/historical analogy based on conceptual modeling of class structure under AGI; the excerpt gives no empirical data or formal model outputs.
high negative A Framework for Understanding the Convergence of Geopolitica... emergence of a neo-feudal equilibrium with extreme concentration of political/ec...
The convergence of geopolitical fragmentation (democratic decline) and AI-driven economic concentration is producing a structural transformation unprecedented in human history.
Theoretical synthesis and historical comparison; the paper presents this as an argument based on conceptual modeling and historical analogy; no specific empirical test or sample noted in the excerpt.
high negative A Framework for Understanding the Convergence of Geopolitica... structural transformation of political-economic order