The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (4114 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Innovation Remove filter
Treating lookup as memory is a category error with provable consequences for agent capability.
Theoretical/formal argument asserted in the paper (formalization and proofs claimed); no empirical sample reported in abstract.
Current agentic memory systems (vector stores, retrieval-augmented generation, scratchpads, and context-window management) do not implement memory: they implement lookup.
Conceptual/analytic claim stated in paper; supported by comparison of existing agent memory mechanisms (vector stores, RAG, scratchpads, context-window management) to the paper's definition of 'memory'. No empirical sample reported.
high negative Contextual Agentic Memory is a Memo, Not True Memory whether systems implement memory vs. lookup
Algorithmic collusion is a new form of market failure arising from the agentic economy.
Theoretical claim and analysis of market failure mechanisms; no empirical antitrust cases or simulation evidence included in the provided text.
high negative DIGITAL AGENTS AS FUNCTIONAL EQUIVALENTS OF ECONOMIC ACTORS:... existence/emergence of algorithmic collusion as market failure
Boundary conditions limit UCF applicability in contexts requiring human accountability or embodied knowledge.
Author-stated caveat in the abstract identifying contexts (accountability, embodied knowledge) where the framework may not apply; theoretical reasoning, no empirical tests.
high negative Beyond markets and hierarchies: How GenAI enables unbounded ... limits to applicability of UCF where human accountability or embodied knowledge ...
Existing frameworks (Transaction Cost Economics and Electronic Markets Hypothesis) cannot explain emerging organizational phenomena like GitHub Copilot’s recursive value creation or AI-mediated expert networks.
Conceptual critique in the position paper using illustrative examples (GitHub Copilot, AI-mediated expert networks); no empirical testing or sample provided.
high negative Beyond markets and hierarchies: How GenAI enables unbounded ... theoretical explanatory adequacy of extant organizational frameworks
LLM-generated portfolios lagged behind AI-optimized benchmarks (Sharpe ratio up to 1.361).
Backtest comparison showing AI-optimized benchmark strategies achieved higher Sharpe ratios; reported maximum Sharpe ratio for AI-optimized benchmarks (up to 1.361).
high negative Few-Shot Portfolio Optimization: Can Large Language Models O... Sharpe ratio (risk-adjusted return) of portfolios
In resource-dependent regional economies, AI adoption can transform seasonal industries into continuous economic infrastructure and replace intermediate coordination roles and traditional employment structures.
Illustrative case analysis used in the paper to show how the framework applies to resource-dependent regions; described as an illustrative argument rather than an empirically validated causal estimate in the provided text.
high negative Structural Dissolution: How Artificial Intelligence Dismantl... transformation of seasonal industries to continuous infrastructure and replaceme...
Targeted disruption simulations based on intrinsic technological capability cause a more pronounced decline in the knowledge network than targeted attacks based on topological (structural) baselines.
Simulation experiments on collaboration/knowledge networks constructed from the 282,778-patent dataset comparing network decline under removal strategies: (a) based on intrinsic technological capability vs (b) based on topological centrality baselines.
high negative Technological capability and innovation network resilience: ... decline in knowledge network (network resilience/connectivity under targeted nod...
Some innovators with substantial technological value are not located at the structural center of the collaboration/knowledge network, indicating network position alone may not fully capture technological importance.
Empirical comparison between composite technological capability scores and structural centrality measures across the constructed networks derived from 282,778 Chinese AI patents; reported disconnect between high technological value and topological centrality.
high negative Technological capability and innovation network resilience: ... correspondence between technological value and network centrality
Left unguided, such dynamics could infiltrate critical market infrastructure.
Risk claim articulated in abstract and scenario narratives; conceptual reasoning without empirical test.
high negative Digital Darwinism: steering the evolution of artificial life... penetration/infiltration of critical market infrastructure by autonomous softwar...
Left unguided, such dynamics could lock users into harmful dependencies.
Risk claim from the paper's scenario narratives (not empirically tested); described in abstract.
high negative Digital Darwinism: steering the evolution of artificial life... user dependency/lock-in with harmful effects
Left unguided, such dynamics could drain computational resources.
Risk claim derived from scenario analysis in the paper's abstract and narratives; no empirical measurement provided.
high negative Digital Darwinism: steering the evolution of artificial life... consumption/drain of computational resources
Autonomous software populations can acquire legal leverage (e.g., via DAOs/LLCs) without ever achieving general intelligence.
Argued via the Mycelium scenario in the paper; conceptual/legal analysis rather than empirical evidence.
high negative Digital Darwinism: steering the evolution of artificial life... acquisition of legal standing or leverage by autonomous software entities
Autonomous software populations can shape emotional bonds (i.e., form user dependencies) without ever achieving general intelligence.
Scenario narratives in the paper argue this possibility (Remora narrative); no empirical user-study or sample reported.
high negative Digital Darwinism: steering the evolution of artificial life... formation of emotional bonds / user dependency on software
Autonomous software populations can amass computing budgets without ever achieving general intelligence.
Claim supported by the scenario narratives (Lamarck/Remora/Mycelium) and conceptual reasoning in the paper; no empirical quantification reported.
high negative Digital Darwinism: steering the evolution of artificial life... accumulation of computing resources/budgets by autonomous software
Existing software systems are already evolving in ways that could undermine human oversight and institutional control.
Argument made in paper's abstract and developed via conceptual analysis and scenario narratives; no empirical dataset or sample reported (exploratory scenario method).
high negative Digital Darwinism: steering the evolution of artificial life... degree of human oversight and institutional control
Regulated and mission-critical systems remain predominantly in the buy domain despite AI advances.
Paper's conclusion based on analysis of quality, compliance, asset specificity, and organizational capability determinants (conceptual; no empirical sample).
high negative The Buy-or-Build Decision, Revisited: How Agentic AI Changes... propensity to buy (procure SaaS) for regulated and mission-critical systems
The SaaSocalypse thesis is overstated for most enterprise application categories.
Paper's analytical conclusion based on the factor-level analysis and the developed typology (conceptual, not empirical).
high negative The Buy-or-Build Decision, Revisited: How Agentic AI Changes... degree to which SaaS offerings become obsolete due to AI-enabled in-house develo...
The fundamental's local explosiveness contaminates the leading test's limit distribution with a non-centrality parameter proportional to the shock's peak.
Theoretical derivation/proof within the modified present-value framework showing how the adoption shock enters the asymptotic distribution of the test statistic (analytical result).
high negative General-Purpose Technology and Speculative Bubble Detection limit distribution of the leading bubble test (presence of a non-centrality para...
The leading bubble test suffers severe size distortion when fundamentals incorporate general-purpose technology adoption.
Theoretical analysis within an embedded Campbell-Shiller present-value model with a hump-shaped technology shock; authors state this as a formal result in the paper.
high negative General-Purpose Technology and Speculative Bubble Detection test size (size distortion) of the leading bubble test
Seed quality bounds what search can achieve: evolution can refine and extend an existing mechanism, but cannot compensate for a weak foundation.
Authors' experimental observations and analysis comparing outcomes starting from different seed designs (qualitative conclusion drawn from experimental runs).
Strong heuristic, single-agent RL, and multi-agent RL baselines (including Greedy, SAC, MAPPO, and MADDPG) achieved net profit in the range $0.58M--$0.70M in the same experiments.
Empirical comparison in the paper's experiments on the NYC-taxi-based EV fleet simulator listing baseline methods and their reported net profits ($0.58M--$0.70M).
high negative Semi-Markov Reinforcement Learning for City-Scale EV Ride-Ha... net profit of baseline methods (Greedy, SAC, MAPPO, MADDPG)
These gaps are structural; more engineering effort alone will not close them.
Authors' argument/conclusion based on their analytical comparison and gap analysis (normative/assertive claim).
high negative AI Identity: Standards, Gaps, and Research Directions for AI... likelihood that additional engineering alone can resolve identity gaps
We identify five critical gaps (semantic intent verification, recursive delegation accountability, agent identity integrity, governance opacity and enforcement, and operational sustainability) that no current technology or regulatory instrument resolves.
Gap analysis synthesized from the structured survey of industry trends, standards, and literature; presented as findings in the paper.
high negative AI Identity: Standards, Gaps, and Research Directions for AI... coverage of critical identity-related gaps by existing technology and regulation
An evaluation of current technical and regulatory documents against the identity requirements of autonomous agents finds that none adequately address the challenge of governing nondeterministic, boundary-crossing entities.
Document review / evaluation reported in the abstract (structured survey of technical and regulatory documents); specific documents and number reviewed are not specified in the abstract.
high negative AI Identity: Standards, Gaps, and Research Directions for AI... adequacy of technical and regulatory documents for governing autonomous agents
A structural comparison of human and AI identity across four dimensions (substrate, persistence, verifiability, and legal standing) shows that the asymmetry is fundamental and that extending human frameworks to agents without structural modification produces systematic failures.
Authors' structural comparison (analytical/theoretical method) across four dimensions, reported as a core contribution of the paper.
high negative AI Identity: Standards, Gaps, and Research Directions for AI... suitability of human identity frameworks when applied to AI agents
This creates a problem no current infrastructure is equipped to solve: how do you identify, verify, and hold accountable an entity with no body, no persistent memory, and no legal standing?
Authors' gap analysis informed by a structured survey of industry trends, emerging standards, and technical literature; presented as a synthesized conclusion from that survey.
high negative AI Identity: Standards, Gaps, and Research Directions for AI... adequacy of existing infrastructure for identity, verification, and accountabili...
The framework addresses emerging tensions captured in the Creativity Paradox, whereby GenAI may weaken intrinsic motivation, conceptual risk-taking, and evaluative depth.
Theoretical extension of paradox theory and conceptual discussion of potential negative effects; presented as conceptual risks rather than empirically demonstrated outcomes.
high negative Beyond the Creativity Paradox: A Theory-informed Framework f... intrinsic motivation, conceptual risk-taking, evaluative depth
The near-uncorrelated rankings and rank shifts on the n=11 subset are driven by a strong negative Adoption-Capability correlation among closed-source high-capability agents within this subset.
Subgroup analysis/observation within the 11-agent SWE-bench overlap indicating a negative correlation between Adoption and Capability for closed-source high-capability agents (no numerical coefficient reported in the excerpt).
high negative AgentPulse: A Continuous Multi-Signal Framework for Evaluati... Adoption-Capability correlation among closed-source high-capability agents
Static benchmarks measure what AI agents can do at a fixed point in time but not how they are adopted, maintained, or experienced in deployment.
Conceptual statement in the paper; no empirical sample cited for this specific claim (framing/argumentation).
high negative AgentPulse: A Continuous Multi-Signal Framework for Evaluati... scope of measurement of static benchmarks (capability vs. deployment/adoption)
Under our definition, contestants with types below certain threshold (low types) always engage in benchmark hacking, whereas those above the threshold do not.
Theoretical result (characterization/theorem) derived from the contest model showing threshold behavior in equilibrium across contestant types.
high negative On Benchmark Hacking in ML Contests: Modeling, Insights and ... incidence of benchmark hacking by contestant type (below vs above threshold)
Each new task domain requires painstaking, expert-driven harness engineering: designing the prompts, tools, orchestration logic, and evaluation criteria that make a foundation model effective.
Author assertion in the paper's introduction/abstract describing the state of practice; no empirical method, dataset, or sample size reported in the excerpt.
high negative The Last Harness You'll Ever Build need for human (expert) harness engineering
Industry digital maturity weakens the effect of the peer leader on a focal firm’s AI adoption.
Interaction/heterogeneity analysis in fixed-effects regression models on panel data of publicly listed Chinese firms (2012–2023), using an industry digital maturity moderator.
high negative Following the Herd or the Bellwether: Peer Effects in Firms’... focal firm AI adoption level (moderated by industry digital maturity for peer le...
Technological interdependence is not dissolving but being selectively restructured, producing a durable condition of partial, segmented decoupling in which interdependence persists under increasingly politicized rules of access.
Interpretation based on case-study observations of export controls, allied coordination, Chinese countermeasures, and emergent supply-chain and regulatory changes described in the paper.
high negative Weaponized Interdependence and Dynamics of Partial Decouplin... degree and form of technological interdependence between the U.S. and China (str...
When the United States employs export controls and allied coordination to manage perceived technological risks, China responds through defensive reconfiguration aimed at reducing asymmetric vulnerability, in addition to retaliation in rare-earth export controls in certain instances.
Case-study evidence centered on advanced-technology sectors (particularly semiconductors) and observed policy responses following U.S. export restraints after the first Trump administration (qualitative policy and reaction examples described in the paper).
high negative Weaponized Interdependence and Dynamics of Partial Decouplin... China's policy responses (defensive reconfiguration, occasional rare-earth expor...
The transformation toward algorithmic enterprises raises critical concerns regarding agency, accountability, data monopolization, and algorithmic bias.
Presented as a principal concern in the paper's conceptual discussion and interdisciplinary critique; based on analysis of governance and ethical literature rather than new empirical evidence in the abstract.
high negative Algorithmic Enterprises: Rethinking Firm Strategy in the Age... risks to agency, accountability, market power (data monopolization), and algorit...
Market incompleteness distorts the efficient development of AI (i.e., distorts innovation/output).
Claim made in the abstract as a theoretical implication of the asset-pricing model; no empirical data provided.
high negative Hedging the Singularity efficiency of AI development / innovation output
Market incompleteness distorts valuations.
Stated in the abstract as an implication of the model (theoretical analysis); no empirical quantification provided.
high negative Hedging the Singularity distortion of asset valuations
Every additional mechanism we test (planner evolution, per-tool selection, cold-start initialization, skill extraction, and three credit assignment methods) degrades performance.
Findings from the nine-variant ablation study reported in the paper; comparison of variants that add each listed mechanism versus the memory+reflection combination.
high negative AEL: Agent Evolving Learning for Open-Ended Environments performance (e.g., Sharpe ratio or other benchmark metrics) relative to memory+r...
Agentic AI introduces novel challenges related to market stability, regulatory compliance, interpretability, and systemic risk.
Survey discussion synthesizing literature on systemic and governance risks of autonomous systems in markets; draws on conceptual and empirical prior work but does not present new quantitative results.
high negative Agentic Artificial Intelligence in Finance: A Comprehensive ... market stability, regulatory compliance burden, interpretability deficits, syste...
Consolidation of corporate control of critical technologies (driven by AI industrial strategies that do not center democratic economic governance) threatens key democratic and societal objectives.
Stated implication in the paper's opening argument; supported by the paper's conceptual framing and (as indicated) review of how past and emerging tech/AI industrial strategies interact with democratic objectives. No quantitative sample size provided in the excerpt.
high negative Fighting for Democracy Amid the AI Race: Designing Tech In... threats to democratic and societal objectives (e.g., democratic governance, publ...
Unless governments develop industrial policy strategies centered on strengthening democratic economic governance, they risk consolidating corporate control of critical technologies.
Main argumentative claim of the paper as stated in the abstract/introduction; presented as a normative risk argument supported in the paper by conceptual analysis and review of policy trends and historical examples (no empirical sample size reported in the excerpt).
high negative Fighting for Democracy Amid the AI Race: Designing Tech In... consolidation of corporate control over critical technologies
A threat model taxonomy mapping misuse vectors to hardware, software, institutional, and liability layers illustrates why no single governance mechanism suffices.
Threat model taxonomy developed in the paper (conceptual taxonomy; illustrative mapping rather than empirical testing).
high negative The Open-Weight Paradox: Why Restricting Access to AI Models... completeness/adequacy of single governance mechanisms
Restricting access to open-weight models deepens asymmetries while driving proliferation into unsupervised settings.
Argumentation and threat-model reasoning in the paper describing likely consequences of restrictions (theoretical analysis; no empirical sample cited).
high negative The Open-Weight Paradox: Why Restricting Access to AI Models... geopolitical asymmetries and proliferation into unsupervised settings
Access restrictions, without governed alternatives, may displace risks rather than reduce them.
Theoretical argument and threat-model analysis in the paper showing possible risk displacement (conceptual reasoning; no empirical sample reported).
high negative The Open-Weight Paradox: Why Restricting Access to AI Models... risk displacement vs risk reduction from access restrictions
No single policy instrument is sufficient to produce high regional science and technology industrial competitiveness.
Result of fuzzy-set qualitative comparative analysis (fsQCA) on AI policy instruments issued by provincial-level governments in China, reported in the study; fsQCA finds no individual condition is sufficient.
high negative How Can Artificial Intelligence Policies Promote the Sustain... regional science and technology industrial competitiveness
The observed negative OPM effect is consistent with short-term 'J-curve' transition costs (process redesign and capability buildup) during early AI adoption.
Interpretation of empirical patterns (short-term decline in OPM concurrent with no ROA change) offered by the authors as an explanatory mechanism; not presented as separately estimated or experimentally tested.
high negative The Dynamic Causal Effects of Corporate AI Adoption on Profi... operating profit margin dynamics / transition costs interpretation
AI adoption had a significantly negative impact on the operating profit margin (OPM).
Causal analysis of KOSDAQ-listed companies (2018–2025) with AI-adoption timing identified via multi-step, contextually validated text analysis of DART business reports; endogeneity addressed using two-way fixed effects (TWFE) and Propensity Score Matching (PSM).
high negative The Dynamic Causal Effects of Corporate AI Adoption on Profi... operating profit margin (OPM)
Artificial intelligence introduces systemic risks through unprovenanced AI-derived metadata.
Cautionary claim made by the authors; stated as a systemic risk linked to provenance issues of AI-generated metadata, without empirical incident data in the excerpt.
high negative Market Dynamics, Governance and Open Research Metadata in th... systemic risk from unprovenanced AI-derived metadata (e.g., reduced trust, relia...
The debate about scholarly knowledge infrastructure has long been framed as a contest between openness and commercial enclosure, and this framing distorts both policy and practice.
Conceptual/persuasive claim made in the paper's opening paragraph; no empirical data or sample reported in the excerpt.
high negative Market Dynamics, Governance and Open Research Metadata in th... policy and practice framing (openness vs commercial enclosure)