The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (4114 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Innovation Remove filter
AGI could strain existing governance frameworks.
Paper's policy analysis describing potential mismatches between governance capacity and AGI-induced disruptions (as stated in abstract); no empirical tests or quantification reported in the abstract.
high negative Europe and the Geopolitics of AGI: The Need for a Preparedne... governance_and_regulation
AGI could intensify interstate competition.
Paper's geopolitical analysis and scenario-based reasoning informed by trends in AI capabilities (stated in abstract); no quantitative measures reported in the abstract.
high negative Europe and the Geopolitics of AGI: The Need for a Preparedne... governance_and_regulation
AGI could fundamentally alter the global distribution of economic and military power.
Paper's geopolitical analysis drawing on capability trends and scenario reasoning (as stated in abstract); no empirical quantification provided in the abstract.
high negative Europe and the Geopolitics of AGI: The Need for a Preparedne... governance_and_regulation
Existing AI-generated image detection benchmarks mainly evaluate standalone authenticity classification, cross-generator transfer, or forensic localization, leaving claim-conditioned fraudulent evidence detection underexplored.
Literature/contextual positioning in the paper contrasting prior benchmarks' focus with the proposed task.
high negative FraudBench: A Multimodal Benchmark for Detecting AI-Generate... coverage of existing benchmarks with respect to claim-conditioned fraudulent evi...
There is a clear gap between generic AI image detection and reliable claim-conditioned refund-evidence verification.
Synthesis of experimental findings indicating that existing detectors and MLLMs are insufficiently reliable for the specific task of claim-conditioned refund-evidence verification.
high negative FraudBench: A Multimodal Benchmark for Detecting AI-Generate... reliability/robustness of AI image detectors on claim-conditioned verification
Current MLLMs often recognize real-damaged evidence but fail on many fake-damaged subsets, with fake-damage detection rates (TPR) far below the 50% baseline on most generator subsets.
Experimental results reported in the paper comparing MLLM true positive rates (TPR) on real-damaged vs. fake-damaged subsets produced by multiple generators.
high negative FraudBench: A Multimodal Benchmark for Detecting AI-Generate... true positive rate (TPR) for detecting fake-damaged evidence
Human capital and technological innovation channels show weaker or even negative effects on Lae, attributed to short-term resource misallocation and skill mismatches.
Spatial mediation analysis (channel analysis) using panel data for 30 provincial regions (2012–2022) assessing mediating roles of human capital and technological innovation.
high negative A study of the impact of artificial intelligence on the low-... mediated effect of human capital and technological innovation on Lae
In labor-intensive industries, industrial robots shorten the backward linkage length (i.e., they reduce backward linkage length in labor-intensive sub-sectors).
Heterogeneity analysis in the paper comparing effects across labor-intensive sub-sectors within the panel of 14 manufacturing sub-sectors; reported finding of a negative effect on backward linkage length in labor-intensive industries.
high negative Research on the impact of industrial robot application on th... backward linkage length (a component of global value chain length) in labor-inte...
Institutional inertia in property valuation poses risks to asset pricing, collateral risk modelling and investor confidence.
Analytical inference from interview findings and theoretical synthesis highlighting implications for property investment and financial market stability.
high negative Exploring barriers to valuation technology adoption in prope... risks to asset pricing, collateral risk modelling and investor confidence
Despite advances in automation, data analytics and AI, the sector has been slow to digitise.
Background statement supported by interview data and sector observation reported in the study.
high negative Exploring barriers to valuation technology adoption in prope... pace of digitisation in the property valuation sector
The IDOI framework provides a transferable model for understanding digital transformation in regulated, high-trust professions and highlights the market-level risks of institutional inertia in property valuation.
Development of the IDOI conceptual framework from qualitative data and theoretical integration; authors' claim about transferability and implications.
high negative Exploring barriers to valuation technology adoption in prope... transferability of the framework and market-level risks from institutional inert...
Generational divides, protectionist attitudes and fears of automation reinforce digital resistance.
Qualitative interview evidence reporting attitudes across cohorts of valuers and firm personnel; thematic analysis identifying cultural and attitudinal themes.
high negative Exploring barriers to valuation technology adoption in prope... cultural/attitudinal resistance to VTech
The Valuers Act (1948), fragmented infrastructure and sovereignty concerns limit innovation.
Interview data from practitioners, firm leaders and regulators in New Zealand citing specific regulatory and infrastructure constraints; thematic analysis.
high negative Exploring barriers to valuation technology adoption in prope... regulatory and infrastructure constraints on innovation
Barriers to adoption arise primarily from institutional conservatism, outdated regulation and weak data governance rather than technical shortcomings.
Qualitative semi-structured interviews with valuers, firm leaders and regulators in New Zealand; thematic analysis guided by Rogers' diffusion of innovations and institutional theory synthesised into the IDOI framework.
high negative Exploring barriers to valuation technology adoption in prope... barriers to VTech adoption
LLM hallucinations are infiltrating knowledge production at scale, threatening both the reliability and equity of future scientific discovery as human and AI systems draw on the existing literature.
Synthesis/conclusion drawn from the observed prevalence, growth, distribution across fields and authorship patterns, and limited correction by moderation/publication processes described above.
high negative LLM hallucinations in the wild: Large-scale evidence from no... risk to reliability and equity of scientific discovery (qualitative assessment)
Preprint moderation and journal publication processes capture only a fraction of these errors.
Comparison of hallucinated-reference prevalence in preprints versus versions that underwent moderation or journal publication, showing many errors remain uncaught.
high negative LLM hallucinations in the wild: Large-scale evidence from no... fraction of hallucinated references detected/removed by moderation and publicati...
Patent text similarity analysis confirms a 'homogenization trap' (AI-associated increases in patent-text similarity).
Text-similarity analysis of patent documents reported in the paper showing increased patent similarity associated with AI use.
high negative The Inverted-U Relationship Between AI and Corporate Innovat... patent text similarity (homogenization of patent content)
Industry concentration negatively moderates the AI–innovation relationship.
Moderation analysis/interacted fixed-effects models indicating that higher industry concentration weakens the AI→innovation effect.
high negative The Inverted-U Relationship Between AI and Corporate Innovat... moderating effect of industry concentration on AI → innovation
The relative importance of AI-related equities as shock transmitters diminishes over time.
Time-varying estimates from the TVP-VAR showing a decline in the net transmitter contribution of the AI equities group across the sample.
high negative Artificial Intelligence and Financial Market Connectedness: ... relative contribution of AI equities to spillovers
The overall level of connectedness declines modestly following the public release of ChatGPT by OpenAI in November 2022.
Time-series comparison of aggregate connectedness measures derived from the TVP-VAR, with a reported modest post-November 2022 decline (event reference: ChatGPT release).
high negative Artificial Intelligence and Financial Market Connectedness: ... aggregate connectedness level
A policy irreversibility result: there exists a critical time before the singularity after which redistribution becomes politically impossible because wealth concentration makes feasible tax rates vanishingly small.
Proof/argument in the paper showing that as time approaches the singularity the set of tax rates that satisfy political-feasibility constraints (workers' budget / feasibility) shrinks to zero, implying a latest feasible intervention time.
high negative The Economic Singularity: Core Mathematical Model political_feasibility_of_redistribution (feasible tax rates over time)
Financialization amplifies the exponent of the super-exponential divergence by a factor γ_F/η.
Mathematical derivation in the paper showing that the exponent in the asymptotic growth rate near the singularity is multiplied by γ_F/η when including the financialization term γ_F K_f^2 and its coupling parameter η.
high negative The Economic Singularity: Core Mathematical Model growth_exponent of wealth_ratio (asymptotic)
Near the singularity, the wealth ratio between capital owners and workers diverges super-exponentially.
Asymptotic analysis near the finite-time singularity showing that the ratio of capital-owner wealth to worker wealth grows faster than exponential (super-exponentially) as time approaches the blow-up time.
high negative The Economic Singularity: Core Mathematical Model wealth_ratio (capital owners vs. workers)
AI adoption deepens the negative indirect effect of CEO–TMT faultlines on green innovation via reduced eco-attention (moderated mediation).
Reported moderated mediation analysis on the panel dataset (35,347 firm-year observations) showing that AI moderates the indirect path from CEO–TMT faultlines to green innovation through eco-attention, making the indirect effect more negative when AI is greater.
high negative When AI Amplifies Negative Echoes: CEO–TMT Faultlines, Eco-A... green innovation (indirect effect via eco-attention)
AI technology strengthens the negative relationship between CEO–TMT faultlines and eco-attention (AI exacerbates the adverse effect of faultlines on eco-attention).
Moderation/interaction analysis reported in the paper using the same panel dataset (35,347 firm-year observations) indicating a significant interaction between AI adoption and CEO–TMT faultlines on eco-attention.
CEO–TMT faultlines reduce eco-attention (organizational attention to environmental issues).
Direct association reported in the paper from regression/mediation models using the panel dataset (35,347 firm-year observations) showing a negative relationship between CEO–TMT faultlines and eco-attention.
CEO–TMT faultlines negatively affect green innovation through reduced eco-attention.
Empirical mediation analysis on the panel dataset (35,347 firm-year observations, 2010–2023) testing CEO–TMT faultlines -> eco-attention -> green innovation.
high negative When AI Amplifies Negative Echoes: CEO–TMT Faultlines, Eco-A... green innovation (mediated by eco-attention)
AGI (Artificial General Intelligence) is problematic both conceptually and definitionally.
Authorial assertion in the paper stating AGI is problematic as a concept and definition; framed as a conditioning assumption that shapes the subsequent analysis.
high negative Pathways to AGI conceptual_and_definitional_soundness_of_AGI
The paper argues we should avoid assuming the inevitability of the current situation relating to AI (i.e., the current commercial AI development trajectory is not inevitable).
Authorial methodological claim in the paper's framing/introductory text; presented as a normative methodological stance rather than empirical evidence.
high negative Pathways to AGI policy_assumption_of_inevitability
Across short stories, marketing slogans, and alternative-uses tasks, three frontier LLMs fall below parity across crowding kernels.
Empirical experiments reported in the paper evaluating three frontier large language models on three task domains (short stories, marketing slogans, alternative-uses) and finding ρ < 1 (below parity) across crowding kernels. The abstract specifies three models but does not report the number of generated samples per model or other sample-size details.
high negative Ex Ante Evaluation of AI-Induced Idea Diversity Collapse human-relative diversity ratio (ρ) indicating excess crowding
This creates an evaluation blind spot, as AI can improve individual outputs while increasing population-level crowding.
Theoretical/ conceptual claim in the paper arguing that improvements at the individual-output level can still increase similarity (crowding) at the population level; no empirical numbers given in the abstract.
high negative Ex Ante Evaluation of AI-Induced Idea Diversity Collapse population-level crowding (diversity collapse)
Creative AI systems are typically evaluated at the level of individual utility, yet creative outputs are consumed in populations: an idea loses value when many others produce similar ones.
Conceptual argument presented in the paper's introduction motivating a population-level perspective on creative outputs (no empirical sample size reported).
high negative Ex Ante Evaluation of AI-Induced Idea Diversity Collapse loss of value due to similarity (population-level creative value)
The reform reduces industrial wastewater discharge, which improves agricultural production conditions (mechanism linking the reform to higher grain yield).
Mechanism analysis in the paper reporting reductions in industrial wastewater discharge following the reform (mediation channel analysis).
high negative Can water resource tax reform increase grain yield?—Evidence... industrial wastewater discharge
Tabular data does not have a foundation model that understands it natively; every approach to tabular AI today (from gradient-boosted trees to the latest tabular foundation models) requires a preprocessing pipeline before any model can consume the data.
Paper's survey/positioning statement asserting the current state of tabular AI approaches and their reliance on preprocessing pipelines (no specific empirical dataset given).
high negative Data Language Models: A New Foundation Model Class for Tabul... presence/absence of a native tabular foundation model and the need for preproces...
DePAI entails risks including security, centralization, incentive failure, legal exposure, and the crowding-out of intrinsic motivation, requiring value-sensitive design and continuously adaptive governance.
Risk analysis and conceptual argument in the paper identifying possible failure modes and recommended design/governance responses; no empirical incidence data provided.
high negative DAO-enabled decentralized physical AI: A new paradigm for hu... security, centralization, incentive failure, legal exposure, and intrinsic motiv...
Mechanism tests indicate innovation stagnation in mature firms with redundant AI is a pathway that limits productivity gains (i.e., AI can be associated with stagnant innovation in mature firms).
Mechanism analysis reported in the paper showing signs of reduced innovation-related gains or stagnation in mature, advanced firms using AI (interpreted as redundant AI leading to limited incremental innovation).
high negative The Heterogeneous Effects of Artificial Intelligence on Ente... Innovation activity / productivity implications
Of these four, integration capacity is the least developed for scientific institutions and the most binding: no improvement in AI tooling can buy it.
Normative/diagnostic claim in the paper about relative scarcity and irreducibility of integration capacity; no empirical measures or sample provided in the excerpt.
high negative AI-Augmented Science and the New Institutional Scarcities relative development of integration capacity in scientific institutions and its ...
Four complements then become scarce and load-bearing for AI-augmented science: verified signal, legitimacy, authentic provenance, and integration capacity (the community's tolerance for delegated cognition).
Theoretical framework proposed by the paper; list of four complements presented as an argument without empirical quantification in the excerpt.
high negative AI-Augmented Science and the New Institutional Scarcities scarcity of verified signal, legitimacy, authentic provenance, and integration c...
Frontier software engineering agents have saturated short-horizon benchmarks while regressing on the work that constitutes senior engineering: long-horizon, multi-engineer, ambiguous-specification deliverables.
Position asserted in the paper based on literature/benchmark trends and authors' field observations; no original empirical dataset or quantified analysis provided in the paper text excerpt.
high negative The Conversations Beneath the Code: Triadic Data for Long-Ho... performance on short-horizon benchmarks versus performance on long-horizon, mult...
The most valuable AI capabilities (reasoning, judgment, intuition) are precisely those we cannot verify with current methods.
Argumentative claim in the position paper linking capability value to unverifiability; no empirical validation or measurement of 'value' or verifiability included.
high negative Reliable AI Needs to Externalize Implicit Knowledge: A Human... verifiability of high-level AI capabilities (reasoning, judgment, intuition)
Current reliability methods can only verify explicit knowledge against sources, creating a fundamental gap in verifying AI's implicit knowledge.
Conceptual critique in the paper of existing verification/validation approaches; no systematic review or empirical comparison provided.
high negative Reliable AI Needs to Externalize Implicit Knowledge: A Human... verifiability of AI knowledge (explicit vs implicit)
Implicit knowledge remains unexternalized because documentation cost exceeds perceived value.
Presented as an economic/theoretical explanation in the paper; no empirical study, sample, or cost estimates provided.
high negative Reliable AI Needs to Externalize Implicit Knowledge: A Human... degree of externalization of implicit knowledge (documentation vs tacit retentio...
Whether it is the periodic compulsory recoinage in medieval Europe or Gesell's stamp scrip, both are essentially mechanisms for taxing money holdings.
Interpretive/historical claim presented by the authors; no empirical testing or sample reported in the excerpt.
high negative RSDM: The Consensus Honest Money in the AI Era degree_to_which_historical_monetary_policies_function_as_a_tax_on_money_holdings
The devaluation of money runs through almost the whole process of history, from the weight reduction and purity decrease of metallic coin to the unanchored over-issuance of paper currency.
Historical summary/claim by the authors referencing long-run monetary history; no specific empirical study or sample size given in the excerpt.
high negative RSDM: The Consensus Honest Money in the AI Era occurrence_of_currency_devaluation_over_history
Current AI agents implement only the first half of CLS (fast exemplar/hippocampal-style storage) and lack the slow weight-consolidation half.
Analytic claim in paper comparing current AI agent designs to CLS; no empirical evaluation reported in abstract.
high negative Contextual Agentic Memory is a Memo, Not True Memory presence/absence of slow weight-consolidation mechanisms in AI agents
Agents that rely only on lookup are structurally vulnerable to persistent memory poisoning as injected content propagates across all future sessions.
Theoretical/security argument presented in paper; claims about propagation of injected content across sessions; no empirical attack experiments detailed in abstract.
high negative Contextual Agentic Memory is a Memo, Not True Memory vulnerability to persistent memory poisoning
Conflating the two produces agents that face a provable generalization ceiling on compositionally novel tasks that no increase in context size or retrieval quality can overcome.
Formal claim asserted in paper (formalization of limitations and proofs claimed); no empirical sample detailed in abstract.
high negative Contextual Agentic Memory is a Memo, Not True Memory generalization performance on compositionally novel tasks
Conflating retrieval and weight-based memory produces agents that accumulate notes indefinitely without developing expertise.
Theoretical argument/formalization presented in paper; claim based on analysis of how lookup-only systems fail to consolidate abstract knowledge; no empirical sample reported in abstract.
high negative Contextual Agentic Memory is a Memo, Not True Memory expertise development / continued accumulation of notes
Treating lookup as memory is a category error with provable consequences for security.
Theoretical/formal argument and formalization in paper; security consequences (e.g., persistent poisoning) claimed; no empirical sample reported in abstract.
high negative Contextual Agentic Memory is a Memo, Not True Memory security (vulnerability to persistent memory poisoning)
Treating lookup as memory is a category error with provable consequences for long-term learning.
Theoretical/formal argument asserted in the paper, drawing on formalization and Complementary Learning Systems theory; no empirical sample reported in abstract.
high negative Contextual Agentic Memory is a Memo, Not True Memory long-term learning