The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (13827 claims)

Adoption
8454 claims
Productivity
7544 claims
Governance
6789 claims
Human-AI Collaboration
6327 claims
Org Design
4126 claims
Innovation
4058 claims
Labor Markets
3520 claims
Skills & Training
2924 claims
Inequality
2057 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 749 195 97 889 1979
Governance & Regulation 815 391 188 121 1539
Organizational Efficiency 771 189 124 83 1177
Technology Adoption Rate 624 233 123 96 1084
Research Productivity 410 121 56 331 929
Output Quality 466 177 59 47 749
Decision Quality 320 174 75 42 618
Firm Productivity 435 55 88 20 604
AI Safety & Ethics 214 276 65 33 593
Market Structure 178 166 122 24 495
Task Allocation 206 64 70 31 376
Skill Acquisition 165 57 60 17 299
Innovation Output 201 27 41 18 288
Employment Level 105 51 107 13 278
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 116 63 42 11 232
Firm Revenue 149 46 26 3 224
Inequality Measures 44 122 49 6 221
Task Completion Time 169 29 8 12 219
Worker Satisfaction 89 61 20 12 182
Error Rate 69 91 10 2 172
Regulatory Compliance 76 68 14 5 163
Training Effectiveness 92 19 13 19 145
Wages & Compensation 77 36 25 6 144
Automation Exposure 51 54 22 12 142
Team Performance 86 17 27 9 140
Developer Productivity 94 17 14 6 132
Job Displacement 12 80 20 1 113
Hiring & Recruitment 51 7 8 3 69
Skill Obsolescence 5 45 6 1 57
Creative Output 31 16 7 2 57
Social Protection 27 16 8 2 53
Labor Share of Income 17 17 17 51
Worker Turnover 11 12 3 26
Industry 1 1
Switchcraft saves over $3,600 per million queries.
Cost savings estimate reported in the paper based on the measured 84% reduction applied to a million-query baseline.
high positive Switchcraft: AI Model Router for Agentic Tool Calling monetary savings per million queries
Switchcraft reduces inference cost by 84%.
Empirical cost analysis reported in the paper comparing inference cost with and without Switchcraft.
high positive Switchcraft: AI Model Router for Agentic Tool Calling inference cost reduction
Switchcraft's accuracy matches or exceeds the best individual model.
Empirical comparison reported in the paper between Switchcraft accuracy (82.9%) and accuracies of individual models (details summarized by authors).
high positive Switchcraft: AI Model Router for Agentic Tool Calling relative accuracy compared to individual models
Switchcraft achieves 82.9% accuracy.
Empirical evaluation results reported in the paper (accuracy metric measured on the evaluation framework).
Switchcraft operates inline, selecting the lowest-cost model subject to correctness.
Method description in the paper describing Switchcraft's operational design.
high positive Switchcraft: AI Model Router for Agentic Tool Calling model selection strategy (cost minimization constrained by correctness)
We present Switchcraft, the first (to the best of our knowledge) model router optimized for agentic tool calling.
Authors' stated contribution / novelty claim in the paper (method description: Switchcraft).
high positive Switchcraft: AI Model Router for Agentic Tool Calling availability of a router optimized for agentic tool calling
Hallucinated references disproportionately assign credit to already prominent and male scholars, suggesting LLM-generated errors may reinforce existing inequities in scientific recognition.
Analysis linking hallucinated citations to characteristics of the (intended or assigned) cited authors, including measures of prominence and inferred gender, showing over-representation of prominent and male scholars among hallucinated attributions.
high positive LLM hallucinations in the wild: Large-scale evidence from no... distribution of (hallucinated) citation credit by cited-author prominence and ge...
Hallucinated references are especially pronounced among small and early-career author teams.
Analysis of hallucination prevalence by author-team characteristics (team size and author career stage) within the audited dataset.
high positive LLM hallucinations in the wild: Large-scale evidence from no... rate of hallucinated references by team size and author career stage
Hallucinated references are especially pronounced in manuscripts with linguistic signatures of AI-assisted writing.
Classification of manuscripts by linguistic features (signatures) indicative of AI-assistance and comparison of hallucination prevalence between groups.
high positive LLM hallucinations in the wild: Large-scale evidence from no... association between AI-writing linguistic signatures and presence of hallucinate...
These errors are diffusely embedded across many papers but especially pronounced in fields with rapid AI uptake.
Cross-field comparison within the audited dataset showing higher rates of non-existent references in fields identified as having rapid AI adoption.
high positive LLM hallucinations in the wild: Large-scale evidence from no... rate/prevalence of hallucinated references by research field
We provide a conservative estimate of 146,932 hallucinated citations in 2025 alone.
Quantitative extrapolation/estimation from the audit of references in the dataset, producing an annualized (2025) conservative count.
high positive LLM hallucinations in the wild: Large-scale evidence from no... count of hallucinated citations in 2025
We find a sharp rise in non-existent references following widespread LLM adoption.
Temporal analysis of the audited references comparing prevalence of non-existent (hallucinated) citations before and after the period of widespread LLM adoption across the 111M-reference dataset.
high positive LLM hallucinations in the wild: Large-scale evidence from no... prevalence of non-existent (hallucinated) references over time
The Analysis Contract framework generalizes across domains of vibe inference through domain-specific instantiation.
Theoretical claim and conceptual generalization proposed in the paper; no cross-domain empirical tests or case studies reported.
high positive Vibe Econometrics and the Analysis Contract applicability/generalizability of the Analysis Contract across domains
The Analysis Contract, a proposed pre-commitment framework, can adapt the logic of pre-analysis plans and the Causal Roadmap to the AI-assisted setting by imposing three conditions before a causal claim is made: a method-data contract, a data audit, and a pre-commitment statement defining what would count as a disconfirming result.
Proposed methodological/framework contribution in the paper; described and motivated conceptually, without empirical validation or implementation evidence.
high positive Vibe Econometrics and the Analysis Contract governance of AI-assisted causal claims / credibility of causal claims under AI ...
The paper extends the TOE (Technology-Organization-Environment) framework by identifying an optimal AI adoption range and empirically validating the homogenization trap.
Theoretical contribution claimed in discussion linking empirical inverted-U and homogenization findings back to TOE framework.
high positive The Inverted-U Relationship Between AI and Corporate Innovat... theoretical extension of TOE framework
AI’s enabling effect on innovation is more sustainable in high-technology firms (relative to low-tech firms).
Heterogeneity analyses by firm technology intensity (high-tech vs. others) showing more sustained positive AI effects in high-tech firms.
high positive The Inverted-U Relationship Between AI and Corporate Innovat... sustainability/strength of AI’s effect on firm innovation by tech-intensity
AI’s enabling effect on innovation is more sustainable in non-state-owned firms (compared to state-owned firms).
Heterogeneity analyses by ownership type reported in the paper showing stronger/sustained positive AI–innovation effects for non-state-owned firms.
high positive The Inverted-U Relationship Between AI and Corporate Innovat... sustainability/strength of AI’s effect on firm innovation by ownership type
Firm absorptive capacity partially mediates the AI–innovation relationship.
Bootstrap mediation analysis performed on the sample indicating a partial mediation effect of absorptive capacity between AI and innovation.
high positive The Inverted-U Relationship Between AI and Corporate Innovat... role of absorptive capacity as mediator in AI → innovation pathway
The positive effect of GGFs on digital–intelligent transformation is particularly strong for firms with robust dynamic capabilities.
Heterogeneity analysis reported in the paper comparing effects across firms with differing levels of dynamic capabilities using the DID sample of Chinese A–share listed firms (2012–2024).
high positive Government-Guided Funds and Corporate Digital–Intelligent Tr... corporate digital–intelligent transformation (heterogeneous effect by dynamic ca...
The positive effect of GGFs on digital–intelligent transformation is particularly strong for firms operating in high‑tech industries.
Heterogeneity analysis reported in the paper comparing effects across industries (high‑tech vs. others) using the DID sample of Chinese A–share listed firms (2012–2024).
high positive Government-Guided Funds and Corporate Digital–Intelligent Tr... corporate digital–intelligent transformation (heterogeneous effect by industry t...
The positive effect of GGFs on digital–intelligent transformation is particularly strong in firms with high-quality internal controls.
Heterogeneity analysis reported in the paper comparing effects across firms with different internal control quality using the DID sample of Chinese A–share listed firms (2012–2024).
high positive Government-Guided Funds and Corporate Digital–Intelligent Tr... corporate digital–intelligent transformation (heterogeneous effect by internal c...
GGFs promote firms’ digital–intelligent transformation by encouraging knowledge spillovers.
Mechanism analysis reported in the paper that identifies knowledge spillovers as a channel from GGFs to firm-level digital–intelligent transformation, using the DID framework on Chinese A–share listed firms (2012–2024).
high positive Government-Guided Funds and Corporate Digital–Intelligent Tr... corporate digital–intelligent transformation (mediated by knowledge spillovers)
GGFs promote firms’ digital–intelligent transformation by transmitting policy guidance.
Mechanism analysis reported in the paper indicating a pathway from GGFs to firm transformation via policy guidance channels, based on the DID sample of Chinese A–share listed firms (2012–2024).
high positive Government-Guided Funds and Corporate Digital–Intelligent Tr... corporate digital–intelligent transformation (mediated by policy guidance transm...
GGFs promote firms’ digital–intelligent transformation by easing firms' financing constraints.
Mechanism analysis reported in the paper (mediation / pathway analysis tied to the DID framework) using the same sample of Chinese A–share listed firms (2012–2024).
high positive Government-Guided Funds and Corporate Digital–Intelligent Tr... corporate digital–intelligent transformation (mediated by financing constraints)
Government-guided funds (GGFs) significantly promote firms’ digital–intelligent transformation.
Difference-in-differences (DID) analysis applied to Chinese A–share listed firms over 2012–2024, as reported in the paper's main empirical results.
high positive Government-Guided Funds and Corporate Digital–Intelligent Tr... corporate digital–intelligent transformation
A lightweight pre-generation router exceeds the best cascade policy on four of five datasets, mainly because it avoids the cheap model's generation cost on queries sent directly to a larger model rather than because of a stronger routing signal.
Empirical experiments across the five benchmarks showing the pre-generation router outperforms best cascade on 4/5 datasets; analysis attributing the advantage primarily to avoided generation cost rather than improved routing accuracy/signal.
high positive Is Escalation Worth It? A Decision-Theoretic Characterizatio... number of datasets where pre-generation router outperforms best cascade; driver ...
Broader equity markets, proxied by the S&P 500, remain the dominant source of spillovers throughout the sample period.
Directional spillover results from the TVP-VAR indicating the S&P 500 has the largest and persistent net outward spillover contributions over the full sample.
high positive Artificial Intelligence and Financial Market Connectedness: ... dominance in net spillover contributions
AI-related equities initially act as net transmitters of shocks.
Directional spillover measures from the TVP-VAR showing AI equity group had positive net directional connectedness early in the sample.
high positive Artificial Intelligence and Financial Market Connectedness: ... net directional spillovers (net transmitter status)
The theoretical superiority of SignSGD accurately predicts its faster convergence during the pretraining of a 124M parameter GPT-2 model.
Empirical experiment reported in the paper: pretraining runs of a 124M-parameter GPT-2 model comparing SignSGD (or Muon) vs baseline SGD/variants; details (number of runs, datasets, seeds) are not provided in the abstract.
high positive When and Why SignSGD Outperforms SGD: A Theoretical Study Ba... empirical optimization/convergence speed during GPT-2 pretraining
Extending the sign operator to matrices preserves the optimal scaling with dimensionality: we provide an equivalent optimal lower bound for the Muon optimizer in the matrix domain.
Theoretical extension of the analysis to matrix-valued problems and derivation of a matching optimal lower bound for the Muon optimizer, demonstrating preserved scaling.
high positive When and Why SignSGD Outperforms SGD: A Theoretical Study Ba... optimal lower bound / scaling with dimensionality for matrix sign-based optimiza...
SignSGD effectively reduces the complexity by a factor of d under sparse noise, where d is the problem dimension (comparison of SignSGD upper bound with SGD lower bound shows a factor-d improvement).
Theoretical comparison between the derived upper bound for SignSGD and the derived lower bound for SGD within the paper, under the separable/sparse noise model and specified smoothness assumptions.
high positive When and Why SignSGD Outperforms SGD: A Theoretical Study Ba... optimization complexity (iterations/queries) to reach l1-stationarity under spar...
Under this distinct problem geometry (l1-stationarity, l_infty-smoothness, separable noise), we derive matched upper and lower bounds for SignSGD and explicitly characterize the problem class in which SignSGD provably dominates SGD.
Theoretical derivation of both upper bounds (for SignSGD) and matching lower bounds (for the problem class) presented in the paper; proofs establishing tightness.
high positive When and Why SignSGD Outperforms SGD: A Theoretical Study Ba... convergence bounds (upper and lower) for SignSGD under specified assumptions
By analyzing sign-based optimizers under l1-norm stationarity, l_infty-smoothness, and a separable noise model, we can better capture the coordinate-wise nature of signed updates and overcome the barrier that prevents sign-based methods from outperforming SGD in standard settings.
Theoretical analysis in the paper introducing these alternative geometric/assumption settings (l1-stationarity, l_infty-smoothness, separable noise) and deriving results under these assumptions.
high positive When and Why SignSGD Outperforms SGD: A Theoretical Study Ba... applicability of sign-based optimizer analysis and potential for improved conver...
The results imply an urgency of early intervention in AI-driven economies to avoid extreme inequality and loss of redistribution options.
Synthesis and policy discussion in the paper based on the finite-time singularity, super-exponential divergence of wealth ratios, and the policy-irreversibility result.
high positive The Economic Singularity: Core Mathematical Model policy_urgency / timing_of_intervention
Under mild conditions, the system exhibits a finite-time singularity where AI capability, AI capital, and financial capital diverge.
Analytical dynamical-systems analysis and proofs in the paper demonstrating finite-time blow-up (singularity) of A (AI capability), K_a (AI capital), and K_f (financial capital) for parameter ranges satisfying the stated mild conditions.
high positive The Economic Singularity: Core Mathematical Model innovation_output (AI capability) and financial capital levels
Users maintain a moderate level of trust in AI even when their decisions diverge from those of AI.
Reported descriptive/analytic finding from the experiment with 59 pre-service teachers indicating measured trust remained at a moderate level in inconsistent decision conditions.
high positive Shaping Human-AI Collaboration in Education: Effects of AI-A... trust in AI under decision divergence
The proportion of consistent decisions significantly moderates the impact of AI-assisted decision-making paradigms on users' confidence levels.
Moderation analysis reported in the study (N=59); authors indicate that proportion of consistent human-AI decisions significantly moderates the effect of AI-assisted decision-making paradigm on confidence.
high positive Shaping Human-AI Collaboration in Education: Effects of AI-A... users' confidence (moderation effect)
Consistency between human and AI decisions significantly enhances task performance.
Within-subject consistency manipulation in the experimental sample of 59 pre-service teachers; authors report significant positive association between proportion of consistent decisions and measured task performance.
Consistency between human and AI decisions significantly enhances users' confidence.
Within-subject manipulation of human-AI consistency in the study (N=59); authors report a significant positive effect of consistency on users' confidence in the measured models.
Consistency between human and AI decisions significantly enhances users' trust in AI.
Within-subject manipulation of human-AI consistency in the experiment with 59 pre-service teachers; authors report a significant positive effect of consistency on trust measured and tested in their models.
When human-AI decision consistency is taken into account, AI-assisted decision-making paradigms influence task performance indirectly through a sequential psychological pathway involving users’ confidence and their trust in the AI.
Same experimental sample (N=59), structural equation modeling reported a significant indirect (mediated) pathway from AI-assisted paradigms → users' confidence → trust in AI → task performance; moderation by human-AI consistency was considered.
high positive Shaping Human-AI Collaboration in Education: Effects of AI-A... task performance (mediated effect)
Post-hoc SHAP attribution reveals that complaint recurrence and neighborhood-level statistics are stronger predictors of actionable violations than raw complaint volume.
Empirical claim based on post-hoc SHAP feature-attribution analysis applied to the paper's models; the excerpt reports a relative feature importance finding but provides no numeric effect sizes or sample counts.
high positive Scaling the Queue: Reinforcement Learning for Equitable Call... predictive importance for actionable violations (feature importance)
We formalize each domain as a Markov Decision Process (MDP) in which equitable classification coverage is a first-class reward objective.
Methodological specification in the paper asserting each operational domain was modeled as an MDP with equity-aware reward structure. No further empirical details in the excerpt.
high positive Scaling the Queue: Reinforcement Learning for Equitable Call... equitable classification coverage (as a modeled reward)
The proposed technique is designed to maximize throughput, minimize misclassification cost, and actively narrow historical equity gaps in service delivery.
Stated design objectives of the RL approach in the paper. No quantified outcomes or evaluation reported in the provided text.
high positive Scaling the Queue: Reinforcement Learning for Equitable Call... throughput; misclassification cost; historical equity gaps in service delivery
Rather than replacing human classifiers, our agents act as intelligent intake routers that learn to assign incoming complaints to action categories: escalate, batch, defer, inspect now.
Descriptive claim of agent behavior and intended design; asserts agents perform routing decisions into four action categories. No empirical performance numbers provided in the excerpt.
high positive Scaling the Queue: Reinforcement Learning for Equitable Call... complaint routing action assignment
We develop an equity-centered reinforcement learning (RL) framework that augments call classification capacity across six New York City Department of Buildings operational domains (boiler safety, crane and derrick oversight, heat and hot water, housing complaint triage, scaffold safety, and Natural Area District protection).
Methodological development described in the paper; claimed application domain spans six named DOB operational areas. No evaluation metrics or sample sizes provided in the excerpt.
high positive Scaling the Queue: Reinforcement Learning for Equitable Call... call classification capacity / intake routing capability
U.S. lawmakers and agencies have advanced standards, testing, and procurement oversight related to AI as the AGI race tightens.
Reported in the paper as a synthesis of recent policy and agency activity (standards, testing programs, procurement oversight); descriptive summary rather than a quantified empirical analysis (no sample size reported).
high positive Emerging AI Trends advancement of AI-related standards, testing initiatives, and procurement oversi...
So far in 2026, agentic coding automation has advanced, with tools that enable end-to-end planning, coding, and debugging.
Asserted in the paper as an observed trend through 2026, based on examples of tooling and product announcements; presented descriptively without a stated empirical sample or controlled evaluation.
high positive Emerging AI Trends capability of agentic coding automation tools to perform end-to-end planning, co...
Milestones in 2025 also include early regulatory actions.
Reported in the paper's synthesis of 2025 events; based on review of policy developments and announcements rather than a quantitative evaluation (no sample size reported).
high positive Emerging AI Trends early regulatory actions (new rules, guidance, or enforcement steps in 2025)
Milestones in 2025 highlight the broad adoption of multimodal and agentic AI.
Stated in the paper as part of a narrative synthesis of 2025 milestones; presented as an observational summary drawing on literature, industry reports and documented deployments rather than a systematic empirical study (no sample size or statistical analysis reported).
high positive Emerging AI Trends adoption of multimodal and agentic AI