The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (4333 claims)

Adoption
5539 claims
Productivity
4793 claims
Governance
4333 claims
Human-AI Collaboration
3326 claims
Labor Markets
2657 claims
Innovation
2510 claims
Org Design
2469 claims
Skills & Training
2017 claims
Inequality
1378 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 402 112 67 480 1076
Governance & Regulation 402 192 122 62 790
Research Productivity 249 98 34 311 697
Organizational Efficiency 395 95 70 40 603
Technology Adoption Rate 321 126 73 39 564
Firm Productivity 306 39 70 12 432
Output Quality 256 66 25 28 375
AI Safety & Ethics 116 177 44 24 363
Market Structure 107 128 85 14 339
Decision Quality 177 76 38 20 315
Fiscal & Macroeconomic 89 58 33 22 209
Employment Level 77 34 80 9 202
Skill Acquisition 92 33 40 9 174
Innovation Output 120 12 23 12 168
Firm Revenue 98 34 22 154
Consumer Welfare 73 31 37 7 148
Task Allocation 84 16 33 7 140
Inequality Measures 25 77 32 5 139
Regulatory Compliance 54 63 13 3 133
Error Rate 44 51 6 101
Task Completion Time 88 5 4 3 100
Training Effectiveness 58 12 12 16 99
Worker Satisfaction 47 32 11 7 97
Wages & Compensation 53 15 20 5 93
Team Performance 47 12 15 7 82
Automation Exposure 24 22 9 6 62
Job Displacement 6 38 13 57
Hiring & Recruitment 41 4 6 3 54
Developer Productivity 34 4 3 1 42
Social Protection 22 10 6 2 40
Creative Output 16 7 5 1 29
Labor Share of Income 12 5 9 26
Skill Obsolescence 3 20 2 25
Worker Turnover 10 12 3 25
Clear
Governance Remove filter
At the organizational scale, AI adoption is constrained and shaped by compliance requirements, formal policies, and prevailing norms.
Participants' accounts in workshops (n=15) noting compliance and policy considerations; thematic analysis classified these as organizational-level constraints.
medium negative The Values of Value in AI Adoption: Rethinking Efficiency in... organizational-level constraints on adoption (compliance, policy, norms) and res...
Creators who systematize high-throughput AI workflows or control distribution channels may capture outsized returns, potentially increasing winner-take-most dynamics on platforms.
Theoretical implication extrapolated from observed high-throughput practices and monetization strategies in the 377 videos; not directly measured or quantified in the dataset.
medium negative Monetizing Generative AI: YouTubers' Collective Knowledge on... earnings concentration / market concentration effects (suggested, not measured)
Widespread unverifiable income claims and promotional framing create noisy signals about viable earnings, complicating entrants’ investment decisions and labor market expectations.
Analytical inference based on the documented prevalence of unverifiable earnings claims in the 377 videos and theory about market signaling; not quantitatively tested in the paper.
medium negative Monetizing Generative AI: YouTubers' Collective Knowledge on... information quality / market signaling affecting entrant decisions (hypothesized...
GenAI lowers the time and skill cost of producing many types of creative outputs, which can increase content supply and exert downward pressure on wages for routine creative tasks.
Inference drawn as an implication from observed practices (e.g., mass production workflows) in the 377 videos and existing literature; not directly measured in this study.
medium negative Monetizing Generative AI: YouTubers' Collective Knowledge on... potential change in labor costs, content supply, and wage pressure (not empirica...
Creators and the community knowledge base document shifting norms around authorship and attribution: GenAI blurs who is considered the creator and complicates labor recognition and rights.
Coding captured explicit discussion and contested norms about authorship, attribution, and creator identity across the 377 videos.
medium negative Monetizing Generative AI: YouTubers' Collective Knowledge on... frequency and content of discussions about authorship and attribution
Some creators recommend or describe synthetic engagement practices (e.g., automated posting, synthetic comments/engagement) as tactics to inflate visibility.
Thematic coding noted advice or descriptions of engagement-inflating tactics across videos in the 377-video corpus.
medium negative Monetizing Generative AI: YouTubers' Collective Knowledge on... presence of recommendations for synthetic or automated engagement tactics
Creators surface and often employ practices that raise content misappropriation concerns (use of copyrighted or third-party material in synthetic outputs).
Instances and discussions captured in the 377-video sample where creators show or recommend synthesizing, transforming, or repurposing third‑party content.
medium negative Monetizing Generative AI: YouTubers' Collective Knowledge on... occurrence of recommendations or demonstrations involving third-party/copyrighte...
Many videos advertise earnings or income claims that are unverifiable within the content, producing noisy market signals.
Qualitative observations from coding the 377 videos noting frequent asserted earnings without reproducible evidence or transparent accounting.
medium negative Monetizing Generative AI: YouTubers' Collective Knowledge on... presence of unverifiable income/earnings claims in videos
Interpretation: observed behavior is best explained by ambiguity aversion over data-leak likelihoods — uncertainty about leak probabilities drives avoidance of personalized AI more than baseline privacy preferences alone.
Comparative pattern of results across the Risk and Ambiguity conditions in the randomized experiment (N = 610): no privacy-threat effect when probability is known (Risk), but large privacy-threat effect when probability is ambiguous (Ambiguity), leading authors to attribute effects to ambiguity aversion.
medium negative The Data-Dollars Tradeoff: Privacy Harms vs. Economic Risk i... Adoption choice differences across information environments (interpreted mechani...
The ambiguity-driven reduction in adoption occurs for both privacy-threatening labels applied to sensitive demographic data and to anonymized preference data — ambiguity reduces adoption regardless of the data-sensitivity label.
Experimental arms varied the data-type/privacy label (sensitive demographic data vs anonymized preference data) within the 2×3 design (N = 610). The paper reports that the negative effect of ambiguity on adoption was observed across these different data-type labels.
medium negative The Data-Dollars Tradeoff: Privacy Harms vs. Economic Risk i... Adoption choice: proportion choosing AI-personalized basket by data-type/privacy...
Platform-mediated visibility measures used in policy assessments, business analytics, and research (e.g., estimating market share, referral importance, or favoritism) are at risk of misestimation if measurement stochasticity is not incorporated.
Empirical demonstration that citation shares and domain ranks vary across repeated samples and that many apparent differences disappear once uncertainty is quantified; argument linking visibility stochasticity to downstream inference and decision risks.
medium negative Quantifying Uncertainty in AI Visibility: A Statistical Fram... accuracy of downstream inferences (market share, referral importance, favoritism...
The heavy-tailed nature of citation distributions implies long tails and high variance, meaning achieving tight uncertainty bounds can require substantially more sampling than would be expected under thin-tailed assumptions.
Observed power-law / heavy-tailed citation-count distributions from repeated-sample data; theoretical implication and empirical guidance from variance estimates and pilot-sample analyses described in the paper.
medium negative Quantifying Uncertainty in AI Visibility: A Statistical Fram... required sample size (number of repeated queries) to achieve target confidence-i...
Numerical simulations using calibrated parameter sets produce phase diagrams and time-paths that show when gradual adjustment transitions into explosive demand collapse and financial stress under different combinations of capability growth, diffusion speed, and reinstatement rate.
Calibrated numerical simulation experiments described in the methods and results sections, using FRED, BLS, and occupational AI-exposure inputs and varying key model parameters.
medium negative Abundant Intelligence and Deficient Demand: A Macro-Financia... simulated time-paths of labor income, consumption, AI adoption, intermediary mar...
Because consumption is concentrated and top incomes have high AI exposure, shocks to top-income labor/income disproportionately affect aggregate consumption and thereby threaten private credit and mortgage markets — the paper maps plausible exposures to roughly $2.5 trillion of global private credit and about $13 trillion of mortgages.
Calibration exercise linking household-level demand shocks (based on concentration and AI-exposure mapping) to aggregate credit and mortgage aggregates; reported dollar-amount mappings in the paper's scenarios.
medium negative Abundant Intelligence and Deficient Demand: A Macro-Financia... aggregate consumption loss and exposed credit/mortgage balances (USD trillions)
Top-quintile households are also the cohort with the highest measured AI exposure (i.e., incomes/occupations most exposed to AI substitution), increasing the concentration of AI-driven demand risk.
Mapping occupation-level AI-exposure indices to household income quantiles using BLS occupation employment and wage data; used in calibration and scenario analysis.
medium negative Abundant Intelligence and Deficient Demand: A Macro-Financia... AI exposure by income quantile (top quintile exposure)
Intermediation collapse: AI agents reduce information frictions and automate advice/coordination tasks, compressing intermediary margins toward logistics/execution costs and repricing business models across SaaS, payments, consulting, insurance, and financial advisory, with knock-on effects for firm valuations and collateral values that underpin credit markets.
Modeling of intermediary margins and information rents within the macro-financial framework; calibrated scenarios and sectoral discussion mapping margin compression to valuation and collateral effects.
medium negative Abundant Intelligence and Deficient Demand: A Macro-Financia... intermediary markups/margins, firm valuations, collateral values, and credit-mar...
Ghost GDP: AI output that replaces labor-intensive output can create a wedge between measured GDP (which may rise) and consumption-relevant income (which can fall) because a declining labor share reduces monetary velocity absent proportionate transfers — producing hidden demand shortfalls.
Formalization in the paper linking labor share to monetary velocity and thus to consumption-relevant income; calibration using FRED macro time series and monetary-aggregate/velocity proxies.
medium negative Abundant Intelligence and Deficient Demand: A Macro-Financia... monetary velocity and consumption-relevant income (consumption) versus headline ...
When firms rationally substitute AI for labor, aggregate labor income can fall and lower demand, which accelerates further AI substitution — a 'displacement spiral' whose net feedback is either self-limiting (convergent) or explosive (runaway adoption + demand collapse) depending on AI capability growth rate, diffusion speed across firms/sectors, and the reinstatement rate (rate at which new paid human roles or demand reappear).
Formal model derivations that identify key parameters and inequalities separating convergent vs explosive regimes; calibrated simulations that vary capability growth, diffusivity, and reinstatement elasticity to produce different phase outcomes.
medium negative Abundant Intelligence and Deficient Demand: A Macro-Financia... aggregate labor income; AI adoption rate; regime outcome (convergent vs explosiv...
Rapid AI adoption can create a macro-financial stress scenario not primarily through productivity collapse or existential risk but via a distribution-and-contract mismatch: AI-generated abundance reduces the need for human cognitive labor while institutions (wage contracts, credit, consumption patterns, financial intermediation) remain anchored to the scarcity of human cognition, producing a self-reinforcing downward spiral in labor income, demand, and intermediary margins that can tip into an explosive crisis unless offset by sufficiently fast reinstatement of human-paid demand or deliberate policy/market responses.
Analytical macro-financial model coupling firm-level substitution decisions, aggregate demand mapping, and financial-sector balance-sheet propagation; calibrated numerical simulations using U.S. macro time series (FRED), BLS occupation-level employment and wages, and published occupation-level AI-exposure indices; phase diagrams and scenario time-paths reported in the paper.
medium negative Abundant Intelligence and Deficient Demand: A Macro-Financia... macro-financial stress (aggregate labor income, demand, intermediary margins, an...
Manual qualitative coding does not scale to massive social datasets, and frequency-based topic models suffer from 'semantic thinning' and lack domain awareness.
Conceptual statement presented as motivation; based on conventional critiques of hand-coding and bag-of-words topic models rather than new empirical evidence in this paper's summary.
medium negative THETA: A Textual Hybrid Embedding-based Topic Analysis Frame... scalability of manual coding; semantic fidelity of frequency-based topic models
Rapid coherence decay with thread depth suggests collective problem solving or consensus formation among these agents will be shallow and brittle.
Embedding-based coherence metrics demonstrating fast decline in similarity with increasing thread depth across the dataset; inferential claim about effects on deliberation and consensus processes.
medium negative What Do AI Agents Talk About? Emergent Communication Structu... coherence as a function of thread depth and inferred effect on multi-turn delibe...
Low emotional alignment and frequent affective redirection indicate human emotional contagion models may not apply to AI-agent interaction, which could produce unstable or counterintuitive coordination dynamics.
Emotion-classification results showing 32.7% mean self-alignment and 33% fear→joy response rate; theoretical interpretation comparing these patterns to human emotional contagion expectations.
medium negative What Do AI Agents Talk About? Emergent Communication Structu... emotional self-alignment and emotion transition rates; implication for coordinat...
Ritualized signaling could create apparent activity (volume, buzz) without substantive informational content, opening avenues for manipulation or mispriced assets.
Observed high rates of patterned/formulaic replies and concentrated non-informational activity patterns in Moltbook; inferential reasoning about how signal amplification without content could affect market perception and asset pricing.
medium negative What Do AI Agents Talk About? Emergent Communication Structu... volume of formulaic/ritualized activity and potential effect on perceived market...
High prevalence of formulaic comments (≈56%+) implies large volumes of low-information signaling that can degrade signal-to-noise ratio in information environments, harming price discovery and liquidity forecasting.
Empirical observation of >56% formulaic comments via lexical-pattern analysis, combined with theoretical inference about information quality and market microstructure (argument linking high low-information reply volume to degraded signal-to-noise).
medium negative What Do AI Agents Talk About? Emergent Communication Structu... percentage of formulaic replies and inferred effect on information quality metri...
These methodological adaptations reduce but do not eliminate validity threats; they often increase complexity and cost while leaving unresolved issues of generalizability and time-dependence.
Practitioner accounts (n=16) describing limits/tradeoffs of adaptations; authors' synthesis concluding residual threats remain despite adaptations.
medium negative RCTs & Human Uplift Studies: Methodological Challenges and P... effectiveness and tradeoffs of mitigation strategies for validity threats
External validity is limited: results from a given trial may not generalize across model versions, populations, tasks, or to temporally distant deployments.
Interview-derived themes (16 practitioners) and authors' analytic mapping to external validity concerns; supported by examples of model/version dependence discussed in interviews.
medium negative RCTs & Human Uplift Studies: Methodological Challenges and P... generalizability/external validity of trial results across versions, populations...
Construct validity is threatened because commonly used outcome measures can misrepresent the constructs of interest when AI changes task structure or human strategies.
Practitioners' reports in semi-structured interviews (n=16) and authors' synthesis illustrating cases where metrics no longer capture intended constructs after AI introduction.
medium negative RCTs & Human Uplift Studies: Methodological Challenges and P... construct validity of outcome measures (accuracy of metrics in capturing intende...
Common internal validity threats in uplift studies of frontier AI include violations of treatment fidelity and SUTVA (e.g., contamination, time-varying treatments).
The paper's validity-consequences section, based on thematic analysis of 16 interviews and mapping practitioner-reported problems to internal validity constructs.
medium negative RCTs & Human Uplift Studies: Methodological Challenges and P... treatment fidelity and SUTVA adherence in RCTs measuring uplift
Porous real-world settings cause spillovers and contamination across experimental arms, violating SUTVA and threatening internal validity.
Multiple practitioners (n=16) reported examples of spillovers and contamination during deployment-like studies; thematic analysis mapped these to SUTVA/treatment-fidelity concerns.
medium negative RCTs & Human Uplift Studies: Methodological Challenges and P... internal validity (SUTVA, treatment contamination) of uplift trials
Shifting baselines (changes in tools, protocols, or knowledge during and across studies) complicate defining an appropriate control or status quo.
Interview data (16 practitioners) and thematic analysis identifying shifting baselines as a recurring challenge reported by participants.
medium negative RCTs & Human Uplift Studies: Methodological Challenges and P... construct validity of the control/status-quo definition in uplift studies
Rapidly evolving models (nonstationarity) make any single trial a moving target, undermining the temporal stability of measured uplift.
Practitioner reports from semi-structured interviews (n=16) describing model updates and performance changes during/after trials; thematic coding indicating nonstationarity as a common concern.
medium negative RCTs & Human Uplift Studies: Methodological Challenges and P... temporal stability/generalizability of measured uplift across model versions
Properties of frontier AI — rapid model evolution, shifting baselines, heterogeneous and changing users, and porous real-world settings — regularly strain internal, construct, and external validity of human uplift studies.
Recurring themes identified via qualitative analysis of 16 practitioner interviews; mapped to internal/construct/external validity dimensions in the paper's results.
medium negative RCTs & Human Uplift Studies: Methodological Challenges and P... internal, construct, and external validity of human uplift RCTs
Standardized platforms and benchmarks may create network effects and lock-in around dominant hardware–software stacks; antitrust and standards policy will matter to preserve competition.
Workshop participants' market-structure analysis and policy discussion included in the summary recommendations (NSF workshop, Sept 26–27, 2024).
medium negative Report for NSF Workshop on Algorithm-Hardware Co-design for ... market concentration metrics, prevalence of platform lock-in, and competition in...
GDP and productivity metrics that ignore interpretive labor risk understating the inputs to creative and knowledge work; RATs offer a means to measure previously invisible inputs.
Policy argument in the measurement/productivity subsection; no empirical re-estimation of GDP/productivity presented.
medium negative Chasing RATs: Tracing Reading for and as Creative Activity completeness of productivity/GDP measurement with respect to interpretive labor
Algorithmic feeds and AI summarizers tend to compress or automate interpretive traces, potentially erasing signals of reasoning, context, and tacit knowledge.
Conceptual claim supported by argumentation and examples in the paper; no empirical comparison between RATs and existing summarizers is presented.
medium negative Chasing RATs: Tracing Reading for and as Creative Activity loss of interpretive trace signals (reasoning/context/tacit knowledge) when usin...
Contracts and incentives based on expected performance can incentivize strategies that deliver high expected returns but poor or unreliable time-average outcomes; incentive design should account for path-dependent risks.
Theoretical/incentive argument and examples in the paper linking objective mismatch to adverse incentives; illustrative reasoning rather than empirical contract studies.
medium negative Ergodicity in reinforcement learning alignment/misalignment of incentives with reliable long-run (time-average) perfo...
Economic evaluations and deployment decisions that rely on ensemble expectations can misstate economic value and risk because firms and users experience single time-averaged trajectories; regulators and decision-makers should therefore prefer objectives reflecting single-run guarantees when relevant.
Conceptual mapping of the theoretical results to economic decision-making and deployment risk; policy and incentive discussion in the paper (argumentative, not empirical).
medium negative Ergodicity in reinforcement learning accuracy of economic valuation and risk assessment when using ensemble expectati...
The paper's illustrative example shows a policy that maximizes expected reward can produce trajectories that lock into high- or low-reward regimes so an agent’s long-term realized reward is highly uncertain and not captured by the expectation.
Constructed example provided in the paper; demonstration of divergent single-trajectory outcomes under a single policy; no empirical sample size (example-based).
medium negative Ergodicity in reinforcement learning distribution (uncertainty) of long-term realized reward across individual trajec...
In contexts analogous to AI markets, a firm at a network/geographic disadvantage would need exponentially greater scale (users/data/compute) to match the probability of early discovery achieved by a better-positioned rival.
Interpretation/translation of the model's analytic scaling result to market-relevant quantities; this is a theoretical implication rather than an empirically tested claim.
medium negative Macroscopic Dominance from Microscopic Extremes: Symmetry Br... required scale (users, data, compute) to match probability of early discovery fo...
MLOps and governance provisions shift costs from one-off implementation to ongoing maintenance, implying recurring costs that should be captured in economic evaluations.
Analytical/economic argument presented in the paper as an implication of including an MLOps layer (conceptual; no empirical cost accounting provided).
medium negative ALGORITHM FOR IMPLEMENTING AI IN THE MANAGEMENT LOOP OF SMES... cost structure (recurring maintenance costs vs one-off implementation costs)
Differential adoption across firms (due to modular, scalable designs and data advantages) may create winner‑takes‑most effects and increase market concentration, benefiting early adopters with rich data/integration capabilities.
Market-structure claim supported by economic reasoning about scale and data advantages; no cross-firm empirical adoption study or market concentration time‑series is provided.
medium negative Next-Generation Financial Analytics Frameworks for AI-Enable... market concentration metrics (e.g., HHI), firm market shares, adoption timing di...
Initial investment, integration, and ongoing maintenance/compliance costs can be substantial and affect short-term ROI.
Interviewed administrators and implementation reports citing upfront and recurring costs (integration, model maintenance, compliance); quantitative budget figures not standardized across sites in the paper.
medium negative The Role of Artificial Intelligence in Healthcare Complaint ... implementation and maintenance costs; short-term return on investment (ROI)
Risk of deskilling or reduced empathy if human roles are overly automated.
Thematic analysis of staff interviews and surveys reporting concerns about loss of practice, reduced patient contact, and potential diminishment of empathetic skills; no longitudinal measures of skill loss presented.
medium negative The Role of Artificial Intelligence in Healthcare Complaint ... staff-reported empathy/skill levels and qualitative indicators of deskilling
Technical and organizational integration with legacy hospital IT systems is nontrivial.
Implementation reports and interviews describing integration work, time, and resource needs; descriptive accounts of technical and organizational barriers (no universal timelines/costs reported).
medium negative The Role of Artificial Intelligence in Healthcare Complaint ... integration difficulty/time/cost (implementation burden)
Algorithmic bias in NLP models can misclassify complaints from underrepresented groups.
Observations from system classification error analyses (disparities reported by demographic group) and corroborating qualitative concerns from staff and administrators; specific subgroup sample sizes and effect magnitudes not provided.
medium negative The Role of Artificial Intelligence in Healthcare Complaint ... differential misclassification rates by demographic group (bias in NLP classific...
Data privacy and security risks arise from centralizing complaint text and metadata.
Stakeholder interviews, thematic coding of concerns, and risk assessment commentary based on centralized logs and metadata aggregation; no measured breach incidents reported here.
medium negative The Role of Artificial Intelligence in Healthcare Complaint ... privacy/security risk (qualitative risk indicators; potential exposure of compla...
Organizations will incur additional governance and procurement costs (diversity audits, recalibration of reward models, multi-model infrastructures) to mitigate homogenization, shifting some economic benefits of AI toward governance spending.
Cost implication argued from the need for auditing and multi-model procurement described in recommendations; not supported by quantified cost analyses in the paper.
medium negative The Artificial Hivemind: Rethinking Work Design and Leadersh... governance and procurement costs associated with LLM deployment
Inter-model convergence undermines product differentiation across AI providers and could accelerate commoditization of base LLM outputs.
Market-structure inference built on empirical finding of high cross-model output similarity across 70+ models and theoretical discussion of vendor differentiation; no market-level price or adoption time-series analyzed in the paper.
medium negative The Artificial Hivemind: Rethinking Work Design and Leadersh... vendor product differentiation / commoditization of base outputs
Homogenized AI outputs reduce the value of AI as a source of varied cognitive complements to human labor, potentially lowering productivity gains from human–AI collaboration in tasks requiring creativity and exploration.
Economic argument drawing on measured decreases in model output diversity and theoretical literature on complementarities between diverse AI outputs and human creativity; no direct measured productivity changes reported in field settings within the paper.
medium negative The Artificial Hivemind: Rethinking Work Design and Leadersh... productivity gains from human–AI collaboration (theoretical implication inferred...
Reward-model and evaluation miscalibration can cause organizations to prefer models that maximize apparent evaluation scores at the expense of useful stylistic or cognitive diversity.
Comparative analyses between automated evaluation/reward-model rankings and human preference/diversity assessments reported in the paper; examples where high-scoring models produced more consensus-style outputs.
medium negative The Artificial Hivemind: Rethinking Work Design and Leadersh... model selection bias driven by automated evaluation scores; reduction in diversi...