The Commonplace
Home Dashboard Papers Evidence Digests 🎲

Evidence (4049 claims)

Adoption
5126 claims
Productivity
4409 claims
Governance
4049 claims
Human-AI Collaboration
2954 claims
Labor Markets
2432 claims
Org Design
2273 claims
Innovation
2215 claims
Skills & Training
1902 claims
Inequality
1286 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 369 105 58 432 972
Governance & Regulation 365 171 113 54 713
Research Productivity 229 95 33 294 655
Organizational Efficiency 354 82 58 34 531
Technology Adoption Rate 277 115 63 27 486
Firm Productivity 273 33 68 10 389
AI Safety & Ethics 112 177 43 24 358
Output Quality 228 61 23 25 337
Market Structure 105 118 81 14 323
Decision Quality 154 68 33 17 275
Employment Level 68 32 74 8 184
Fiscal & Macroeconomic 74 52 32 21 183
Skill Acquisition 85 31 38 9 163
Firm Revenue 96 30 22 148
Innovation Output 100 11 20 11 143
Consumer Welfare 66 29 35 7 137
Regulatory Compliance 51 61 13 3 128
Inequality Measures 24 66 31 4 125
Task Allocation 64 6 28 6 104
Error Rate 42 47 6 95
Training Effectiveness 55 12 10 16 93
Worker Satisfaction 42 32 11 6 91
Task Completion Time 71 5 3 1 80
Wages & Compensation 38 13 19 4 74
Team Performance 41 8 15 7 72
Hiring & Recruitment 39 4 6 3 52
Automation Exposure 17 15 9 5 46
Job Displacement 5 28 12 45
Social Protection 18 8 6 1 33
Developer Productivity 25 1 2 1 29
Worker Turnover 10 12 3 25
Creative Output 15 5 3 1 24
Skill Obsolescence 3 18 2 23
Labor Share of Income 7 4 9 20
Clear
Governance Remove filter
There are few integrated frameworks (bridging ethics and technical controls) in the current AI governance landscape.
Result of the literature review and cluster analysis showing limited coverage of frameworks that integrate ethical principles with auditable technical controls.
high negative AI Governance Risk Tiering for Sustainable Digital Infrastru... prevalence of integrated governance frameworks
Findings reveal a fragmented landscape dominated by ethics/privacy-centric and compliance/risk-focused approaches.
Synthesis of the reviewed literature and results of PCA/k-means clustering indicate thematic dominance of ethics/privacy and compliance/risk orientations across frameworks.
high negative AI Governance Risk Tiering for Sustainable Digital Infrastru... dominant thematic focus of governance frameworks
These findings uncover critical threats to judicial integrity and public trust and underscore the urgent need for robust safeguards against non-legal influences in AI legal systems.
Interpretation/conclusion drawn from the empirical results (observed deviations, sentiment amplification, and subgroup vulnerabilities).
high negative LLM Safety in Judicial AI: A Stress Test of Social Media Inf... potential impact on judicial integrity and public trust (qualitative/inferential...
These safety risks are compounded for emotionally charged topics.
Subgroup analyses where emotionally charged case topics showed larger deviations and stronger effects from injected sentiment.
high negative LLM Safety in Judicial AI: A Stress Test of Social Media Inf... change in deviation/amplification of model outputs for emotionally charged topic...
These safety risks are compounded (stronger) for low-skilled occupational categories.
Subgroup analyses reported in the paper showing larger model deviations and/or greater sentiment amplification effects for cases involving low-skilled occupations.
high negative LLM Safety in Judicial AI: A Stress Test of Social Media Inf... interaction effect: deviation/amplification magnitude by occupational skill leve...
The sentiment-induced divergences lead to unstable and often inflated compensation predictions by the models.
Analysis of model-predicted compensation amounts under sentiment perturbations showing increased variability and upward bias compared to CJOL amounts.
high negative LLM Safety in Judicial AI: A Stress Test of Social Media Inf... predicted compensation amounts (inflation and instability) from LLMs versus CJOL...
Public opinion (social media sentiment) substantially amplifies deviations between LLM outputs and real rulings.
Stress-test experiments in which injected social media sentiment increased the divergence of model outputs from CJOL judgments across the sample.
high negative LLM Safety in Judicial AI: A Stress Test of Social Media Inf... change in deviation between LLM outputs and CJOL rulings when social media senti...
Models exhibit inherent deviations from real rulings.
Empirical comparison of LLM outputs to CJOL judgments showing systematic differences (based on the paper's reported comparisons across the dataset).
high negative LLM Safety in Judicial AI: A Stress Test of Social Media Inf... magnitude and frequency of deviations between LLM outputs and actual court judgm...
The article argues that the idea of a “Pax Silica” is fragile.
Conclusion drawn from the paper's theoretical framework and comparative analysis; presented as an assessment rather than empirical measurement.
high negative The Logistics of Hegemony: Semiconductor Chokepoints, Global... stability/fragility of a proposed techno-hegemonic order ('Pax Silica')
Contemporary struggles over semiconductor supply chains represent not a new hegemonic order but a logistical adaptation of Pax Americana.
Stated thesis supported by comparative/historical analysis and theoretical argumentation (comparative analysis of historical Pax orders and U.S. techno-security architecture); no quantitative sample size reported in abstract.
high negative The Logistics of Hegemony: Semiconductor Chokepoints, Global... characterization of geopolitical order governing semiconductor supply chains
In the short term, big data may inhibit welfare growth.
Theoretical comparative-static/dynamic analysis reported in the model showing that initial or short-run effects of increased data sharing can reduce welfare growth (no empirical/sample data).
high negative Study on the impact of big data sharing on individuals’ welf... short-term growth of individuals' welfare
Traditional paradigms, specifically the resource-based view and the dynamic capabilities framework, operate under closed-system, first-order cybernetic assumptions that fail to capture the dissipative nature of algorithmic agents.
Conceptual critique presented in the paper's theoretical argumentation (literature critique and re-framing); no empirical sample reported.
high negative Governing Human–AI Co-Evolution: Intelligentization Capabili... explanatory_power_of_management_theory (ability to account for AI-driven organiz...
This result directly contradicts classical scaling laws which assume monotonic capability gains with model scale.
Comparative theoretical claim in the paper contrasting the Institutional Scaling Law with classical empirical/theoretical scaling laws in ML literature.
high negative Punctuated Equilibria in Artificial Intelligence: The Instit... relationship between model scale and deployment-relevant fitness/capability
The Institutional Scaling Law proves that institutional fitness is non-monotonic in model scale.
Formal mathematical derivation/proof presented in the paper (the 'Institutional Scaling Law').
high negative Punctuated Equilibria in Artificial Intelligence: The Instit... institutional fitness as a function of model scale
AI development proceeds not through smooth advancement but through extended periods of stasis interrupted by rapid phase transitions that reorganize the competitive landscape (punctuated equilibrium pattern).
Argument based on punctuated equilibrium theory from evolutionary biology and historical analysis presented in the paper identifying discrete transitions in AI history; the paper cites and classifies eras/events as evidence.
high negative Punctuated Equilibria in Artificial Intelligence: The Instit... pattern of AI development (stasis vs. phase transitions)
The interaction of artificial intelligence and environmental regulation produces a '1 + 1 < 2' crowding-out effect (their combined effect is less than the sum of individual effects).
Spatial Durbin model with interaction term between AI and environmental regulation as summarized in the abstract; reported as a crowding-out interaction.
high negative How artificial intelligence and environmental regulation inf... UCEE index (interaction effect of AI and environmental regulation)
Environmental regulation significantly inhibits local UCEE.
Spatial Durbin model results reported in the abstract indicating a significant negative local coefficient for environmental regulation.
high negative How artificial intelligence and environmental regulation inf... UCEE index (local/provincial effect of environmental regulation)
Artificial intelligence significantly inhibits local UCEE.
Spatial Durbin model results reported in the abstract indicating a significant negative local coefficient for artificial intelligence.
high negative How artificial intelligence and environmental regulation inf... UCEE index (local/provincial effect of AI)
Rather than broad job losses, evidence points to a reallocation at the entry level: AI automates tasks typically assigned to junior staff, shifting the nature of entry-level roles.
Synthesis of firm- and task-level empirical studies reported in the brief documenting automation of routine/junior tasks and changes in job-task composition; specific sample sizes vary by cited study and are not provided in the brief.
high negative AI, Productivity, and Labor Markets: A Review of the Empiric... automation of entry-level/junior tasks and changes to entry-level job content
Algorithmic credit systems are linked to higher levels of financial stress.
Study reports a positive association between algorithmic credit system use and reported financial stress from regression analysis on the 400-user cross-sectional dataset.
Confirmation bias poses a weakness in LLM-based code review, with implications on how AI-assisted development tools are deployed.
Synthesis of findings from Study 1 (framing-induced detection failures) and Study 2 (practical exploitability and partial mitigation via debiasing).
high negative Measuring and Exploiting Confirmation Bias in LLM-Assisted S... reliability/security of LLM-based code review
Adversarial framing succeeds in 88% of cases against Claude Code (autonomous agent) in real project configurations where adversaries can iteratively refine their framing to increase attack success.
Study 2 experiments in real project configurations with iterative adversary refinement evaluated against Claude Code (autonomous agent); reported 88% success rate.
high negative Measuring and Exploiting Confirmation Bias in LLM-Assisted S... attack success rate (vulnerability reintroduction accepted/not detected)
Adversarial pull request framing (e.g., labeled as security improvements or urgent functionality fixes) succeeds in reintroducing known vulnerabilities in 35% of cases against GitHub Copilot under one-shot attacks.
Study 2 experiments simulating adversarial pull requests evaluated against GitHub Copilot (interactive assistant); reported success rate 35% for one-shot attacks.
high negative Measuring and Exploiting Confirmation Bias in LLM-Assisted S... attack success rate (vulnerability reintroduction accepted/not detected)
The framing effect is strongly asymmetric: false negatives increase sharply while false positive rates change little.
Comparison of false negative and false positive rates across framing conditions in Study 1 experiments (250 CVE pairs across models).
high negative Measuring and Exploiting Confirmation Bias in LLM-Assisted S... false negative rate and false positive rate
Framing a change as bug-free reduces vulnerability detection rates by 16-93%.
Result reported from Study 1 controlled experiments across models and framing conditions (250 CVE pairs).
high negative Measuring and Exploiting Confirmation Bias in LLM-Assisted S... vulnerability detection rate
LLM-generated peer reviews place significantly less weight on clarity and significance of the research.
Comparative analysis between LLM-generated reviews and human reviews from the conference dataset; reported as a statistically significant difference but exact statistics and sample size not provided in the excerpt.
high negative How LLMs Distort Our Written Language importance/weight given to clarity and significance in peer review content
Significantly more heavy LLM users reported that the writing was less creative and not in their voice.
Self-reported measures from participants in the human user study comparing heavy LLM users to others; no sample size or exact statistics provided in the excerpt.
high negative How LLMs Distort Our Written Language self-reported creativity and 'in-your-voice' authenticity of writing
In Chicago, the model shows moderate under-detection of Black residents with DIR equal to 0.22.
Reported DIR value from simulation results on Chicago 2022 data.
high negative Unmasking Algorithmic Bias in Predictive Policing: A GAN-Bas... Disparate Impact Ratio (DIR) indicating under-detection of Black residents
It is impractical to uniformly apply an alignment method across diverse, independently developed AI models in strategic settings.
Paper assertion / motivating argument (stated as motivation for investigating zero-shot Nash-like behavior); not presented as an empirical finding within the paper.
high negative Reasonably reasoning AI agents can avoid game-theoretic fail... practicality/adoption feasibility of universal alignment methods
The crowding-out effect of AI washing on green innovation is heterogeneous: private enterprises, small and medium-sized enterprises (SMEs), and firms in highly competitive sectors suffer more severe negative impacts.
Subgroup/heterogeneity analysis reported in the paper on the same sample of Chinese A-share listed companies (2006–2024); abstract identifies private firms, SMEs, and firms in highly competitive industries as more affected.
high negative The Spillover Effects of Peer AI Rinsing on Corporate Green ... green innovation (heterogeneous treatment effects across firm types and industri...
The negative relationship between AI washing and green innovation is transmitted through dual channels in both product and capital markets.
Mechanism analysis reported in the paper (presumably mediation or channel analysis) using the same dataset of Chinese A-share firms' annual reports and firm-level market data; abstract states product- and capital-market channels convey the crowding-out effect.
high negative The Spillover Effects of Peer AI Rinsing on Corporate Green ... green innovation (via product-market and capital-market channels)
Corporate AI washing exerts a significant crowding-out effect on green innovation.
Empirical analysis using semantic measures of 'AI washing' derived from large language model (LLM) analysis of annual reports for Chinese A-share listed companies (2006–2024); paper reports statistically significant negative relationship between AI washing and firms' green innovation (details of regression models not provided in abstract).
Exclusion-based cohesion can produce state-contingent illusory precision together with effective input concentration and dynamic lock-in simultaneously—i.e., these phenomena co-occur under the model's parameter regimes.
Analytical model results showing co-occurrence of multiple adverse phenomena (bias that grows in tails, illusory precision, input concentration, lock-in) under the same exclusion mechanisms; derived within the paper's theoretical framework.
high negative Cohesion as Concentration: Exclusion-Driven Fragility in Fin... co-occurrence of multiple adverse outcomes: tail bias, observed disagreement, ef...
When the anchor belief is updated from internally filtered aggregates, the system can exhibit dynamic lock-in: delayed recognition of regime shifts followed by abrupt correction.
Analytical dynamics studied in the model when anchor updates depend on filtered (excluded) aggregates; derivations demonstrate delayed detection and abrupt adjustments. This is a theoretical/dynamical model result, no empirical data.
high negative Cohesion as Concentration: Exclusion-Driven Fragility in Fin... delay in regime recognition and magnitude/timing of corrective update
Exclusion leads to effective concentration of decision inputs: the effective number of independent inputs falls below the nominal participant count.
Model-derived analytic result showing that report shrinkage and discarding reduce effective information contributions, quantified relative to nominal participation in the theoretical framework. No empirical sample.
high negative Cohesion as Concentration: Exclusion-Driven Fragility in Fin... effective number of independent decision inputs (information concentration)
Exclusion-based cohesion induces 'illusory precision': observed disagreement can fall while actual estimation error in tail regimes rises (i.e., lower recorded variance despite higher true error).
Theoretical result derived from the signal-aggregation model showing a regime in which filtered reports reduce observed variance even as tail-regime estimation error increases. No empirical validation provided.
high negative Cohesion as Concentration: Exclusion-Driven Fragility in Fin... observed disagreement (reported variance) versus true estimation error in tail r...
Relative to a full-inclusion benchmark, exclusion-based cohesion produces state-contingent bias that is small in normal regimes but grows sharply under regime displacement (tail events).
Analytical comparisons between the exclusion model and a full-inclusion benchmark within the theoretical model; derivations showing bias as a function of regime and exclusion parameters. The result is from model analysis, not empirical data.
high negative Cohesion as Concentration: Exclusion-Driven Fragility in Fin... estimation bias (especially under regime displacement/tail events)
The establishment of the China–ASEAN Free Trade Area (CAFTA) reduced regional trade policy uncertainty.
Empirical analysis treats CAFTA as an exogenous policy shock and measures a decline in regional trade policy uncertainty using firm‑ and trade‑level data from the China Industrial Enterprise Database and China Customs Database covering 2000–2014; identification via difference‑in‑differences (DID). (Sample sizes not specified in provided summary.)
high negative How regional trade policy uncertainty affects agricultural i... regional trade policy uncertainty (measured at regional/firm level)
Securitization of economic dependencies—especially in strategic sectors (semiconductors, telecoms, cloud)—frames partner states as security risks and exposes them to blacklists, de-risking campaigns, and sudden loss of market access.
Process tracing of export controls and blacklisting episodes; chronologies of sanction/policy actions affecting firms and partners; policy documents and public lists (e.g., export-control lists). (Data sources: export-control lists, sanction policy documents, corporate/access denials; sample sizes not specified.)
high negative China-US Trade War and the Challenges for Developing Countri... incidence of blacklisting/sanctions affecting partners, sudden changes in market...
Large-scale AI models have significant energy and resource costs, creating a notable environmental footprint that must be addressed.
Narrative integration of prior empirical studies measuring compute, energy consumption, and embodied emissions of large models (cited literature); the review does not present new quantitative measurements itself.
high negative The Evolution and Societal Impact of Artificial Intelligence... energy consumption, carbon emissions, and resource use associated with large-sca...
As AI is deployed in safety-critical domains, reliability, regulation, and human-oriented system design become essential to avoid harms.
Review of literature on safety-critical systems, human–machine interaction studies, and regulatory policy discussions; the paper reports this as a consensus implication rather than presenting new empirical tests.
high negative The Evolution and Societal Impact of Artificial Intelligence... system reliability/safety and risk of harm in safety-critical deployments
Stronger empirical evidence is needed on how hazard, exposure, and vulnerability interact across space and time to shape aggregated multi-risks.
Evaluation of project activities and case studies identifying gaps in empirical spatio-temporal analyses of interacting risk components; synthesis recommends targeted empirical work.
high negative Reducing risk together: moving towards a more holistic appro... empirical understanding of spatio-temporal interactions among hazard, exposure, ...
The current literature is skewed toward descriptive and engineering work; there is a lack of causal, field‑experimental evidence on NLP interventions' effects on customer behavior and firm profits.
Review coding of study types in the sample (engineering/descriptive vs. experimental/causal) showing few field experiments or causal designs.
high negative Natural language processing in bank marketing: a systematic ... presence vs. absence of causal/experimental studies measuring effects on custome...
Important gaps include customer acquisition, personalization at scale, use of external text sources (social media, news, reviews), operational process improvement, and cross‑channel integration.
Gap detection via low‑density regions in the UMAP thematic map of sentence‑transformer embeddings and manual review showing low article counts for these topics within the 109‑article sample.
high negative Natural language processing in bank marketing: a systematic ... topical coverage by customer journey stage and source type (acquisition, persona...
Existing literature on NLP in marketing is concentrated around customer retention tasks (e.g., churn prediction, complaint handling, relationship management).
Thematic clustering from sentence‑transformer embeddings of article text combined with UMAP visualization, and manual review of article topics and keywords identifying frequent retention‑related themes.
high negative Natural language processing in bank marketing: a systematic ... topical frequency/coverage by customer journey stage (retention)
NLP applications in bank marketing are severely under‑studied.
Descriptive result from the PRISMA review showing only 8/109 articles focused on NLP in bank marketing (≈7%), plus thematic mapping showing sparse coverage in bank‑marketing/NLP intersection.
high negative Natural language processing in bank marketing: a systematic ... proportion and absolute count of studies at the intersection of NLP and bank mar...
AI‑enabled platforms can magnify winner‑takes‑most dynamics in digital services trade, concentrating market power.
Theoretical and empirical literature on network effects and platform markets reviewed in the paper; illustrative examples (no novel empirical aggregation).
high negative Analysis of Digital Services Trade and Export Competitivenes... market concentration / competition in digital services
Current data governance regimes in China can impede cross‑border data flows.
Comparative policy analysis and literature documenting data localization and privacy/regulatory regimes that restrict flows (descriptive evidence in the review).
high negative Analysis of Digital Services Trade and Export Competitivenes... volume/feasibility of cross‑border data flows
Institutional barriers—fragmented international rules on data flows and privacy, regulatory divergence including data localization, weak participation in multilateral rule setting, and uneven domestic regulation of platforms—impede digital services trade.
Comparative policy analysis and literature review, supported by policy documents and case examples (qualitative evidence; no original econometric tests).
high negative Analysis of Digital Services Trade and Export Competitivenes... cross‑border digital services trade / export competitiveness
Problem C is the practical difficulty of attributing responsibility and agency across distributed socio-technical systems (robots, algorithms, institutions, humans).
Conceptual diagnosis developed in the paper and exemplified with vignettes from three application domains; defined as an analytic concept rather than empirically measured.
high negative Examining ethical challenges in human–robot interaction usin... ability to attribute responsibility/agency in distributed socio-technical system...