The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (13827 claims)

Adoption
8454 claims
Productivity
7544 claims
Governance
6789 claims
Human-AI Collaboration
6327 claims
Org Design
4126 claims
Innovation
4058 claims
Labor Markets
3520 claims
Skills & Training
2924 claims
Inequality
2057 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 749 195 97 889 1979
Governance & Regulation 815 391 188 121 1539
Organizational Efficiency 771 189 124 83 1177
Technology Adoption Rate 624 233 123 96 1084
Research Productivity 410 121 56 331 929
Output Quality 466 177 59 47 749
Decision Quality 320 174 75 42 618
Firm Productivity 435 55 88 20 604
AI Safety & Ethics 214 276 65 33 593
Market Structure 178 166 122 24 495
Task Allocation 206 64 70 31 376
Skill Acquisition 165 57 60 17 299
Innovation Output 201 27 41 18 288
Employment Level 105 51 107 13 278
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 116 63 42 11 232
Firm Revenue 149 46 26 3 224
Inequality Measures 44 122 49 6 221
Task Completion Time 169 29 8 12 219
Worker Satisfaction 89 61 20 12 182
Error Rate 69 91 10 2 172
Regulatory Compliance 76 68 14 5 163
Training Effectiveness 92 19 13 19 145
Wages & Compensation 77 36 25 6 144
Automation Exposure 51 54 22 12 142
Team Performance 86 17 27 9 140
Developer Productivity 94 17 14 6 132
Job Displacement 12 80 20 1 113
Hiring & Recruitment 51 7 8 3 69
Skill Obsolescence 5 45 6 1 57
Creative Output 31 16 7 2 57
Social Protection 27 16 8 2 53
Labor Share of Income 17 17 17 51
Worker Turnover 11 12 3 26
Industry 1 1
Post-conflict reconstruction relies heavily on private enterprises to bring back employment, rebuild supply networks, and reconnect damaged economies.
Statement grounded in literature cited in the review (paper positions this as a general premise from post-conflict reconstruction literature); no primary data reported.
high positive RegTech-enabled governance of sanctions-safe enterprise ecos... role of private enterprises in employment recovery, supply-network rebuilding, a...
A causal ablation confirms that each of the four mechanical enforcement primitives is individually necessary.
Causal ablation experiments reported by authors in the synthetic banking domain: removing each primitive degrades performance/governance, implying individual necessity. Abstract does not report exact experimental counts or effect sizes.
high positive Mechanical Enforcement for LLM Governance:Evidence of Govern... impact of removing each mechanical primitive on governance/task metrics (necessi...
Mechanical enforcement raises task accuracy from MCC ~0.43 to 0.88.
Reported Matthews correlation coefficient (MCC) for task accuracy under text-only governance (≈0.43) versus mechanical enforcement (≈0.88) in the paper's synthetic experiments; sample size not provided in abstract.
high positive Mechanical Enforcement for LLM Governance:Evidence of Govern... task accuracy (Matthews correlation coefficient)
Mechanical enforcement more than doubles deferral information content.
Comparison of information-content measures for deferrals between text-only governance and mechanical enforcement in the synthetic banking domain experiments; exact numeric basis not given in abstract.
high positive Mechanical Enforcement for LLM Governance:Evidence of Govern... deferral information content (information-theoretic or content metric reported b...
Mechanical enforcement reduces the rate of deferrals that carry no decision-relevant information by 73%.
Head-to-head comparison between text-only governance and a mechanically enforced architecture (four primitives) in the paper's synthetic banking experiments; specific sample size not stated in abstract.
high positive Mechanical Enforcement for LLM Governance:Evidence of Govern... relative reduction in rate of non-informative deferrals
These results challenge the presumed universality of the fairness-accuracy tradeoff and demonstrate that well-designed modeling improvements can advance both fairness and accuracy in large-scale public sector systems.
Synthesis of the three complementary analyses (observational county-level correlations, simulation experiments with added property features, and simulations incorporating Census data) performed on the 26 million-sale dataset covering ~95% of U.S. counties.
high positive Tradeoffs are Domain Dependent: Improving Accuracy and Fairn... co-movement of fairness and accuracy under improved modeling practices
Incorporating publicly available Census data into assessment models - a feasible reform in most counties - would significantly improve both accuracy and fairness relative to status quo assessments.
Simulated reforms adding publicly available Census covariates to assessment models and comparing resulting accuracy and fairness metrics to status-quo assessments across the dataset covering 26 million sales/95% of counties.
high positive Tradeoffs are Domain Dependent: Improving Accuracy and Fairn... assessment accuracy and fairness after inclusion of Census data
When accuracy improves in the simulated assessment models, fairness almost always improves as well.
Analysis of simulated model outcomes showing joint changes in accuracy and fairness metrics across many simulated configurations and counties; reported near-universal co-improvement when accuracy rises.
high positive Tradeoffs are Domain Dependent: Improving Accuracy and Fairn... assessment fairness (distributional error/fairness metrics) conditional on chang...
In simulated assessment models, adding property features improves accuracy in most cases.
Simulation experiments using alternative assessment models that include additional property-level features; comparisons between baseline and feature-augmented simulated models across many counties/cases.
high positive Tradeoffs are Domain Dependent: Improving Accuracy and Fairn... assessment accuracy (model predictive performance)
Assessment accuracy and fairness - measured using domain-relevant metrics - are strongly correlated across counties under status quo practices.
Observational analysis of status-quo assessment outcomes using a dataset of 26 million property sales spanning ~95% of U.S. counties; county-level correlation analysis between domain-relevant accuracy metrics and fairness metrics.
high positive Tradeoffs are Domain Dependent: Improving Accuracy and Fairn... assessment accuracy and assessment fairness (domain-relevant metrics)
The research contributes to the literature on technology adoption in developing economies and offers policymakers and business leaders in sub-Saharan Africa valuable insights.
Paper's stated contribution in the abstract; a general claim about the study's scholarly and policy relevance rather than a quantifiable empirical result.
high positive Estimation of Firm Labour Productivity and Sales Growth from... contribution to literature and policy relevance
Targeted policy interventions — such as upskilling initiatives and supportive regulatory frameworks — are important to harness AI’s benefits while mitigating adverse impacts on workers.
Paper conclusion/recommendation drawn from empirical findings (positive association of AI with productivity and sales, plus observed cross-country variation). This is presented as a policy implication; no empirical evaluation of specific policies is reported in the excerpt.
high positive Estimation of Firm Labour Productivity and Sales Growth from... policy effectiveness in harnessing AI benefits and mitigating worker impacts (re...
AI adoption has a significant positive relationship with firm sales growth in the selected sub-Saharan African countries.
Same firm-level World Bank Enterprise Surveys (2007–2024) and regression methods (FGLS, robust OLS, HDFE) as above. Paper statement: "AI has a significant positive relationship with ... sales growth." Exact sample size and numeric effect not provided in excerpt.
AI adoption has a significant positive relationship with firm labour productivity in the selected sub-Saharan African countries.
Firm-level dataset from the World Bank Enterprise Surveys covering 2007–2024; empirical analysis using feasible generalized least squares (FGLS), robust OLS, and high-dimensional fixed effects (HDFE) linear regressions. Paper statement: "AI has a significant positive relationship with firm labour productivity." Exact firm sample size not reported in the provided excerpt.
Policy should prioritize employment‑centered digital strategies that are spatially differentiated and institutionally grounded to mitigate negative labor and development effects.
Normative policy recommendation arising from the paper's theoretical framework and regional field observations (policy prescription; not an empirically estimated intervention in the paper).
high positive Automation, Migration, and Development: Geography of Job Pre... effectiveness of employment-centered, spatially differentiated digital policies
There is a positive spillover effect on AI-ineligible chats: treated workers adapted their multitasking workflow to devote greater attention to these chats.
Experiment-level observations comparing worker behavior on AI-ineligible chats between treatment and control; treated workers reallocated attention/effort (multitasking workflow changes) leading to improved attention on AI-ineligible chats.
high positive Agentic AI and Human-in-the-Loop Interventions: Field Experi... attention/effort devoted to AI-ineligible chats (spillover effect)
Early intervention is essential for sustaining high post-escalation intervention effort.
Temporal analysis of intervention timing within the randomized experiment showing an association between earlier human intervention after escalation and higher subsequent intervention effort.
high positive Agentic AI and Human-in-the-Loop Interventions: Field Experi... post-escalation intervention effort as a function of intervention timing
Human intervention preserves service quality in algorithm-triggered technical escalations (unresolved customer issues beyond the AI's capability).
Experimental subgroup analysis of escalations categorized as algorithm-triggered technical escalations; post-escalation human interventions were observed to maintain service quality in these cases.
high positive Agentic AI and Human-in-the-Loop Interventions: Field Experi... service quality after technical escalations
PRIF yielded an average ROI of 83%.
Reported financial evaluation/ROI estimate following PRIF adoption in the paper (derived from pilot/case study cost-benefit or sample analysis).
high positive Enhancing Forensic Accounting Practice: A Proactive Risk Man... return on investment (ROI %)
PRIF adoption reduced financial misstatements by 47%.
Reported change in financial misstatement incidence after PRIF implementation in the paper's evaluation (case studies/forensic report analysis).
high positive Enhancing Forensic Accounting Practice: A Proactive Risk Man... financial misstatements (reduction %)
PRIF adoption reduced compliance resolution time by 58%.
Reported performance metric after PRIF adoption in pilot/case studies described in the paper.
high positive Enhancing Forensic Accounting Practice: A Proactive Risk Man... compliance resolution time (reduction %)
Client retention was 91% for high SCI versus 54% for low SCI.
Reported retention rates stratified by SCI levels in paper (presumably derived from the sample used for SCI analysis).
The Stakeholder Communication Index (SCI) revealed a strong correlation (r = 0.83) between report quality and client retention.
Statistical analysis reported in paper linking SCI-derived report quality scores to client retention; correlation coefficient r = 0.83 provided.
high positive Enhancing Forensic Accounting Practice: A Proactive Risk Man... correlation between report quality and client retention (r)
Accuracy increased from 62% to 89–94% after integration of AI and blockchain.
Reported accuracy figures in results section based on PRIF evaluation (presumably from analyzed forensic reports/case studies).
high positive Enhancing Forensic Accounting Practice: A Proactive Risk Man... detection/analysis accuracy (%)
Integration of AI and blockchain reduced the risk detection time from 47 days post-event to 9–22 days pre-event.
Reported results from PRIF implementation/pilot using case studies and forensic report analysis (paper cites these temporal comparisons).
high positive Enhancing Forensic Accounting Practice: A Proactive Risk Man... risk detection time (days)
This study pioneers a Proactive Risk Intelligence Framework (PRIF) for Chartered Accountant (CA) firms, targeting gaps in risk anticipation, stakeholder communication, and compliance.
Paper description of study objective and framework development (mixed-method design, interviews, case studies, forensic report analysis).
high positive Enhancing Forensic Accounting Practice: A Proactive Risk Man... creation and introduction of PRIF (framework development)
By reframing reskilling as a shared, supported, and bounded process, AI-driven change can foster long-term career resilience, professional identity renewal, and sustainable human–AI integration.
Conceptual conclusion/implication drawn by the authors from the proposed model and recommendations; no empirical validation included in the paper.
high positive AI-driven skill volatility and the emergence of re-skilling ... career resilience, professional identity renewal, quality of human–AI integratio...
The paper advances a set of sustainable, collective strategies—such as role-linked learning, protected learning time, skill prioritization, and phased AI adoption—to interrupt the reskilling loop and redistribute adaptive demands across organizations.
Prescriptive/theoretical recommendations proposed by the authors; no empirical evaluation or trial evidence presented.
high positive AI-driven skill volatility and the emergence of re-skilling ... effectiveness of organizational strategies in reducing reskilling burdens
The paper proposes a reconstructed labour law framework based on economic dependency rather than traditional employment classification, including recognition of dependent contractor status, platform liability for worker welfare, algorithmic transparency, social security obligations, and specialised grievance mechanisms.
Normative legal/policy proposal articulated by the author(s) based on theoretical argument and the comparative analysis of existing regulatory gaps; prescriptive recommendation rather than empirically tested intervention.
high positive Corporate Accountability in the Gig Economy: Re-examining La... recommended legal/regulatory reforms and institutional design
Because the method sits architecturally below the current safety stack, the same formula provides a real-time warning signal that current alignment does not supply, portable across current and future ChatGPT-like AI architectures and instantiable in application domains where competing response classes can be defined.
Theoretical/architectural claim in paper, supported by cross-architecture empirical tests and theoretical argument (no further quantitative sample size provided in excerpt).
high positive Fusion-fission forecasts when AI will shift to undesirable b... availability of a real-time warning signal for undesirable shifts, portability a...
The authors made an a priori time-stamped prediction eleven months before the Stanford 'Delusional Spirals' corpus appeared, and that prediction was independently confirmed by the corpus of 207,443 human–AI exchanges.
Time-stamped prediction reported in paper; independent confirmation claimed via the Stanford 'Delusional Spirals' corpus containing 207,443 human–AI exchanges.
high positive Fusion-fission forecasts when AI will shift to undesirable b... a priori predictive success confirmed by an independent corpus
The shift phenomenon and forecasting persist at production scale across ten frontier chatbots.
Empirical observation reported in paper: tests across ten production/frontier chatbots.
high positive Fusion-fission forecasts when AI will shift to undesirable b... persistence of shift dynamics and forecasting applicability in production chatbo...
The method achieved 90 percent correct forecasting across seven AI models spanning two orders of magnitude in parameter count (124M–12B).
Empirical test reported in paper: seven AI models evaluated for forecasting accuracy; model parameter counts reported as 124M–12B.
high positive Fusion-fission forecasts when AI will shift to undesirable b... forecasting accuracy of shifts
The shift-condition approach is validated across six independent tests.
Paper statement listing six independent validation tests (method: multiple independent experiments/tests).
high positive Fusion-fission forecasts when AI will shift to undesirable b... validation across multiple independent tests
The shift condition is neither model-specific nor driven by stochastic sampling.
Claim supported by cross-model empirical tests reported in the paper (tests spanning multiple model sizes and production chatbots).
high positive Fusion-fission forecasts when AI will shift to undesirable b... generality of shift condition across models and sampling modes
The shift condition is derivable mathematically and results from group-level competition between the conversation-so-far (C) and the desirable (B) and undesirable (D) basin dynamics, which can be estimated in advance for a given application.
Paper claims an explicit mathematical derivation (theoretical/mathematical methods reported).
high positive Fusion-fission forecasts when AI will shift to undesirable b... mathematical derivability and interpretable formulation of shift condition (C vs...
A vector generalization of fusion–fission group dynamics (observed in living and active-matter systems) drives — and can forecast — future shifts in an AI's behavior.
Theoretical proposal plus empirical validation reported in paper (validated across six independent tests as stated).
high positive Fusion-fission forecasts when AI will shift to undesirable b... ability to forecast future behavioral shifts in AI
The appropriate design response to Metis tasks is centaur architectures in which humans lead and AI supports, rather than pursuing further automation.
Prescriptive recommendation based on the conceptual analysis and normative reasoning in the paper; not supported by empirical evaluation or quantified comparisons of architectures.
high positive Metis AI: The Overlooked Middle Zone Between AI-Native and W... recommended human-AI system design
Policy conclusion: while palliative care is an ethical imperative, its expansion must be decoupled from the oncological paradigm and matched with state-funded long-term care to protect against clinical decline and financial shocks.
Normative recommendation based on the empirical distributional findings (average protective effects but harmful tails for vulnerable groups) and cross-national differences reported in the analysis.
high positive The Broken Shield of European Palliative Care: Evidence from... Policy effectiveness in protecting households from clinical decline and financia...
We introduce a Synthetic Data Generation framework using Tabular Denoising Diffusion Probabilistic Models within a Two-Learner architecture to synthesize high-fidelity digital twins from pan-European SHARE data (2016-2021).
Methodological contribution described in the paper; implementation details include use of diffusion-based tabular generative models and a Two-Learner architecture applied to SHARE microdata from 2016–2021.
high positive The Broken Shield of European Palliative Care: Evidence from... Quality/fidelity of synthesized digital twins (methodological outcome)
On average, palliative care (PC) acts as a 'double shield', truncating out-of-pocket expenditures (financial toxicity) and informal caregiving shadow values (time poverty).
Analysis of pan-European SHARE data (2016-2021) using a Synthetic Data Generation framework (Tabular Denoising Diffusion Probabilistic Models within a Two-Learner architecture) to create digital twins and estimate treatment effects.
high positive The Broken Shield of European Palliative Care: Evidence from... Out-of-pocket expenditures (financial toxicity) and informal caregiving shadow v...
The study highlights the importance of reskilling and education reforms to ensure inclusive labor market outcomes in the era of AI-driven transformation.
Authors' policy recommendation based on their empirical findings from the survey (n=320) and SEM analysis; presented as a conclusion/recommendation rather than a quantified empirical result.
high positive ARTIFICIAL INTELLIGENCE, AUTOMATION, AND LABOR MARKET TRANSF... policy recommendation: reskilling and education reforms
The model explained 49% of variance in wage dynamics (R^2 = 0.49).
SEM model statistics reported for the survey-based model (n=320); R-squared for wage dynamics = 49%.
high positive ARTIFICIAL INTELLIGENCE, AUTOMATION, AND LABOR MARKET TRANSF... wage dynamics (explained variance)
The model explained 45% of variance in skill transformation (R^2 = 0.45).
SEM model statistics reported for the survey-based model (n=320); R-squared for skill transformation = 45%.
high positive ARTIFICIAL INTELLIGENCE, AUTOMATION, AND LABOR MARKET TRANSF... skill transformation (explained variance)
The model explained 52% of variance in employment patterns (R^2 = 0.52).
SEM model fit/variance-explained statistics reported for the survey-based model (n=320); R-squared for employment patterns = 52%.
high positive ARTIFICIAL INTELLIGENCE, AUTOMATION, AND LABOR MARKET TRANSF... employment patterns (explained variance)
Mediation analysis confirmed that skill transformation plays a significant mediating role linking AI adoption with wage distribution/outcomes.
Mediation analysis within the SEM framework applied to the survey data (n=320); authors report a significant mediation effect (no numeric indirect effect reported in the summary).
high positive ARTIFICIAL INTELLIGENCE, AUTOMATION, AND LABOR MARKET TRANSF... wage dynamics (as mediated by skill transformation)
Mediation analysis confirmed that skill transformation plays a significant mediating role linking AI adoption with employment outcomes.
Mediation analysis within the SEM framework applied to the survey data (n=320); authors report a significant mediation effect (no numeric indirect effect reported in the summary).
high positive ARTIFICIAL INTELLIGENCE, AUTOMATION, AND LABOR MARKET TRANSF... employment patterns (as mediated by skill transformation)
Skill transformation significantly affected wage dynamics (β = 0.55, p < 0.001).
Structural equation modeling (SEM) on the same sample (n=320); reported standardized path coefficient β = 0.55 with p < 0.001.
Skill transformation significantly affected employment patterns (β = 0.58, p < 0.001).
Structural equation modeling (SEM) mediation/causal-path analysis on the survey (n=320); reported standardized path coefficient β = 0.58 with p < 0.001.
AI adoption significantly influenced wage dynamics (β = 0.61, p < 0.001).
Structural equation modeling (SEM) on the same survey sample (n=320); reported standardized path coefficient β = 0.61 with p < 0.001.