The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (6869 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Governance Remove filter
The basin of attraction of the partial adoption trap is enlarged by a threshold coordination failure arising from the non-appropriable nature of systemic benefits.
Model analysis showing how non-appropriable systemic benefits (externalities) change payoff structure and enlarge the basin of attraction for partial adoption. Theoretical derivation; no empirical sample.
high negative The partial adoption trap: Coordination failure, trust, and ... size of basin of attraction for partial adoption (likelihood of landing in parti...
Current monolithic architectures struggle to enforce rigid brand constraints, frequently hallucinating unapproved visual assets.
Asserted critique of existing architectures in paper; no specific empirical metrics, datasets, or sample sizes provided.
high negative Genflow Ad Studio: A Compound AI Architecture for Brand-Alig... hallucination of unapproved assets / brand compliance
Integration of generative video models into enterprise environments is restricted by temporal inconsistencies and severe brand misalignment.
Statement in paper describing deployment limitations; no empirical study, dataset, or sample size provided to quantify these restrictions.
high negative Genflow Ad Studio: A Compound AI Architecture for Brand-Alig... brand alignment / temporal consistency
Deterministic copy collapses uncertainty (i.e., copying deterministically collapses the learner's uncertainty over actions).
Ablation/diagnostic comparisons reported in the paper showing deterministic-copy policies reduce or collapse uncertainty compared to stochastic or trace-informed policies in the benchmark tasks.
high negative When Outcome Looks Right But Discipline Fails: Trace-Based E... uncertainty over action distributions (uncertainty collapse)
Reward-only PPO variants miss trace alignment (they achieve reward/KPIs but do not align with benchmark trace/behavior).
Empirical comparison across the two-hotel benchmark and a compact hidden-budget bidding task showing reward-only PPO variants fail to match trace-based diagnostics.
high negative When Outcome Looks Right But Discipline Fails: Trace-Based E... trace alignment (agreement between agent trace and benchmark behavior)
Yapay zekâ gelişmekte olan ekonomiler için hem fırsatlar hem de tehditler yaratmaktadır: AI işgücü maliyeti avantajını törpüleyebilir.
Kavramsal değerlendirme; mekanizma temelli argüman (otomasyon işgücü maliyeti avantajını azaltır); ampirik veri ya da örneklem belirtilmemiştir.
high negative Yapay Zekâ ve Küresel Değer Zincirleri: Ticaret Politikası v... gelişmekte olan ülkelerin işgücü maliyeti avantajının azalması
Bu dönüşüm mevcut küresel değer zinciri yapılarını ve ülkelerin bu zincirlerdeki konumlarını doğrudan sorgulamaktadır.
Kavramsal tartışma; yazarın analitik çerçevesiyle GVC (küresel değer zinciri) yapılarının AI ile yeniden değerlendirilebileceği ileri sürülmektedir; ampirik örneklem yok.
high negative Yapay Zekâ ve Küresel Değer Zincirleri: Ticaret Politikası v... ülkelerin küresel değer zincirlerindeki konumlarının belirsizleşmesi / yeniden b...
The tech industry claims that its products, business models, and methods of resource extraction are unprecedented and fall outside any existing legal framework.
Descriptive claim about prevailing industry discourse referenced by the authors. (Citations or examples of industry statements not included in the excerpt.)
high negative Auditing African Content Moderators' Working Conditions by U... industry discourse of exceptionalism (claiming novelty and exemption from existi...
Exploitative working conditions violate workers' rights.
Legal assessment based on documents and the authors' interpretation of rights under applicable law (GDPR and labour rights frameworks). (Specific legal rulings or counts not provided in the excerpt.)
high negative Auditing African Content Moderators' Working Conditions by U... violation of workers' legal rights by working conditions
The results of this approach provide legally grounded evidence of the structural disadvantages faced by content moderators in the Global South, whose exploitative working conditions violate workers' rights.
Documents obtained via GDPR requests (employment contracts, NDAs, etc.) and legal interpretation are used as evidence to support claims of structural disadvantage and rights violations. (Specific documents and counts not provided in the excerpt.)
high negative Auditing African Content Moderators' Working Conditions by U... structural disadvantages and rights violations experienced by content moderators...
Current alignment approaches are primarily reactive rather than proactive.
Author's critique/characterization of prevailing alignment practice (conceptual observation without quantitative support).
high negative Positive Alignment: Artificial Intelligence for Human Flouri... orientation of alignment approaches (reactive vs proactive)
The prevailing paradigm of alignment parallels early psychology's focus on mental illness: necessary but incomplete.
Analogy/argument presented by the authors as a conceptual critique (no empirical test reported).
high negative Positive Alignment: Artificial Intelligence for Human Flouri... completeness/adequacy of the current alignment paradigm
Existing alignment research is dominated by concerns about safety and preventing harm: safeguards, controllability, and compliance.
Author's literature-level observation / conceptual review in the paper (no systematic review or quantitative coding reported).
high negative Positive Alignment: Artificial Intelligence for Human Flouri... dominant focus of alignment research
Step-wise verification (verifying each stage of the reasoning chain) increases computational overhead and infrastructure requirements when deployed at scale.
Paper's structural trade-off analysis and engineering argument; no measured compute-costs, benchmarks, or sample-size reporting included in the provided text.
high negative Optimizing Process Based Reward Models through Reinforcement... computational overhead / infrastructure cost
Process-based supervision introduces challenges regarding the sustainability of human-in-the-loop feedback loops.
Socio-technical argumentation in the paper—concern raised about ongoing human verification burden; no longitudinal or empirical data on human labor sustainability provided.
high negative Optimizing Process Based Reward Models through Reinforcement... sustainability of human-in-the-loop feedback (human labor burden / scalability o...
Deploying PRMs at scale introduces unique challenges regarding system latency.
Engineering and infrastructure trade-off analysis described in the paper; no measured latency benchmarks or sample-size performance tests provided in the supplied text.
high negative Optimizing Process Based Reward Models through Reinforcement... system latency / runtime performance
Traditional outcome-based reward models, which evaluate only the final correctness of a solution, often fail to identify logical fallacies or "hallucinations" occurring within intermediate steps.
Theoretical critique and conceptual argumentation presented in the paper; no empirical study or sample size reported.
high negative Optimizing Process Based Reward Models through Reinforcement... hallucination/error detection in intermediate reasoning steps
Capital-intensive sectors face structural constraints on adaptability.
Observed sectoral differences in comparative analysis (e.g., inclusion of ExxonMobil among firms) indicating lower Flexibility Index scores or slower reallocation in capital-intensive firm(s).
high negative Budgeting for Agility: A Cross-Sectoral Analysis of Fiscal F... adaptability / capacity to reallocate resources
Cross-sectoral empirical evidence linking budget flexibility, forecasting accuracy, and institutional oversight remains limited.
Statement of literature gap in paper motivating the study; no new quantitative estimate provided.
high negative Budgeting for Agility: A Cross-Sectoral Analysis of Fiscal F... availability of cross-sector empirical evidence
Traditional static budgeting models are increasingly inadequate in environments marked by volatility, technological disruption, and fiscal uncertainty.
Framing claim in paper introduction; no specific empirical estimate given. Based on comparative empirical design motivation.
high negative Budgeting for Agility: A Cross-Sectoral Analysis of Fiscal F... adequacy of static budgeting models (organizational adaptability to volatile env...
The findings carry direct implications for accountability, institutional integrity, and public trust in urban governance, and contribute to ongoing discourse on responsible AI adoption in cities aligned with global sustainability priorities.
Synthesis of audit results and discussion of their broader implications for public-sector adoption of LLMs in cities; inferential claim based on study outcomes (e.g., errors, fabricated sources, regulatory misinterpretation).
high negative Governance risks of AI reasoning in urban infrastructure thr... implications for accountability, institutional integrity, public trust
These failures extend beyond technical accuracy and introduce risks for governance, fiscal responsibility, and regulatory compliance.
Interpretation of audit findings (e.g., high rate of unverifiable citations, misinterpretation of regulations, degraded alignment on strategic scenarios) to argue systemic risks in governance and fiscal/regulatory domains.
high negative Governance risks of AI reasoning in urban infrastructure thr... risks to governance, fiscal responsibility, regulatory compliance
Many responses misinterpreted regulatory requirements or relied on shallow justification.
Qualitative coding/analysis of LLM responses against expert rubric showing frequent misinterpretation of regulations and superficial reasoning.
high negative Governance risks of AI reasoning in urban infrastructure thr... accuracy of regulatory interpretation and depth of justification
Decision alignment with expert judgment degraded as scenario complexity increased, with strong agreement on operational triage but near-complete divergence on strategic capital allocation.
Comparative evaluation of LLM decisions vs. expert rubric across scenarios of varying complexity (operational triage through strategic capital allocation); qualitative and/or quantitative agreement measures reported in paper.
high negative Governance risks of AI reasoning in urban infrastructure thr... alignment between LLM decisions and expert judgment across scenario complexity
LLM self-reported confidence was negatively correlated with actual reasoning quality (r = -0.23), meaning the lowest-performing models projected the greatest certainty.
Statistical correlation reported between LLM self-reported confidence scores and measured reasoning quality across audited responses/models; correlation coefficient r = -0.23.
high negative Governance risks of AI reasoning in urban infrastructure thr... relationship between self-reported confidence and measured reasoning quality
Across all models, 51.3% of cited sources were unverifiable or fabricated.
Quantitative audit of citations provided by the six commercial LLMs; proportion of cited sources judged unverifiable or fabricated as reported in paper.
high negative Governance risks of AI reasoning in urban infrastructure thr... verifiability of cited sources
Monte Carlo simulations illustrate that standard DID estimators that ignore spillovers can miss the total effect.
Monte Carlo simulation results reported in the paper comparing standard DID estimators (which ignore spillovers) to the proposed approach; simulations show standard DID can fail to capture the total effect under spillovers.
high negative Identification and Estimation of Staggered Difference-in-Dif... accuracy of total effect estimation (bias/omission by standard DID)
The analysis also identifies risks linked to exclusion, symbolic compliance, and concentration of control over compliance processes.
Theoretical risk mapping produced by the integrative review and interpretive synthesis; no primary empirical evidence presented.
high negative RegTech-enabled governance of sanctions-safe enterprise ecos... risks of RegTech governance (exclusion, symbolic compliance, concentration of co...
Uncertainty around compliance and excessive risk avoidance reduce the space for lawful business activity.
Interpretive synthesis of evidence and arguments across the reviewed literatures (sanctions compliance, institutional voids); no original empirical test.
high negative RegTech-enabled governance of sanctions-safe enterprise ecos... extent of lawful business activity (regulatory-compliance-driven market particip...
Firms working under such conditions often experience limited access to finance and markets.
Claim derived from literature on firm constraints in weak institutional/sanctioned contexts as reviewed in the paper; no primary empirical data reported.
high negative RegTech-enabled governance of sanctions-safe enterprise ecos... access to finance and markets for firms
Post-conflict and sanctions-affected environments are strongly affected by sanctions pressure, weak rule enforcement, and high levels of corruption risk.
Synthesis of literature on sanctions, weak institutions, and corruption risk presented in the integrative review; no new empirical sample reported.
high negative RegTech-enabled governance of sanctions-safe enterprise ecos... institutional environment quality (sanctions pressure, rule enforcement, corrupt...
Accuracy is not a sufficient proxy for governance in regulated AI systems.
Empirical results from synthetic banking experiments showing divergence between task accuracy and governance-quality metrics across architectures, as summarized in the abstract.
high negative Mechanical Enforcement for LLM Governance:Evidence of Govern... sufficiency of task accuracy as a proxy for governance/auditability
Under text-only governance, 27% of deferrals carry no decision-relevant information.
Experimental evaluation in a synthetic banking domain comparing text-only governance to mechanical enforcement; reported statistic in paper abstract. Specific sample size not stated in abstract.
high negative Mechanical Enforcement for LLM Governance:Evidence of Govern... fraction of deferrals that contain no decision-relevant information
Currently, systematic assessment errors cause owners of lower-valued properties to face disproportionately high tax burdens, creating regressivity in the property tax system.
Empirical analysis of property assessments and tax burdens using 26 million property sales across ~95% of U.S. counties, showing systematic errors that bias tax burdens toward lower-valued properties.
high negative Tradeoffs are Domain Dependent: Improving Accuracy and Fairn... distributional tax burden (regressivity across property value quintiles)
There are limits to technology‑led growth strategies in labor‑abundant contexts; such strategies do not reliably deliver inclusive employment gains.
Argument based on synthesis of theory and comparative field evidence demonstrating weak employment outcomes from technology‑led growth in labor‑abundant settings (no quantitative effect sizes reported).
high negative Automation, Migration, and Development: Geography of Job Pre... effectiveness of technology-led growth strategies for employment generation
Digital media play a significant role in shaping youth mobilization and political unrest in migrants' countries of origin.
Empirical observations and regional field evidence reported in the paper linking digital media use to youth mobilization and political outcomes (qualitative/comparative evidence; no numeric sample size provided).
high negative Automation, Migration, and Development: Geography of Job Pre... youth mobilization and political unrest
Developing countries face macroeconomic vulnerabilities because of dependence on remittances, which are exposed by automation-driven changes in migrant labor demand.
Analytical linkage developed in the paper supported by comparative field evidence and macroeconomic reasoning; remittance dependence highlighted as a vulnerability (no quantitative estimates or sample sizes reported).
high negative Automation, Migration, and Development: Geography of Job Pre... macroeconomic vulnerability arising from remittance dependence
Technology adoption in core industries in advanced economies is linked with labor displacement, rising youth unemployment, and urban labor saturation in South Asia and North Africa.
Geographically grounded framework combined with comparative regional field evidence focused on South Asia and North Africa (qualitative/comparative field data referenced; no numeric sample sizes provided).
high negative Automation, Migration, and Development: Geography of Job Pre... labor displacement / youth unemployment / urban labor saturation
AI adoption and accelerating automation amplify employment precarity in labor‑surplus economies.
Conceptual synthesis grounded in economic geography and labor economics, supported by comparative field evidence cited for labor‑surplus contexts (no quantitative sample size reported).
high negative Automation, Migration, and Development: Geography of Job Pre... employment precarity (job quality and stability)
Automation functions as a transnational shock that contracts demand for migrant labor in advanced economies.
Theoretical argument drawing on economic geography, labor economics, and development studies; comparative/regional field evidence referenced in the paper (no numerical sample size reported).
Unless labour law evolves to address digitally mediated control and platform-based asymmetry, the gig economy risks normalising exploitative labour conditions under the guise of innovation and flexibility.
Predictive/theoretical claim based on the paper's synthesis of platform practices, legal gaps, and normative concerns; argued through comparative analysis and conceptual reasoning rather than quantitative forecasting.
high negative Corporate Accountability in the Gig Economy: Re-examining La... future trajectory of labour conditions and normalization of exploitative practic...
The paper uses the concept of 'digital slavery' as a normative framework to describe labour conditions shaped by coercive algorithmic management, absence of bargaining power, and structural precarity.
Conceptual and normative framing within the paper, using the 'digital slavery' metaphor to interpret observed platform labour practices and their implications; theoretical argumentation rather than empirical measurement.
high negative Corporate Accountability in the Gig Economy: Re-examining La... characterisation of labour conditions under algorithmic management
While several jurisdictions (UK, US, EU, India) have attempted to regulate gig work, most regulatory responses remain incomplete and fail to fully address platform accountability.
Comparative policy/regulatory analysis of the United Kingdom, United States, European Union and India assessing statutes, litigation and policy measures; qualitative assessment rather than statistical evaluation (no quantitative sample size reported).
high negative Corporate Accountability in the Gig Economy: Re-examining La... completeness/effectiveness of regulatory responses to platform accountability
Platform companies rely on contractual misclassification, corporate structuring, and the legal fiction of neutrality to separate control from liability.
Legal and corporate-structure analysis across jurisdictions, examining contracts, corporate forms and legal doctrines; based on comparative statutory and case-law review (no quantitative sample size reported).
high negative Corporate Accountability in the Gig Economy: Re-examining La... allocation of legal liability and regulatory accountability
The platform economy produces a deeply unequal labour structure marked by algorithmic control, economic dependency, surveillance, and lack of social protection.
Synthesis and critical analysis combining literature, policy review and comparative jurisdictional study to argue systemic effects on labour structure; primarily qualitative evidence and theoretical framing (no quantitative sample size reported).
high negative Corporate Accountability in the Gig Economy: Re-examining La... distributional labour outcomes and social protection coverage
Gig workers, though formally classified as independent contractors, are functionally subjected to pricing control, performance monitoring, automated penalties, and deactivation mechanisms that closely resemble managerial authority.
Descriptive/qualitative evidence in the paper: examples and analysis of platform design and management practices (algorithmic pricing, monitoring, penalties, deactivation); based on platform policy documents, case examples and comparative review (no quantitative sample size reported).
high negative Corporate Accountability in the Gig Economy: Re-examining La... degree of algorithmic/managerial control over workers
Digital labour platforms exercise employer-like control while avoiding employer-like legal responsibilities.
Argument and comparative legal analysis across jurisdictions (United Kingdom, United States, European Union, India) demonstrating platform practices and legal/regulatory responses; based on documentary/legal review and critical analysis (no quantitative sample size reported).
high negative Corporate Accountability in the Gig Economy: Re-examining La... legal employment classification and control/responsibility
Shifts persist in even the newest AI models despite remarkable progress in AI modeling, post-training alignment and safeguards.
Asserted in paper; supported by later empirical validation across multiple models and production chatbots (see other claims), but no explicit sample size in this sentence.
high negative Fusion-fission forecasts when AI will shift to undesirable b... persistence of undesirable behavioral shifts despite alignment/safeguards
ChatGPT-like AI behavior can shift, unnoticed, from desirable to undesirable (e.g., encouraging self-harm, extremist acts, financial losses, or costly medical and military mistakes), and no one can yet predict when.
Statement in paper framing the problem; qualitative observations and motivating examples (no numeric sample size provided in the excerpt).
high negative Fusion-fission forecasts when AI will shift to undesirable b... occurrence of unnoticed shifts from desirable to undesirable outputs
Employees experience technostress, anxiety and micro-political negotiation around AI tools in everyday work.
Reported experiences from semistructured interviews with 28 managers/professionals across 12 organizations; thematic analysis highlighting technostress and anxiety as themes.
high negative Reimagining work in the age of intelligent automation: a qua... technostress and anxiety among employees