Evidence (6869 claims)
Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 758 | 199 | 100 | 900 | 2007 |
| Governance & Regulation | 826 | 400 | 191 | 122 | 1563 |
| Organizational Efficiency | 777 | 193 | 124 | 84 | 1189 |
| Technology Adoption Rate | 635 | 233 | 124 | 97 | 1098 |
| Research Productivity | 422 | 128 | 57 | 336 | 954 |
| Output Quality | 476 | 179 | 59 | 47 | 761 |
| Decision Quality | 328 | 177 | 81 | 47 | 640 |
| Firm Productivity | 435 | 57 | 88 | 20 | 606 |
| AI Safety & Ethics | 218 | 277 | 65 | 33 | 599 |
| Market Structure | 180 | 170 | 123 | 24 | 502 |
| Task Allocation | 213 | 64 | 72 | 33 | 387 |
| Skill Acquisition | 170 | 61 | 61 | 17 | 309 |
| Innovation Output | 203 | 27 | 43 | 18 | 292 |
| Employment Level | 105 | 54 | 107 | 13 | 281 |
| Fiscal & Macroeconomic | 131 | 69 | 43 | 26 | 276 |
| Consumer Welfare | 117 | 63 | 42 | 11 | 233 |
| Firm Revenue | 153 | 48 | 26 | 3 | 230 |
| Task Completion Time | 173 | 31 | 8 | 12 | 225 |
| Inequality Measures | 44 | 122 | 49 | 6 | 221 |
| Worker Satisfaction | 89 | 65 | 22 | 12 | 188 |
| Error Rate | 69 | 92 | 10 | 2 | 173 |
| Regulatory Compliance | 77 | 69 | 14 | 5 | 165 |
| Automation Exposure | 56 | 56 | 26 | 13 | 154 |
| Training Effectiveness | 94 | 21 | 13 | 19 | 149 |
| Wages & Compensation | 77 | 36 | 25 | 6 | 144 |
| Team Performance | 86 | 17 | 27 | 10 | 141 |
| Developer Productivity | 95 | 17 | 14 | 6 | 133 |
| Job Displacement | 12 | 80 | 20 | 1 | 113 |
| Hiring & Recruitment | 52 | 7 | 8 | 3 | 70 |
| Creative Output | 31 | 18 | 8 | 3 | 61 |
| Skill Obsolescence | 5 | 46 | 6 | 1 | 58 |
| Social Protection | 27 | 16 | 8 | 2 | 53 |
| Labor Share of Income | 17 | 19 | 17 | — | 53 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Governance
Remove filter
The basin of attraction of the partial adoption trap is enlarged by a threshold coordination failure arising from the non-appropriable nature of systemic benefits.
Model analysis showing how non-appropriable systemic benefits (externalities) change payoff structure and enlarge the basin of attraction for partial adoption. Theoretical derivation; no empirical sample.
Current monolithic architectures struggle to enforce rigid brand constraints, frequently hallucinating unapproved visual assets.
Asserted critique of existing architectures in paper; no specific empirical metrics, datasets, or sample sizes provided.
Integration of generative video models into enterprise environments is restricted by temporal inconsistencies and severe brand misalignment.
Statement in paper describing deployment limitations; no empirical study, dataset, or sample size provided to quantify these restrictions.
Deterministic copy collapses uncertainty (i.e., copying deterministically collapses the learner's uncertainty over actions).
Ablation/diagnostic comparisons reported in the paper showing deterministic-copy policies reduce or collapse uncertainty compared to stochastic or trace-informed policies in the benchmark tasks.
Reward-only PPO variants miss trace alignment (they achieve reward/KPIs but do not align with benchmark trace/behavior).
Empirical comparison across the two-hotel benchmark and a compact hidden-budget bidding task showing reward-only PPO variants fail to match trace-based diagnostics.
Yapay zekâ gelişmekte olan ekonomiler için hem fırsatlar hem de tehditler yaratmaktadır: AI işgücü maliyeti avantajını törpüleyebilir.
Kavramsal değerlendirme; mekanizma temelli argüman (otomasyon işgücü maliyeti avantajını azaltır); ampirik veri ya da örneklem belirtilmemiştir.
Bu dönüşüm mevcut küresel değer zinciri yapılarını ve ülkelerin bu zincirlerdeki konumlarını doğrudan sorgulamaktadır.
Kavramsal tartışma; yazarın analitik çerçevesiyle GVC (küresel değer zinciri) yapılarının AI ile yeniden değerlendirilebileceği ileri sürülmektedir; ampirik örneklem yok.
The tech industry claims that its products, business models, and methods of resource extraction are unprecedented and fall outside any existing legal framework.
Descriptive claim about prevailing industry discourse referenced by the authors. (Citations or examples of industry statements not included in the excerpt.)
Exploitative working conditions violate workers' rights.
Legal assessment based on documents and the authors' interpretation of rights under applicable law (GDPR and labour rights frameworks). (Specific legal rulings or counts not provided in the excerpt.)
The results of this approach provide legally grounded evidence of the structural disadvantages faced by content moderators in the Global South, whose exploitative working conditions violate workers' rights.
Documents obtained via GDPR requests (employment contracts, NDAs, etc.) and legal interpretation are used as evidence to support claims of structural disadvantage and rights violations. (Specific documents and counts not provided in the excerpt.)
Current alignment approaches are primarily reactive rather than proactive.
Author's critique/characterization of prevailing alignment practice (conceptual observation without quantitative support).
The prevailing paradigm of alignment parallels early psychology's focus on mental illness: necessary but incomplete.
Analogy/argument presented by the authors as a conceptual critique (no empirical test reported).
Existing alignment research is dominated by concerns about safety and preventing harm: safeguards, controllability, and compliance.
Author's literature-level observation / conceptual review in the paper (no systematic review or quantitative coding reported).
Step-wise verification (verifying each stage of the reasoning chain) increases computational overhead and infrastructure requirements when deployed at scale.
Paper's structural trade-off analysis and engineering argument; no measured compute-costs, benchmarks, or sample-size reporting included in the provided text.
Process-based supervision introduces challenges regarding the sustainability of human-in-the-loop feedback loops.
Socio-technical argumentation in the paper—concern raised about ongoing human verification burden; no longitudinal or empirical data on human labor sustainability provided.
Deploying PRMs at scale introduces unique challenges regarding system latency.
Engineering and infrastructure trade-off analysis described in the paper; no measured latency benchmarks or sample-size performance tests provided in the supplied text.
Traditional outcome-based reward models, which evaluate only the final correctness of a solution, often fail to identify logical fallacies or "hallucinations" occurring within intermediate steps.
Theoretical critique and conceptual argumentation presented in the paper; no empirical study or sample size reported.
Capital-intensive sectors face structural constraints on adaptability.
Observed sectoral differences in comparative analysis (e.g., inclusion of ExxonMobil among firms) indicating lower Flexibility Index scores or slower reallocation in capital-intensive firm(s).
Cross-sectoral empirical evidence linking budget flexibility, forecasting accuracy, and institutional oversight remains limited.
Statement of literature gap in paper motivating the study; no new quantitative estimate provided.
Traditional static budgeting models are increasingly inadequate in environments marked by volatility, technological disruption, and fiscal uncertainty.
Framing claim in paper introduction; no specific empirical estimate given. Based on comparative empirical design motivation.
The findings carry direct implications for accountability, institutional integrity, and public trust in urban governance, and contribute to ongoing discourse on responsible AI adoption in cities aligned with global sustainability priorities.
Synthesis of audit results and discussion of their broader implications for public-sector adoption of LLMs in cities; inferential claim based on study outcomes (e.g., errors, fabricated sources, regulatory misinterpretation).
These failures extend beyond technical accuracy and introduce risks for governance, fiscal responsibility, and regulatory compliance.
Interpretation of audit findings (e.g., high rate of unverifiable citations, misinterpretation of regulations, degraded alignment on strategic scenarios) to argue systemic risks in governance and fiscal/regulatory domains.
Many responses misinterpreted regulatory requirements or relied on shallow justification.
Qualitative coding/analysis of LLM responses against expert rubric showing frequent misinterpretation of regulations and superficial reasoning.
Decision alignment with expert judgment degraded as scenario complexity increased, with strong agreement on operational triage but near-complete divergence on strategic capital allocation.
Comparative evaluation of LLM decisions vs. expert rubric across scenarios of varying complexity (operational triage through strategic capital allocation); qualitative and/or quantitative agreement measures reported in paper.
LLM self-reported confidence was negatively correlated with actual reasoning quality (r = -0.23), meaning the lowest-performing models projected the greatest certainty.
Statistical correlation reported between LLM self-reported confidence scores and measured reasoning quality across audited responses/models; correlation coefficient r = -0.23.
Across all models, 51.3% of cited sources were unverifiable or fabricated.
Quantitative audit of citations provided by the six commercial LLMs; proportion of cited sources judged unverifiable or fabricated as reported in paper.
Monte Carlo simulations illustrate that standard DID estimators that ignore spillovers can miss the total effect.
Monte Carlo simulation results reported in the paper comparing standard DID estimators (which ignore spillovers) to the proposed approach; simulations show standard DID can fail to capture the total effect under spillovers.
The analysis also identifies risks linked to exclusion, symbolic compliance, and concentration of control over compliance processes.
Theoretical risk mapping produced by the integrative review and interpretive synthesis; no primary empirical evidence presented.
Uncertainty around compliance and excessive risk avoidance reduce the space for lawful business activity.
Interpretive synthesis of evidence and arguments across the reviewed literatures (sanctions compliance, institutional voids); no original empirical test.
Firms working under such conditions often experience limited access to finance and markets.
Claim derived from literature on firm constraints in weak institutional/sanctioned contexts as reviewed in the paper; no primary empirical data reported.
Post-conflict and sanctions-affected environments are strongly affected by sanctions pressure, weak rule enforcement, and high levels of corruption risk.
Synthesis of literature on sanctions, weak institutions, and corruption risk presented in the integrative review; no new empirical sample reported.
Accuracy is not a sufficient proxy for governance in regulated AI systems.
Empirical results from synthetic banking experiments showing divergence between task accuracy and governance-quality metrics across architectures, as summarized in the abstract.
Under text-only governance, 27% of deferrals carry no decision-relevant information.
Experimental evaluation in a synthetic banking domain comparing text-only governance to mechanical enforcement; reported statistic in paper abstract. Specific sample size not stated in abstract.
Currently, systematic assessment errors cause owners of lower-valued properties to face disproportionately high tax burdens, creating regressivity in the property tax system.
Empirical analysis of property assessments and tax burdens using 26 million property sales across ~95% of U.S. counties, showing systematic errors that bias tax burdens toward lower-valued properties.
There are limits to technology‑led growth strategies in labor‑abundant contexts; such strategies do not reliably deliver inclusive employment gains.
Argument based on synthesis of theory and comparative field evidence demonstrating weak employment outcomes from technology‑led growth in labor‑abundant settings (no quantitative effect sizes reported).
Digital media play a significant role in shaping youth mobilization and political unrest in migrants' countries of origin.
Empirical observations and regional field evidence reported in the paper linking digital media use to youth mobilization and political outcomes (qualitative/comparative evidence; no numeric sample size provided).
Developing countries face macroeconomic vulnerabilities because of dependence on remittances, which are exposed by automation-driven changes in migrant labor demand.
Analytical linkage developed in the paper supported by comparative field evidence and macroeconomic reasoning; remittance dependence highlighted as a vulnerability (no quantitative estimates or sample sizes reported).
Technology adoption in core industries in advanced economies is linked with labor displacement, rising youth unemployment, and urban labor saturation in South Asia and North Africa.
Geographically grounded framework combined with comparative regional field evidence focused on South Asia and North Africa (qualitative/comparative field data referenced; no numeric sample sizes provided).
AI adoption and accelerating automation amplify employment precarity in labor‑surplus economies.
Conceptual synthesis grounded in economic geography and labor economics, supported by comparative field evidence cited for labor‑surplus contexts (no quantitative sample size reported).
Automation functions as a transnational shock that contracts demand for migrant labor in advanced economies.
Theoretical argument drawing on economic geography, labor economics, and development studies; comparative/regional field evidence referenced in the paper (no numerical sample size reported).
Unless labour law evolves to address digitally mediated control and platform-based asymmetry, the gig economy risks normalising exploitative labour conditions under the guise of innovation and flexibility.
Predictive/theoretical claim based on the paper's synthesis of platform practices, legal gaps, and normative concerns; argued through comparative analysis and conceptual reasoning rather than quantitative forecasting.
The paper uses the concept of 'digital slavery' as a normative framework to describe labour conditions shaped by coercive algorithmic management, absence of bargaining power, and structural precarity.
Conceptual and normative framing within the paper, using the 'digital slavery' metaphor to interpret observed platform labour practices and their implications; theoretical argumentation rather than empirical measurement.
While several jurisdictions (UK, US, EU, India) have attempted to regulate gig work, most regulatory responses remain incomplete and fail to fully address platform accountability.
Comparative policy/regulatory analysis of the United Kingdom, United States, European Union and India assessing statutes, litigation and policy measures; qualitative assessment rather than statistical evaluation (no quantitative sample size reported).
Platform companies rely on contractual misclassification, corporate structuring, and the legal fiction of neutrality to separate control from liability.
Legal and corporate-structure analysis across jurisdictions, examining contracts, corporate forms and legal doctrines; based on comparative statutory and case-law review (no quantitative sample size reported).
The platform economy produces a deeply unequal labour structure marked by algorithmic control, economic dependency, surveillance, and lack of social protection.
Synthesis and critical analysis combining literature, policy review and comparative jurisdictional study to argue systemic effects on labour structure; primarily qualitative evidence and theoretical framing (no quantitative sample size reported).
Gig workers, though formally classified as independent contractors, are functionally subjected to pricing control, performance monitoring, automated penalties, and deactivation mechanisms that closely resemble managerial authority.
Descriptive/qualitative evidence in the paper: examples and analysis of platform design and management practices (algorithmic pricing, monitoring, penalties, deactivation); based on platform policy documents, case examples and comparative review (no quantitative sample size reported).
Digital labour platforms exercise employer-like control while avoiding employer-like legal responsibilities.
Argument and comparative legal analysis across jurisdictions (United Kingdom, United States, European Union, India) demonstrating platform practices and legal/regulatory responses; based on documentary/legal review and critical analysis (no quantitative sample size reported).
Shifts persist in even the newest AI models despite remarkable progress in AI modeling, post-training alignment and safeguards.
Asserted in paper; supported by later empirical validation across multiple models and production chatbots (see other claims), but no explicit sample size in this sentence.
ChatGPT-like AI behavior can shift, unnoticed, from desirable to undesirable (e.g., encouraging self-harm, extremist acts, financial losses, or costly medical and military mistakes), and no one can yet predict when.
Statement in paper framing the problem; qualitative observations and motivating examples (no numeric sample size provided in the excerpt).
Employees experience technostress, anxiety and micro-political negotiation around AI tools in everyday work.
Reported experiences from semistructured interviews with 28 managers/professionals across 12 organizations; thematic analysis highlighting technostress and anxiety as themes.