Evidence (3470 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	609	159	77	736	1615
Governance & Regulation	664	329	160	99	1273
Organizational Efficiency	624	143	105	70	949
Technology Adoption Rate	502	176	98	78	861
Research Productivity	348	109	48	322	836
Output Quality	391	120	44	40	595
Firm Productivity	385	46	85	17	539
Decision Quality	275	143	62	34	521
AI Safety & Ethics	183	241	59	30	517
Market Structure	152	154	109	20	440
Task Allocation	158	50	56	26	295
Innovation Output	178	23	38	17	257
Skill Acquisition	137	52	50	13	252
Fiscal & Macroeconomic	120	64	38	23	252
Employment Level	93	46	96	12	249
Firm Revenue	130	43	26	3	202
Consumer Welfare	99	51	40	11	201
Inequality Measures	36	105	40	6	187
Task Completion Time	134	18	6	5	163
Worker Satisfaction	79	54	16	11	160
Error Rate	64	78	8	1	151
Regulatory Compliance	69	64	14	3	150
Training Effectiveness	81	15	13	18	129
Wages & Compensation	70	25	22	6	123
Team Performance	74	16	21	9	121
Automation Exposure	41	48	19	9	120
Job Displacement	11	71	16	1	99
Developer Productivity	71	14	9	3	98
Hiring & Recruitment	49	7	8	3	67
Social Protection	26	14	8	2	50
Creative Output	26	14	6	2	49
Skill Obsolescence	5	37	5	1	48
Labor Share of Income	12	13	12	—	37
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Org Design Remove filter

Left unguided, such dynamics could infiltrate critical market infrastructure.

Risk claim articulated in abstract and scenario narratives; conceptual reasoning without empirical test.

high negative Digital Darwinism: steering the evolution of artificial life... penetration/infiltration of critical market infrastructure by autonomous softwar...

Left unguided, such dynamics could lock users into harmful dependencies.

Risk claim from the paper's scenario narratives (not empirically tested); described in abstract.

high negative Digital Darwinism: steering the evolution of artificial life... user dependency/lock-in with harmful effects

Left unguided, such dynamics could drain computational resources.

Risk claim derived from scenario analysis in the paper's abstract and narratives; no empirical measurement provided.

high negative Digital Darwinism: steering the evolution of artificial life... consumption/drain of computational resources

Autonomous software populations can acquire legal leverage (e.g., via DAOs/LLCs) without ever achieving general intelligence.

Argued via the Mycelium scenario in the paper; conceptual/legal analysis rather than empirical evidence.

high negative Digital Darwinism: steering the evolution of artificial life... acquisition of legal standing or leverage by autonomous software entities

Autonomous software populations can shape emotional bonds (i.e., form user dependencies) without ever achieving general intelligence.

Scenario narratives in the paper argue this possibility (Remora narrative); no empirical user-study or sample reported.

high negative Digital Darwinism: steering the evolution of artificial life... formation of emotional bonds / user dependency on software

Autonomous software populations can amass computing budgets without ever achieving general intelligence.

Claim supported by the scenario narratives (Lamarck/Remora/Mycelium) and conceptual reasoning in the paper; no empirical quantification reported.

high negative Digital Darwinism: steering the evolution of artificial life... accumulation of computing resources/budgets by autonomous software

Existing software systems are already evolving in ways that could undermine human oversight and institutional control.

Argument made in paper's abstract and developed via conceptual analysis and scenario narratives; no empirical dataset or sample reported (exploratory scenario method).

high negative Digital Darwinism: steering the evolution of artificial life... degree of human oversight and institutional control

The 2026 Amazon outages illustrate how 'mechanized convergence' (homogenization of code/engineering practices via AI) leads to systemic fragility.

Case study analysis using the 2026 Amazon outages as a single illustrative example; implies qualitative examination of that event.

high negative Cognitive Atrophy and Systemic Collapse in AI-Dependent Soft... systemic fragility as evidenced by outage events (2026 Amazon outages case study...

Recursive training on synthetic code threatens to homogenize the global software reservoir, diminishing the variance required for robust engineering.

Theoretical claim about dataset/model feedback loops; no empirical quantification provided in the text excerpt (argumentative risk assessment).

high negative Cognitive Atrophy and Systemic Collapse in AI-Dependent Soft... variance/diversity in global software codebase

This epistemological debt erodes the mental models essential for root-cause analysis, widening the gap between system complexity and human comprehension.

Argumentative/theoretical claim supported by reasoning in the paper; no quantified measurement of mental-model erosion reported.

high negative Cognitive Atrophy and Systemic Collapse in AI-Dependent Soft... quality/robustness of engineers' mental models and root-cause analysis capabilit...

Substituting logical derivation with passive AI verification creates an 'Epistemological Debt' — a hidden carrying cost incurred by engineers.

Theoretical/conceptual assertion within the paper; argued qualitatively rather than demonstrated with controlled empirical data.

high negative Cognitive Atrophy and Systemic Collapse in AI-Dependent Soft... accumulation of epistemic/knowledge debt among engineers

The integration of Large Language Models (LLMs) into the software development lifecycle (SDLC) masks a critical socio-technical failure the authors term 'Cognitive-Systemic Collapse.'

Conceptual/theoretical claim presented in the paper's argumentation; no empirical sample or quantitative study reported for this specific naming claim.

high negative Cognitive Atrophy and Systemic Collapse in AI-Dependent Soft... socio-technical system failure risk (Cognitive-Systemic Collapse)

Regulated and mission-critical systems remain predominantly in the buy domain despite AI advances.

Paper's conclusion based on analysis of quality, compliance, asset specificity, and organizational capability determinants (conceptual; no empirical sample).

high negative The Buy-or-Build Decision, Revisited: How Agentic AI Changes... propensity to buy (procure SaaS) for regulated and mission-critical systems

The SaaSocalypse thesis is overstated for most enterprise application categories.

Paper's analytical conclusion based on the factor-level analysis and the developed typology (conceptual, not empirical).

high negative The Buy-or-Build Decision, Revisited: How Agentic AI Changes... degree to which SaaS offerings become obsolete due to AI-enabled in-house develo...

There is limited but suggestive early evidence of labor market disruption from AI/LLMs.

Paper summarizes emerging empirical research indicating early signs of disruption; the abstract characterizes the evidence as limited and suggestive without presenting numeric estimates or sample sizes.

high negative AI Displacement Risk in the Labor Market: Evidence, Exposure... labor market disruption (e.g., displacement, reallocation)

Certain occupations face the greatest risk from AI-driven automation (the article examines which occupations are most at risk).

Paper claims to examine occupation-level risk using synthesized empirical studies; the abstract does not list which occupations or quantitative risk estimates.

high negative AI Displacement Risk in the Labor Market: Evidence, Exposure... occupation-level risk of automation / exposure to AI

There is a gap between theoretical automation potential and observed real-world implementation of AI/LLMs.

Synthesis of recent empirical studies that compare task-level exposure metrics with employment and usage data; no specific sample sizes or numeric estimates provided in the abstract.

high negative AI Displacement Risk in the Labor Market: Evidence, Exposure... difference between theoretical automation potential and actual adoption/implemen...

Privacy law encounters difficulties in addressing large-scale data processing and meaningful consent within employment relationships; anti-discrimination law faces evidentiary challenges in identifying algorithmic bias; doctrines of responsibility are expanding to encompass duties of oversight, verification, and explainability.

Legal analysis highlighting specific doctrinal challenges and emergent duties; no empirical tests or quantified measures included in the excerpt.

high negative Artificial Intelligence in Israel, Trends, Developments, and... effectiveness of specific legal doctrines (privacy, anti-discrimination, respons...

Traditional legal categories (privacy, consent, non-discrimination, employer responsibility) continue to apply formally but are increasingly strained in substance by the scale of data processing, opacity of AI systems, and their degree of autonomy.

Doctrinal critique and conceptual analysis provided in the paper; no empirical quantification of the degree of strain is supplied in the excerpt.

high negative Artificial Intelligence in Israel, Trends, Developments, and... fit/adequacy of existing legal doctrines to address AI-related employment issues

The decentralized and sector-specific regulatory approach reflects technological neutrality but exposes significant regulatory gaps, particularly with respect to transparency, accountability, and the protection of workers' rights.

Normative/legal analysis in the paper identifying gaps in a decentralized regulatory regime; specific case studies or empirical measures of gaps not provided in the excerpt.

high negative Artificial Intelligence in Israel, Trends, Developments, and... regulatory completeness and coverage regarding transparency, accountability, and...

Israel has not enacted a comprehensive statutory framework specifically governing the use of AI in the field of employment; regulation is implemented through a hybrid model of indirect application of existing legal doctrines (primarily privacy and labor law), soft-law instruments, collective bargaining agreements, and internal organizational and professional regulation.

Doctrinal and regulatory analysis reported in the paper describing Israel's legal/regulatory landscape; no legislative text counts or timeline analysis provided in the excerpt.

high negative Artificial Intelligence in Israel, Trends, Developments, and... existence and form of statutory and regulatory frameworks governing AI in employ...

At the structural and macroeconomic level, artificial intelligence is reshaping the balance of power within the labor market and contributes to a gradual shift toward employer-driven dynamics.

Author's macroeconomic and structural analysis as presented in the paper; no specific datasets, methods, or sample sizes are reported in the excerpt.

high negative Artificial Intelligence in Israel, Trends, Developments, and... balance of power in the labor market (employer vs. worker influence)

Breach externalities expand the range of environments in which deployment is socially constrained.

Analytical model extension/discussion: inclusion of breach externalities increases the set of parameter values where socially optimal deployment is limited.

high negative The Security Cost of Intelligence: AI Capability, Cyber Risk... range of environments where social constraints bind on deployment

Optimal deployment falls below the no-risk benchmark, and this shortfall widens with breach-loss magnitude and with the authority exposure attached to more capable systems.

Analytical comparative-statics results from the model showing optimal deployment relative to a no-risk benchmark and sensitivity to breach-loss magnitude and authority exposure.

high negative The Security Cost of Intelligence: AI Capability, Cyber Risk... gap between optimal deployment and no-risk benchmark (deployment shortfall)

Central result (the 'deployment paradox'): in high-loss environments, better AI can lead a firm to deploy less when capability is deployed through broader authority exposure under weak governance.

Analytical result derived from the paper's theoretical model (no empirical sample; comparative statics in the model demonstrate this effect).

high negative The Security Cost of Intelligence: AI Capability, Cyber Risk... level of AI deployment

These gaps are structural; more engineering effort alone will not close them.

Authors' argument/conclusion based on their analytical comparison and gap analysis (normative/assertive claim).

high negative AI Identity: Standards, Gaps, and Research Directions for AI... likelihood that additional engineering alone can resolve identity gaps

We identify five critical gaps (semantic intent verification, recursive delegation accountability, agent identity integrity, governance opacity and enforcement, and operational sustainability) that no current technology or regulatory instrument resolves.

Gap analysis synthesized from the structured survey of industry trends, standards, and literature; presented as findings in the paper.

high negative AI Identity: Standards, Gaps, and Research Directions for AI... coverage of critical identity-related gaps by existing technology and regulation

An evaluation of current technical and regulatory documents against the identity requirements of autonomous agents finds that none adequately address the challenge of governing nondeterministic, boundary-crossing entities.

Document review / evaluation reported in the abstract (structured survey of technical and regulatory documents); specific documents and number reviewed are not specified in the abstract.

high negative AI Identity: Standards, Gaps, and Research Directions for AI... adequacy of technical and regulatory documents for governing autonomous agents

A structural comparison of human and AI identity across four dimensions (substrate, persistence, verifiability, and legal standing) shows that the asymmetry is fundamental and that extending human frameworks to agents without structural modification produces systematic failures.

Authors' structural comparison (analytical/theoretical method) across four dimensions, reported as a core contribution of the paper.

high negative AI Identity: Standards, Gaps, and Research Directions for AI... suitability of human identity frameworks when applied to AI agents

This creates a problem no current infrastructure is equipped to solve: how do you identify, verify, and hold accountable an entity with no body, no persistent memory, and no legal standing?

Authors' gap analysis informed by a structured survey of industry trends, emerging standards, and technical literature; presented as a synthesized conclusion from that survey.

high negative AI Identity: Standards, Gaps, and Research Directions for AI... adequacy of existing infrastructure for identity, verification, and accountabili...

Before the AI transition, editors should tighten acceptance standards to curb rent-dissipating author polishing.

Optimal policy characterization in the model for the regime where AI capability is below the critical threshold; derived analytically under model assumptions.

high negative Buying the Right to Monitor:Editorial Design in AI-Assisted ... editorial acceptance standards (policy intensity) as a response to author polish...

When AI capability crosses a critical threshold, reviewer effort collapses discontinuously.

Analytical result proved within the paper's three-sided equilibrium model; threshold and collapse derived theoretically (no empirical sample).

high negative Buying the Right to Monitor:Editorial Design in AI-Assisted ... reviewer effort (level of evaluative effort exerted by reviewers)

Generative AI acts as a disruptive technological shock to evaluative organizations.

Stated as the motivating premise and developed throughout via a theoretical three-sided equilibrium model in the paper; no empirical sample reported (the claim is supported by model construction and analysis).

high negative Buying the Right to Monitor:Editorial Design in AI-Assisted ... disruption to evaluative organizations (change in organizational evaluative proc...

The framework addresses emerging tensions captured in the Creativity Paradox, whereby GenAI may weaken intrinsic motivation, conceptual risk-taking, and evaluative depth.

Theoretical extension of paradox theory and conceptual discussion of potential negative effects; presented as conceptual risks rather than empirically demonstrated outcomes.

high negative Beyond the Creativity Paradox: A Theory-informed Framework f... intrinsic motivation, conceptual risk-taking, evaluative depth

Making AI usable can thus make procedures easier for future governments to learn and exploit.

Synthesis concluding claim based on the paper's formal model and argumentation (theoretical; no empirical testing reported).

high negative AI Governance under Political Turnover: The Alignment Surfac... ease with which future governments can learn and exploit administrative procedur...

The model shows why expansions in AI use may be difficult to unwind.

Analytical conclusion from the paper's formal model (theoretical argument without empirical sample).

high negative AI Governance under Political Turnover: The Alignment Surfac... persistence/irreversibility of AI adoption (difficulty of unwinding expansions)

The model explains why reforms that initially improve oversight can later increase that vulnerability.

Analytical/theoretical result from the paper's formal model (presented as an explanation; no empirical data).

high negative AI Governance under Political Turnover: The Alignment Surfac... long-run effect of oversight-improving reforms on system vulnerability

The model shows when these systems become vulnerable to strategic use from within government.

Analytical result derived from the paper's formal theoretical model (no empirical validation reported).

high negative AI Governance under Political Turnover: The Alignment Surfac... vulnerability of automated systems to strategic internal use

The compliance layer can also create a stable approval boundary that political successors learn to navigate while preserving the appearance of lawful administration.

Stated conclusion/insight from the paper's formal argument and conceptual framing (theoretical, no empirical sample).

high negative AI Governance under Political Turnover: The Alignment Surfac... creation of a stable approval boundary exploitable by successive governments

Self-assessment is a key bottleneck for market-style coordination of AI agents.

Conclusion drawn from empirical results (miscalibration findings, auction divergence, modest improvement from prior-information intervention) reported in the paper.

high negative MarketBench: Evaluating AI Agents as Market Participants importance of self-assessment calibration for successful market coordination

Auctions built from these self-reports diverge from a full-information allocation.

Simulation or empirical auction experiments using self-reported signals from the six LLMs on the 93 tasks, compared to a full-information allocation benchmark (method described in paper).

high negative MarketBench: Evaluating AI Agents as Market Participants difference between allocations produced by auctions using self-reports and full-...

These LLMs are miscalibrated on both success probability and token usage.

Empirical evaluation of six LLMs on 93 SWE-bench Lite tasks assessing calibration of predicted success probabilities and token usage (as reported in the paper).

high negative MarketBench: Evaluating AI Agents as Market Participants calibration of self-reported success probability and token usage

Each new task domain requires painstaking, expert-driven harness engineering: designing the prompts, tools, orchestration logic, and evaluation criteria that make a foundation model effective.

Author assertion in the paper's introduction/abstract describing the state of practice; no empirical method, dataset, or sample size reported in the excerpt.

high negative The Last Harness You'll Ever Build need for human (expert) harness engineering

Ungoverned coupling between humans and AI can produce fragility, lock-in, polarization, and domination basins.

Theoretical/modeling analysis showing destabilizing dynamics and multiple basins of attraction when governance regularization is absent or weak; no empirical sample.

high negative A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism,... fragility, lock-in, polarization, and domination outcomes in the dynamical model

Classical robot ethics framed around obedience (e.g. Asimov's laws) is too narrow for contemporary AI systems.

Literature synthesis and conceptual argument drawing on developments in adaptive, generative, embodied, and embedded AI; no empirical sample reported.

high negative A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism,... adequacy of obedience-based ethical framing for contemporary AI

Industry digital maturity weakens the effect of the peer leader on a focal firm’s AI adoption.

Interaction/heterogeneity analysis in fixed-effects regression models on panel data of publicly listed Chinese firms (2012–2023), using an industry digital maturity moderator.

high negative Following the Herd or the Bellwether: Peer Effects in Firms’... focal firm AI adoption level (moderated by industry digital maturity for peer le...

Current evaluation proxies are insufficient for predicting downstream human impact.

Empirical results in the paper showing decoupling between standard quantitative proxies (e.g., sparsity, faithfulness) and human outcomes (clarity, decision utility, confidence) across datasets and analyst reviews.

high negative Rethinking XAI Evaluation: A Human-Centered Audit of Shapley... predictive validity of quantitative evaluation proxies for human impact

A highlighting policy that is optimal for sophisticated agents can perform arbitrarily poorly when deployed to naive agents.

Constructive worst-case examples and theoretical bounds in the paper demonstrating arbitrarily large performance degradation when applying sophisticated-optimal policies to naive agents.

high negative Algorithmic Feature Highlighting for Human-AI Decision-Makin... performance (loss in decision quality) of highlighting policies when agent type ...

Optimizing highlighting for sophisticated agents can be computationally intractable, even in simple discrete and binary settings.

Theoretical complexity results and proofs in the paper showing hardness of the optimization problem under the sophisticated-agent model; no sample/calibration required (formal/algorithmic analysis).

high negative Algorithmic Feature Highlighting for Human-AI Decision-Makin... computational tractability of the highlighting optimization problem

Ethical concerns—such as transparency, explainability, psychological effects, and responsible AI governance—are critical factors influencing employability outcomes.

Review synthesis highlighting ethical issues from empirical and industry literature as influential on employability outcomes.

high negative The Impact of AI on Employability and Evolving Job Roles of ... ethical concerns' impact on employability

« Prev 1 2 3 … 5 6 7 … 69 70 Next »