Evidence (4857 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	417	113	67	480	1091
Governance & Regulation	419	202	124	64	823
Research Productivity	261	100	34	303	703
Organizational Efficiency	406	96	71	40	616
Technology Adoption Rate	323	128	74	38	568
Firm Productivity	307	38	70	12	432
Output Quality	260	71	27	29	387
AI Safety & Ethics	118	179	45	24	368
Market Structure	107	128	85	14	339
Decision Quality	177	75	37	19	312
Fiscal & Macroeconomic	89	58	33	22	209
Employment Level	74	34	78	9	197
Skill Acquisition	98	36	40	9	183
Innovation Output	121	12	24	13	171
Firm Revenue	98	35	24	—	157
Consumer Welfare	73	31	37	7	148
Task Allocation	87	16	34	7	144
Inequality Measures	25	76	32	5	138
Regulatory Compliance	54	61	13	3	131
Task Completion Time	89	7	4	3	103
Error Rate	44	51	6	—	101
Training Effectiveness	58	12	12	16	99
Worker Satisfaction	47	33	11	7	98
Wages & Compensation	54	15	20	5	94
Team Performance	47	12	15	7	82
Automation Exposure	27	26	10	6	72
Job Displacement	6	39	13	—	58
Hiring & Recruitment	40	4	6	3	53
Developer Productivity	34	4	3	1	42
Social Protection	22	11	6	2	41
Creative Output	16	7	5	1	29
Labor Share of Income	12	6	9	—	27
Skill Obsolescence	3	20	2	—	25
Worker Turnover	10	12	—	3	25

Productivity Remove filter

Broader conclusion: AI has the potential to raise productivity and create value, but without proactive policy the benefits risk being concentrated among skilled workers and firms, exacerbating inequality and regional disparities.

Integrative interpretation drawing on productivity and distributional findings from the 17 studies and theoretical considerations about differential complementarities and adoption patterns.

medium mixed The role of generative artificial intelligence on labor mark... productivity gains and distributional outcomes (inequality, regional disparities...

Whether AI is net job‑creating depends on context (sector, country, policy environment, and workforce skill composition).

Observed heterogeneity across the 17 studies by sectoral setting, country context, and policy environment; studies report differing net employment outcomes depending on these factors.

medium mixed The role of generative artificial intelligence on labor mark... net employment effect (jobs created minus jobs displaced) by context

AI contributes to labor‑market polarization: growth in high‑skill opportunities alongside contraction in many middle- and low‑skill roles.

Comparative synthesis of occupational and wage-composition findings across the 17 studies shows recurring patterns of expansion at the high-skill end and reductions in middle/low-skill employment.

medium mixed The role of generative artificial intelligence on labor mark... occupational composition / wage distribution (polarization indicators)

Expected differential wage pressure: wages are likely to fall for routine/low‑skill occupations and rise or remain stable for high‑skill workers who possess complementary AI skills.

Econometric studies summarized in the review (cross‑sectional and panel regressions) and theoretical consistency with SBTC; the review highlights heterogeneity in findings and limited long‑run causal certainty.

medium mixed The Impact of AI Machine Learning on Human Labor in the Work... wage trajectories by skill level (routine/low‑skill vs high‑skill complementary ...

AI contributes to skills polarization: demand rises for advanced cognitive, digital, and socio‑emotional skills while routine cognitive and manual task demand declines.

Theoretical integration (SBTC), task decomposition studies showing shifts in task demand by skill content, and labour‑market analyses reporting changes in occupational skill mixes; evidence comes from cross‑sectional and panel studies summarized in the review.

medium mixed The Impact of AI Machine Learning on Human Labor in the Work... demand for different skill categories (advanced cognitive/digital/socio‑emotiona...

AI/ML has a dual, sector- and skill-dependent effect on labor: widespread displacement of routine and lower-skilled tasks coexists with augmentation of professional and cognitive work and the creation of new labor forms (gig, platform-mediated, and human–AI hybrid roles).

Systematic synthesis of peer‑reviewed empirical studies, industry and policy reports, task‑based analyses, and firm/establishment case studies across cross‑country and sectoral analyses; empirical approaches include econometric (cross‑sectional and panel) studies linking automation/AI adoption to employment and wages, task decomposition analyses, and surveys of firm adoption and restructuring. The review notes heterogeneity across studies and limited long‑run causal evidence.

medium mixed The Impact of AI Machine Learning on Human Labor in the Work... employment composition and task allocation (displacement of routine/low‑skill ta...

AI technical capability in the U.S. labor market is substantially larger and far more geographically diffuse than visible adoption suggests.

Agent-based simulation that maps thousands of AI tools to a skills taxonomy and a synthetic population representing the U.S. workforce (151 million agents), covering 32,000+ skills and ~3,000 counties; comparison of the Iceberg Index (skills-based exposure) to a visible-adoption wage-share metric.

medium mixed The Iceberg Index: Measuring Workforce Exposure in the AI Ec... difference between skills-based exposure (Iceberg Index) and visible AI-adoption...

Standard policy responses focused on retraining and active labor-market programs are necessary but insufficient to fully offset structural job losses where K_T substitutes broadly for tasks.

Model simulations and policy experiments in the calibrated dynamic model comparing scenarios with aggressive retraining versus structural fiscal/interventionist reforms; discussion of empirical limits from case studies and historical reskilling outcomes.

medium mixed The Macroeconomic Transition of Technological Capital in the... employment recovery and distributional outcomes under alternative policy scenari...

Routine automation of routine drafting tasks by GLAI may reduce demand for junior drafting labor while increasing demand for skilled reviewers, auditors, and legal technologists.

Labor-market reasoning based on task automation literature and illustrative vignettes; no labor-force survey or longitudinal employment data provided.

medium mixed (negative for junior drafting roles, positive for reviewer/technologist roles) Why Avoid Generative Legal AI Systems? Hallucination, Overre... employment demand by role (junior drafters vs. skilled reviewers/auditors/techno...

Roughly half of the projected LFPR decline to 55% by 2050 is attributable to AI—equivalent to around 10 million lost jobs.

Authors' decomposition/interpretation of conditional forecast results under the rapid scenario reported in the abstract (ties LFPR decline to job-count equivalents).

medium negative Forecasting the Economic Effects of AI job losses attributable to AI (by 2050, rapid scenario)

Our findings echo observations of pervasive annotation errors in text-to-SQL benchmarks, suggesting quality issues are systemic in data engineering evaluation.

Comparative claim referencing prior observations in text-to-SQL literature and the authors' audit results on ELT-Bench; no new cross-benchmark quantitative analysis reported in the excerpt.

medium negative ELT-Bench-Verified: Benchmark Quality Issues Underestimate A... presence of systemic annotation/benchmark quality issues across data engineering...

That measured machine-equivalent work appeared on no financial statement, workforce report, or government statistical return.

Claim about absence of reporting for the deployment's measured work (asserted in the paper for the deployment case).

medium negative HEWU: A Standardized Framework for Measuring Machine-Generat... reporting/disclosure of machine labor in formal records

Many automotive firms, especially those developing new energy and intelligent vehicles, have suffered financial distress and even exited the market.

Descriptive statement in the paper's introduction/motivation citing observed industry outcomes (financial distress and market exit) among automotive firms focused on NEV and intelligent vehicles.

medium negative The 'Intelligent Trap' in Corporate Finance—A Study Based on... financial distress / market exit

The dominant mechanism behind the performance drop is a collapse of Type2_Contextual issue detection at config_B, consistent with attention dilution in long contexts.

Analysis of issue-type specific detection rates shows Type2_Contextual detection collapses at config_B; interpretation ties this to attention dilution in longer contexts.

medium negative SWE-PRBench: Benchmarking AI Code Review Quality Against Pul... Type2_Contextual issue detection rate

The economic inevitability of technological transformation (in agentic finance) and the critical urgency of proactive intervention.

Author claim synthesizing the paper's argument and modeling results (normative conclusion based on earlier analysis and assertions, not a validated empirical finding).

medium negative STRENGTHENING FINANCIAL WORKFORCE COMPETITIVENESS: A CURRICU... likelihood of technology-driven structural change in the finance workforce

Our findings surface practical limits on the complexity people can manage in human-AI negotiation.

Synthesis claim based on the empirical study varying number of issues and observed decline in performance beyond three issues; presented as a conceptual/practical implication of the results.

medium negative From Overload to Convergence: Supporting Multi-Issue Human-A... maximum manageable negotiation complexity (number of issues before performance d...

TDD (test-driven development) prompting alone increased regressions to 9.94%.

Empirical result reported in the paper comparing a TDD prompting intervention against other workflows on the benchmark (values given in the excerpt).

medium negative TDAD: Test-Driven Agentic Development - Reducing Code Regres... regression rate (percentage of tests that regressed) under TDD prompting

Current benchmarks focus almost exclusively on resolution rate, leaving regression behavior under-studied.

Paper's critique of existing benchmark literature and practices (asserted by authors in background; no specific benchmark survey details in the excerpt).

medium negative TDAD: Test-Driven Agentic Development - Reducing Code Regres... coverage of regression measurement in existing benchmarks

The paper identifies five structural challenges arising from the memory governance gap: memory silos across agent workflows; governance fragmentation across teams and tools; unstructured memories unusable by downstream systems; redundant context delivery in autonomous multi-step executions; and silent quality degradation without feedback loops.

Qualitative analysis and problem framing presented in the paper (authors' identification of five specific challenges).

medium negative Governed Memory: A Production Architecture for Multi-Agent W... presence/identification of five structural governance challenges

AI raises managerial cognitive complexity and creates recurring tensions between algorithmic optimisation and systemic, ethical reasoning.

Theoretical synthesis highlighting emergent tensions from integrating computational optimisation with systems thinking and ethical considerations; conceptual, no empirical tests.

medium negative Comparative analysis of strategic vs. computational thinking... managerial cognitive complexity and frequency/severity of optimisation vs ethica...

Underprovision of verification is likely if left to market forces because information quality has positive externalities and misinformation imposes negative externalities, justifying public funding, subsidies, or regulation.

Economic reasoning and policy implications drawn from the study's findings and the literature on public goods/externalities.

medium negative Fact-Checking Platforms in the Middle East: A Comparative St... level of provision of verification services relative to social optimum

Censorship, restricted data flows, and government interference fragment markets, limit economies of scale, and favor well-resourced, internationally connected actors—widening capacity gaps.

Interpretive economic analysis grounded in observed access constraints and comparative case material across the three platforms.

medium negative Fact-Checking Platforms in the Middle East: A Comparative St... market fragmentation and distribution of capacity among actors

Limited data access and censorship reduce the efficacy of AI tools by creating training and validation gaps; legal risks complicate use of proprietary platforms and cloud services.

Interviews describing constraints on data availability and legal/operational barriers to using some platforms and cloud services; interpretive analysis of implications for AI training/validation.

medium negative Fact-Checking Platforms in the Middle East: A Comparative St... AI tool effectiveness (training/validation quality) and deployability

Generative AI increases the volume and sophistication of misinformation (deepfakes, fabricated documents), raises false-positive risks, and can be weaponized by state or nonstate actors.

Interview accounts and qualitative analysis noting observed or anticipated misuse of generative models and associated verification challenges.

medium negative Fact-Checking Platforms in the Middle East: A Comparative St... misinformation volume/sophistication and verification error risk

Resource constraints—limited staff time, funding, and technical capacity—are recurring operational challenges for these platforms.

Staff and stakeholder interviews plus analysis of organizational reports indicating staffing, funding, and technical limitations.

medium negative Fact-Checking Platforms in the Middle East: A Comparative St... staffing levels, funding availability, technical capacity

Platforms experience difficulty building and retaining audience trust and engagement, especially in contexts of high public skepticism or polarization.

Interview data from platform staff describing audience engagement challenges, supported by analysis of audience-focused platform formats and community-reporting strategies.

medium negative Fact-Checking Platforms in the Middle East: A Comparative St... audience trust and engagement levels

Platforms face limited or asymmetric access to primary data sources such as platform APIs, state data, and archives.

Interview accounts and document analysis noting restricted API access and barriers to state-held data and archives across the three cases.

medium negative Fact-Checking Platforms in the Middle East: A Comparative St... access to primary data sources

Censorship and legal risks constrain reporting and distribution for these fact-checking platforms.

Consistent reports from interview subjects and corroborating document analysis indicating legal/censorship-related limitations on publishing and distribution.

medium negative Fact-Checking Platforms in the Middle East: A Comparative St... reporting frequency, distribution channels, and content choices

Political instability, legal pressure, and censorship strongly shape what platforms can investigate, publish, and access in the region.

Thematic findings from semi-structured interviews with platform staff and document analysis of public reports and policy statements across the three country cases.

medium negative Fact-Checking Platforms in the Middle East: A Comparative St... ability to investigate, publish, and access information

Investments in alignment interventions (pluralistic evaluation, transparency) produce public‑good benefits that private firms may underinvest in absent regulation, standards, or procurement incentives.

Economic reasoning about public goods and incentives, supported by conceptual synthesis of firm behavior literature, not by original empirical investment data.

medium negative LLM Alignment should go beyond Harmlessness–Helpfulness and ... level of private investment in alignment interventions relative to socially opti...

Misalignment generates negative externalities (misinformation, biased decisions, harms to vulnerable groups) that markets may underprovide solutions for, motivating public‑interest interventions.

Economic argumentation and literature synthesis on externalities and public goods; supported by referenced examples in prior work though not quantified here.

medium negative LLM Alignment should go beyond Harmlessness–Helpfulness and ... social harms/externalities associated with misaligned LLM deployments (e.g., mis...

AI can augment measurement (e.g., collaboration patterns, output tracking) but if poorly designed may reinforce visibility biases that disadvantage remote workers.

Theoretical reasoning and literature citations about algorithmic bias and monitoring; illustrated with secondary examples rather than primary empirical tests.

medium negative The Sociology of Remote Work and Organisational Culture: How... measurement bias; differential visibility; career impacts for remote workers

Hybrid arrangements can exacerbate inequities in access to informal networks and career advancement, often privileging co-located or better-networked employees.

Theoretical integration of sociological and management studies with comparative case illustrations; secondary data examples referenced but no new causal empirical tests reported.

medium negative The Sociology of Remote Work and Organisational Culture: How... access to informal networks; promotion/career advancement rates

Hybrid and remote work create risks of professional invisibility, fragmented social networks, and unequal access to workplace social capital.

Literature synthesis and illustrative case studies drawn from secondary sources; qualitative/comparative case evidence rather than primary quantitative data.

medium negative The Sociology of Remote Work and Organisational Culture: How... professional visibility; social network cohesion; access to workplace social cap...

HACCA proliferation increases negative externalities and public-good failure risks, meaning private markets will underinvest in mitigation absent public intervention.

Public-goods and externality economic theory applied to cybersecurity; policy analysis (qualitative).

medium negative Highly Autonomous Cyber-Capable Agents: Anticipating Capabil... level of private investment in collective security measures and need for public ...

Widespread HACCA availability compresses the capability gap between resource-rich and resource-poor actors, empowering criminal groups and smaller states and concentrating harms in less-protected sectors and geographies.

Diffusion and strategic externalities analysis; scenario reasoning about capability democratization (qualitative).

medium negative Highly Autonomous Cyber-Capable Agents: Anticipating Capabil... measures of capability inequality across actors and incidence of harms in less-p...

Firms will shift investment toward cybersecurity and away from other productive uses; small and medium enterprises (SMEs) will be disproportionately affected due to limited defenses.

Investment-allocation reasoning and distributional analysis of firm capabilities (qualitative; no firm-level panel data).

medium negative Highly Autonomous Cyber-Capable Agents: Anticipating Capabil... share of firm investment in cybersecurity vs. other capital expenditure; relativ...

Cyber insurance markets will face increased premium pressure and uncertainty; insurers may raise prices, restrict coverage, or withdraw from some lines.

Economic analysis of risk pricing under higher uncertainty and tail risks; analogy to prior insurance market reactions to emerging risks (qualitative).

medium negative Highly Autonomous Cyber-Capable Agents: Anticipating Capabil... insurance premiums, coverage restrictions, and market participation in cyber ins...

Automation lowers fixed and marginal costs of conducting high-skill cyber operations, changing the supply-side economics and enabling a rapid expansion in the number of attackers.

Cost-structure reasoning about automation effects on labor and tool costs; conceptual economic analysis (no empirical cost data provided).

medium negative Highly Autonomous Cyber-Capable Agents: Anticipating Capabil... cost per attack and resulting number of attackers or attack frequency

Widespread diffusion of HACCAs will raise the baseline cyber threat and reduce the monopoly of advanced states and groups on high-end offensive capabilities.

Capability diffusion assessment and historical analogies to proliferation of technologies (qualitative; no large-scale empirical diffusion model).

medium negative Highly Autonomous Cyber-Capable Agents: Anticipating Capabil... distribution of offensive cyber capability across actor types

HACCAs would intensify interstate cyber competition by increasing operational tempo and reducing attribution certainty, complicating deterrence and crisis management.

Strategic scenario analysis and expert judgment linking automation features (speed, scale, opacity) to deterrence and attribution challenges (qualitative).

medium negative Highly Autonomous Cyber-Capable Agents: Anticipating Capabil... operational tempo of interstate cyber actions and accuracy/certainty of attribut...

Automation via HACCAs lowers the barrier to entry for conducting sophisticated cyber operations, enabling criminal groups, non-state actors, and less-resourced states to perform high-tier attacks.

Economic reasoning about fixed and marginal cost reductions, capability-diffusion analysis, and analogy to automation in other domains (qualitative; no empirical cost-study sample).

medium negative Highly Autonomous Cyber-Capable Agents: Anticipating Capabil... number/proportion of actor-types capable of conducting high-skill cyber operatio...

HACCAs would sustain operations using five core operational tactics: autonomous infrastructure setup; credential and access harvesting; advanced detection evasion; adaptive shutdown-avoidance; and operational persistence and scaling.

Attack-lifecycle mapping, review of APT case studies, and red-team threat-modeling to extrapolate automated equivalents of human-led tactics (qualitative categorization).

medium negative Highly Autonomous Cyber-Capable Agents: Anticipating Capabil... presence and effectiveness of the five operational tactics in HACCA-driven campa...

HACCAs would materially change the threat environment by enabling top-tier offensive cyber operations to be automated and widely proliferable, creating large strategic, economic, and systemic security risks.

Scenario-based forecasting, capability-trajectory assessment, review of APT case studies, and threat-modeling/red-team reasoning (qualitative synthesis; no large-n empirical quantification).

medium negative Highly Autonomous Cyber-Capable Agents: Anticipating Capabil... magnitude of change in cyber threat environment (proliferation and automation of...

Counterfactual simulations show that modest salary increases have a smaller effect on predicted attrition than eliminating overtime (in this dataset and model).

Comparative counterfactual experiments run on the calibrated logistic model: simulations altering salary vs. altering overtime feature; reported that overtime elimination outperforms modest pay increases in retained headcount and probability reductions (exact salary-change amounts and comparative numbers not given in the summary).

medium negative Explainable AI for Employee Retention in Green Human Resourc... change in predicted attrition probability and aggregated retained headcount unde...

In the dataset used, eliminating overtime could potentially retain about 31 employees — a larger effect than modest salary increases.

Aggregated counterfactual simulation on the IBM HR Analytics dataset: after setting overtime to zero for applicable records, the model-predicted net retained headcount ≈ 31; compared to simulations of modest salary increases which yielded smaller retained headcount (exact salary-change magnitude and headcount numbers not provided).

medium negative Explainable AI for Employee Retention in Green Human Resourc... predicted retained headcount (number of employees whose attrition probability fa...

Eliminating overtime could lower predicted attrition probability by 17.35% for affected employees (per the model's counterfactual simulation).

Counterfactual policy simulation using the calibrated logistic model on the IBM HR Analytics dataset: set overtime feature to zero for affected employees and compute change in each employee's calibrated attrition probability; reported average reduction = 17.35%.

medium negative Explainable AI for Employee Retention in Green Human Resourc... change in calibrated predicted attrition probability (percentage point reduction...

Traditional STP showed a 67% performance decline after six months in unstable market conditions.

Empirical observation reported in the study—likely derived from simulation scenarios and/or longitudinal analysis of behavioral data; precise data source (simulation vs. observed field data), statistical tests, and sample framing are not specified in the summary.

medium negative The Algorithmic Canvas: On the Autopoietic Redefinition of S... effectiveness/performance of traditional STP over time (decline over six months ...

The persistence of interpretive, human-in-the-loop evaluation implies ongoing labor requirements (annotation, sense-making, governance roles), affecting forecasts of automation and labor substitution in sectors adopting LLMs.

Interview reports describing continued manual work for evaluation tasks across participants; authors draw implications for labor demand.

medium negative Results-Actionability Gap: Understanding How Practitioners E... continued human labor requirements for evaluation

Environmental and informational externalities from AI (energy use, privacy harms, bias) justify regulatory and Pigouvian-style interventions to correct market failures.

Conceptual and policy literature reviewed, combined with empirical observations about environmental impacts and privacy/bias incidents reported in prior studies; the paper does not provide new causal estimates of externality magnitudes.

medium negative The Evolution and Societal Impact of Artificial Intelligence... externality magnitudes (environmental costs, privacy/bias harms) and welfare eff...

« Prev 1 2 3 … 48 49 50 … 97 98 Next »