Evidence (3470 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	609	159	77	736	1615
Governance & Regulation	664	329	160	99	1273
Organizational Efficiency	624	143	105	70	949
Technology Adoption Rate	502	176	98	78	861
Research Productivity	348	109	48	322	836
Output Quality	391	120	44	40	595
Firm Productivity	385	46	85	17	539
Decision Quality	275	143	62	34	521
AI Safety & Ethics	183	241	59	30	517
Market Structure	152	154	109	20	440
Task Allocation	158	50	56	26	295
Innovation Output	178	23	38	17	257
Skill Acquisition	137	52	50	13	252
Fiscal & Macroeconomic	120	64	38	23	252
Employment Level	93	46	96	12	249
Firm Revenue	130	43	26	3	202
Consumer Welfare	99	51	40	11	201
Inequality Measures	36	105	40	6	187
Task Completion Time	134	18	6	5	163
Worker Satisfaction	79	54	16	11	160
Error Rate	64	78	8	1	151
Regulatory Compliance	69	64	14	3	150
Training Effectiveness	81	15	13	18	129
Wages & Compensation	70	25	22	6	123
Team Performance	74	16	21	9	121
Automation Exposure	41	48	19	9	120
Job Displacement	11	71	16	1	99
Developer Productivity	71	14	9	3	98
Hiring & Recruitment	49	7	8	3	67
Social Protection	26	14	8	2	50
Creative Output	26	14	6	2	49
Skill Obsolescence	5	37	5	1	48
Labor Share of Income	12	13	12	—	37
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Org Design Remove filter

A foreign state actor threat model for enterprise identity governance establishing that Silk Typhoon, Salt Typhoon, Volt Typhoon, and North Korean AI-enhanced identity fraud operations have already operationalized AI identity vulnerabilities as active attack vectors.

Paper claims to provide a threat model and asserts these named actors have operationalized AI identity vulnerabilities; stated grounding implied to be threat intelligence and incident analysis, though not detailed in the excerpt.

high negative Who Governs the Machine? A Machine Identity Governance Taxon... operationalization of AI identity vulnerabilities by named foreign actor groups

Nation-state actors including Silk Typhoon and Salt Typhoon have operationalized ungoverned machine credentials as primary espionage vectors against critical infrastructure.

Asserted in paper and described as grounded in threat intelligence; no specific threats, incidents, or data described in the excerpt.

high negative Who Governs the Machine? A Machine Identity Governance Taxon... use of ungoverned machine credentials by nation-state actors for espionage again...

A single ungoverned automated agent produced $5.4-10 billion in losses in the 2024 CrowdStrike outage.

Statement in paper attributing a $5.4-10B loss to an ungoverned automated agent during the 2024 CrowdStrike outage; no citation or method shown in excerpt.

high negative Who Governs the Machine? A Machine Identity Governance Taxon... financial losses caused by an ungoverned automated agent in the 2024 CrowdStrike...

No integrated framework exists to govern machine identities (AI agents, service accounts, API tokens, automated workflows).

Asserted in paper as a gap in existing governance frameworks; no empirical test or survey reported in the excerpt.

high negative Who Governs the Machine? A Machine Identity Governance Taxon... existence of an integrated governance framework for machine identities

Automated agents, service accounts, API tokens, and automated workflows now outnumber human identities in enterprise environments by ratios exceeding 80 to 1.

Statement in paper (asserted prevalence); no sample size or data source provided in the excerpt.

high negative Who Governs the Machine? A Machine Identity Governance Taxon... number of machine identities relative to human identities in enterprise environm...

Foundation-model usage can increase compute-related emissions.

Conceptual/environmental concern highlighted in the paper about the carbon footprint of heavy model use and persistent storage; no quantified emissions analysis or lifecycle assessment presented.

high negative Remote-Capable Knowledge Work Should Default to AI-Enabled F... compute-related (carbon) emissions associated with foundation-model usage

These systems can cause skill atrophy.

Theoretical risk articulated in the paper that reliance on AI assistance may degrade human skills over time; no longitudinal skill-measurement or experimental evidence provided.

high negative Remote-Capable Knowledge Work Should Default to AI-Enabled F... degradation or atrophy of worker skills

The same foundation-model systems can also intensify surveillance.

Cautionary claim in the paper noting the surveillance risk of durable, queryable traces and integrated tooling; presented as a conceptual risk rather than empirically measured increase in surveillance.

high negative Remote-Capable Knowledge Work Should Default to AI-Enabled F... increase in workplace surveillance capability/use

Baseline (non-structured) interactions had 16 of 50 accepted on first pass.

Reported counts in the paper for the baseline group (16 accepted of 50 baseline interactions).

high negative Context Engineering: A Practitioner Methodology for Structur... first-pass acceptances (count and rate)

In an observational study of documented interactions across four AI tools (Claude, ChatGPT, Cowork, Codex), incomplete context was associated with 72% of iteration cycles.

Observational study reported in the paper covering interactions across four AI tools; the paper reports the 72% figure.

high negative Context Engineering: A Practitioner Methodology for Structur... iteration cycles associated with incomplete context

This combination (rapid but uneven capability advance and lagging knowledge about harms/safeguards) creates a difficult policy condition: governments must decide under uncertainty across multiple plausible technological trajectories through 2030.

Reasoned argument in the article synthesizing foresight scenarios and the literature on uncertainty in AI progress (references to documents like OECD foresight and the International AI Safety Report 2026).

high negative Governing frontier general-purpose AI in the public sector: ... policy decision-making under uncertainty across AI progress trajectories

Knowledge about harms, safeguards, and effective interventions remains partial and lagged relative to capability advances.

Analytic claim in the article, supported by cited reports and literature that document gaps in understanding of harms and safeguards.

high negative Governing frontier general-purpose AI in the public sector: ... state of knowledge on harms, safeguards, and interventions

The expansion of AI in digital health has simultaneously introduced complex governance, privacy, and financial sustainability challenges.

Argument and synthesis across regulatory policy, ethics, and healthcare economics literatures presented in the review (literature review / conceptual synthesis).

high negative Conceptual framework for AI governance, data privacy complia... governance complexity / privacy compliance burden / financial sustainability ris...

Result 2: When managers are short-termist or worker skill has external value, the decision-maker's optimal policy can produce the augmentation trap, leaving the worker worse off than if AI had never been adopted.

Analytical result from the dynamic model comparing planner/objective variations (short-termist manager or externalities) and showing an outcome labeled the 'augmentation trap'.

high negative The Augmentation Trap: AI Productivity and the Cost of Cogni... worker welfare/productivity relative to non-adoption

Result 1: Even a decision-maker who fully anticipates skill erosion rationally adopts AI when front-loaded productivity gains outweigh long-run skill costs, producing steady-state loss: the worker ends up less productive than before adoption.

Analytical result from the dynamic model showing optimal adoption choice can lead to a steady-state where worker productivity is lower than pre-adoption (model-based comparative statics).

high negative The Augmentation Trap: AI Productivity and the Cost of Cogni... steady-state worker productivity (relative to pre-adoption)

Experimental evidence shows that sustained use of AI tools can erode the expertise on which productivity gains depend (deskilling).

Statement in paper referencing experimental studies (no specific study, method, or sample size reported in the excerpt).

high negative The Augmentation Trap: AI Productivity and the Cost of Cogni... worker expertise / skill level

Aggressive compression increased total session cost by 67% despite reducing input tokens by 17%, because it shifted interpretive burden to the model's reasoning phase.

Result reported from the controlled experiment comparing log-format conditions; four conditions described but specific number of sessions/replications not provided in the abstract.

high negative Beyond Human-Readable: Rethinking Software Engineering Conve... total session cost (primary) and input token count (secondary)

The impossibility is structural: transparency, audits, and oversight cannot resolve it without reducing autonomy.

Logical consequence derived from the Accountability Incompleteness Theorem and the formal model; stated directly in the paper.

high negative The Accountability Horizon: An Impossibility Theorem for Gov... effectiveness of transparency/audits/oversight in restoring accountability witho...

Accountability Incompleteness Theorem: for any collective whose compound autonomy exceeds the Accountability Horizon and whose interaction graph contains a human-AI feedback cycle, no framework can satisfy all four accountability properties simultaneously.

Central theoretical result stated in the paper; supported by a formal impossibility proof based on the model and axioms.

high negative The Accountability Horizon: An Impossibility Theorem for Gov... existence of frameworks satisfying all four accountability properties

Agentic AI systems violate the above shared accountability assumption not as an engineering limitation but as a mathematical necessity once autonomy exceeds a computable threshold.

Formal theoretical development in the paper culminating in the Accountability Incompleteness Theorem (mathematical proof based on the introduced formal model and axioms).

high negative The Accountability Horizon: An Impossibility Theorem for Gov... possibility of assigning meaningful responsibility (attributability) under forma...

OpenAI o3 achieves only 17% of optimal collective performance.

Experimental measurement of collective performance for OpenAI o3 in the paper's multi-agent setup (value reported in abstract; no sample size provided there).

high negative More Capable, Less Cooperative? When LLMs Fail At Zero-Cost ... collective performance (percent of optimal group revenue)

We term this the Logic Monopoly -- the agent society's unchecked monopoly over the entire logic chain from planning through execution to evaluation.

Terminology/definition introduced by the authors to describe the conceptual governance problem; definitional claim rather than empirical finding.

high negative AgentCity: Constitutional Governance for Autonomous Agent Ec... concentration of control over planning, execution, and evaluation logic

When agents from different human principals collaborate at scale, the collective becomes opaque: no single human can observe, audit, or govern the emergent behavior.

Conceptual/analytical claim presented as a security/governance risk in the paper; no empirical study or quantified measurement given in the excerpt.

high negative AgentCity: Constitutional Governance for Autonomous Agent Ec... observability/auditability/governability of multi-principal agent collectives

Participants incentivized for originality incorporate fewer AI suggestions verbatim.

Usage and output-analysis from the pre-registered RCT comparing verbatim incorporation rates of AI suggestions across incentive conditions (no numeric rates provided in excerpt).

high negative Incentives shape how humans co-create with generative AI rate of verbatim incorporation of AI suggestions

Early evidence suggests generative AI increases productivity but does so at the cost of collective diversity, potentially narrowing the set of ideas and perspectives produced.

Statement refers to prior literature/early studies (no specific study, sample size, or method reported in the excerpt).

high negative Incentives shape how humans co-create with generative AI collective diversity of produced ideas/perspectives

The study observed errors and limitations in both phases (test generation and refactoring), and manual intervention was necessary at times.

Case study observations reported in the paper describing observed model errors/limitations and instances requiring manual developer intervention.

high negative AI-Assisted Unit Test Writing and Test-Driven Code Refactori... occurrence of errors and need for manual intervention

High-risk agentic systems with untraceable behavioral drift cannot currently satisfy the AI Act's essential requirements.

Authors' legal and normative conclusion based on their regulatory mapping and analysis (argumentative/legal reasoning rather than reported empirical testing).

high negative AI Agents Under EU Law compliance feasibility of high-risk agentic systems with untraceable behavioral ...

The paper identifies agent-specific compliance challenges in cybersecurity, human oversight, transparency across multi-party action chains, and runtime behavioral drift.

Author-stated findings from the regulatory mapping and analysis; specific challenge areas listed without reported quantitative measurement.

high negative AI Agents Under EU Law compliance challenges (cybersecurity, human oversight, transparency, runtime dri...

The EU AI Act (Regulation 2024/1689) regulates these systems through a risk-based framework, but it does not operate in isolation: providers face simultaneous obligations under the GDPR, the Cyber Resilience Act, the Digital Services Act, the Data Act, the Data Governance Act, sector-specific legislation, the NIS2 Directive, and the revised Product Liability Directive.

Legal/regulatory mapping asserted by the authors listing specific EU regulations and directives that impose obligations on providers.

high negative AI Agents Under EU Law regulatory obligations faced by AI agent providers

Multiple distinct contexts tend to collapse into one another or 'rot', degrading over time and reducing the utility of efforts to account for context.

Theoretical and empirical claim supported by interviewee reports and the authors' analytic synthesis; presented as observed pattern across cases (qualitative; sample size not specified).

high negative Context Collapse: Barriers to Adoption for Generative AI in ... durability and distinctness of contextual representations and their utility for ...

Generative AI tools fail to account for users' context in workplace settings.

Findings from expert interviews reporting concrete examples where tools did not incorporate or respect relevant contextual information; qualitative analysis (sample size not provided in the summary).

high negative Context Collapse: Barriers to Adoption for Generative AI in ... degree to which tools incorporate relevant contextual factors

Current approaches to account for the contexts in which generative AI technologies are used fall short of users' expectations and needs.

Qualitative empirical study based on expert interviews and analysis of user/developer perspectives (method described as expert interviews; exact sample size not stated in provided summary).

high negative Context Collapse: Barriers to Adoption for Generative AI in ... fit between system behavior and users' expectations/needs (contextual appropriat...

The literature remains fragmented, with limited integrative frameworks to explain how AI-human dynamics and decision-making typologies shape outcomes.

Conclusion drawn from the systematic review and bibliometric analysis of the 627-article corpus as reported in the abstract.

high negative Advancing Decision-Making through AI-Human Collaboration: A ... degree of integration/coherence of the academic literature; presence of integrat...

The remaining 26 barriers are carried over from prior digital transformation waves — 22 in amplified form and 4 unchanged.

Comparative coding/classification within the review corpus indicating whether each barrier is novel or carried over, and whether it is amplified versus unchanged.

high negative BARRIERS TO AGENTIC AI ENTERPRISE TRANSFORMATION novelty_vs_carried_over_of_barriers

Three barriers were identified as agentic-specific: error propagation in multi-agent systems, role ambiguity, and accountability diffusion.

Classification of the 29 coded barriers by 'agentic specificity' within the literature review; these three barriers were labeled agentic-specific by the authors.

high negative BARRIERS TO AGENTIC AI ENTERPRISE TRANSFORMATION agentic_specific_barriers

Occupations whose AI-exposed steps are more dispersed across the production workflow (higher fragmentation) exhibit a substantially lower share of their steps actually executed by AI, conditional on AI exposure share.

Empirical regression analysis controlling for share of AI-exposed steps; uses dataset linking O*NET tasks, human AI exposure assessments, Anthropic Economic Index execution outcomes, and GPT-generated workflow orderings (details in Sections 5.1 and 7).

high negative Chaining Tasks, Redefining Work: A Theory of AI Automation share (fraction) of steps executed by AI at the occupation/job level

Treated firms' demand for external capital investment falls by just over $220,000 relative to the control group.

RCT with 515 firms; reported dollar-change in external investment demand between treated and control firms.

high negative Mapping AI into Production: A Field Experiment on Firm Perfo... change in external capital investment demand (USD)

Despite faster growth, treated firms do not scale inputs proportionally: their demand for external capital investment falls by 39.5% relative to the control group.

RCT with 515 firms; firms reported external capital demand/investment requests; comparison of investment demand between treatment and control groups.

high negative Mapping AI into Production: A Field Experiment on Firm Perfo... demand for external capital investment

There are macroeconomic risks associated with AI-led unemployment.

Paper's macroeconomic analysis drawing on labor economics and technology adoption research; no quantitative estimates or sample sizes provided in the summary.

high negative A Shorter Workweek as Economic Infrastructure: Managing AI-D... macroeconomic risk indicators (e.g., unemployment, aggregate demand shortfalls)

Managerial incentives drive premature workforce contraction during AI adoption.

Analytical claim grounded in labor economics and organizational behavior review; the summary indicates examination of managerial incentives but does not report primary empirical tests or sample sizes.

high negative A Shorter Workweek as Economic Infrastructure: Managing AI-D... timing and extent of workforce contraction

Premature workforce contraction in response to AI adoption foreshadows deeper structural challenges as AI systems mature.

Forward-looking claim based on synthesis of literature and theoretical projection; no empirical quantification or sample provided in the summary.

high negative A Shorter Workweek as Economic Infrastructure: Managing AI-D... long-run structural economic challenges (e.g., systemic instability, labor marke...

This pattern of premature workforce reductions reflects longstanding corporate short-termism rather than genuine technological displacement.

The paper's interpretation drawing on labor economics and organizational behavior literature; no empirical study or sample size reported in the summary.

high negative A Shorter Workweek as Economic Infrastructure: Managing AI-D... drivers of workforce reduction (managerial incentives vs. actual automation capa...

Organizations face mounting pressure to demonstrate immediate returns on AI investments, often through workforce reductions that outpace actual automation capabilities.

Argument in paper citing accelerating AI adoption across sectors and observed managerial responses; no primary dataset or sample size reported in the text.

high negative A Shorter Workweek as Economic Infrastructure: Managing AI-D... workforce reductions / layoffs

The interaction between strict algorithmic control and worker counter-strategies leads to persistent limit cycles in strategy frequencies rather than convergence to a stable compliant workforce.

Dynamical systems analysis and simulation trajectories from the EGT model showing limit cycles / oscillatory equilibria in strategy proportions; model-based (no empirical sample).

high negative THE RED QUEEN in the DASHBOARD: CO-EVOLUTIONARY DYNAMICS of ... dynamical behavior of strategy frequencies (limit cycles vs. stable equilibrium)

The way we're thinking about generative AI right now is fundamentally individual (this appears in how users interact with models, how models are built, how they're benchmarked, and how commercial and research strategies using AI are defined).

Author's observational/descriptive claim supported by argumentative examples (mentions user interaction patterns, model design and benchmarking practices, and commercial/research strategies); no empirical sample or quantitative analysis reported in the excerpt.

high negative The Future of AI is Many, Not One conceptual framing and practices around generative AI (individual-focused design...

The emission-reduction effect of AI innovation is more significant for firms located in regions with underdeveloped factor markets.

Heterogeneity (regional subsample/interaction) analysis reported in the paper on the 21,428 firm-year sample, indicating larger AI-related emission reductions in regions with less developed factor markets.

high negative Artificial Intelligence Innovation, Internal Structure Optim... corporate carbon emission intensity (differential effect by regional factor mark...

The emission-reduction effect of AI innovation is more significant for firms in high-environmental-sensitivity industries.

Heterogeneity (subsample/interaction) analysis in the paper using the 21,428 firm-year observations, showing stronger AI-related emission reductions in industries characterized as high environmental sensitivity.

high negative Artificial Intelligence Innovation, Internal Structure Optim... corporate carbon emission intensity (differential effect by industry environment...

The emission-reduction effect of AI innovation is more significant for enterprises with a low supply chain concentration.

Heterogeneity (subsample) analysis reported in the paper using the 21,428 firm-year dataset, comparing effects across firms with different supply chain concentration levels.

high negative Artificial Intelligence Innovation, Internal Structure Optim... corporate carbon emission intensity (differential effect by supply chain concent...

Executives’ green cognition and government environmental attention together constitute dual internal and external driving forces for corporate carbon emission reduction.

Further analysis reported in the paper (moderation/interaction analysis or additional regressions) on the same 21,428 firm-year sample showing these factors strengthen carbon reduction associated with AI innovation.

high negative Artificial Intelligence Innovation, Internal Structure Optim... corporate carbon emission intensity / carbon emission reduction

AI innovation can significantly reduce corporate carbon emission intensity.

Empirical analysis using panel data of 21,428 firm-year observations from Chinese A-share listed manufacturing companies over 2010–2022; result reported in the paper's main regressions (method described as micro-level empirical analysis).

high negative Artificial Intelligence Innovation, Internal Structure Optim... corporate carbon emission intensity

« Prev 1 2 3 … 7 8 9 … 69 70 Next »