Evidence (13827 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	749	195	97	889	1979
Governance & Regulation	815	391	188	121	1539
Organizational Efficiency	771	189	124	83	1177
Technology Adoption Rate	624	233	123	96	1084
Research Productivity	410	121	56	331	929
Output Quality	466	177	59	47	749
Decision Quality	320	174	75	42	618
Firm Productivity	435	55	88	20	604
AI Safety & Ethics	214	276	65	33	593
Market Structure	178	166	122	24	495
Task Allocation	206	64	70	31	376
Skill Acquisition	165	57	60	17	299
Innovation Output	201	27	41	18	288
Employment Level	105	51	107	13	278
Fiscal & Macroeconomic	131	69	43	26	276
Consumer Welfare	116	63	42	11	232
Firm Revenue	149	46	26	3	224
Inequality Measures	44	122	49	6	221
Task Completion Time	169	29	8	12	219
Worker Satisfaction	89	61	20	12	182
Error Rate	69	91	10	2	172
Regulatory Compliance	76	68	14	5	163
Training Effectiveness	92	19	13	19	145
Wages & Compensation	77	36	25	6	144
Automation Exposure	51	54	22	12	142
Team Performance	86	17	27	9	140
Developer Productivity	94	17	14	6	132
Job Displacement	12	80	20	1	113
Hiring & Recruitment	51	7	8	3	69
Skill Obsolescence	5	45	6	1	57
Creative Output	31	16	7	2	57
Social Protection	27	16	8	2	53
Labor Share of Income	17	17	17	—	51
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Knowledge workers become adversarial auditors rather than keystroke-producers.

Projected role-shift based on the verification-bottleneck thesis and interdisciplinary supporting arguments; no empirical longitudinal workforce study reported.

high positive The Instrumental Dissolution of Typing: Why AI Challenges th... dominant work tasks/roles of knowledge workers (generation vs. auditing)

The central contribution identifies the verification bottleneck: as AI collapses production friction, the primary constraint shifts from generation to evaluation.

Theoretical argument supported by literature synthesis across multiple fields; no direct experimental quantification provided.

high positive The Instrumental Dissolution of Typing: Why AI Challenges th... relative constraint: generation vs. evaluation (verification) in knowledge work

We contribute design guidelines for specialized AI and articulate a vision for 'ecosystem-aware' Humble AI.

Paper's stated contributions (design guidelines and conceptual vision) described in the abstract.

high positive Learning from AVA: Early Lessons from a Curated and Trustwor... design guidance / conceptual framework

Qualitatively, participants used AVA as a specialized 'evidence engine'; reasoned abstention clarified scope boundaries, and trust was calibrated through institutional provenance and page-anchored citations.

Qualitative findings from surveys and 20 interviews reported in the paper (participant quotations and thematic analysis implied in abstract).

high positive Learning from AVA: Early Lessons from a Curated and Trustwor... user behavior and trust calibration (use as evidence engine; role of abstention ...

Difference-in-Differences estimates associate sustained engagement with 2.4-3.9 hours saved weekly.

Quantitative claim reported in the paper based on Difference-in-Differences analysis of usage/engagement data from the evaluation (implicit sample drawn from the >2,200 participants).

high positive Learning from AVA: Early Lessons from a Curated and Trustwor... time saved per week

AVA operationalizes epistemic humility through two mechanisms: citation verifiability (tracing claims to sources) and reasoned abstention (declining unsupported queries with justification and redirection).

Design claim describing implemented mechanisms in the platform; described in the paper as operational features.

high positive Learning from AVA: Early Lessons from a Curated and Trustwor... epistemic humility operationalization (citation verifiability and reasoned abste...

AVA's multi-agent pipeline enables users to query and receive evidence-based syntheses.

System design and capability claim in the paper (description of multi-agent pipeline producing evidence-based syntheses).

high positive Learning from AVA: Early Lessons from a Curated and Trustwor... output: evidence-based syntheses

AVA is a GenAI platform built on a curated library of over 4,000 World Bank Reports with multilingual capabilities.

System description provided in the paper; statement of dataset size and functionality (library count and multilingual support).

high positive Learning from AVA: Early Lessons from a Curated and Trustwor... system corpus size / multilingual capability

The governance architecture (privacy implemented as physics rather than policy, founder-controlled class shares on non-negotiable architectural commitments) is inseparable from the product itself.

Normative and architectural argument in the paper tying governance design choices to product architecture (no empirical validation in this text).

high positive The Continuity Layer: Why Intelligence Needs an Architecture... relationship between governance architecture and AI product architecture

Physics limits now constraining the model layer make the continuity layer newly consequential.

Analytical argument in the paper linking physical constraints on model scaling to increased importance of continuity (no empirical measurement included here).

high positive The Continuity Layer: Why Intelligence Needs an Architecture... relative consequentiality of continuity given physics limits on model scaling

The paper proposes a four-layer development arc for continuity: from external SDK to hardware node to long-horizon human infrastructure.

Design/roadmap proposal described in the manuscript (no empirical testing provided here).

high positive The Continuity Layer: Why Intelligence Needs an Architecture... proposed development pathway for continuity infrastructure

The engineering architecture for continuity is mapped to the theological pattern of kenosis and the symbolic pattern of Alpha and Omega, and the paper argues this mapping is structural rather than merely metaphorical.

Interpretive/mapping argument presented in the paper (theoretical/analogical reasoning).

high positive The Continuity Layer: Why Intelligence Needs an Architecture... conceptual mapping between engineering architecture and symbolic/theological pat...

The paper describes a storage primitive called Decomposed Trace Convergence Memory whose write-time decomposition and read-time reconstruction produce the continuity property.

Design proposal in the manuscript outlining a storage primitive and its read/write behavior (no empirical validation reported here).

high positive The Continuity Layer: Why Intelligence Needs an Architecture... ability of a storage primitive to produce continuity

Continuity is defined in the paper as a system property with seven required characteristics, distinct from memory and from retrieval.

Explicit definitional claim made in the manuscript (enumeration of seven characteristics described).

high positive The Continuity Layer: Why Intelligence Needs an Architecture... conceptual definition/characterization of continuity

A companion paper (arXiv:2604.10981) positions the ATANT framework against existing memory, long-context, and agentic-memory benchmarks.

Citation to a companion paper that reportedly compares frameworks/benchmarks.

high positive The Continuity Layer: Why Intelligence Needs an Architecture... comparative positioning of evaluation frameworks

The formal evaluation framework for the property described here is the ATANT benchmark (arXiv:2604.06710), published separately with evaluation results on a 250-story corpus.

Citation to separate benchmark paper and reported evaluation on a 250-story corpus.

high positive The Continuity Layer: Why Intelligence Needs an Architecture... benchmarking/evaluation of continuity property

Engineering work to build the continuity layer has begun in public.

Statement in the paper asserting publicly visible engineering activity (no specific projects or quantitative audit included in this text).

high positive The Continuity Layer: Why Intelligence Needs an Architecture... public engineering activity toward continuity layer

The continuity layer is the most consequential piece of infrastructure the field has not yet built.

Normative claim/argument in the position paper (no empirical test presented in this text).

high positive The Continuity Layer: Why Intelligence Needs an Architecture... relative infrastructural importance in AI systems

The most important architectural problem in AI is not the size of the model but the absence of a layer that carries forward what the model has come to understand (a "continuity layer").

Position paper argument and conceptual reasoning in the manuscript (no empirical study reported).

high positive The Continuity Layer: Why Intelligence Needs an Architecture... existence/importance of a continuity layer in AI architecture

Code-generating Artificial Intelligence has gained popularity within both professional and educational programming settings over the past several years.

Background statement in the paper's introduction (observational claim about recent trends in AI adoption).

high positive Fast and Forgettable: A Controlled Study of Novices' Perform... adoption/popularity of code-generating AI

The emotional effect of the human teammate was significantly more positive and arousing compared to working with Copilot.

Subjective emotion measures (valence/arousal) collected in the study; reported significant differences favoring human teammate on positivity and arousal (n=22).

high positive Fast and Forgettable: A Controlled Study of Novices' Perform... emotional valence and arousal during task

Several dimensions of participants' workload were significantly reduced when using GitHub Copilot.

Subjective workload measures collected during the experiment; multiple workload dimensions reported as significantly lower in the Copilot condition (n=22).

high positive Fast and Forgettable: A Controlled Study of Novices' Perform... subjective workload (multiple dimensions)

Participants performed significantly better with GitHub Copilot than with their human teammate.

Experimental comparison of task performance between Copilot-assisted individual condition and human pair condition; statistical significance reported in results (sample size n=22).

high positive Fast and Forgettable: A Controlled Study of Novices' Perform... programming performance on timed Python tasks

China leads initiatives of global governance (in AI).

Stated strategic observation in the paper's introduction (no empirical measures provided in the excerpt).

high positive Polarization and Integration in Global AI Research leadership in global AI governance initiatives

The United Kingdom and Germany have integrated exclusively with the US.

Analysis of cross-country collaboration and citation ties showing exclusive integration patterns for the UK and Germany with the US in the publication-based network comparisons to random models.

high positive Polarization and Integration in Global AI Research international research integration (collaboration/citation) of UK and Germany wi...

Illustrative welfare calculations suggest net gains in the tens of billions annually from the proposed policies/interventions.

Paper reports illustrative/calculatory welfare exercises (not structural estimates) that yield an aggregate welfare figure described as 'net gains in the tens of billions annually'.

high positive The Inference Bottleneck: A Formal Model of Vertical Foreclo... aggregate welfare gains (annual)

The policy section proposes 'Neutral Inference', a four-pillar conduct framework consisting of QoS parity, routing transparency, FRAND-style non-discrimination, and tier transparency with release-pathway discipline.

Normative policy proposal laid out in the paper's policy section.

high positive The Inference Bottleneck: A Formal Model of Vertical Foreclo... regulatory/conduct framework (Neutral Inference) components

Under logit demand and symmetric rivals, the QoS gap is strictly increasing in inference-quality importance (alpha) and downstream margins.

Comparative statics derived from the analytical model (logit demand, symmetric rivals).

high positive The Inference Bottleneck: A Formal Model of Vertical Foreclo... QoS gap

The main theoretical result provides an explicit local equilibrium characterization of the QoS gap under logit demand and symmetric rivals.

Analytical derivation in the formal game-theoretic model assuming logit demand and symmetric rivals; presented as the paper's main theoretical result.

high positive The Inference Bottleneck: A Formal Model of Vertical Foreclo... QoS gap (equilibrium characterization)

An extension motivated by Anthropic's April 2026 release introduces a third mechanism, tier-based access discrimination, parameterized by a tier gap (tau) and partner-exclusivity (kappa).

Model extension in the paper explicitly adds parameters (tau, kappa) to represent tier-based access discrimination; motivated by a contemporaneous product release.

high positive The Inference Bottleneck: A Formal Model of Vertical Foreclo... tier-based access discrimination (parameterized by tau and kappa)

The model isolates two foreclosure mechanisms operating without predatory pricing: quality-of-service (QoS) discrimination against downstream rivals (via latency, throughput, context limits, or feature access) and routing bias in assistant-layer interfaces.

Formal game-theoretic model developed in the paper; mechanisms are derived and described in model set-up and analysis.

high positive The Inference Bottleneck: A Formal Model of Vertical Foreclo... presence of foreclosure mechanisms (QoS discrimination, routing bias)

As generative AI commercializes, competitive advantage is shifting from model training toward inference, distribution, and routing.

Framing/introductory assertion in the paper (conceptual argument, literature synthesis), not an empirical test.

high positive The Inference Bottleneck: A Formal Model of Vertical Foreclo... shift in source of competitive advantage (training -> inference/distribution/rou...

Evaluation demonstrates speed improvements of 6-7 minutes over traditional methods.

Reported empirical timing result in paper abstract: 6-7 minutes (presumably time to validate a change) compared to traditional methods (no further detail or sample size in abstract).

high positive Aether: Network Validation Using Agentic AI and Digital Twin validation time (speed)

Evaluation demonstrates diagnostic coverage of 92-96%.

Reported empirical range in paper abstract (92-96% diagnostic coverage over evaluated cases; specific n not provided in abstract).

high positive Aether: Network Validation Using Agentic AI and Digital Twin diagnostic coverage

Evaluation demonstrates promising results in error detection (100%).

Reported empirical result in paper abstract: 100% error detection over evaluated scenarios (no sample size given in abstract).

high positive Aether: Network Validation Using Agentic AI and Digital Twin error detection rate

By orchestrating agent collaboration atop this digital twin, Aether enables automated, rapid network change validation while reducing manual effort, minimizing errors, and improving operational agility and cost-effectiveness.

High-level claim supported by system design and subsequent empirical evaluation reported in paper (evaluation details referenced in abstract).

high positive Aether: Network Validation Using Agentic AI and Digital Twin automation, manual effort, error rates, operational agility, cost-effectiveness

Aether agents use a unified Network Digital Twin integrating modeling, simulation, and emulation to maintain a consistent, up-to-date network view for verification and testing.

Design claim describing the digital twin's capabilities (modeling, simulation, emulation) as part of the system; presented in paper text.

high positive Aether: Network Validation Using Agentic AI and Digital Twin consistency and freshness of network view for verification/testing

Aether features an agentic architecture with five specialized Network Operations AI agents that collaboratively handle the change validation lifecycle from intent analysis to network verification and testing.

System architecture claim in paper describing five specialized agents (design specification; no empirical sample size).

high positive Aether: Network Validation Using Agentic AI and Digital Twin architectural decomposition into five agents

Aether integrates Generative Agentic AI with a multi-functional Network Digital Twin to automate and streamline network change validation workflows.

Paper describes Aether system design and architecture combining agentic AI and a digital twin (design-level claim; architectural description).

high positive Aether: Network Validation Using Agentic AI and Digital Twin automation/streamlining of change validation workflows

To mitigate the curse of dimensionality in HRL, the paper introduces a capacity-aware state–action encoding mechanism that compresses the control interface into structured summary signals.

Methodological contribution described in the paper: proposed encoding mechanism intended to reduce state-action dimensionality and simplify the control interface.

high positive Omnichannel Supply Chains Amid Demand Shocks: A Centralized ... state-action dimensionality reduction and improved scalability/learning efficien...

The model shows cooperative behaviour supported by reward-punishment schemes that discourage deviations.

Analysis of the learned strategies/behaviour of the simulated deep reinforcement learning agents showing emergence of cooperation enforced via reward-punishment mechanisms (as reported in the paper).

high positive Convergence to collusion in algorithmic pricing presence of cooperative behaviour and mechanisms (reward-punishment) that deter ...

A modern deep reinforcement learning model deployed to price goods in a repeated oligopolistic competition game with continuous prices converges to a collusive outcome in an amount of time that matches empirical observations (under reasonable assumptions on the length of a time step).

Simulation/experiment using a modern deep reinforcement learning model in a repeated oligopoly pricing game with continuous prices; claim that convergence time matches empirical observations. (No sample size, number of runs, or numerical convergence time provided in the excerpt.)

high positive Convergence to collusion in algorithmic pricing time to converge to a collusive pricing outcome

Previous research shows that [pricing] algorithms can exhibit collusive behaviour.

Citation/summary of prior literature (as stated in paper); no specific studies or sample sizes given in the excerpt.

high positive Convergence to collusion in algorithmic pricing occurrence of collusive behaviour by pricing algorithms

A common response to these worries stresses that the goods derived from work can be found elsewhere, often in better activities, suggesting that the proliferation of AI-powered automation does not threaten the meaningfulness of people’s lives.

Description of a commonly offered counterargument in the literature and popular debate (conceptual/literature-summary; no empirical data or sample reported).

high positive Is artificial intelligence a threat to meaningful work and l... argument that non-work activities can replace meaning from work (impact on meani...

Intelligent textile technologies can effectively enhance operational efficiency in the textile industry's supply chain.

Overall result statement summarizing pilot study outcomes (inventory turnover, order fulfillment, cost control) as evidence; no numeric aggregate efficiency measure or sample size provided in the excerpt.

high positive Intelligent Textile Technology�CDriven Supply Chain Optimiza... operational efficiency

Intelligent textile technologies can effectively enhance supply chain transparency.

Conclusion based on the pilot study and the inclusion of blockchain-based data sharing in the model; no empirical transparency metrics or sample size reported in the provided text.

high positive Intelligent Textile Technology�CDriven Supply Chain Optimiza... supply chain transparency

Intelligent textile technologies can effectively enhance supply chain collaboration.

Conclusion drawn from the pilot study reported in the paper; no quantitative measures of collaboration or supporting statistics provided in the supplied text.

high positive Intelligent Textile Technology�CDriven Supply Chain Optimiza... supply chain collaboration

A pilot study demonstrates significant improvements in customer satisfaction.

Reported pilot study in the paper; no details on how customer satisfaction was measured, sample size, or effect size are provided in the supplied text.

high positive Intelligent Textile Technology�CDriven Supply Chain Optimiza... customer satisfaction

A pilot study demonstrates significant improvements in cost control.

Reported pilot study in the paper; the summary does not provide numerical cost reductions or sample size.

high positive Intelligent Textile Technology�CDriven Supply Chain Optimiza... cost control (operational/cost reductions)

A pilot study demonstrates significant improvements in order fulfillment efficiency.

Reported pilot study in the paper; no sample size, quantitative metrics, or statistical tests reported in the provided text.

high positive Intelligent Textile Technology�CDriven Supply Chain Optimiza... order fulfillment efficiency

« Prev 1 2 3 … 139 140 141 … 276 277 Next »