Evidence (3470 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	609	159	77	736	1615
Governance & Regulation	664	329	160	99	1273
Organizational Efficiency	624	143	105	70	949
Technology Adoption Rate	502	176	98	78	861
Research Productivity	348	109	48	322	836
Output Quality	391	120	44	40	595
Firm Productivity	385	46	85	17	539
Decision Quality	275	143	62	34	521
AI Safety & Ethics	183	241	59	30	517
Market Structure	152	154	109	20	440
Task Allocation	158	50	56	26	295
Innovation Output	178	23	38	17	257
Skill Acquisition	137	52	50	13	252
Fiscal & Macroeconomic	120	64	38	23	252
Employment Level	93	46	96	12	249
Firm Revenue	130	43	26	3	202
Consumer Welfare	99	51	40	11	201
Inequality Measures	36	105	40	6	187
Task Completion Time	134	18	6	5	163
Worker Satisfaction	79	54	16	11	160
Error Rate	64	78	8	1	151
Regulatory Compliance	69	64	14	3	150
Training Effectiveness	81	15	13	18	129
Wages & Compensation	70	25	22	6	123
Team Performance	74	16	21	9	121
Automation Exposure	41	48	19	9	120
Job Displacement	11	71	16	1	99
Developer Productivity	71	14	9	3	98
Hiring & Recruitment	49	7	8	3	67
Social Protection	26	14	8	2	50
Creative Output	26	14	6	2	49
Skill Obsolescence	5	37	5	1	48
Labor Share of Income	12	13	12	—	37
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Org Design Remove filter

Without parallel investment in digital literacy, organizational culture, and inter-firm networks, AI will reproduce rather than reduce employment inequalities.

Authors' conclusion drawn from thematic analysis of interviews and conceptual framing; predictive statement based on qualitative findings.

high negative Artificial Intelligence, Social Capital, and Sustainable Emp... employment_inequalities

AI adoption in peripheral economies is not a purely technological or financial challenge but a social and human capital challenge, embedded in a biocultural environment shaped by brain drain, institutional thinness, and weak civic intermediation.

Synthesis of interview findings using Bitsani's Biocultural City framework; qualitative evidence from 12 interviews supports this argument.

high negative Artificial Intelligence, Social Capital, and Sustainable Emp... nature_of_challenges_to_AI_adoption

Knowledge deficits and financial constraints emerge as primary barriers [to AI adoption].

Thematic analysis of the twelve semi-structured interviews reporting these themes as primary barriers.

high negative Artificial Intelligence, Social Capital, and Sustainable Emp... barriers_to_AI_adoption

The welfare equivalence property is unique to the Brier score: for every non-Brier strictly proper scoring rule, the welfare gap under smooth C^1 oversight is bounded below by Ω(Var(1/G'') (γ/β)^2).

Mathematical lower-bound result proved in the paper comparing welfare under smooth C^1 oversight for non-Brier scoring rules; the bound is expressed as Ω(Var(1/G'') (γ/β)^2) in the paper.

high negative The Endogeneity of Miscalibration: Impossibility and Escape ... welfare gap between second-best and first-best under smooth C^1 oversight for no...

The impossibility (that non-affine approval undermines truthful reporting) holds for all strictly proper scoring rules, and the paper provides a closed-form perturbation formula.

General theoretical result proved across the class of strictly proper scoring rules, accompanied by a closed-form formula for the perturbation in the paper.

high negative The Endogeneity of Miscalibration: Impossibility and Escape ... existence and magnitude of perturbation from truthful reporting under arbitrary ...

Any non-affine approval makes truthful reporting suboptimal under the combined objective whenever deviation is undetectable — the principal cannot avoid the perturbation that undermines calibration.

Analytical impossibility theorem in the paper's formal model showing that non-affine approvals create incentives for non-truthful reports when deviations are undetectable (mathematical proof).

high negative The Endogeneity of Miscalibration: Impossibility and Escape ... truthfulness of agent reports (report calibration/truthfulness)

Current AI tools are not yet mature enough to replace developers.

Conclusion drawn from the controlled experiment and participant feedback comparing AI-assisted vs traditional task-splitting.

high negative Splitting User Stories Into Tasks with AI -- A Foe or an All... suitability of AI to replace developers

Breaking down user stories into actionable tasks is a critical yet time-consuming process in agile software development.

Background/introductory statement in the paper describing the problem motivation; no experimental sample size reported for this claim.

high negative Splitting User Stories Into Tasks with AI -- A Foe or an All... time required to split user stories (descriptive claim about time consumption)

AI adoption deepens the negative indirect effect of CEO–TMT faultlines on green innovation via reduced eco-attention (moderated mediation).

Reported moderated mediation analysis on the panel dataset (35,347 firm-year observations) showing that AI moderates the indirect path from CEO–TMT faultlines to green innovation through eco-attention, making the indirect effect more negative when AI is greater.

high negative When AI Amplifies Negative Echoes: CEO–TMT Faultlines, Eco-A... green innovation (indirect effect via eco-attention)

AI technology strengthens the negative relationship between CEO–TMT faultlines and eco-attention (AI exacerbates the adverse effect of faultlines on eco-attention).

Moderation/interaction analysis reported in the paper using the same panel dataset (35,347 firm-year observations) indicating a significant interaction between AI adoption and CEO–TMT faultlines on eco-attention.

high negative When AI Amplifies Negative Echoes: CEO–TMT Faultlines, Eco-A... eco-attention

CEO–TMT faultlines reduce eco-attention (organizational attention to environmental issues).

Direct association reported in the paper from regression/mediation models using the panel dataset (35,347 firm-year observations) showing a negative relationship between CEO–TMT faultlines and eco-attention.

high negative When AI Amplifies Negative Echoes: CEO–TMT Faultlines, Eco-A... eco-attention

CEO–TMT faultlines negatively affect green innovation through reduced eco-attention.

Empirical mediation analysis on the panel dataset (35,347 firm-year observations, 2010–2023) testing CEO–TMT faultlines -> eco-attention -> green innovation.

high negative When AI Amplifies Negative Echoes: CEO–TMT Faultlines, Eco-A... green innovation (mediated by eco-attention)

AGI (Artificial General Intelligence) is problematic both conceptually and definitionally.

Authorial assertion in the paper stating AGI is problematic as a concept and definition; framed as a conditioning assumption that shapes the subsequent analysis.

high negative Pathways to AGI conceptual_and_definitional_soundness_of_AGI

The paper argues we should avoid assuming the inevitability of the current situation relating to AI (i.e., the current commercial AI development trajectory is not inevitable).

Authorial methodological claim in the paper's framing/introductory text; presented as a normative methodological stance rather than empirical evidence.

high negative Pathways to AGI policy_assumption_of_inevitability

Existing coordination approaches often occupy two extremes: highly structured methods that rely on fixed roles/pipelines assigned a priori, and fully unstructured teams that enable adaptability but suffer inefficiencies like error propagation, inter-agent conflicts, and wasted resources.

Framing/background claim made in the paper (conceptual argument motivating LATTE).

high negative Improving the Efficiency of Language Agent Teams with Adapti... coordination efficiency / error propagation / resource waste

We contribute a non-additive harm decomposition (welfare loss W, coverage loss C) that exposes how attrition shifts harm from the regulator-accountable surface to a regulator-invisible one.

Methodological contribution in the paper: definition of welfare loss W and coverage loss C and analysis showing attrition reallocates observable vs. unobservable harm; supported by theoretical exposition and simulation examples.

high negative A Benchmark for Strategic Auditee Gaming Under Continuous Co... distribution of harm (welfare loss vs coverage loss) and effect of sample attrit...

An audit-aware OffAuditDrift strategy that exploits Stackelberg commitment defeats both (Periodic-with-floor and history-conditioned suspicion-escalation) auditor extensions.

Construction of the OffAuditDrift auditee strategy in the paper and simulation/theoretical demonstration that it can evade both proposed auditor policies by exploiting auditor commitment.

high negative A Benchmark for Strategic Auditee Gaming Under Continuous Co... effectiveness of an audit-aware auditee strategy at defeating auditor policies

We identify a structural feature of any noise-aware static-auditor design: a cover regime in which coverage gaps and granularity gaps cannot be closed simultaneously (formalized as Observation 1).

Theoretical observation/proposition in the paper (Observation 1) derived from the formal model of continuous auditing under noise-aware static auditing rules.

high negative A Benchmark for Strategic Auditee Gaming Under Continuous Co... trade-off between coverage gaps and granularity gaps in static auditing designs

Regulated systems can delay outcome reporting, drift their reports within plausible noise envelopes, exploit longitudinal sample attrition, and cherry-pick among ambiguous metric definitions.

Specification and enumeration of auditee strategies in the paper (Delay, Drift, Cherry-pick, Attrition, OffAuditDrift); conceptual examples and inclusion in simulator.

high negative A Benchmark for Strategic Auditee Gaming Under Continuous Co... types of auditee strategic behaviors available under continuous audits

Continuous post-deployment compliance audits, mandated by emerging regulations such as the EU AI Act and Digital Services Act, create a class of strategic gaming distinct from the one-shot input/output gaming studied in prior work.

Conceptual and theoretical argument in the paper, motivated by regulatory context; formalization of continuous auditing as a multi-round interaction (T-round Stackelberg game).

high negative A Benchmark for Strategic Auditee Gaming Under Continuous Co... existence of a distinct class of strategic gaming (audit-evasion behaviors) unde...

DePAI entails risks including security, centralization, incentive failure, legal exposure, and the crowding-out of intrinsic motivation, requiring value-sensitive design and continuously adaptive governance.

Risk analysis and conceptual argument in the paper identifying possible failure modes and recommended design/governance responses; no empirical incidence data provided.

high negative DAO-enabled decentralized physical AI: A new paradigm for hu... security, centralization, incentive failure, legal exposure, and intrinsic motiv...

The cultural and technical misalignment of the data center and electric power sectors makes coordination difficult.

Analytic claim in the paper describing differing design principles, operational philosophies, and economic incentives as sources of misalignment; presented as conceptual analysis without empirical measurement in the excerpt.

high negative From Barrier to Bridge: The Case for AI Data Center/Power Gr... ease/difficulty of coordination between sectors

A single hyperscale training campus can draw power comparable to a mid-sized city, driven by one tightly synchronized job whose demand swings by hundreds of megawatts in seconds.

Concrete illustrative assertion in the paper about facility-level power draw and rapid demand swings; no numeric source, dataset, or case-study details provided in the excerpt.

high negative From Barrier to Bridge: The Case for AI Data Center/Power Gr... power draw (MW) and rapid demand swing magnitude/timescale

AI training data centers break that assumption (load diversity).

Argumentative claim in the paper asserting that characteristics of AI training workloads violate the load-diversity assumption; no quantitative study included in the excerpt.

high negative From Barrier to Bridge: The Case for AI Data Center/Power Gr... degree to which aggregate grid demand is smoothed by uncorrelated loads (i.e., l...

Responsible AI research typically focuses on examining the use and impacts of deployed AI systems, and there is currently limited visibility into the pre-deployment decisions to pursue building such systems.

Argument and literature framing presented in the paper based on a scoping review of academic literature, civil society resources, and grey literature.

high negative To Build or Not to Build? Factors that Lead to Non-Developme... visibility into pre-deployment decision-making for AI development

This concentration can diffuse responsibility and raise the probability of irreversible system-level loss even when local per-action error rates remain low.

Theoretical result/argument from the model linking concentrated decision-energy to increased systemic risk despite low local error rates.

high negative AI Safety as Control of Irreversibility: A Systems Framework... probability of irreversible system-level loss

Efficiency pressure, path dependence, scale feedback, and weak boundary constraints concentrate decision-energy in the most efficient node.

Derived from the paper's formal model and argumentation about system dynamics (efficiency and feedback mechanisms); theoretical rather than empirical evidence.

high negative AI Safety as Control of Irreversibility: A Systems Framework... concentration of decision-energy (centralization of decision authority)

Declining deployment friction changes the safety problem at its root: safety is not only local output correctness or preference alignment, but the control of irreversibility under rising decision density.

Main theoretical argument of the paper; supported by conceptual framing and a formal model that introduces decision-density considerations.

high negative AI Safety as Control of Irreversibility: A Systems Framework... safety framing (control of irreversibility)

Recent AI systems compress the distance between capability growth and capability deployment.

Conceptual and descriptive claim in the paper's introduction; supported by theoretical argumentation and illustrative examples rather than empirical measurement.

high negative AI Safety as Control of Irreversibility: A Systems Framework... deployment speed / adoption

Of these four, integration capacity is the least developed for scientific institutions and the most binding: no improvement in AI tooling can buy it.

Normative/diagnostic claim in the paper about relative scarcity and irreducibility of integration capacity; no empirical measures or sample provided in the excerpt.

high negative AI-Augmented Science and the New Institutional Scarcities relative development of integration capacity in scientific institutions and its ...

Four complements then become scarce and load-bearing for AI-augmented science: verified signal, legitimacy, authentic provenance, and integration capacity (the community's tolerance for delegated cognition).

Theoretical framework proposed by the paper; list of four complements presented as an argument without empirical quantification in the excerpt.

high negative AI-Augmented Science and the New Institutional Scarcities scarcity of verified signal, legitimacy, authentic provenance, and integration c...

Frontier software engineering agents have saturated short-horizon benchmarks while regressing on the work that constitutes senior engineering: long-horizon, multi-engineer, ambiguous-specification deliverables.

Position asserted in the paper based on literature/benchmark trends and authors' field observations; no original empirical dataset or quantified analysis provided in the paper text excerpt.

high negative The Conversations Beneath the Code: Triadic Data for Long-Ho... performance on short-horizon benchmarks versus performance on long-horizon, mult...

Specification discipline, not model capability, is the binding constraint on AI-assisted software dependability.

Synthesis conclusion by the authors based on the multivocal literature review, telemetry findings, conceptual modeling (PRP/SGM), and the four-month pilot evaluation.

high negative The Productivity-Reliability Paradox: Specification-Driven G... software dependability (reliability) in AI-assisted development

These conflicting findings constitute the Productivity-Reliability Paradox (PRP): a systematic phenomenon emerging from non-deterministic code generators and insufficient specification discipline.

Conceptual synthesis and interpretation by the paper's authors, based on the multivocal literature review, telemetry, and experimental evidence summarized above.

high negative The Productivity-Reliability Paradox: Specification-Driven G... software dependability / trade-off between productivity and reliability

Telemetry across 10,000+ developers shows 91% longer code review times.

Observational telemetry data aggregated across >10,000 developers reported in the paper; metric reported is percent increase in review time.

high negative The Productivity-Reliability Paradox: Specification-Driven G... code review time

The most rigorous randomized controlled trial (RCT) documents a 19% slowdown for experienced developers.

A single RCT cited in the paper described as the most rigorous trial; result reported as a 19% slowdown for experienced developers. Sample size for the RCT is not provided in the summary statement.

high negative The Productivity-Reliability Paradox: Specification-Driven G... developer productivity (task completion speed)

Making LLMs themselves explicitly Bayesian belief-updating engines remains computationally intensive and conceptually nontrivial as a general modeling target

Stated as a limitation in the paper (conceptual and computational argument); no benchmarks or computational cost measurements reported.

high negative Position: agentic AI orchestration should be Bayes-consisten... computational feasibility and conceptual tractability of making LLMs fully Bayes...

Keeping humans in the loop can sometimes make the decision worse.

Argumentative/diagnostic statement in the paper (theoretical assertion; no experimental or observational effect sizes reported in the excerpt).

high negative Leading Across the Spectrum of Human-AI Relationships: A Con... decision quality when humans are kept in the loop

Leaders may believe oversight remains meaningful when it has become ceremonial.

Conceptual warning in the paper about erosion of meaningful oversight (no empirical validation provided in the excerpt).

high negative Leading Across the Spectrum of Human-AI Relationships: A Con... meaningfulness/effectiveness of oversight

The central risk is misrecognition: leaders may keep a human-centered story in place after decision-shaping authority has shifted elsewhere (e.g., to AI).

Analytic/diagnostic claim in the paper (conceptual warning; no empirical sample or measured incidence provided).

high negative Leading Across the Spectrum of Human-AI Relationships: A Con... degree of accurate recognition of who holds decision-shaping authority

Reactive approaches paired with automation or creation produced breakdowns (reduced effectiveness).

Thematic evidence from interviewees describing instances where reactive leadership combined with high automation-or-creation use led to coordination or accountability breakdowns across the 34 cases.

high negative E-leadership and human-AI collaboration: socio-technical ali... perceived team effectiveness (breakdowns)

Suppression bias is the systematic suppression of correct-but-difficult recommendations when clinician capability falls below the execution threshold.

Definition and characterization of a proposed failure mode provided in the paper (conceptual/theoretical).

high negative Learning from Disagreement: Clinician Overrides as Implicit ... bias in recorded overrides leading to omission of correct-but-difficult recommen...

Existing approaches, runtime guardrails, training-time alignment, and post-hoc auditing treat governance as an external constraint rather than an internalized behavioral principle, leaving agents vulnerable to unsafe and irreversible actions.

Author's conceptual/literature critique presented in the paper (argumentative claim, no empirical sample or experiment reported for this statement).

high negative Think Before You Act -- A Neurocognitive Governance Model fo... vulnerability to unsafe and irreversible actions

Boundary conditions limit UCF applicability in contexts requiring human accountability or embodied knowledge.

Author-stated caveat in the abstract identifying contexts (accountability, embodied knowledge) where the framework may not apply; theoretical reasoning, no empirical tests.

high negative Beyond markets and hierarchies: How GenAI enables unbounded ... limits to applicability of UCF where human accountability or embodied knowledge ...

Existing frameworks (Transaction Cost Economics and Electronic Markets Hypothesis) cannot explain emerging organizational phenomena like GitHub Copilot’s recursive value creation or AI-mediated expert networks.

Conceptual critique in the position paper using illustrative examples (GitHub Copilot, AI-mediated expert networks); no empirical testing or sample provided.

high negative Beyond markets and hierarchies: How GenAI enables unbounded ... theoretical explanatory adequacy of extant organizational frameworks

AI governance, ethical concerns, openness, workforce adjustment, and integration complexity are crucial concerns that managers must consider when implementing AI.

Synthesis of risks and challenges reported across the reviewed literature (paper's discussion/conclusion); no specific counts of studies or empirical measures provided in the abstract.

high negative Artificial intelligence, machine learning, and deep learning... governance and ethical risks, workforce adjustment challenges, system integratio...

Conventional managerial practices usually encounter difficulties dealing with the flow of information, ineffectiveness of workflow, slow decision making, and redundant administrative processes.

Background statement in the paper's introduction / literature review (narrative claim based on surveyed literature); no specific empirical study or sample size reported in the abstract.

high negative Artificial intelligence, machine learning, and deep learning... information flow, workflow effectiveness, decision speed, administrative redunda...

In resource-dependent regional economies, AI adoption can transform seasonal industries into continuous economic infrastructure and replace intermediate coordination roles and traditional employment structures.

Illustrative case analysis used in the paper to show how the framework applies to resource-dependent regions; described as an illustrative argument rather than an empirically validated causal estimate in the provided text.

high negative Structural Dissolution: How Artificial Intelligence Dismantl... transformation of seasonal industries to continuous infrastructure and replaceme...

Targeted disruption simulations based on intrinsic technological capability cause a more pronounced decline in the knowledge network than targeted attacks based on topological (structural) baselines.

Simulation experiments on collaboration/knowledge networks constructed from the 282,778-patent dataset comparing network decline under removal strategies: (a) based on intrinsic technological capability vs (b) based on topological centrality baselines.

high negative Technological capability and innovation network resilience: ... decline in knowledge network (network resilience/connectivity under targeted nod...

Some innovators with substantial technological value are not located at the structural center of the collaboration/knowledge network, indicating network position alone may not fully capture technological importance.

Empirical comparison between composite technological capability scores and structural centrality measures across the constructed networks derived from 282,778 Chinese AI patents; reported disconnect between high technological value and topological centrality.

high negative Technological capability and innovation network resilience: ... correspondence between technological value and network centrality

« Prev 1 2 3 4 5 6 … 69 70 Next »