Evidence (3062 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	373	105	59	439	984
Governance & Regulation	366	172	115	55	718
Research Productivity	237	95	34	294	664
Organizational Efficiency	364	82	62	34	545
Technology Adoption Rate	293	118	66	30	511
Firm Productivity	274	33	68	10	390
AI Safety & Ethics	117	178	44	24	365
Output Quality	231	61	23	25	340
Market Structure	107	123	85	14	334
Decision Quality	158	68	33	17	279
Fiscal & Macroeconomic	75	52	32	21	187
Employment Level	70	32	74	8	186
Skill Acquisition	88	31	38	9	166
Firm Revenue	96	34	22	—	152
Innovation Output	105	12	21	11	150
Consumer Welfare	68	29	35	7	139
Regulatory Compliance	52	61	13	3	129
Inequality Measures	24	68	31	4	127
Task Allocation	71	10	29	6	116
Worker Satisfaction	46	38	12	9	105
Error Rate	42	47	6	—	95
Training Effectiveness	55	12	11	16	94
Task Completion Time	76	5	4	2	87
Wages & Compensation	46	13	19	5	83
Team Performance	44	9	15	7	76
Hiring & Recruitment	39	4	6	3	52
Automation Exposure	18	16	9	5	48
Job Displacement	5	29	12	—	46
Social Protection	19	8	6	1	34
Developer Productivity	27	2	3	1	33
Worker Turnover	10	12	—	3	25
Creative Output	15	5	3	1	24
Skill Obsolescence	3	18	2	—	23
Labor Share of Income	8	4	9	—	21

Human Ai Collab Remove filter

Claims that AI will imminently replace human auditors are overstated; real-world economic benefits are more likely to come from complementary automation (breadth + triage) rather than full substitution.

Interpretation based on empirical failures in end-to-end exploitation, instability across configurations, and scaffold sensitivity observed in this study.

medium negative Re-Evaluating EVMBench: Are AI Agents Ready for Smart Contra... economic_value_of_automation (qualitative_assessment_of_substitution_vs_compleme...

Detection and exploitation rankings are unstable: rankings shift across model configurations, tasks, and datasets, so results are not robust to evaluation choices.

Observed variability in detection/exploitation rankings across the expanded matrix of models, scaffolds, and datasets in the study's experiments.

medium negative Re-Evaluating EVMBench: Are AI Agents Ready for Smart Contra... ranking_stability (consistency_of_model_rankings_across_configs_and_datasets)

High within-person variability and statement-dependent ambiguity imply noisy sentiment labels that can attenuate estimated effects in econometric analyses (measurement error / attenuation bias).

Empirical findings of moderate within-person stability and strong statement dependence in a sample of 81 students labeling decontextualized statements; combined with standard measurement-error theory (paper’s implication for applied analyses).

medium negative Exploring Indicators of Developers' Sentiment Perceptions in... expected bias (attenuation) in estimated associations when using noisy sentiment...

Standardized platforms and benchmarks may create network effects and lock-in around dominant hardware–software stacks; antitrust and standards policy will matter to preserve competition.

Workshop participants' market-structure analysis and policy discussion included in the summary recommendations (NSF workshop, Sept 26–27, 2024).

medium negative Report for NSF Workshop on Algorithm-Hardware Co-design for ... market concentration metrics, prevalence of platform lock-in, and competition in...

The sphere + dislodgement-threshold material approximation may not capture all real-world mechanical and adhesive properties, limiting generalization.

Authors note/modeling limitation: summary explicitly states the material physics are approximated and may not capture all real-world properties; this is presented as a limitation rather than an empirical result.

medium negative Learning Adaptive Force Control for Contact-Rich Sample Scra... generalization/physical fidelity of the simulation model (limitation)

Key technical and organizational risks include model brittleness, privacy and IP concerns in code generation (training-data provenance), and increased governance and QA burdens.

Literature review highlighting known risks and survey responses reporting practitioner concerns; no quantified incident rates provided.

medium negative Artificial Intelligence as a Catalyst for Innovation in Soft... reported incidence or concern levels about risks (qualitative)

Practitioners report barriers to adoption including integration costs, lack of trust/explainability, poor data quality, and skills gaps.

Thematic analysis / coding of open-ended survey responses and literature review identifying common adoption barriers; survey sample size not specified.

medium negative Artificial Intelligence as a Catalyst for Innovation in Soft... prevalence of reported barriers in survey responses

Signals may be gamed by providers or agents; incentive-compatible design and auditability are crucial.

Risk/limitations noted by the authors as a foreseeable strategic behavior problem; presented as a caution rather than empirically observed gaming in the current dataset.

medium negative Task-Aware Delegation Cues for LLM Agents vulnerability to strategic manipulation of signals (qualitative risk)

GDP and productivity metrics that ignore interpretive labor risk understating the inputs to creative and knowledge work; RATs offer a means to measure previously invisible inputs.

Policy argument in the measurement/productivity subsection; no empirical re-estimation of GDP/productivity presented.

medium negative Chasing RATs: Tracing Reading for and as Creative Activity completeness of productivity/GDP measurement with respect to interpretive labor

Algorithmic feeds and AI summarizers tend to compress or automate interpretive traces, potentially erasing signals of reasoning, context, and tacit knowledge.

Conceptual claim supported by argumentation and examples in the paper; no empirical comparison between RATs and existing summarizers is presented.

medium negative Chasing RATs: Tracing Reading for and as Creative Activity loss of interpretive trace signals (reasoning/context/tacit knowledge) when usin...

Expect diminishing returns from AI investments if parallel investments in organizational change and data governance are not made.

Synthesis of case evidence and theoretical argument: instances where additional AI investment produced limited marginal benefit absent organizational complements.

medium negative Optimizing integrated supply planning in logistics: Bridging... marginal returns to AI (performance per unit AI investment)

Legacy systems and siloed organizational structures produce persistent forecasting inaccuracies, operational disconnects, and constrained responsiveness.

Cross-case interview narratives documenting continued forecasting issues and operational misalignment in firms with legacy IT and functional silos.

medium negative Optimizing integrated supply planning in logistics: Bridging... forecasting accuracy, operational alignment, responsiveness (lead times)

MLOps and governance provisions shift costs from one-off implementation to ongoing maintenance, implying recurring costs that should be captured in economic evaluations.

Analytical/economic argument presented in the paper as an implication of including an MLOps layer (conceptual; no empirical cost accounting provided).

medium negative ALGORITHM FOR IMPLEMENTING AI IN THE MANAGEMENT LOOP OF SMES... cost structure (recurring maintenance costs vs one-off implementation costs)

Adoption complementarities (AI tools + developer skill + organizational processes) favor larger incumbents and well‑funded firms, possibly increasing concentration in tech sectors.

Theoretical argument about complementarities and returns to scale; illustrative examples; lacks firm‑level empirical testing.

medium negative How AI Will Transform the Daily Life of a Techie within 5 Ye... market concentration measures (market share, concentration ratios) and different...

In the near term, displacement risks concentrate on junior or highly routine roles; mobility and retraining will determine realized unemployment impacts.

Task automatability mapping indicating routine tasks more automatable and qualitative reasoning on labor mobility; no empirical unemployment projections.

medium negative How AI Will Transform the Daily Life of a Techie within 5 Ye... employment outcomes for junior/highly routine roles (displacement rates, unemplo...

Adoption will be heterogeneous: larger firms and well‑resourced teams will capture more gains earlier, producing competitive advantages.

Theoretical argument about adoption complementarities (AI tools + developer skill + organizational processes) and illustrative examples; no cross‑firm empirical analysis.

medium negative How AI Will Transform the Daily Life of a Techie within 5 Ye... heterogeneity in productivity gains and market advantage by firm size/resource l...

Initial investment, integration, and ongoing maintenance/compliance costs can be substantial and affect short-term ROI.

Interviewed administrators and implementation reports citing upfront and recurring costs (integration, model maintenance, compliance); quantitative budget figures not standardized across sites in the paper.

medium negative The Role of Artificial Intelligence in Healthcare Complaint ... implementation and maintenance costs; short-term return on investment (ROI)

Risk of deskilling or reduced empathy if human roles are overly automated.

Thematic analysis of staff interviews and surveys reporting concerns about loss of practice, reduced patient contact, and potential diminishment of empathetic skills; no longitudinal measures of skill loss presented.

medium negative The Role of Artificial Intelligence in Healthcare Complaint ... staff-reported empathy/skill levels and qualitative indicators of deskilling

Technical and organizational integration with legacy hospital IT systems is nontrivial.

Implementation reports and interviews describing integration work, time, and resource needs; descriptive accounts of technical and organizational barriers (no universal timelines/costs reported).

medium negative The Role of Artificial Intelligence in Healthcare Complaint ... integration difficulty/time/cost (implementation burden)

Algorithmic bias in NLP models can misclassify complaints from underrepresented groups.

Observations from system classification error analyses (disparities reported by demographic group) and corroborating qualitative concerns from staff and administrators; specific subgroup sample sizes and effect magnitudes not provided.

medium negative The Role of Artificial Intelligence in Healthcare Complaint ... differential misclassification rates by demographic group (bias in NLP classific...

Data privacy and security risks arise from centralizing complaint text and metadata.

Stakeholder interviews, thematic coding of concerns, and risk assessment commentary based on centralized logs and metadata aggregation; no measured breach incidents reported here.

medium negative The Role of Artificial Intelligence in Healthcare Complaint ... privacy/security risk (qualitative risk indicators; potential exposure of compla...

Organizations will incur additional governance and procurement costs (diversity audits, recalibration of reward models, multi-model infrastructures) to mitigate homogenization, shifting some economic benefits of AI toward governance spending.

Cost implication argued from the need for auditing and multi-model procurement described in recommendations; not supported by quantified cost analyses in the paper.

medium negative The Artificial Hivemind: Rethinking Work Design and Leadersh... governance and procurement costs associated with LLM deployment

Inter-model convergence undermines product differentiation across AI providers and could accelerate commoditization of base LLM outputs.

Market-structure inference built on empirical finding of high cross-model output similarity across 70+ models and theoretical discussion of vendor differentiation; no market-level price or adoption time-series analyzed in the paper.

medium negative The Artificial Hivemind: Rethinking Work Design and Leadersh... vendor product differentiation / commoditization of base outputs

Homogenized AI outputs reduce the value of AI as a source of varied cognitive complements to human labor, potentially lowering productivity gains from human–AI collaboration in tasks requiring creativity and exploration.

Economic argument drawing on measured decreases in model output diversity and theoretical literature on complementarities between diverse AI outputs and human creativity; no direct measured productivity changes reported in field settings within the paper.

medium negative The Artificial Hivemind: Rethinking Work Design and Leadersh... productivity gains from human–AI collaboration (theoretical implication inferred...

Reward-model and evaluation miscalibration can cause organizations to prefer models that maximize apparent evaluation scores at the expense of useful stylistic or cognitive diversity.

Comparative analyses between automated evaluation/reward-model rankings and human preference/diversity assessments reported in the paper; examples where high-scoring models produced more consensus-style outputs.

medium negative The Artificial Hivemind: Rethinking Work Design and Leadersh... model selection bias driven by automated evaluation scores; reduction in diversi...

Homogenized outputs increase organizational susceptibility to groupthink and correlated errors across teams using different models.

Argument based on observed inter-model convergence (high similarity across models) implying correlated outputs and thus correlated mistakes across teams; no randomized organizational field experiment reported, this is an inferred risk from the empirical convergence data.

medium negative The Artificial Hivemind: Rethinking Work Design and Leadersh... risk of correlated errors / susceptibility to groupthink (conceptual risk inferr...

Homogenization of LLM outputs erodes creative diversity in AI-assisted work and reduces the variety of solutions produced.

Inference drawn from measured decreases in response diversity (entropy/distinct-n) and the observed inter-model convergence across real-world queries; argument linking lower measured diversity to fewer distinct solution proposals in AI-augmented workflows.

medium negative The Artificial Hivemind: Rethinking Work Design and Leadersh... creative diversity / number of distinct solution variants produced

Current reward models and automated evaluation metrics are biased toward consensus/high-probability responses, preferring consensus-style outputs even when stylistically diverse alternatives are judged equally high-quality by humans.

Reported human preference assessments and comparisons between human judgments and automated/reward-model scores showing cases where reward models favor higher-probability/consensus outputs despite no human-quality advantage; analyses described comparing reward-model scores to human judgments on stylistically diverse outputs.

medium negative The Artificial Hivemind: Rethinking Work Design and Leadersh... alignment between reward-model/automated evaluation scores and human quality jud...

Unresolved liability and regulatory uncertainty increase malpractice risk and insurance costs, leading insurers and providers to favor conservative adoption and continued human-in-the-loop safeguards.

Regulatory/legal analysis and stakeholder behavior models discussed in the review; observed cautious deployment patterns in practice noted in the literature.

medium negative Will AI Replace Physicians in the Near Future? AI Adoption B... malpractice risk; insurance premiums; adoption conservatism; presence of human-i...

Regulatory pathways and approval standards are evolving but are not yet aligned with deployment of high-autonomy clinical systems.

Review of recent policy analyses and regulatory documents showing ongoing updates and gaps between current standards and requirements for high-autonomy AI deployment.

medium negative Will AI Replace Physicians in the Near Future? AI Adoption B... alignment between regulatory frameworks and high-autonomy clinical deployment re...

Sanctions and supply-chain restrictions affect access to hardware and software, altering adoption paths and increasing costs; domestic substitution or international cooperation will influence future trajectories.

Institutional analysis documenting sanctions/import restrictions and their implications for hardware/software access; qualitative assessment of substitution and cooperation options.

medium negative ADOPTION OF ARTIFICIAL INTELLIGENCE IN THE RUSSIAN EXTRACTIV... availability and cost of hardware/software inputs for AI and resulting adoption ...

The barriers to AI adoption in Russia’s extractive industries interact systemically (e.g., lack of data reduces demand for talent; weak infrastructure deters investment), so piecemeal measures will have limited effect.

Analytical synthesis identifying co-moving constraints across cross-country trends and qualitative firm-level evidence showing interacting bottlenecks.

medium negative ADOPTION OF ARTIFICIAL INTELLIGENCE IN THE RUSSIAN EXTRACTIV... overall effectiveness of isolated vs. coordinated interventions on AI diffusion ...

Institutional failures—weak standards/interoperability, limited public–private coordination, regulatory uncertainty, and sanctions/import restrictions—exacerbate diffusion problems for AI in extractive sectors.

Institutional review of standards, procurement and public–private coordination mechanisms; documentation of regulatory uncertainty and sanctions/import restrictions affecting hardware/software access.

medium negative ADOPTION OF ARTIFICIAL INTELLIGENCE IN THE RUSSIAN EXTRACTIV... standards/interoperability quality, level of public–private coordination, regula...

Infrastructure shortfalls — insufficient sensorization, limited connectivity (edge/cloud), inadequate computing hardware and immature localized software stacks — are underdeveloped in Russia relative to peers and hinder deployment.

ICT infrastructure indicators, comparative metrics on sensorization/connectivity/computing availability, and project case evidence from extractive firms.

medium negative ADOPTION OF ARTIFICIAL INTELLIGENCE IN THE RUSSIAN EXTRACTIV... sensor density, connectivity quality (edge/cloud readiness), availability of com...

There are human capital constraints: shortages of AI talent in industry-specific roles, limited retraining of engineering staff, and brain drain reduce the sector's capacity to absorb and deploy AI.

Workforce and education statistics, patent/activity counts, and expert commentary; qualitative case evidence showing limited retraining and talent shortages in industry-specific AI roles.

medium negative ADOPTION OF ARTIFICIAL INTELLIGENCE IN THE RUSSIAN EXTRACTIV... industry-specific AI talent supply, retraining rates for engineering staff, meas...

Absolute and relative AI investment volumes in the Russian extractive sector are lower than in the US, China and EU; private risk capital is limited and public support insufficiently targeted to scale-up projects.

Investment datasets and national/industry statistics comparing public and private AI investment volumes (absolute and relative to output) for extractive sectors across jurisdictions (2020–2025).

medium negative ADOPTION OF ARTIFICIAL INTELLIGENCE IN THE RUSSIAN EXTRACTIV... AI investment volumes (absolute and per unit of extractive output); availability...

Data access is a primary bottleneck: datasets are fragmented, often proprietary or closed, ownership rules are unclear, and mechanisms for safe data sharing are weak, hindering model training and cross-firm applications.

Review of data governance frameworks across jurisdictions and firm-level case evidence documenting closed/proprietary datasets and weak sharing mechanisms.

medium negative ADOPTION OF ARTIFICIAL INTELLIGENCE IN THE RUSSIAN EXTRACTIV... availability and usability of industrial data for AI model training and cross-fi...

The gap is driven not only by smaller investment flows but also by institutional constraints—limited data access, weak data governance, human capital shortages, and inadequate digital infrastructure—that together suppress diffusion and scaling of AI applications.

Institutional analysis (review of data governance frameworks, regulatory regimes, standards, market structure) plus qualitative firm-level case studies and expert commentary illustrating how these factors impede adoption and scaling.

medium negative ADOPTION OF ARTIFICIAL INTELLIGENCE IN THE RUSSIAN EXTRACTIV... diffusion and scaling of AI applications in extractive industries

Russia’s adoption of AI in extractive industries is both slower (lower growth rate) and shallower (lower depth of digitalization) than peer jurisdictions in 2020–2025.

Time-series comparison of digitalization/digit maturity proxies and AI investment volumes across countries for 2020–2025; synthesis of trend differences from public datasets and sectoral indices.

medium negative ADOPTION OF ARTIFICIAL INTELLIGENCE IN THE RUSSIAN EXTRACTIV... rate of change in digitalization indicators and depth of digitalization (digit m...

Between 2020–2025 Russia trails the United States, China and the EU on both digitalization indicators and AI investment volumes in the mining and oil & gas sectors.

Comparative multi-country trend analysis (2020–2025) using publicly available investment and digitalization indicators: national/industry statistics, investment datasets, and sectoral digitalization indices comparing Russia, US, China and EU over 2020–2025.

medium negative ADOPTION OF ARTIFICIAL INTELLIGENCE IN THE RUSSIAN EXTRACTIV... digitalization levels and AI investment volumes per unit of extractive output (m...

Widespread adoption of LLMs without adequate verification increases systemic cybersecurity risks with potential economic spillovers.

Synthesis of security incident case studies and risk analyses revealing vulnerabilities in generated code and potential downstream impacts.

medium negative ChatGPT as a Tool for Programming Assistance and Code Develo... frequency/severity of security breaches attributable to AI-generated code; downs...

Models lack deep contextual reasoning and may fail on tasks requiring long-term design thinking or deep domain knowledge.

Benchmark failures and user studies in the reviewed literature demonstrating degraded performance on complex architectural/design tasks and domain-specific reasoning problems.

medium negative ChatGPT as a Tool for Programming Assistance and Code Develo... task success on long-horizon design tasks, reasoning/chain-of-thought benchmark ...

Use of these tools can mask gaps in foundational computational skills among novices.

Pedagogical case studies and assessments indicating reliance on AI can produce superficial solutions and lower demonstrated understanding of core concepts.

medium negative ChatGPT as a Tool for Programming Assistance and Code Develo... measures of foundational skill (conceptual quiz scores, ability to solve novel/u...

Short-term AI adoption costs and adjustment reduce firm profits during early adoption phases.

Theoretical model predictions from the differentiated Bertrand framework; empirical component claims alignment with these short-run effects (no sample size or estimation details given in summary).

medium negative MODELING HOSPITALITY AND TOURISM STRATEGIES short-run firm profit (profit reduction)

Key constraints on realized gains include governance complexity, model reliability limits (errors, brittleness, distribution shifts), orchestration challenges integrating agents across systems, and ongoing need for human oversight for safety, fairness, and quality control.

Qualitative observations and limitations reported from the Alfred AI deployments and authors' analysis of operational experience; evidence comes from live deployments but is descriptive rather than quantitative.

medium negative Artificial Intelligence Agents in Knowledge Work: Transformi... presence and impact of governance complexity, model errors, orchestration diffic...

This generation–verification mismatch produces a chronic bottleneck in development processes.

Analytic diagnosis and behavioral reasoning in the paper (design principles and system analysis); no empirical testing or simulation results provided.

medium negative Overton Framework v1.0: Cognitive Interlocks for Integrity i... development process throughput constrained by verification capacity

AI-assisted software development creates a persistent structural imbalance: generation throughput (machine-produced code, tests, docs) outpaces human verification capacity.

Conceptual/theoretical argument and systems/architectural modeling in the paper; no empirical measurement, no sample size, no field data reported.

medium negative Overton Framework v1.0: Cognitive Interlocks for Integrity i... ratio of machine generation throughput to human verification throughput / verifi...

Data‑driven agritech platforms exhibit network effects and potential for market power, implying a policy need for data portability and interoperability to preserve competition.

Economic reasoning, policy reports, and case study examples summarized in the review; the claim is grounded in market analysis rather than large‑scale causal studies.

medium negative MODERN APPROACHES TO SUSTAINABLE AGRICULTURAL TRANSFORMATION market concentration, barriers to entry, interoperability metrics

If left unregulated and untargeted, AI and digital agritech platforms risk concentrating surplus with technology providers and capital owners, potentially increasing rural inequality and weakening smallholder bargaining power.

Theoretical market‑structure analysis, case studies of platform markets, and policy analyses cited in the paper; empirical causal evidence on long‑run distributional effects is limited.

medium negative MODERN APPROACHES TO SUSTAINABLE AGRICULTURAL TRANSFORMATION distribution of surplus/value capture, measures of rural inequality, smallholder...

Data ownership, lack of interoperability, privacy concerns, and concentration of digital agritech platforms create risks for competition and equitable value capture in agricultural value chains.

Policy reports, market analyses, and case studies discussed in the paper; the claim is supported by descriptive evidence and theoretical assessments rather than large causal estimates.

medium negative MODERN APPROACHES TO SUSTAINABLE AGRICULTURAL TRANSFORMATION market concentration, distribution of surplus/value capture, competition indicat...

« Prev 1 2 3 … 35 36 37 … 61 62 Next »