Evidence (4049 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	369	105	58	432	972
Governance & Regulation	365	171	113	54	713
Research Productivity	229	95	33	294	655
Organizational Efficiency	354	82	58	34	531
Technology Adoption Rate	277	115	63	27	486
Firm Productivity	273	33	68	10	389
AI Safety & Ethics	112	177	43	24	358
Output Quality	228	61	23	25	337
Market Structure	105	118	81	14	323
Decision Quality	154	68	33	17	275
Employment Level	68	32	74	8	184
Fiscal & Macroeconomic	74	52	32	21	183
Skill Acquisition	85	31	38	9	163
Firm Revenue	96	30	22	—	148
Innovation Output	100	11	20	11	143
Consumer Welfare	66	29	35	7	137
Regulatory Compliance	51	61	13	3	128
Inequality Measures	24	66	31	4	125
Task Allocation	64	6	28	6	104
Error Rate	42	47	6	—	95
Training Effectiveness	55	12	10	16	93
Worker Satisfaction	42	32	11	6	91
Task Completion Time	71	5	3	1	80
Wages & Compensation	38	13	19	4	74
Team Performance	41	8	15	7	72
Hiring & Recruitment	39	4	6	3	52
Automation Exposure	17	15	9	5	46
Job Displacement	5	28	12	—	45
Social Protection	18	8	6	1	33
Developer Productivity	25	1	2	1	29
Worker Turnover	10	12	—	3	25
Creative Output	15	5	3	1	24
Skill Obsolescence	3	18	2	—	23
Labor Share of Income	7	4	9	—	20

Governance Remove filter

If cognitive interlocks are widely adopted, many negative externalities can be internalized and AI-driven productivity gains can be realized more sustainably; absent such controls, equilibrium may drift toward higher error rates and systemic incidents.

Long-run equilibrium argument based on theoretical reasoning and conditional claims; no longitudinal or cross-firm empirical evidence presented.

speculative mixed Overton Framework v1.0: Cognitive Interlocks for Integrity i... long-run system outcomes (error rates, incident frequency, net productivity) con...

Labor demand effects are ambiguous: junior/entry-level demand may be reduced for some tasks while demand for verification and higher-skill roles may rise.

Economic reasoning, early observational signals, and theoretical task-reallocation frameworks; empirical longitudinal evidence is limited or absent.

speculative mixed ChatGPT as a Tool for Programming Assistance and Code Develo... labor demand by skill level and occupation (employment levels, hiring rates)

The effectiveness of generative AI depends critically on human-AI workflows: prompt design, iterative refinement, and human vetting materially affect outcomes.

Qualitative analyses of interaction patterns and experiments manipulating prompting/iteration showing variation in outcomes; many studies report improved outputs after iterative prompting and human-in-the-loop refinement.

medium-high mixed ChatGPT as an Innovative Tool for Idea Generation and Proble... variation in output quality based on prompt design; changes in output after iter...

Market demand is likely to bifurcate: high-value clinical markets will require rigorous explainability and neuroscientific grounding (higher willingness-to-pay), while research and consumer segments may tolerate black-box models (lower margins).

Market segmentation argument built from differing end-user requirements and tolerance for opaque models; presented as a projected implication rather than an empirically tested market study.

speculative mixed Explainable Artificial Intelligence (XAI) for EEG Analysis: ... market segmentation / willingness-to-pay across segments

Teams often produce evaluation outputs (tests, metrics, user feedback) but lack mechanisms, processes, or technical levers to convert those outputs into actionable engineering or product changes—a novel “results-actionability gap.”

Recurring theme from the 19 practitioner interviews and coding; authors explicitly articulate and label this gap based on participants' reports.

medium-high negative Results-Actionability Gap: Understanding How Practitioners E... ability to translate evaluation outputs into concrete product/engineering change...

The study confirms several previously documented evaluation challenges with LLMs: model unpredictability, metric mismatch, high human-evaluation costs, and difficulty reproducing failures.

Interview data from 19 practitioners; thematic analysis flagged these recurring problems as reported by participants and aligned with prior literature.

medium-high negative Results-Actionability Gap: Understanding How Practitioners E... presence and prevalence of known evaluation challenges

Emergent quality hierarchies among agents imply winner-take-most dynamics in informational value and potential market concentration in agent quality.

Observed formation of quality hierarchies in agent interactions and documented economic interpretation; this is a hypothesis/implication drawn from qualitative patterns rather than measured market outcomes.

speculative negative When Openclaw Agents Learn from Each Other: Insights from Em... distribution of informational value / concentration of agent quality

Security of LLM-based MASs functions as an economic externality: failures can impose social costs (misinformation, poor collective decisions), and absent liability or market incentives providers may underinvest in robustness.

Economic reasoning and implication section in the paper—conceptual argument linking the technical vulnerability to economic externality and incentive misalignment. No empirical economic data provided in the summary.

speculative negative Don't Trust Stubborn Neighbors: A Security Framework for Age... investment in defenses (underprovision) and social costs from MAS security failu...

Analytical conditions on stubbornness and influence weights identify when a single adversary can dominate network dynamics (i.e., influence propagation criteria derived from FJ fixed-point analysis).

Mathematical/theoretical analysis of FJ model fixed points and influence propagation in the paper; derivation of conditions relating agent stubbornness and interpersonal trust weights to steady-state influence.

medium-high negative Don't Trust Stubborn Neighbors: A Security Framework for Age... theoretical criteria predicting when an agent's influence weight leads to domina...

If models frequently leak or misuse preferences in third‑party contexts, users and organizations will discount the value of personalization or demand stronger controls, increasing costs for deploying memory features and reducing consumer surplus.

Economic reasoning and implication drawn from the observed misapplication behavior; no empirical user adoption or market data provided in the study to directly support this claim.

speculative negative BenchPreS: A Benchmark for Context-Aware Personalized Prefer... Projected changes in trust, adoption costs, and consumer surplus (not empiricall...

The failure mode (misapplication of preferences to third parties) creates negative externalities (privacy violations, normative harms, misinformation, contractual breaches) that markets and platforms may not internalize without regulation or design changes.

Economic interpretation and argumentation building on the empirical failure mode; these harms are hypothesized implications rather than measured outcomes in the paper.

speculative negative BenchPreS: A Benchmark for Context-Aware Personalized Prefer... Projected negative externalities on third parties (not directly measured in stud...

Unclear liability frameworks increase perceived and real costs and can slow adoption by hospitals and insurers.

Policy analyses and procurement narratives noting liability uncertainty cited as a barrier to procurement and deployment.

medium_high negative Human-AI interaction and collaboration in radiology: from co... time-to-adoption, procurement decisions citing liability concerns, insurance/cov...

Up-front implementation costs commonly include procurement, integration with PACS/EMR, UI/UX development, regulatory compliance, and staff training; recurring costs include monitoring, data labeling, software updates, and cybersecurity.

Implementation reports, vendor and hospital accounts, and qualitative studies documenting cost categories (specific dollar amounts vary across settings and are rarely published in detail).

medium_high negative Human-AI interaction and collaboration in radiology: from co... implementation capital expenditures, annual operating expenditures

Without continuous support for upskilling/reskilling and inclusive policies, AI risks becoming a source of exclusion rather than an enabler of human advancement.

Normative conclusion derived from reviewed literature and thematic interpretation in the qualitative study (literature-based; evidence is secondary and not quantified).

speculative negative THE IMPACT OF ARTIFICIAL INTELLIGENCE IN THE WORKPLACE: OPPO... social inclusion versus exclusion related to AI adoption

Research literature synthesis demonstrates 70-75% automation potential.

Quantitative estimate offered by the authors (70-75%) as part of function-by-function analysis; no described empirical evaluation or sample supporting the figure.

speculative negative Are Universities Becoming Obsolete in the Age of Artificial ... percent automation potential for research literature synthesis

Knowledge transmission (teaching/lecturing) shows 75-80% AI substitutability.

Authors' quantitative estimate presented in the analysis (75-80%); the paper does not detail empirical methods or validation samples for this percentage.

speculative negative Are Universities Becoming Obsolete in the Age of Artificial ... percent substitutability/automation potential of knowledge transmission

Administrative tasks face 75-80% disruption risk from AI.

Paper provides a quantitative estimate (75-80%) as part of its functional disruption assessment; no empirical methodology, dataset, or sample size is described to support the numeric range.

speculative negative Are Universities Becoming Obsolete in the Age of Artificial ... percent disruption/substitutability of administrative tasks

Aggregation and linkage across data sources can reveal intimate, predictive traits that were not foreseeable to the data subject at the time of sale.

Conceptual argument with references to documented cases and literature on data linkage and inference; relies on illustrative examples rather than original empirical experiments.

medium-high negative Data and privacy: Putting markets in (their) place Extent to which data aggregation yields unforeseen sensitive inferences about in...

The United States shows a more market-driven (firm-dominated) patenting profile and comparatively weaker integration between AI and robotics patent trajectories.

Country-level and actor-type decomposition for U.S. patent filings (1980–2019), showing higher firm share of patents and weaker long-run association/cointegration between core AI and AI-enhanced robotics series compared with China (as reported in the paper).

medium-high negative The "Gold Rush" in AI and Robotics Patenting Activity. Do in... share of patents by firms in U.S.; strength of long-run integration between U.S....

There is a risk of a two‑tier market where high‑quality temporal‑preserving enhancements are costly, increasing inequality in experiential welfare and cognitive capital.

Speculative socioeconomic implication based on cost/access arguments and distributional concerns; no inequality modeling or empirical pricing data provided.

speculative negative XChronos and Conscious Transhumanism: A Philosophical Framew... distributional inequality in access to temporal‑quality enhancements and resulti...

Technical expansion without an accompanying theory of lived temporality risks increasing capabilities while degrading the qualitative depth of human experience (presence, attentional flow, felt meaning).

Argumentative claim supported by philosophical analysis and literature synthesis (neurophenomenology, attention economics); no empirical test reported (N/A).

speculative negative XChronos and Conscious Transhumanism: A Philosophical Framew... qualitative depth of human experience (presence, attentional flow, felt meaning)

High-quality, equitable climate information displays public-good characteristics (nonrival, nonexcludable at scale), so private incentives alone will underprovide geographically representative data and shared infrastructure.

Economic reasoning supported by observed concentration of compute and model development (mapping) and standard public-goods theory; no formal empirical market model estimated in the paper.

medium-high negative The Rise of AI in Weather and Climate Information and its Im... Level of provision of geographically representative data/shared infrastructure u...

Full replacement of physicians would require breakthroughs in robust generalization, embodied capabilities, and legal/regulatory change—currently lacking.

Conceptual inference based on documented limitations (OOD generalization, lack of embodied/sensorimotor capability, unsettled legal/regulatory environment) summarized in the review.

speculative negative Will AI Replace Physicians in the Near Future? AI Adoption B... feasibility/timeline for physician replacement

Shrinking acquisition workforce capacity functions as a critical scarce input in defense AI economics; reduced human capital lowers the Department's ability to extract value from AI investments and to internalize externalities, decreasing effective returns to AI procurement.

Institutional trend evidence of workforce reductions combined with economic analysis treating institutional capacity as an input factor. No empirical quantification of returns or elasticity provided—this is analytical inference.

speculative negative FEATURE COMMENT: Governance as a "Blocker": How the Pentagon... effective returns to AI procurement given acquisition workforce capacity (theore...

Ambiguous standards increase uncertainty for contracting officers, raising the risk that they will either over-rely on vendor claims or inconsistently enforce requirements, both of which harm procurement integrity.

Policy-text analysis identifying vague criteria combined with qualitative analysis of procurement decision workflows; argument based on measurement and enforcement friction literature. No empirical study of contracting officer behavior provided.

speculative negative FEATURE COMMENT: Governance as a "Blocker": How the Pentagon... consistency and reliability of contracting officer enforcement and reliance on v...

Lower governance barriers and ambiguous procurement criteria (e.g., undefined 'model objectivity') can skew market competition toward suppliers that prioritize rapid iteration and opaque practices over rigorous assurance, harming traceability and quality.

Market-effects reasoning grounded in policy changes (document analysis) and qualitative institutional analysis of measurement/enforcement frictions. No market-share or supplier-behavior data provided.

speculative negative FEATURE COMMENT: Governance as a "Blocker": How the Pentagon... market composition and supplier incentives (favoring speed/opacity vs. assurance...

Mandating permissive contract terms and enabling waivers reduces private incentives for contractors to invest in safety and compliance, creating classical moral-hazard problems in defense AI procurement.

Economic reasoning and principal–agent analysis applied to the documented contractual changes (primary-source policy text). No empirical measurement of contractor investment behavior provided; claim is theoretical/inferential.

speculative negative FEATURE COMMENT: Governance as a "Blocker": How the Pentagon... contractor incentives to invest in safety and compliance (theoretical inference)

A mismatch between expanded waiver authority (Barrier Removal Board) and declining acquisition oversight capacity creates procurement-integrity and systemic risks: faster acquisition concurrent with weakened institutional checks increases likelihood of improper procurement decisions and unchecked deployment of unsafe or unvetted AI models.

Synthesis of primary-source policy analysis, institutional staffing trend evidence, and qualitative risk/scenario assessment using principal–agent and moral-hazard frameworks. This is a conceptual risk projection rather than an empirically derived probability estimate.

speculative negative FEATURE COMMENT: Governance as a "Blocker": How the Pentagon... probability and nature of procurement-integrity failures and deployments of unsa...

Emerging agentic/AGI capabilities introduce new failure modes and governance challenges that standard ML oversight may not cover.

Emerging literature, theoretical analyses, and expert opinion summarized in the synthesis; authors note limited empirical long-term data and characterize this as an emergent risk.

speculative negative Framework for Government Policy on Agentic and Generative AI... governance risk / novel failure modes

Centralized provision of high-quality coding models by a few vendors could produce vendor lock-in and increase platform power in software development inputs.

Market-structure analysis and industry observations synthesized in the paper; the claim is forward-looking and not established by longitudinal market data within the review.

speculative negative ChatGPT as a Tool for Programming Assistance and Code Develo... market concentration measures (e.g., HHI), indicators of vendor lock-in (switchi...

This reversal of the burden of proof creates moral-hazard-like behavior: incentives for speed reduce verification effort.

Theoretical argument built on the micro-coercion mechanism and economic reasoning; no empirical validation provided.

speculative negative Overton Framework v1.0: Cognitive Interlocks for Integrity i... verification effort per artifact (e.g., reviewer time), proportion of unchecked ...

Under time pressure, developers adopt an implicit default of accepting plausible machine outputs unless they can disprove them (the 'micro-coercion of speed'), effectively reversing the burden of proof.

Behavioral mechanism posited from descriptive reasoning and thought experiments; no behavioral experiments, surveys, or observational data reported.

speculative negative Overton Framework v1.0: Cognitive Interlocks for Integrity i... developer acceptance rate of machine-generated outputs under time pressure; rate...

DAR dynamics (authority states, hysteresis, safe-exit times) introduce path-dependence and switching costs that should be treated as state variables in production and decision models of human–AI joint work.

Theoretical implications section arguing these elements add path-dependence and switching costs to economic/production models; analytic reasoning, not empirical measurement.

medium-high negative Human–AI Handovers: A Dynamic Authority Reversal Framework f... switching_costs; path_dependence_indicators; effect_on_throughput

Concentration risks exist because high fixed costs for safe integration and model adaptation may favor larger incumbents or platform providers.

Conceptual economic reasoning and practitioner commentary synthesized in the review; no empirical market-structure analysis or sample-based evidence included here.

speculative negative The Effectiveness of ChatGPT in Customer Service and Communi... market concentration indicators and barriers to entry related to AI integration ...

Imported AI systems may impose foreign values and norms, risking erosion of indigenous knowledge and social cohesion.

Normative and conceptual argument supported by cited case studies and policy analyses; no original anthropological or sociological fieldwork in the paper.

low-medium negative Towards Responsible Artificial Intelligence Adoption: Emergi... indicators of indigenous knowledge retention, measures of cultural alignment of ...

Deployed AI systems can produce algorithmic bias that harms marginalized groups when models are trained on skewed or non‑representative data.

Synthesis of prior empirical findings and case studies on algorithmic bias and fairness in ML systems; paper does not present new empirical tests.

medium-high negative Towards Responsible Artificial Intelligence Adoption: Emergi... fairness metrics, disparate error rates, incidence of discriminatory outcomes fo...

Human reviewers may over-trust machine-generated language and explanations (automation bias), reducing the likelihood of detecting fraudulent outputs.

Reference to automation-bias literature and conceptual examples; threat modeling and illustrative vignettes in the article.

medium-high negative Prompt Engineering or Prompt Fraud? Governance Challenges fo... detection rate of fraudulent outputs by human reviewers when outputs are machine...

Existing internal audit and compliance frameworks focus on access, transaction, and system controls, not on content-generation integrity.

Literature and standards review combined with threat-control mapping demonstrating gaps in content/provenance coverage.

medium-high negative Prompt Engineering or Prompt Fraud? Governance Challenges fo... coverage of content-generation integrity within existing audit/compliance framew...

AI systems and economic models are biased toward European languages because of lack of vernacular corpora; investing in high-quality corpora for African vernaculars (e.g., Cameroon Pidgin) is necessary to avoid misallocation of resources.

Policy implication extrapolated from the study's finding that vernacular mediation materially affects outcomes, combined with general knowledge about data-driven AI bias; no empirical AI-modeling tests in the paper.

speculative negative (current state) / positive (recommended investment) From Linguistic Hybridity to Development Sovereignty: Pidgin... AI model performance and allocation bias (inferred, not measured)

There are research opportunities to measure returns to 'teaching' (causal impact of configuring agents on human skill accumulation and earnings) and to model agent-platform ecosystems with network effects, spillovers, and endogenous quality hierarchies.

Author-stated research agenda and proposed empirical questions derived from the observed phenomena; not empirical results but recommended directions.

speculative null result When Openclaw Agents Learn from Each Other: Insights from Em... need for future causal estimates of returns to teaching and formal models of eco...

Recommended research priorities include hierarchical/temporal-decomposition methods, continual learning, robust adaptation to non-stationarity, and causal/structured reasoning to handle multi-factor interactions.

Paper discussion linking observed failure modes to methodological gaps and proposing research directions to address limitations; these are recommendations rather than experimentally validated claims.

speculative null result RetailBench: Evaluating Long-Horizon Autonomous Decision-Mak... suggested research directions to improve robustness (proposed, not empirically v...

Recommended future research includes scalable interoperability solutions, longitudinal lifecycle value validation, human‑centred adoption strategies, and sustainability assessment methods.

Authors' explicit recommendations at the end of the review based on identified gaps in the literature.

speculative null result Digital Twins Across the Asset Lifecycle: Technical, Organis... priority research areas to address current evidence gaps

Researchers should combine qualitative studies with administrative/matched employer–employee data and experimental/quasi-experimental designs (pilot rollouts, staggered adoption) to identify causal effects of AI on tasks, productivity, and wages.

Methodological recommendation by authors based on limitations of their qualitative study (15 UX designers) and the need to quantify observed phenomena; not an empirical claim tested in the paper.

speculative null result The Values of Value in AI Adoption: Rethinking Efficiency in... recommended measurement approaches for causal identification (task allocation, p...

AI economics should prioritize causal identification of who benefits and who loses when AI is introduced into credit and other financial services, and model endogenous platform behavior including competition and regulatory responses.

Research agenda proposed by the authors based on identified gaps in the literature; prescriptive guidance rather than empirically tested claims.

speculative null result Financial Inclusion in the Age of FinTech Platforms: Opportu... research priorities (causal identification, endogenous platform behavior) rather...

Regulatory tools to consider include algorithmic impact assessments, data portability/interoperability mandates, fairness enforcement, sandboxing with post-deployment audits, and macroprudential tools for platform risk.

Policy recommendation derived from literature review and gap analysis; framed as suggested instruments rather than tested interventions.

speculative null result Financial Inclusion in the Age of FinTech Platforms: Opportu... effectiveness of regulatory tools on consumer protection, competition, and syste...

To measure and monitor these effects, researchers should track firm-level adoption of AI features, fulfillment automation intensity, platform-mediated market entry, and task-level labor shifts.

Author recommendations based on gaps identified in the case-based and multi-modal empirical work and the sensitivity of results to adoption measures; not an empirical finding but a methodological claim.

speculative null result Artificial Intelligence–Enabled E-Commerce Systems and Autom... measurement coverage metrics (availability/quality of adoption and task-shift da...

The threshold for taxing AI may be crossed once AI becomes sufficiently capable in substituting humans across cognitive tasks.

Model-based comparative-static/threshold analysis showing that higher AI substitutability for cognitive tasks increases the likelihood that cognitive workers will consider switching to manual jobs, thereby meeting the model's tax-initiation condition.

speculative positive Workers' Incentives and the Optimal Taxation of AI whether/when the model's tax-initiation threshold is crossed as a function of AI...

Developing domain-specific vernacular NLP and speech models (health, agriculture, education) would help replicate pragmatic features (proverbs, registers) that enable epistemic appropriation.

Policy/research recommendation based on qualitative findings that proverbs and registers confer legitimacy and facilitate knowledge transfer; no experimental NLP work reported in study.

speculative positive From Linguistic Hybridity to Development Sovereignty: Pidgin... potential improvement in vernacular AI-assisted advisory effectiveness (proposed...

Local-language (vernacular) inclusion improves economic returns to development interventions by increasing comprehension and adoption, thereby improving program cost-effectiveness.

Logical extrapolation from observed higher comprehension and adoption rates in the field sample (N = 45); no direct economic cost–benefit analysis reported in the study—claim framed as implication for AI economics.

speculative positive From Linguistic Hybridity to Development Sovereignty: Pidgin... implied economic return / cost-effectiveness (inferred from uptake/comprehension...

Building and maintaining an open-access disclosure repository would enable comparability, aggregation, and public appraisal of environmental pressures.

Policy recommendation derived from conceptual analysis; no implemented repository or empirical evaluation reported.

speculative positive A golden opportunity: Corporate sustainability reporting as ... data accessibility, comparability, and ability to aggregate environmental disclo...

« Prev 1 2 3 … 78 79 80 81 Next »