Evidence (4333 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	402	112	67	480	1076
Governance & Regulation	402	192	122	62	790
Research Productivity	249	98	34	311	697
Organizational Efficiency	395	95	70	40	603
Technology Adoption Rate	321	126	73	39	564
Firm Productivity	306	39	70	12	432
Output Quality	256	66	25	28	375
AI Safety & Ethics	116	177	44	24	363
Market Structure	107	128	85	14	339
Decision Quality	177	76	38	20	315
Fiscal & Macroeconomic	89	58	33	22	209
Employment Level	77	34	80	9	202
Skill Acquisition	92	33	40	9	174
Innovation Output	120	12	23	12	168
Firm Revenue	98	34	22	—	154
Consumer Welfare	73	31	37	7	148
Task Allocation	84	16	33	7	140
Inequality Measures	25	77	32	5	139
Regulatory Compliance	54	63	13	3	133
Error Rate	44	51	6	—	101
Task Completion Time	88	5	4	3	100
Training Effectiveness	58	12	12	16	99
Worker Satisfaction	47	32	11	7	97
Wages & Compensation	53	15	20	5	93
Team Performance	47	12	15	7	82
Automation Exposure	24	22	9	6	62
Job Displacement	6	38	13	—	57
Hiring & Recruitment	41	4	6	3	54
Developer Productivity	34	4	3	1	42
Social Protection	22	10	6	2	40
Creative Output	16	7	5	1	29
Labor Share of Income	12	5	9	—	26
Skill Obsolescence	3	20	2	—	25
Worker Turnover	10	12	—	3	25

Governance Remove filter

Faster iterative experimental cycles enabled by LLM orchestration may increase returns to experimental R&D and change the optimal allocation between computation, instrumentation, and labor.

Economic argumentation about iterative cycles and returns to capital/labor; proposed rather than empirically demonstrated.

low positive ChatMicroscopy: A Perspective Review of Large Language Model... returns to experimental R&D and allocation of spending across computation, instr...

The paper provides an initial mapping from diagnosis to intervention strategies (therapeutics) — i.e., treatment planning for model dysfunctions.

Conceptual mapping and proposed intervention strategies documented in the therapeutics section (initial mappings; not claimed as exhaustive).

low positive Model Medicine: A Clinical Framework for Understanding, Diag... Existence of a proposed mapping from diagnostic categories to candidate interven...

AI should serve precision and purpose in public policy — improving foresight, enabling better trade-offs, and preserving democratic accountability.

Normative policy prescription and conceptual argumentation in the book; no empirical testing or quantified outcomes reported.

low positive Governing The Future policy foresight quality, decision trade-off management, and preservation of dem...

AI-driven systems should empower people with knowledge and pathways to participate in global markets rather than concentrate gains.

Normative recommendation derived from policy analysis and value judgments in the book; not supported by empirical evidence in the blurb.

low positive Governing The Future distribution of economic gains and levels of participation in global markets

Algorithmic transparency and auditability can reduce systemic risk from opaque automated lending decisions and improve regulator oversight and macroprudential policy.

Conceptual/systemic-risk argument in the "Systemic risk & governance externalities" section; no empirical systemic-risk analysis provided.

low positive Diego Saucedo Portillo Sauceport Research systemic risk indicators related to automated lending (e.g., correlated default ...

Improved algorithmic transparency could reduce information asymmetries, lowering adverse selection and moral hazard over time and potentially expanding credit to underserved populations.

Conceptual economic argument in the "Credit allocation & pricing" section; based on theory rather than empirical testing.

low positive Diego Saucedo Portillo Sauceport Research levels of information asymmetry, incidence of adverse selection/moral hazard, an...

If properly designed and enforced, the protocol measures can improve credit access for underserved populations and reduce biased exclusion, supporting inclusive growth.

Normative claim supported by doctrinal arguments, comparative regulatory literature and technical fairness literature synthesized in the audit (no controlled empirical evaluation reported).

low positive Diego Saucedo Portillo Sauceport Research credit access for underserved populations; incidence of biased exclusion

Firms that effectively implement governed hyperautomation may realize sustainable efficiency and reliability advantages, potentially increasing market concentration in some sectors unless governance costs level the playing field.

Strategic and competitive-dynamics argument derived from case examples and best-practice synthesis; no sector-level empirical concentration measures presented.

low positive Governed Hyperautomation for CRM and ERP: A Reference Patter... firm-level efficiency/reliability gains and sector market concentration

Standardized governance patterns reduce information asymmetries, enabling insurers and regulators to better price and manage enterprise AI risks.

Policy implication argued from the existence of standardized governance artifacts (audit trails, certifications) and industry practice; conceptual, no empirical insurer/regulator data presented.

low positive Governed Hyperautomation for CRM and ERP: A Reference Patter... ability of insurers/regulators to assess/price/manage enterprise AI risk

Embedding governance reduces downside risks (compliance fines, data breaches), improving expected net returns of automation investments and lowering the adoption threshold for risk-averse firms.

Conceptual cost-benefit argument and industry best-practice examples; lacking quantitative measurement of returns or threshold shifts.

low positive Governed Hyperautomation for CRM and ERP: A Reference Patter... expected net returns on automation investments and adoption threshold for firms

Incentives for human‑augmenting AI (e.g., subsidies or tax incentives tied to task redesign and training) can promote inclusive adoption patterns.

Policy analysis and comparative case studies; theoretical models that predict firm adoption responses to incentives, but limited causal empirical evidence specific to AI-targeted incentives.

low positive Intelligence and Labor Market Transformation: A Critical Ana... patterns of AI adoption (augmenting vs. substituting) and associated worker outc...

By synthesizing computer science, engineering, and financial policy insights, DRL should be viewed not merely as a mathematical tool but as a transformative agent within the global socio-technical infrastructure of capital markets.

High-level synthesis and interdisciplinary argumentation in the paper; no empirical evidence or longitudinal studies are cited in the excerpt to demonstrate systemic transformation.

low speculative Deep Reinforcement Learning for Dynamic Portfolio Optimizati... transformative impact on socio-technical structures of capital markets (institut...

Research agenda items include quantifying social returns to different alignment interventions, studying market equilibria under participatory vs. opaque strategies, and modeling optimal regulatory mixes under uncertainty about harms and capability growth.

Prescriptive research agenda derived from the paper's economic analysis and identified knowledge gaps; presented as proposed studies rather than completed research.

low speculative LLM Alignment should go beyond Harmlessness–Helpfulness and ... evidence produced by future studies quantifying returns, market equilibria, and ...

If conformal filtering produces vacuous outputs at factuality levels customers demand, adoption in knowledge-intensive domains may be limited until methods simultaneously provide robustness and informativeness; vendors using efficient verifiers and robust calibration may gain competitive advantage.

Paper's market/economic discussion drawing on empirical trade-offs (informativeness vs. factuality) and cost comparisons; this is an applied implication rather than a direct experimental result.

low speculative Is Conformal Factuality for RAG-based LLMs Robust? Novel Met... market adoption likelihood, product reliability vs. cost (qualitative)

Authors propose the 'AI orchestra' concept: future development will involve coordinated ensembles of specialized AI agents (code generation, test generation, dependency analysis, security scanning) orchestrated by humans and higher-level controllers.

Theoretical/conceptual argument by the authors grounded in qualitative findings from Netlight (practitioner reports of multiple tools and coordination frictions); this is a forward-looking synthesis rather than an empirically established fact.

low speculative Rethinking How IT Professionals Build IT Products with Artif... anticipated architecture of AI tool ecosystems (multiple specialized agents coor...

Modular and cell‑free platforms could enable decentralized, localized manufacturing of specialty compounds, potentially altering trade flows away from centralized petrochemical hubs.

Conceptual synthesis plus small-scale demonstrations of modular/cell-free units in the reviewed literature; limited pilot projects and discussion of potential scalability and portability.

low speculative Harnessing Microbial Factories: Biotechnology at the Edge of... feasibility metrics for localized production (unit throughput, cost per unit at ...

Product teams evaluating LLM-powered features rely on a spectrum of practices—from informal “vibe checks” to organizational meta-work—to cope with LLMs’ unpredictability.

Qualitative interview study with 19 practitioners; thematic coding of transcripts produced descriptions of a range of evaluation practices used by teams.

medium-high mixed Results-Actionability Gap: Understanding How Practitioners E... types of evaluation practices used by product teams

Platform design choices (property rights, portability, reputation, tokenization, escrowed memories) will shape incentives for contributions to shared knowledge and agent improvement.

Policy and mechanism-design implications drawn from observed phenomena (shared memories, contributions, and trust) in the qualitative dataset; recommendation rather than empirically tested claim.

speculative mixed When Openclaw Agents Learn from Each Other: Insights from Em... rate/distribution of contributions to shared knowledge and agent improvement as ...

Shared memory architectures create public-good–like externalities (knowledge diffusion and spillovers) that may be underprovided absent coordination or platform governance.

Qualitative observations of shared memories and diffusion patterns plus theoretical economic interpretation; no empirical quantification of spillover magnitudes provided.

speculative mixed When Openclaw Agents Learn from Each Other: Insights from Em... degree of knowledge diffusion / presence of public-good spillovers from shared m...

Easier specification of constraints can reduce some harms (clear safety violations) but centralizes normative power (who defines constraints) and creates international/cultural externalities and risks of regulatory capture.

Normative and economic argument in the paper combining technical tractability of constraints with governance concerns; this is an inference about likely distributional effects rather than empirically established fact.

speculative mixed Via Negativa for AI Alignment: Why Negative Constraints Are ... measured reduction in certain harms (e.g., illegal instructions) and concentrati...

Because failure modes such as definition misalignment and hypothesis creep were observed, the authors argue for regulation/standards around disclosure of AI-assisted scientific claims and archival of verification artifacts.

Policy recommendation in the paper derived from the documented process-level failure modes in the single project; recommendation is prescriptive, not empirically validated beyond the project.

speculative mixed Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau E... policy recommendation presence (advocacy for disclosure/archival standards) base...

Improved throughput and lower travel costs can induce additional travel demand (rebound), partially offsetting congestion/emissions gains unless paired with demand-management measures.

Theoretical economic reasoning presented in the paper as a caveat; not directly measured in the simulation experiments (no induced-demand dynamic experiments reported).

speculative mixed Data-driven generalized perimeter control: Zürich case study net congestion and emissions accounting for possible induced travel demand

There is a social welfare trade‑off between personalization value (higher AAR) and normative/social risk (higher MR); optimal policy and product design should balance these using BenchPreS metrics.

Analytical argument combining empirical findings (trade‑off between AAR and MR) with economic welfare considerations; the paper does not present formal welfare estimates or market experiments.

speculative mixed BenchPreS: A Benchmark for Context-Aware Personalized Prefer... Trade‑off between personalization benefits (AAR) and social/normative risk (MR) ...

AI in higher education is not simply a technological shift but a structural transformation requiring deliberate, critically informed governance grounded in equity and human agency.

Normative/conceptual conclusion drawn by the author from the thematic analysis and the critical AI media literacy framing; presented as the paper's principal argument or recommendation. (Supported qualitatively by themes from the analyzed discussions rather than quantitative causal evidence.)

speculative mixed A Critical AI Media Literacy Perspective on the Future of Hi... argument for governance reform: the need for critically informed, equity-centere...

The adoption of AI governance programmes by military institutions will have strategic implications.

Hypothesis stated by the author; presented as forward-looking analysis without accompanying empirical modeling, historical analogues, or measured strategic outcomes in the provided text.

speculative mixed AI governance for military decision-making: A proposal for m... strategic implications for military institutions and national security resulting...

The expansion of the gig economy reflects both genuine labor-market innovation enabling worker flexibility and cost shifting from firms to workers that policy intervention may appropriately address.

Synthesis and interpretation of the study's empirical findings (prevalence, heterogeneity, earnings gaps, distributional effects, and social protection measures) from administrative data, labor force surveys, and platform transaction records across 24 OECD countries (2015–2025).

speculative mixed The Gig Economy and Labor Market Restructuring: Platform Wor... qualitative assessment of labor-market implications (flexibility vs. cost-shifti...

Standard productivity metrics (e.g., output per hour) may misprice value if temporal quality matters; firms will face trade‑offs between maximizing throughput and preserving richer subjective temporality that affects long‑run creativity, morale, and retention.

Conceptual economic reasoning and literature synthesis on attention and productivity; no empirical studies or longitudinal workplace data presented.

speculative mixed XChronos and Conscious Transhumanism: A Philosophical Framew... accuracy of productivity metrics and long‑run organizational outcomes (creativit...

Investors and firms may need to include metrics of experiential quality (subjective well‑being, sustained attention quality) alongside productivity metrics when valuing neurotech and human–AI platforms.

Normative/economic implication argued from the framework; no empirical valuation studies or survey of investor behavior included.

speculative mixed XChronos and Conscious Transhumanism: A Philosophical Framew... incorporation of experiential-quality metrics into firm/investor valuation proce...

AI raises returns to platformization and can change the distribution of financial intermediation rents (potentially concentrating returns among platform incumbents).

Theoretical and economic reasoning in the 'Implications for AI Economics' section; conceptual discussion of platform effects and rents rather than empirical measurement in the paper summary.

speculative mixed DIGITAL FINANCIAL ECOSYSTEMS AND FINANCIAL INCLUSION: AN INT... distribution of financial intermediation rents, market concentration indices

Reported pilot gains, if scaled, could shift firm‑level returns and industry productivity measures, but gains are contingent on coordinated adoption; uneven uptake may produce winner‑takes‑more dynamics among technologically advanced firms.

Inference from pilot results and economic reasoning in the reviewed literature; no large‑scale empirical validation provided in the review.

speculative mixed Digital Twins Across the Asset Lifecycle: Technical, Organis... firm‑level returns, industry productivity, market concentration effects

Topology is the dominant factor for price stability and scalability compared to other swept variables (load, presence of hybrid integrator, governance constraints).

Factor-ablation analysis within the 1,620-run simulation study showing the largest explanatory effect (largest changes in volatility and scalability metrics) attributable to graph topology rather than load, hybrid flag, or governance settings.

medium-high mixed Real-Time AI Service Economy: A Framework for Agentic Comput... relative effect sizes on price stability (volatility/convergence) and scalabilit...

Demand for mid-level, routine-focused developer roles could compress while demand rises for verification, security, and AI–human orchestration skills.

Theoretical task-replacement argument based on observed capabilities of LLMs and synthesized user study evidence; limited direct labor-market empirical evidence in the reviewed literature.

speculative mixed ChatGPT as a Tool for Programming Assistance and Code Develo... employment demand by role/skill category; hiring trends and vacancy composition

Routine coding tasks may be partially automated, shifting human labor toward verification, integration, architecture, and domain-specific tasks.

Task-composition studies, user studies showing LLMs handle boilerplate/routine work, and economic inference synthesized across studies.

speculative mixed ChatGPT as a Tool for Programming Assistance and Code Develo... time allocation across task types (routine coding vs. verification/architecture)...

Societal acceptance of AI-generated audiovisual media is uncertain and could range from widespread uptake to broad rejection.

Discussion drawing on mixed empirical studies and scenario construction in the review; the paper notes contradictory findings in existing studies but does not provide primary survey data or sample sizes.

speculative mixed Ethical and societal challenges to the adoption of generativ... social acceptance/adoption levels of AI-generated audiovisual media

If cognitive interlocks are widely adopted, many negative externalities can be internalized and AI-driven productivity gains can be realized more sustainably; absent such controls, equilibrium may drift toward higher error rates and systemic incidents.

Long-run equilibrium argument based on theoretical reasoning and conditional claims; no longitudinal or cross-firm empirical evidence presented.

speculative mixed Overton Framework v1.0: Cognitive Interlocks for Integrity i... long-run system outcomes (error rates, incident frequency, net productivity) con...

Labor demand effects are ambiguous: junior/entry-level demand may be reduced for some tasks while demand for verification and higher-skill roles may rise.

Economic reasoning, early observational signals, and theoretical task-reallocation frameworks; empirical longitudinal evidence is limited or absent.

speculative mixed ChatGPT as a Tool for Programming Assistance and Code Develo... labor demand by skill level and occupation (employment levels, hiring rates)

The effectiveness of generative AI depends critically on human-AI workflows: prompt design, iterative refinement, and human vetting materially affect outcomes.

Qualitative analyses of interaction patterns and experiments manipulating prompting/iteration showing variation in outcomes; many studies report improved outputs after iterative prompting and human-in-the-loop refinement.

medium-high mixed ChatGPT as an Innovative Tool for Idea Generation and Proble... variation in output quality based on prompt design; changes in output after iter...

Market demand is likely to bifurcate: high-value clinical markets will require rigorous explainability and neuroscientific grounding (higher willingness-to-pay), while research and consumer segments may tolerate black-box models (lower margins).

Market segmentation argument built from differing end-user requirements and tolerance for opaque models; presented as a projected implication rather than an empirically tested market study.

speculative mixed Explainable Artificial Intelligence (XAI) for EEG Analysis: ... market segmentation / willingness-to-pay across segments

Teams often produce evaluation outputs (tests, metrics, user feedback) but lack mechanisms, processes, or technical levers to convert those outputs into actionable engineering or product changes—a novel “results-actionability gap.”

Recurring theme from the 19 practitioner interviews and coding; authors explicitly articulate and label this gap based on participants' reports.

medium-high negative Results-Actionability Gap: Understanding How Practitioners E... ability to translate evaluation outputs into concrete product/engineering change...

The study confirms several previously documented evaluation challenges with LLMs: model unpredictability, metric mismatch, high human-evaluation costs, and difficulty reproducing failures.

Interview data from 19 practitioners; thematic analysis flagged these recurring problems as reported by participants and aligned with prior literature.

medium-high negative Results-Actionability Gap: Understanding How Practitioners E... presence and prevalence of known evaluation challenges

Emergent quality hierarchies among agents imply winner-take-most dynamics in informational value and potential market concentration in agent quality.

Observed formation of quality hierarchies in agent interactions and documented economic interpretation; this is a hypothesis/implication drawn from qualitative patterns rather than measured market outcomes.

speculative negative When Openclaw Agents Learn from Each Other: Insights from Em... distribution of informational value / concentration of agent quality

Security of LLM-based MASs functions as an economic externality: failures can impose social costs (misinformation, poor collective decisions), and absent liability or market incentives providers may underinvest in robustness.

Economic reasoning and implication section in the paper—conceptual argument linking the technical vulnerability to economic externality and incentive misalignment. No empirical economic data provided in the summary.

speculative negative Don't Trust Stubborn Neighbors: A Security Framework for Age... investment in defenses (underprovision) and social costs from MAS security failu...

Analytical conditions on stubbornness and influence weights identify when a single adversary can dominate network dynamics (i.e., influence propagation criteria derived from FJ fixed-point analysis).

Mathematical/theoretical analysis of FJ model fixed points and influence propagation in the paper; derivation of conditions relating agent stubbornness and interpersonal trust weights to steady-state influence.

medium-high negative Don't Trust Stubborn Neighbors: A Security Framework for Age... theoretical criteria predicting when an agent's influence weight leads to domina...

If models frequently leak or misuse preferences in third‑party contexts, users and organizations will discount the value of personalization or demand stronger controls, increasing costs for deploying memory features and reducing consumer surplus.

Economic reasoning and implication drawn from the observed misapplication behavior; no empirical user adoption or market data provided in the study to directly support this claim.

speculative negative BenchPreS: A Benchmark for Context-Aware Personalized Prefer... Projected changes in trust, adoption costs, and consumer surplus (not empiricall...

The failure mode (misapplication of preferences to third parties) creates negative externalities (privacy violations, normative harms, misinformation, contractual breaches) that markets and platforms may not internalize without regulation or design changes.

Economic interpretation and argumentation building on the empirical failure mode; these harms are hypothesized implications rather than measured outcomes in the paper.

speculative negative BenchPreS: A Benchmark for Context-Aware Personalized Prefer... Projected negative externalities on third parties (not directly measured in stud...

Unclear liability frameworks increase perceived and real costs and can slow adoption by hospitals and insurers.

Policy analyses and procurement narratives noting liability uncertainty cited as a barrier to procurement and deployment.

medium_high negative Human-AI interaction and collaboration in radiology: from co... time-to-adoption, procurement decisions citing liability concerns, insurance/cov...

Up-front implementation costs commonly include procurement, integration with PACS/EMR, UI/UX development, regulatory compliance, and staff training; recurring costs include monitoring, data labeling, software updates, and cybersecurity.

Implementation reports, vendor and hospital accounts, and qualitative studies documenting cost categories (specific dollar amounts vary across settings and are rarely published in detail).

medium_high negative Human-AI interaction and collaboration in radiology: from co... implementation capital expenditures, annual operating expenditures

Without continuous support for upskilling/reskilling and inclusive policies, AI risks becoming a source of exclusion rather than an enabler of human advancement.

Normative conclusion derived from reviewed literature and thematic interpretation in the qualitative study (literature-based; evidence is secondary and not quantified).

speculative negative THE IMPACT OF ARTIFICIAL INTELLIGENCE IN THE WORKPLACE: OPPO... social inclusion versus exclusion related to AI adoption

Research literature synthesis demonstrates 70-75% automation potential.

Quantitative estimate offered by the authors (70-75%) as part of function-by-function analysis; no described empirical evaluation or sample supporting the figure.

speculative negative Are Universities Becoming Obsolete in the Age of Artificial ... percent automation potential for research literature synthesis

Knowledge transmission (teaching/lecturing) shows 75-80% AI substitutability.

Authors' quantitative estimate presented in the analysis (75-80%); the paper does not detail empirical methods or validation samples for this percentage.

speculative negative Are Universities Becoming Obsolete in the Age of Artificial ... percent substitutability/automation potential of knowledge transmission

« Prev 1 2 3 … 83 84 85 86 87 Next »