Evidence (4857 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	417	113	67	480	1091
Governance & Regulation	419	202	124	64	823
Research Productivity	261	100	34	303	703
Organizational Efficiency	406	96	71	40	616
Technology Adoption Rate	323	128	74	38	568
Firm Productivity	307	38	70	12	432
Output Quality	260	71	27	29	387
AI Safety & Ethics	118	179	45	24	368
Market Structure	107	128	85	14	339
Decision Quality	177	75	37	19	312
Fiscal & Macroeconomic	89	58	33	22	209
Employment Level	74	34	78	9	197
Skill Acquisition	98	36	40	9	183
Innovation Output	121	12	24	13	171
Firm Revenue	98	35	24	—	157
Consumer Welfare	73	31	37	7	148
Task Allocation	87	16	34	7	144
Inequality Measures	25	76	32	5	138
Regulatory Compliance	54	61	13	3	131
Task Completion Time	89	7	4	3	103
Error Rate	44	51	6	—	101
Training Effectiveness	58	12	12	16	99
Worker Satisfaction	47	33	11	7	98
Wages & Compensation	54	15	20	5	94
Team Performance	47	12	15	7	82
Automation Exposure	27	26	10	6	72
Job Displacement	6	39	13	—	58
Hiring & Recruitment	40	4	6	3	53
Developer Productivity	34	4	3	1	42
Social Protection	22	11	6	2	41
Creative Output	16	7	5	1	29
Labor Share of Income	12	6	9	—	27
Skill Obsolescence	3	20	2	—	25
Worker Turnover	10	12	—	3	25

Productivity Remove filter

AI may alter firms' competitive dynamics by amplifying scale advantages and platform effects, making antitrust, data portability, and competition policy relevant to preserve contestability and innovation.

Synthesis of industrial organization theory and empirical observations of platform markets and data-driven firms cited in the literature review; no primary empirical study included in this paper.

medium negative The Evolution and Societal Impact of Artificial Intelligence... market concentration, competition levels, and innovation dynamics

If quantum advantages accrue initially to well-capitalized incumbents (cloud providers, financial firms, pharmaceuticals), we should expect increased market power and higher rents.

Scenario analysis and historical analogs where early compute advantages concentrated market power; qualitative market-structure modeling.

medium negative Modeling Macroeconomic Output Gains from Quantum-Driven Prod... market concentration measures (e.g., market shares, rents), firm-level competiti...

Benefits of quantum diffusion are likely to be uneven across countries, firms, and workers—boosting regions with strong innovation ecosystems and possibly increasing market concentration among compute-capable incumbents.

Multi-region/sectoral modeling with heterogenous adoption and capability parameters; historical analogs showing concentration following early compute advantages; scenario comparisons.

medium negative Modeling Macroeconomic Output Gains from Quantum-Driven Prod... regional competitiveness, firm-level market concentration, distributional outcom...

Without coordinated investments and governance, large theoretical gains may remain unrealized or be very unevenly distributed.

Policy counterfactual scenarios in which underinvestment, fragmented governance, or restrictive export regimes reduce adoption elasticities and infrastructure readiness, producing lower and more concentrated macro gains compared with coordinated-investment scenarios.

medium negative Modeling Macroeconomic Output Gains from Quantum-Driven Prod... realized productivity gains; distribution of gains across firms/countries (inequ...

High executive digital cognition on its own tends to weaken the policy's positive effect on energy utilization efficiency (interpreted as short-run adjustment costs from digital transformation).

Interaction tests between policy treatment and an executive-level digital-cognition measure show a negative interaction coefficient in DID regressions; authors interpret this as evidence of short-run adjustment costs.

medium negative How Does Urban Green Data Center Policy Empower Corporate En... corporate energy utilization efficiency

The under‑use of external text sources in the reviewed literature may be due to privacy, legal/regulatory uncertainty, or integration costs.

Authors' interpretation linking observed low coverage of external text sources (social media, news, reviews) in the 109 articles to plausible barriers (privacy/regulation/integration); no direct empirical test in the review.

medium negative Natural language processing in bank marketing: a systematic ... use of external text sources in marketing research and barriers to their use

Widespread deployment of similar models could create correlated failures or fraud vectors, implying systemic risk that may warrant macroprudential attention.

Analytic caution based on model homogeneity and case/literature discussion; speculative systemic risk concern rather than empirically demonstrated.

medium negative Explore the Impact of Generative AI on Finance and Taxation systemic correlated failure risk, incidence of correlated fraud events

There is regulatory uncertainty around AI-generated filings and responsibility/liability for automated outputs.

Analysis and literature review discuss unclear regulatory positions and legal risks noted in case organizations' deployment considerations.

medium negative Explore the Impact of Generative AI on Finance and Taxation regulatory/compliance risk exposure for AI-generated filings

Integration complexity with legacy ERP/financial systems and sharing-center processes is a significant implementation challenge.

Case study narratives describe integration work and friction points; analytic framing highlights ERP compatibility issues.

medium negative Explore the Impact of Generative AI on Finance and Taxation integration effort/time/cost, compatibility with ERP systems

Model hallucinations, lack of explainability, and limited audit trails limit safe adoption.

Paper cites literature and case observations about model reliability and explainability issues; examples and discussion are qualitative.

medium negative Explore the Impact of Generative AI on Finance and Taxation model reliability (hallucination incidence), explainability/auditability metrics

Data privacy, confidentiality, and cross-border data transfer concerns are important barriers to deployment.

Challenges enumerated from case studies and literature; specific organizational concerns cited in cases (Xiaomi, Deloitte) and in regulatory discussion.

medium negative Explore the Impact of Generative AI on Finance and Taxation deployment constraints related to data privacy (e.g., blocked data flows, need f...

Absent interoperability, divergence in data and AI rules will raise transaction costs, reduce trade gains, and create opportunities for regulatory arbitrage.

Economic reasoning and scenario-based projections; asserted as an outcome of mechanism analysis rather than demonstrated with quantitative estimates.

medium negative Path Analysis of Digital Economy and Reconstruction of Inter... transaction costs, aggregate trade gains, incidence of regulatory arbitrage

Explainability, auditability, or data-localization requirements could favor larger vendors with compliance capacity, increasing market concentration and affecting competition among AI suppliers.

Market-structure argument grounded in regulatory-compliance burden analysis and comparative examples; not supported by empirical market data in the study.

medium negative ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... market concentration and competition among AI vendors (supplier market structure...

Legal uncertainty and strict procedural requirements increase compliance costs and regulatory risk, which can slow AI adoption by firms and public agencies.

Theoretical economic implications drawn from legal analysis and comparative observations; no empirical measurement of costs or adoption rates in the study.

medium negative ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... AI adoption rate and investment risk (speed and likelihood of procurement/invest...

AI can restrict or reshape human administrative discretion in legally sensitive ways.

Doctrinal analysis of statutory specificity and formal procedural requirements in civil-law contexts, illustrated with Vietnam as the exemplar case; comparative observations.

medium negative ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... scope of administrative discretion (degree of human decision-making latitude)

Physical constraints (power grid reliability, water consumption for cooling, and data-center capacity) together with diminishing marginal returns on scaling make continued monolithic scaling economically and environmentally risky.

Conceptual argumentation using known infrastructure constraints and economic reasoning about diminishing returns; no new empirical assessment or quantified risk analysis included.

medium negative An Alternative Trajectory for Generative AI economic and environmental risk metrics (probability/impact of grid stress, wate...

Reasoning-augmented models (e.g., models using chain-of-thought, multi-step reasoning, or external retrieval/looping) can inflate per-query compute by orders of magnitude, exacerbating sustainability problems.

Argument based on architectural patterns (multi-step reasoning, retrieval augmentation, multiple model passes) and reported per-query compute multipliers in auxiliary literature (referenced anecdotally); the paper provides no new benchmarked per-query compute measurements.

medium negative An Alternative Trajectory for Generative AI per-query compute cost and associated energy consumption (compute FLOPs or joule...

The energetic burden of generative AI is shifting from one-time training to recurring, potentially unbounded inference costs as models become productized and high-traffic.

Synthesis of industry observations and early/anecdotal quantitative reports on operational workloads; no original empirical time-series or workload measurements provided in this paper.

medium negative An Alternative Trajectory for Generative AI distribution of energy consumption between training and inference (energy per in...

Scaling monolithic LLMs toward artificial general intelligence (AGI) is colliding with hard physical and economic limits (energy, grid stress, water use, diminishing returns).

Conceptual synthesis and argumentation drawing on observed industry trends (training/inference cost growth), infrastructure constraints (grid reliability, data-center cooling/water use) and theoretical diminishing marginal returns on model/data scaling. No new empirical dataset or controlled experiments reported in the paper.

medium negative An Alternative Trajectory for Generative AI feasibility of continued monolithic scaling measured by physical (power, water, ...

Capabilities and data advantages for certain vendors could lead to market concentration and platform dominance in AI-driven educational feedback.

Expert concern synthesized from the workshop of 50 scholars about market dynamics; theoretical warning without empirical market-structure analysis in the report.

medium negative The Future of Feedback: How Can AI Help Transform Feedback t... market concentration measures (market share, Herfindahl index); entry barriers; ...

Differential access to high-quality AI feedback systems and bias in training data can exacerbate educational inequalities and harm marginalized groups.

Expert consensus and thematic analysis from the 50-scholar workshop, raising equity and bias risks; no empirical subgroup effectiveness estimates included.

medium negative The Future of Feedback: How Can AI Help Transform Feedback t... access disparities; differential effectiveness by subgroup; measures of algorith...

Learners may over-rely on AI feedback or game systems to obtain desirable responses, reducing effortful learning.

Workshop participant concerns synthesized qualitatively; cited as risk and an open empirical question—no experimental data provided.

medium negative The Future of Feedback: How Can AI Help Transform Feedback t... learner reliance on AI (usage patterns); changes in effortful learning behaviors...

Agents that attempt to infer others' reasoning depth may be vulnerable to strategic misrepresentation (partners could behave to induce incorrect ToM estimates).

Conceptual analysis in the paper and discussion of strategic incentives; paper also identifies the risk and suggests potential mitigations (e.g., conservatism, verification, meta-reasoning).

medium negative Adaptive Theory of Mind for LLM-based Multi-Agent Coordinati... vulnerability to strategic manipulation (qualitative risk and proposed mitigatio...

Both too little and too much recursive reasoning (i.e., too shallow or too deep ToM) can produce poor joint behavior — miscalibrated anticipation harms coordination.

Observed non-monotonic effects in the reported experiments where fixed-order agents at either low or high ToM orders performed worse in mismatched pairings; evidence comes from the same multi-environment evaluation using joint-payoff / success-rate metrics.

medium negative Adaptive Theory of Mind for LLM-based Multi-Agent Coordinati... coordination performance (joint payoff, success rate)

Misalignment in Theory-of-Mind (ToM) order between agents (i.e., agents using different recursive reasoning depths) degrades coordination performance.

Empirical experiments using LLM-driven agents with configurable ToM depth across four coordination environments (a repeated matrix game, two grid navigation tasks, and an Overcooked task); comparisons of matched (same-order) vs mismatched (different-order) pairings using task-specific joint payoffs and success rates as metrics.

medium negative Adaptive Theory of Mind for LLM-based Multi-Agent Coordinati... coordination performance (joint payoff, task success rate, task completion/time)

There is a risk of manipulation and misinformation if argument mining/synthesis is unregulated or misaligned with social incentives, creating externalities that may justify public intervention.

Conceptual risk assessment combining known misinformation dynamics and AI capabilities; no empirical incident data provided.

medium negative Argumentative Human-AI Decision-Making: Toward AI Agents Tha... incidence of manipulation/misinformation attributable to argument-mining/synthes...

Increased error risk and weaker explainability from GLAI will raise malpractice and liability exposure for firms and lawyers, driving up insurance and compliance costs.

Legal-risk analysis and economic reasoning connecting explainability/liability to insurance costs; no empirical cost studies presented.

medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... malpractice/liability exposure levels and associated insurance/compliance costs

The combination of hallucination and professional overreliance strains existing regulatory goals (e.g., explainability, human oversight) within European AI governance frameworks.

Legal and regulatory analysis mapping technical and behavioral risks onto European AI governance goals; references to statutory/regulatory texts and policy debates. Qualitative argumentation rather than empirical test.

medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... compatibility between GLAI deployment dynamics and regulatory obligations (e.g.,...

Fabricated or opaque intermediate data and reasoning in GLAI weaken explainability, making it difficult to provide meaningful explanations about how outputs were produced.

Conceptual analysis of token-prediction architectures, literature on explainability limits of LLMs, and legal/regulatory analysis referencing explainability requirements. No empirical measurement.

medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... quality/meaningfulness of explanations about model outputs (explainability)

Hallucinated content produced by GLAI is often linguistically fluent and persuasive, increasing the risk that legal professionals will accept it without verification.

Literature synthesis on model fluency and behavioral literature on trust in coherent authoritative outputs, plus illustrative vignettes. No original experimental data or sample size.

medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... rate of professional acceptance or uncritical reliance on fluent but incorrect o...

This architectural mismatch (token-prediction vs. formal legal reasoning) contributes to confident but factually incorrect outputs (hallucinations) in GLAI.

Technical/conceptual analysis plus synthesis of existing literature on hallucinations in generative models; illustrative examples and vignettes provided. No primary empirical measurement in the paper.

medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... incidence and nature of hallucinated (factually incorrect) outputs produced by G...

Observed failure modes during the workflow included hypothesis creep, definition-alignment bugs (mismatch between informal and formal definitions), and agent avoidance behaviors (agents delegating or failing to complete tasks).

Qualitative analysis and post-mortem reported in the paper based on the single project workflow and logs; specific failure modes enumerated by authors from their process observations.

medium negative Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau E... presence and types of failure modes observed in the workflow (hypothesis creep, ...

Absence of governance and observability could increase social costs of accidents and induce conservative regulation that stifles beneficial adoption.

Policy reasoning and historical regulatory responses to systemic risks; conceptual projection without quantitative modeling of regulatory impact.

medium negative The Internet of Physical AI Agents: Interoperability, Longev... social cost of accidents, regulatory restrictiveness, adoption rates

Strong proprietary stacks and incompatible protocols could create winner‑take‑all or oligopolistic market outcomes due to network effects and switching costs.

Market‑structure theory and historical platform examples (e.g., dominant tech platforms); argument is conceptual and not backed by new empirical market analysis in the paper.

medium negative The Internet of Physical AI Agents: Interoperability, Longev... market concentration (e.g., market share distribution), barriers to entry

Without these architectural commitments, the economic costs — stranded assets, safety incidents, reduced innovation, and high coordination costs — will be substantial.

Predictive economic argument built from historical IoT/Internet lessons and systems reasoning; no quantitative cost estimates or econometric analysis in the paper.

medium negative The Internet of Physical AI Agents: Interoperability, Longev... economic costs: stranded assets, safety incident frequency, innovation rates, co...

Poor governance and observability in agent networks would make accountability, certification, and regulation difficult.

Policy and governance reasoning with illustrative domain examples; conceptual argument without empirical governance case studies or metrics.

medium negative The Internet of Physical AI Agents: Interoperability, Longev... ease of accountability/certification/regulation; observability coverage

Weak or brittle security and trust mechanisms across distributed agent ecosystems will pose serious risks.

Lessons drawn from IoT security failures and conceptual threat analysis; no new penetration testing or security metrics presented.

medium negative The Internet of Physical AI Agents: Interoperability, Longev... security/trust robustness of agent ecosystems (vulnerabilities, compromise rates...

Lifecycle mismatch — rapidly evolving AI software embedded in long‑lived physical assets — risks premature ossification or expensive retrofits.

Systems engineering reasoning and historical analogies to embedded systems/IoT lifecycles; no quantitative lifecycle modeling or case study data in the paper.

medium negative The Internet of Physical AI Agents: Interoperability, Longev... frequency/cost of ossification and expensive retrofits; expected upgrade cost

Misalignment or poor meta-control could produce persistent unsafe behaviors in autonomous learners; governance and oversight mechanisms will be crucial.

Risk analysis based on conceptual failure modes for meta-control; no empirical incidents reported in the paper.

medium negative Why AI systems don't learn and what to do about it: Lessons ... frequency and severity of unsafe behaviors; successful governance interventions

Current models transfer poorly across domains, are brittle in nonstationary environments, and are inefficient in physical/embodied tasks.

Synthesis of known challenges from prior literature and practical experience; paper cites these as motivating observations rather than reporting new data.

medium negative Why AI systems don't learn and what to do about it: Lessons ... cross-domain generalization; robustness under nonstationarity; sample efficiency...

Current models have limited meta-control and do not autonomously decide when to explore, imitate, consult prior knowledge, or consolidate.

Conceptual critique based on typical ML training pipelines and limited on-line decision-making modules; no empirical tests in paper.

medium negative Why AI systems don't learn and what to do about it: Lessons ... autonomy in meta-decisions (e.g., fraction of exploration/imitative acts chosen ...

There is weak integration between passive observation (supervised/representation learning) and active experimentation (reinforcement/exploratory learning) in current systems.

Observation of methodological separation in current literature and systems; conceptual discussion in the paper.

medium negative Why AI systems don't learn and what to do about it: Lessons ... performance on mixed observation-action tasks; ability to combine passive and ac...

Current AI models lack the architectures and control mechanisms required for sustained, autonomous learning in dynamic real-world settings.

Conceptual/theoretical analysis presented in the paper; synthesis of limitations observed in existing literature and practices (no new empirical data provided).

medium negative Why AI systems don't learn and what to do about it: Lessons ... ability to sustain autonomous learning in dynamic real-world environments

Attribution (labeling responses as AI) can alter perceived empathy and therefore matters for product design, branding, and disclosure policy decisions.

Findings from the attribution effect experiment showing reduced feelings of being heard/validated when replies are labeled AI despite identical content; authors discuss implications for product design and disclosure.

medium negative Practicing with Language Models Cultivates Human Empathic Co... recipient-rated perceptions (being heard/validated) and inferred implications fo...

Existing idea-evaluation approaches (LLM judges or human panels) are subjective and disconnected from real research outcomes.

Framing and motivation in the paper arguing current approaches rely on subjective judgments and do not directly tie to later publication/citation outcomes; supported implicitly by the empirical mismatch (LLM-judge vs HindSight).

medium negative HindSight: Evaluating LLM-Generated Research Ideas via Futur... Degree of alignment between evaluative judgments (LLM/human) and later real-worl...

LEAFE's benefits depend on informative, actionable feedback; environments with noisy or adversarial feedback may limit improvements.

Limitations stated in the paper noting sensitivity to feedback quality; conceptual reasoning that the method relies on extracting actionable signals from environment feedback.

medium negative Internalizing Agency from Reflective Experience Change in Pass@k or recovery performance under degraded/noisy feedback (qualitat...

Outcome-driven post-training (optimizing final rewards) underutilizes rich environment feedback and causes 'distribution sharpening' — policies overfit a narrow set of successful behaviors and fail to broaden problem-solving/recovery capacity in long-horizon settings.

Problem diagnosis in the paper supported by comparison of outcome-driven RL (GRPO) performance versus LEAFE and by conceptual argument about how optimizing final success signals can narrow behavioral support; supported by empirical observations of poorer recovery/generalization in baselines.

medium negative Internalizing Agency from Reflective Experience Breadth of problem-solving/recovery capacity (inferred from failure modes and Pa...

Rotation-based PTQ methods (designed for integer formats) fail on MXFP4 because global orthogonal rotations move outlier energy across quantization blocks, creating new outliers and often producing bimodal activations that underutilize the limited MXFP range.

Analytical argument backed by empirical observations reported in the paper: activation-distribution analysis demonstrating cross-block outlier propagation and bimodality when applying global orthogonal rotations to MXFP4-blocked layouts; comparisons to performance collapse under those methods.

medium negative BATQuant: Outlier-resilient MXFP4 Quantization via Learnable... Activation distribution characteristics (outlier propagation, bimodality) and re...

High governance costs in regulated/high-risk domains can slow adoption of agentic systems, concentrating deployment in less regulated uses or among large firms that can afford governance infrastructure.

Economic reasoning about fixed and marginal governance costs and firm-level adoption decisions; no empirical adoption data presented.

medium negative Runtime Governance for AI Agents: Policies on Paths rate of adoption of agentic systems across firm sizes and regulated domains

Path-dependent behavior increases the complexity of principal–agent contracting and moral hazard between platforms, enterprise customers, and downstream users, requiring richer contract terms (acceptable paths, logging, audit rights).

Economic theory reasoning and applied contract/design implications discussed; no empirical contract-study data.

medium negative Runtime Governance for AI Agents: Policies on Paths complexity of contractual arrangements (number/complexity of contract clauses or...

« Prev 1 2 3 … 49 50 51 … 97 98 Next »