Evidence (5539 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	402	112	67	480	1076
Governance & Regulation	402	192	122	62	790
Research Productivity	249	98	34	311	697
Organizational Efficiency	395	95	70	40	603
Technology Adoption Rate	321	126	73	39	564
Firm Productivity	306	39	70	12	432
Output Quality	256	66	25	28	375
AI Safety & Ethics	116	177	44	24	363
Market Structure	107	128	85	14	339
Decision Quality	177	76	38	20	315
Fiscal & Macroeconomic	89	58	33	22	209
Employment Level	77	34	80	9	202
Skill Acquisition	92	33	40	9	174
Innovation Output	120	12	23	12	168
Firm Revenue	98	34	22	—	154
Consumer Welfare	73	31	37	7	148
Task Allocation	84	16	33	7	140
Inequality Measures	25	77	32	5	139
Regulatory Compliance	54	63	13	3	133
Error Rate	44	51	6	—	101
Task Completion Time	88	5	4	3	100
Training Effectiveness	58	12	12	16	99
Worker Satisfaction	47	32	11	7	97
Wages & Compensation	53	15	20	5	93
Team Performance	47	12	15	7	82
Automation Exposure	24	22	9	6	62
Job Displacement	6	38	13	—	57
Hiring & Recruitment	41	4	6	3	54
Developer Productivity	34	4	3	1	42
Social Protection	22	10	6	2	40
Creative Output	16	7	5	1	29
Labor Share of Income	12	5	9	—	26
Skill Obsolescence	3	20	2	—	25
Worker Turnover	10	12	—	3	25

Adoption Remove filter

We present, to the best of our knowledge, the first large-scale study of real-world conversational programming in IDE-native settings.

Authors' assertion about novelty; study scope described (analysis of messages from Cursor and GitHub Copilot across public repositories).

medium mixed Programming by Chat: A Large-Scale Behavioral Analysis of 11... existence/novelty of a large-scale empirical study of IDE-native conversational ...

Mundlak (correlated random effects) specifications indicate that the between-country components are statistically insignificant, while within-country effects remain significant.

Results from Mundlak (correlated RE) specifications reported in abstract indicating insignificance of between-country components and significance of within-country components (no numeric coefficients for the between/within split given in abstract).

medium mixed E-government development: Artificial intelligence vibrancy a... E-Government Development Index (EGDI)

Our results suggest that arbitrage can be a powerful force in AI model markets with implications for model development, distillation, and deployment.

Synthesis/conclusion based on the paper's empirical findings (case study, robustness experiments, distillation analysis) and economic interpretation.

medium mixed Computational Arbitrage in AI Model Markets overall economic influence of arbitrage on model development, distillation, and ...

The paper provides supporting empirical evidence spanning frontier laboratory dynamics, post-training alignment evolution, and the rise of sovereign AI as a geopolitical selection pressure.

Empirical/observational sections in the paper that the authors state cover those three areas (specific datasets, experiments, or case studies are referenced in the text but not quantified in the abstract).

medium mixed Punctuated Equilibria in Artificial Intelligence: The Instit... empirical patterns consistent with the institutional fitness and punctuated-equi...

The paper develops an illustrative empirical application based on event studies of AI-agent capability disclosures and heterogeneous market repricing.

Methodological description in the paper: an illustrative empirical application using event-study methodology on AI capability disclosures and observing heterogeneous market repricing; the excerpt does not report sample size or quantified results.

medium mixed AI Agents in Financial Markets: Architecture, Applications, ... market repricing heterogeneity following AI-agent capability disclosure events

Macroeconomic effects remain hard to observe because of a 'productivity J-curve': firms often must invest in organizational changes first and only later realize measurable financial/productivity gains from AI.

Conceptual synthesis supported by firm-level case studies and empirical papers in the reviewed literature indicating implementation lags; the brief frames this as an interpretation of mixed short-run macro evidence rather than a single causal estimate.

medium mixed AI, Productivity, and Labor Markets: A Review of the Empiric... timing/lags in firm productivity and realization of financial gains from AI inve...

There are architectural tensions between actor-critic frameworks and value-based methods in DRL for finance, and state-space representation and reward function engineering are important to performance in complex financial environments.

Analytical comparison and emphasis in the paper; the excerpt does not include quantitative comparisons, ablation studies, or dataset descriptions to substantiate which architectures perform better under which conditions.

medium mixed Deep Reinforcement Learning for Dynamic Portfolio Optimizati... algorithmic performance differences as a function of DRL architecture, state rep...

The paper provides an extensive system-level investigation into the deployment of DRL architectures for dynamic portfolio optimization.

Stated scope of the paper (system-level investigation); details about methods, datasets, experimental design, or sample sizes are not given in the provided text.

medium mixed Deep Reinforcement Learning for Dynamic Portfolio Optimizati... operational and performance characteristics of DRL deployments for dynamic portf...

The success of regulatory sandboxes ultimately depends on sound institutional safeguards, proportionality, and alignment with broader policy objectives.

Normative conclusion derived from the paper's analytical framework and comparative lessons (no empirical validation reported in the abstract).

medium mixed Experimentalism beyond ex ante regulation: A law and economi... RS success measured by effectiveness, accountability, proportionality, and polic...

Triangulation using Social Interactionism, Critical Discourse Analysis, and Semiotics links statistical gains to mechanisms of epistemic appropriation and symbolic legitimation.

Analytical approach described in the paper; theoretical mapping of observed quantitative gains to social-mechanistic explanations based on discourse samples and observations.

medium mixed From Linguistic Hybridity to Development Sovereignty: Pidgin... mechanisms explaining comprehension/adoption/legitimacy outcomes (theoretical li...

The study's interpretation reframes observed outcomes as effects of linguistic sovereignty rather than merely technical communication failures.

Theoretical synthesis using triangulation of Social Interactionism, Critical Discourse Analysis, and Semiotics applied to empirical findings and discourse data from the field sample.

medium mixed From Linguistic Hybridity to Development Sovereignty: Pidgin... conceptual framing of causes behind comprehension/adoption/legitimacy outcomes (...

Commercial platforms' incentives may not align with public-interest verification, so economic policies (transparency mandates, data portability, competition policy) can reshape incentives and improve information ecosystems.

Policy implication drawn from the study's analysis of platform governance and incentive misalignment, supported by interviews and documents discussing platform interactions.

medium mixed Fact-Checking Platforms in the Middle East: A Comparative St... alignment of platform incentives with public-interest verification

Platforms selectively adopt automated tools for triage, detection, and monitoring while keeping human judgment central to verification.

Interviews and workflow analyses indicating selective automation (for triage/monitoring) combined with human-led verification steps.

medium mixed Fact-Checking Platforms in the Middle East: A Comparative St... degree of automation in verification workflows and reliance on human judgment

Each platform (Akeed, Teyit, Factnameh) adapts its scope and tactics according to national constraints.

Platform-level descriptions derived from interviews with staff/editors and analysis of platform outputs and workflows for each of the three organizations.

medium mixed Fact-Checking Platforms in the Middle East: A Comparative St... scope of investigation and tactical choices

Fact-checking platforms in Jordan (Akeed), Turkey (Teyit), and Iran (Factnameh) face similar operational constraints—censorship, limited access to data, and difficulties engaging audiences—but respond with different strategies shaped by local politics.

Comparative interpretive analysis based on document analysis of platform outputs/guidelines and semi-structured interviews with staff, editors, and stakeholders from the three platforms (Akeed, Teyit, Factnameh).

medium mixed Fact-Checking Platforms in the Middle East: A Comparative St... operational constraints (censorship, data access, audience engagement) and adapt...

Better aligned systems can enhance productivity and decision quality, but misaligned systems can displace or harm workers unevenly; justice‑oriented deployment and active redistribution/retraining policies are needed to manage distributional impacts.

Argument synthesizing literature on technology's labor effects and distributive justice; the paper does not present original empirical labor-market analysis.

medium mixed LLM Alignment should go beyond Harmlessness–Helpfulness and ... productivity/decision quality improvements and differential labor displacement o...

Firms face tradeoffs between customization (to capture users) and pluralism (serving diverse values); market competition may either improve or degrade alignment depending on incentives.

Conceptual economic analysis and literature synthesis on market incentives and product differentiation; presented as theorized tradeoffs rather than empirically resolved.

medium mixed LLM Alignment should go beyond Harmlessness–Helpfulness and ... market-level alignment quality under differing competitive incentive structures

Operational choices (data selection, reward modeling, deployment constraints) are strategic decisions by firms balancing cost, speed to market, and risk, and these choices materially affect alignment outcomes.

Analytical argument supported by examples and literature on product development tradeoffs; no new firm‑level empirical analysis is provided.

medium mixed LLM Alignment should go beyond Harmlessness–Helpfulness and ... alignment outcomes as a function of firm operational choices (e.g., data curatio...

Many perceived alignment failures of large language models (LLMs) are not inevitable consequences of model scale or capability; they largely result from operational choices made in training and deployment.

Conceptual analysis and literature synthesis presented in the paper; references to prior case studies and examples of deployment failures are used to support the argument. No new empirical dataset or controlled experiment is reported.

medium mixed LLM Alignment should go beyond Harmlessness–Helpfulness and ... alignment failures / model behavior divergence from human values, safety require...

The paper proposes an 'algorithmic workplace' framework emphasising hybrid agency (agents composed of humans plus GenAI), decentralised decision processes, and erosion of rigid managerial boundaries.

Conceptual synthesis derived from thematic mapping, co‑word analysis and interpretive discussion of the mapped literature; framework presented as the article's conceptual contribution.

medium mixed Generative AI and the algorithmic workplace: a bibliometric ... conceptual formulation of organisational architecture (algorithmic workplace: hy...

AI diffusion and China’s delayed retirement policy jointly shape pre-retirement workers’ willingness to stay employed.

Cross-sectional survey (n=889) of pre-retirement respondents in Beijing, Guangzhou, and Lanzhou; multivariate regression analysis examining associations between employment willingness and regional AI exposure plus policy context (delayed retirement).

medium mixed Analysis of the Impact of Artificial Intelligence on Middle-... self-reported willingness to continue working before retirement (employment inte...

Passive AI use produced an initial increase in enjoyment/satisfaction that reversed once participants returned to manual work.

Pre-registered experiment (N = 269) measured enjoyment/satisfaction before and after return to manual work; passive-copy condition showed short-term increases in enjoyment/satisfaction that declined after returning to manual tasks.

medium mixed Relying on AI at work reduces self-efficacy, ownership, and ... enjoyment/satisfaction

Proprietary versus open DPP data regimes will shape competition: closed data can lead to vendor lock-in and market power, while open standards can spur broader innovation but may reduce short-term rent extraction.

Conceptual policy/economics argument informed by observed stakeholder perspectives and literature; not empirically tested in this study.

medium mixed Integrating knowledge management and digital product passpor... competitive dynamics and innovation outcomes under different DPP data governance...

DPP ecosystems resemble multi‑sided platforms (producers, recyclers, consumers, certifiers) with network effects such that more participants increase DPP data value, potentially creating winner-take-most dynamics unless standards and interoperability are enforced.

Theoretical/platform-economics reasoning grounded in empirical description of stakeholders and DPP roles from the study; not directly tested with market-level data in the paper.

medium mixed Integrating knowledge management and digital product passpor... platform dynamics, network effects, competition/market concentration risk

Vulnerability is path-dependent and contingent on states’ adaptive capacity—governance quality, industrial policy, and bargaining leverage determine whether a country captures upgrading opportunities or becomes a strategic casualty.

Comparative case analysis using indicators of governance, industrial policy presence, and bargaining outcomes; process tracing of critical junctures showing divergent trajectories. (Data sources: governance indicators, case comparisons; sample sizes not specified.)

medium mixed China-US Trade War and the Challenges for Developing Countri... upgrading outcomes (e.g., movement into higher-value segments), differential FDI...

Trade diversion caused by tariff escalation and restrictions re-routes production and trade flows, but benefits are asymmetric: countries with stronger institutions, infrastructure, and policy capacity capture more investment and value-added.

Analysis of bilateral trade and FDI flow changes after tariffs; supply-chain mapping of relocation events; firm announcements of relocation; comparative cases emphasizing institutional/infrastructure differences. (Data sources: trade and investment flow data, supply-chain maps, firm-level announcements; sample sizes not specified.)

medium mixed China-US Trade War and the Challenges for Developing Countri... FDI inflows into manufacturing/tech, share of value-added retained domestically,...

A multi-hazard, multi-risk approach increases societal resilience but is complex and cross-disciplinary.

Project-wide synthesis, in-depth place-based case studies, and stakeholder engagement reported in MYRIAD-EU activities indicating benefits to resilience alongside noted disciplinary and practical complexity.

medium mixed Reducing risk together: moving towards a more holistic appro... societal resilience

Shifting disaster risk management toward a genuinely multi-hazard, multi-risk paradigm is feasible and valuable but requires coordinated advances across conceptual mainstreaming, evidence on spatio-temporal hazard–exposure–vulnerability dynamics, scenario methods, usable decision-support tools, explicit equity integration, deep case-study coproduction, support for MHEWS, and strengthened ECR leadership.

Synthesis and reflection across MYRIAD-EU (2021–2025) project outputs, comparative synthesis of activities, lessons learned, and stakeholder feedback reported by the project.

medium mixed Reducing risk together: moving towards a more holistic appro... feasibility and value of adopting a multi-hazard, multi-risk disaster risk manag...

Technical milestones (scalable, error-corrected qubits; hybrid algorithms) create fat-tailed outcome distributions where a small probability of breakthrough could yield outsized long-run effects.

Monte Carlo experiments and scenario ensembles that include low-probability, high-impact technical breakthrough parameters; expert elicitation of milestone probabilities.

medium mixed Modeling Macroeconomic Output Gains from Quantum-Driven Prod... tail outcomes for GDP/TFP (extreme long-run gains)

R&D funding, standards, regulatory clarity, export controls, and public–private partnerships shape quantum adoption trajectories; policy missteps can slow adoption and concentrate benefits.

Policy counterfactual scenarios and qualitative analysis of ecosystem roles; calibration informed by historical effects of policy on diffusion of strategic technologies.

medium mixed Modeling Macroeconomic Output Gains from Quantum-Driven Prod... adoption rates, distribution of benefits, market concentration

Aggregate gains hinge on how quickly and broadly quantum technologies diffuse; early gains concentrated in frontier firms/sectors can take decades to propagate economy-wide.

Diffusion modeling using logistic/S-curve and Bass models calibrated to historical analog technologies; scenarios show long lag between frontier adoption and economy-wide diffusion.

medium mixed Modeling Macroeconomic Output Gains from Quantum-Driven Prod... time to economy-wide propagation, aggregate GDP/TFP growth over decades

As successive pilot batches of urban green data center policies are rolled out, the aggregate policy impact follows a nonlinear rise-then-fall (increase followed by decline) diffusion trajectory.

Analysis across pilot-batch rollout timing showing a nonlinear (rise-then-fall) pattern in aggregate estimated effects as the number of pilot batches expands; modeled/visualized within the staggered-adoption DID framework.

medium mixed How Does Urban Green Data Center Policy Empower Corporate En... aggregate policy impact on corporate energy utilization efficiency over pilot-ba...

Realizing NLP value in banks requires organizational investments (data pipelines, model deployment, CRM integration) and complementarity between AI tools and managerial/IT capabilities; returns will depend on these complementarities.

Conceptual implication derived from review of applied/engineering papers and literature on technology complementarities; not directly estimated empirically in the review.

medium mixed Natural language processing in bank marketing: a systematic ... realized ROI from NLP adoption conditional on organizational investments and com...

Automated tax-preparation and filing could increase compliance rates but also make tax bases more sensitive to automated tax-optimization strategies, requiring updated regulatory oversight and audit tools.

Paper's policy and economic implications section combining case-based observations and literature; presented as plausible outcomes rather than measured effects.

medium mixed Explore the Impact of Generative AI on Finance and Taxation tax compliance rates, prevalence of automated tax-optimization, regulatory/audit...

Regulatory design acts as an economic instrument that can balance social value from AI with protection of rights, affecting social welfare, public trust, and long-term adoption rates.

Normative synthesis combining legal and economic reasoning; suggested as a theoretical mechanism rather than empirically validated within the paper.

medium mixed ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... social welfare, public trust, long-term AI adoption rates

Automation of routine administrative tasks may reduce demand for certain clerical roles while increasing demand for oversight, auditing, and legal-technical expertise, altering public-sector labor composition and retraining needs.

Qualitative labor-market reasoning based on task-based automation literature and the administrative context; no field labor-data or sample provided.

medium mixed ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... demand for different job categories (clerical roles vs oversight/legal-technical...

AI feedback may either augment teacher productivity (complementarity) or substitute for routine teacher feedback tasks (substitution), with unclear net labor impacts.

Workshop deliberations among 50 scholars highlighting competing theoretical scenarios; no causal labor-market evidence provided.

medium mixed The Future of Feedback: How Can AI Help Transform Feedback t... teacher time allocation; demand for teacher skills; employment levels in educati...

Easier conversational access to models can substitute for routine cognitive labor while complementing high-skill work; miscalibrated trust affects labor outcomes and supervision costs.

Labor and task-allocation implications argued conceptually; no labor-market empirical evidence or quantified substitution/complementarity rates presented.

medium mixed Why We Need to Destroy the Illusion of Speaking to A Human: ... labor substitution for routine tasks, complementarity with high-skill tasks, sup...

Firms can compete on front-end design (transparency, trustworthiness) as a socially beneficial quality signal, but absent regulation competition may favor more persuasive (less honest) interfaces.

Economic argument about product differentiation and competitive incentives, drawn from market theory and literature; no empirical market study provided.

medium mixed Why We Need to Destroy the Illusion of Speaking to A Human: ... firm competition strategies, prevalence of transparent vs. persuasive interfaces...

Misleading cues can create short-term surplus (user satisfaction) but long-term welfare losses if overtrust causes harms or misinformation.

Theoretical economic argument based on information asymmetry and externalities; no empirical quantification in the paper.

medium mixed Why We Need to Destroy the Illusion of Speaking to A Human: ... short-term user satisfaction vs. long-term welfare (harms from misinformation/ov...

LLM-based chatbots’ conversational naturalness increases usability and adoption but also triggers misleading mental models (e.g., anthropomorphism, overtrust).

Paper-level main finding based on conceptual analysis and literature synthesis from HCI, ethics, and conversational analysis; no new large-scale empirical study or sample reported.

medium mixed Why We Need to Destroy the Illusion of Speaking to A Human: ... usability, adoption (engagement/use rates), and prevalence of misleading mental ...

The approach shifts some resource demand from GPU clusters to CPU, memory, and storage I/O, meaning local SSD and CPU provisioning can become the new bottleneck.

Authors note the system relies on multi-tier I/O and CPU-side updates to enable single-GPU fine-tuning; the summary highlights this resource-shift as a risk/consideration. No quantitative cost or workload-specific tradeoff analysis is provided in the summary.

medium mixed An Efficient Heterogeneous Co-Design for Fine-Tuning on a Si... relative resource utilization (GPU vs CPU/host memory/SSD I/O) and potential bot...

Human experts will likely shift roles from sole decision-makers to adjudicators, challengers, and validators of AI-generated arguments, changing required skills toward critical evaluation and dialectical oversight.

Conceptual labor-market projection; no empirical labor studies or surveys presented.

medium mixed Argumentative Human-AI Decision-Making: Toward AI Agents Tha... changes in job tasks, skill demand, and employment shares for expert validators/...

Productivity gains from partial automation may be offset by negative externalities (incorrect legal outcomes, appeals, reputational damage) that impose social and private costs not captured by narrow productivity measures.

Theoretical economic analysis and illustrative case vignettes describing error propagation; no empirical quantification of externalities.

medium mixed Why Avoid Generative Legal AI Systems? Hallucination, Overre... net social welfare/productivity after accounting for error-related externalities

Market demand will likely split between providers offering generative convenience with liability exposure and providers offering certified/verified, explainable tools at a premium, creating a two-tier market.

Market-structure analysis and illustrative projections; no empirical market data or sample size.

medium mixed Why Avoid Generative Legal AI Systems? Hallucination, Overre... market segmentation between riskier low-cost generative providers and premium ve...

Reported monetary supervision cost was low (~$200) for this project, but the paper cautions that general equilibrium effects and scaling may change costs as demand for supervisors rises.

Paper provides reported supervision cost (≈$200) for the single project and includes a caveat about external validity and scaling; cost is self-reported and contextualized by authors.

medium mixed Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau E... monetary supervision cost for this project (≈$200) and authors' caution about sc...

Because these agents will be embedded in safety‑critical infrastructure, economic and technical outcomes will depend heavily on system architecture choices.

Systems‑engineering and policy reasoning drawing on analogies to Internet/IoT evolution and domain examples (disaster response, healthcare, industrial automation, mobility); conceptual argumentation rather than empirical measurement.

medium mixed The Internet of Physical AI Agents: Interoperability, Longev... economic costs and technical system performance/resilience

BenchPress evaluation shows Pokemon battling evaluates capabilities largely orthogonal to common LLM benchmarks (i.e., it stresses different skill sets).

Paper applies a BenchPress matrix/method to quantify coverage relative to standard benchmarks and reports near-orthogonality for battling tasks in the matrix results.

medium mixed The PokeAgent Challenge: Competitive and Long-Context Learni... coverage/overlap metric from BenchPress matrix comparing PokeAgent Battling to s...

The study documents a 'silent empathy' effect: people often feel empathic concern but fail to express it in ways that align with normative empathic communication; targeted feedback helps close that expression gap.

Analysis showing mismatch between internal empathic concern (implied by context/self-report/ratings) and the presence of idiomatic empathic moves in participants' messages; targeted personalized feedback increased use of normative empathic expressions.

medium mixed Practicing with Language Models Cultivates Human Empathic Co... gap between experienced empathy and expressed empathic moves (alignment with nor...

Investments in interpretability that aim to fully 'rule‑ify' LLM competence may have diminishing returns; economic value may be better captured by research into robust behavioral evaluation, stress testing, and hybrid human‑AI workflows, while partial interpretability remains valuable.

R&D allocation and interpretability economics argument built on the central thesis; suggestion rather than empirical finding.

medium mixed Why the Valuable Capabilities of LLMs Are Precisely the Unex... returns to different types of interpretability/AI safety R&D

« Prev 1 2 3 … 47 48 49 … 110 111 Next »