Evidence (7278 claims)

Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.

The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).

Browse by theme

Nine broad, paper-level topics. Click one to filter the claims below.

Human-AI Collaboration

Claims by outcome category

Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.

Outcome	Positive	Negative	Mixed	Null	Total
Other	795	210	105	955	2131
Governance & Regulation	886	414	197	126	1654
Organizational Efficiency	826	204	129	87	1257
Technology Adoption Rate	681	259	128	110	1189
Research Productivity	464	138	65	349	1028
Output Quality	503	196	61	53	813
Decision Quality	351	180	84	51	673
AI Safety & Ethics	238	288	71	34	637
Firm Productivity	455	58	92	20	631
Market Structure	186	172	123	25	511
Task Allocation	222	70	76	34	407
Innovation Output	238	28	48	18	334
Skill Acquisition	177	62	62	17	318
Employment Level	107	57	108	13	287
Fiscal & Macroeconomic	135	72	44	26	284
Firm Revenue	172	50	28	5	256
Consumer Welfare	121	68	45	12	246
Task Completion Time	183	33	10	13	240
Inequality Measures	45	126	50	6	227
Worker Satisfaction	95	74	23	12	204
Error Rate	77	98	11	4	190
Regulatory Compliance	84	73	17	7	181
Automation Exposure	61	61	27	14	166
Training Effectiveness	98	21	14	19	154
Wages & Compensation	78	37	25	6	146
Developer Productivity	105	18	14	6	144
Team Performance	87	17	28	10	143
Job Displacement	12	83	23	1	119
Hiring & Recruitment	53	8	8	3	72
Social Protection	39	17	8	2	66
Creative Output	32	20	8	3	64
Skill Obsolescence	5	50	6	1	62
Labor Share of Income	17	20	17	—	54
Worker Turnover	15	15	—	3	33
Industry	—	—	—	1	1

Governance Remove filter

Most existing approaches to AI safety, risk management, and governance focus on post-hoc validation, probabilistic risk estimation, or certification of model behavior.

Author statement summarizing the literature / prior work in AI safety and governance (conceptual claim in the paper's introduction). No empirical survey or sample size reported.

high null result Right-to-Act: A Pre-Execution Non-Compensatory Decision Prot... characterization of prevailing AI safety and governance approaches (post-hoc val...

We develop a formal model in which institutions choose the scale of automation, the degree of codification, and safeguards on iterative use.

Methodological statement: the paper presents a formal/theoretical model specifying institutional choice variables (model description rather than empirical result).

high null result AI Governance under Political Turnover: The Alignment Surfac... institutional choices regarding automation scale, codification, and safeguards (...

We compare LLM-guided bidding against truthful and heuristic strategies using the Vickrey-Clarke-Groves (VCG) mechanism as a benchmark for incentive-compatible, dominant-strategy truthfulness.

Methodological claim describing the comparative experimental design: simulations use VCG as benchmark and include comparisons to truthful and heuristic bidding strategies. No sample size or detailed experimental parameters are provided in the excerpt.

high null result Strategic Bidding in 6G Spectrum Auctions with Large Languag... comparative performance of bidding strategies

When the theoretical assumptions guaranteeing truthfulness hold, LLM bidders recover near-equilibrium outcomes consistent with VCG predictions.

Simulation experiments comparing LLM-guided bidding to the VCG benchmark and to truthful/heuristic strategies under conditions where VCG assumptions are satisfied. The paper reports that LLM outcomes were close to the VCG-predicted equilibrium. No numeric sample size or quantitative effect sizes reported in the provided text.

high null result Strategic Bidding in 6G Spectrum Auctions with Large Languag... equilibrium outcomes / allocation and utility relative to VCG benchmark

We investigate the use of Large Language Models (LLMs) as bidding agents in repeated 6G spectrum auctions with budget constraints in vehicular networks.

Descriptive statement of the study design: the paper reports simulation/experimental evaluation where each user equipment (UE) is modeled as a rational player in repeated spectrum auctions; comparison against truthful and heuristic strategies under Vickrey-Clarke-Groves (VCG) benchmark. No numeric sample size reported in the provided text.

high null result Strategic Bidding in 6G Spectrum Auctions with Large Languag... use of LLMs as bidding agents (methodological evaluation)

Die Studie basiert auf einer wiederholten Querschnittsbefragung lizenzierter Beschäftigter einer außeruniversitären Forschungseinrichtung.

Autorenangabe im Abstract: wiederholte Querschnittsbefragung (survey) unter lizenzieren Beschäftigten der untersuchten Forschungseinrichtung; methodische Beschreibung im Abstract.

high null result Generative KI in der Wissensarbeit: Wahrnehmung, Nutzen und ... Studiendesign / Datengrundlage (repeated cross-sectional survey)

The paper provides a natural definition of benchmark hacking in this strategic context by comparing a player's equilibrium effort allocation to that of a single-agent baseline scenario.

Conceptual/theoretical definition introduced in the model comparing equilibrium effort allocations to a single-agent (non-competitive) baseline.

high null result On Benchmark Hacking in ML Contests: Modeling, Insights and ... benchmark hacking (difference in effort allocation versus single-agent baseline)

We study this question using 10,659 matched human-agent pairs from Moltbook, a social media platform where each autonomous agent is publicly linked to its owner's Twitter/X account.

Descriptive statement of the study dataset reported in the paper: dataset of 10,659 matched human-agent pairs from Moltbook with public linkage to owner's Twitter/X account.

high null result Behavioral Transfer in AI Agents: Evidence and Privacy Impli... matched_human-agent_pairs_count

The paper proposes a conceptual framework linking AI adoption to employability and role transformation, mediated by skill adaptation, continuous learning, and organizational readiness.

Author-proposed conceptual framework presented in the review paper (theoretical linkage based on literature synthesis).

high null result The Impact of AI on Employability and Evolving Job Roles of ... linkage between AI adoption and employability

This study takes food delivery riders as the research object and analyzes the dilemma of labor relations determination under AIGC.

Methodological statement in the paper specifying the chosen subject of analysis (food delivery riders); this is an explicit description of the paper's scope rather than an empirical finding.

high null result AIGC+ Determination of Labor Relations in the Context of the... research scope / sample (food delivery riders)

The paper develops an interdisciplinary conceptual framework that integrates insights from economics, management theory, and digital governance to characterize algorithmic enterprises.

Methodological claim about the paper's approach; stated in abstract as the paper's contribution (conceptual framework built from interdisciplinary literature).

high null result Algorithmic Enterprises: Rethinking Firm Strategy in the Age... existence and structure of a conceptual interdisciplinary framework

Future research should strengthen cross-national comparisons, longitudinal tracking, and interdisciplinary collaboration to support development of a technology governance framework that balances efficiency with equity.

Author recommendation based on identified research gaps in the literature review (prescriptive/recommendation).

high null result From Technological Substitution to Institutional Response: A... recommended research approaches and governance framework design

Existing research has clear gaps: limited evidence from developing-country contexts, insufficient attention to within-occupation heterogeneity, incomplete accounts of psychological mechanisms underlying AI anxiety, and a shortage of rigorous evaluations of reskilling policy effectiveness.

Author's assessment based on the reviewed literature identifying thematic gaps and methodological limitations (critical literature review).

high null result From Technological Substitution to Institutional Response: A... completeness and scope of existing research (research gaps)

This study leverages the establishment of National New-Generation Artificial Intelligence Innovation and Development Pilot Zones as a quasi-natural experiment and employs a multi-period DID model on A-share listed manufacturing firms from 2010 to 2023.

Methodological description provided in the paper: policy rollout as quasi-natural experiment; multi-period difference-in-differences estimation; sample frame specified as A-share listed manufacturing firms on the Shanghai and Shenzhen Stock Exchanges, 2010–2023.

high null result The Impact of National New-Generation Artificial Intelligenc... method/design (DID on firm panel 2010–2023)

The First Fundamental Theorem of Welfare Economics assumes that welfare-bearing agents are autonomous and implicitly relies on a binary distinction between autonomy and instrumentality.

Explicit statement in the paper's introduction/abstract describing the theorem's assumptions; conceptual/theoretical textual analysis (no empirical sample).

high null result Post-AGI Economies: Autonomy and the First Fundamental Theor... assumption about welfare-bearing agents (autonomy vs instrumentality)

This paper was generated by AI, using https://github.com/chenandrewy/ralph-wiggum-asset-pricing/.

Author statement in the abstract declaring the paper was generated by AI and providing a GitHub link.

high null result Hedging the Singularity authorship/generation method of the paper

This review was conducted following the guidelines of the Preferred Reporting of Items in a Systematic Review and Meta-Analysis (PRISMA).

Methodological statement in the paper's abstract indicating PRISMA adherence; no further protocol details or study counts provided in the abstract.

high null result Artificial Intelligence, Public Policy and Governance - impl... methodological adherence to PRISMA reporting standards

The paper foregrounds industrial firms' own digital agency as a less understood aspect in the literature on digitalization and governance.

Authors' positioning of their contribution and literature review claim in the paper (qualitative/theoretical claim).

high null result Industry 4.0 Inc.—Mergers and acquisitions and the digital t... research gap concerning firms' digital agency

The analysis is limited to OECD economies and monthly aggregate data, which constrains generalizability.

Study design: monthly panel of 38 OECD economies from 2000–2024 as stated in paper; author-reported limitation.

high null result From Digital Trade to Climate Gains: How Global Value Chains... scope/generalizability

Digital trade alone is not statistically significant in affecting CO2 emissions (β = −0.030).

Same fixed-effects econometric specification on the monthly panel of 38 OECD economies (2000–2024); coefficient reported but not statistically significant.

high null result From Digital Trade to Climate Gains: How Global Value Chains... CO2 emissions

We evaluate 20 state-of-the-art LLMs on their ability to predict empirically supported causal directions.

Experimental evaluation: 20 LLMs tested on the benchmark (10,490 triplets, including 1,056 contested instances) to predict empirically verified causal signs.

high null result Ideological Bias in LLMs' Economic Causal Reasoning model accuracy at predicting empirically verified causal directions

From 10,490 causal triplets (treatment-outcome pairs with empirically verified effect directions) derived from top-tier economics and finance journals, we identify 1,056 ideology-contested instances.

Construction/extension of the EconCausal benchmark by selecting 10,490 causal triplets from top-tier economics and finance journals and labeling 1,056 as ideology-contested (intervention- vs market-oriented divergence).

high null result Ideological Bias in LLMs' Economic Causal Reasoning dataset_counts (number of causal triplets and contested instances)

The governance of open-weight artificial intelligence (AI) models has been framed as a binary choice: openness as risk, restriction as safety.

Literature and policy framing review presented in the paper (conceptual/argumentative analysis).

high null result The Open-Weight Paradox: Why Restricting Access to AI Models... policy framing of AI governance (openness vs restriction)

This is an exploratory and qualitative state-of-practice study grounded in over 30 interviews across four stakeholder groups (large enterprises, small/medium firms, AI developers, and CAD/CAM/CAE vendors).

Methodological statement in the paper describing study design and sample composition.

high null result Agentic AI in Engineering and Manufacturing: Industry Perspe... study design/sample composition

Key breakthroughs needed include integration with traditional engineering tools and data types, robust verification frameworks, and improved spatial and physical reasoning.

Interviewee-identified requirements compiled from over 30 interviews; stakeholders repeatedly pinpoint integration, verification, and spatial/physical reasoning as priority technical advances.

high null result Agentic AI in Engineering and Manufacturing: Industry Perspe... technical capabilities and integrations needed for broader deployment

We conduct a controlled experiment where AI agents trade in a prediction market after receiving private signals, measuring information aggregation by the log error of the last price.

Statement of experimental design and measurement approach in the paper: laboratory-style controlled experiment, private signals given to agents, log error of last price used to quantify aggregation.

high null result Information Aggregation with AI Agents methodological description (log error of last price used as aggregation metric)

Allowing strategic prompting does not affect information aggregation.

Experimental manipulation that included strategic prompting of AI agents prior to trading; aggregation measured by log error of last price; observed no effect.

high null result Information Aggregation with AI Agents information aggregation (log error of the last price)

Changing the initial price does not affect information aggregation.

Experimental condition varying the initial market price and measuring resulting aggregation performance (log error of last price); reported no effect.

high null result Information Aggregation with AI Agents information aggregation (log error of the last price)

Changing the duration of the market does not affect information aggregation.

Experimental manipulation of market duration in the trading experiment; measured aggregation (log error of last price) across durations and found no effect.

high null result Information Aggregation with AI Agents information aggregation (log error of the last price)

Allowing cheap talk communication does not affect information aggregation.

Experimental condition comparing markets with and without cheap talk communication; aggregation measured by log error of the last price; reported no effect.

high null result Information Aggregation with AI Agents information aggregation (log error of the last price)

The study analyzes AI policies issued by provincial-level governments in China using a policy instrument framework and fuzzy-set qualitative comparative analysis (fsQCA).

Methods statement in the paper describing dataset (provincial-level AI policy documents), theoretical framing (policy instrument framework), and analytic method (fsQCA).

high null result How Can Artificial Intelligence Policies Promote the Sustain... methodological approach / dataset (provincial AI policies)

Five major themes emerged from the review: (1) Machine Learning for Credit Risk Assessment and Financial Inclusion; (2) Deep Learning and Neural Networks for Market Prediction and Volatility Forecasting; (3) Natural Language Processing and Sentiment Analysis for Decision Support; (4) AI-Based Fraud Detection and Operational Risk Management; and (5) Explainable AI, Regulatory Technology, and Governance Frameworks.

Thematic synthesis of the 64 retained studies reported in results; explicit listing of five themes in the paper's Results section.

high null result AI-Driven Financial Risk Management and Decision Intelligenc... topics/themes identified in the literature

We conducted a scoping review across four major databases (SciSpace, Google Scholar, ArXiv) covering publications from 2019 to 2025 and retained 64 unique studies after deduplication and screening.

Methods section: Arksey and O'Malley framework (enhanced by Levac et al.), explicit database search (SciSpace, Google Scholar, ArXiv), timeframe stated (2019–2025), and reported final sample of 64 studies after deduplication and screening.

high null result AI-Driven Financial Risk Management and Decision Intelligenc... number of studies included in the review

This study proposes a framework for evaluating platform ecosystems by their long-term effects on human capital formation and institutional resilience.

Methodological contribution claimed by the paper (development of an evaluative framework); presented as part of the paper's contributions rather than an empirical finding.

high null result When Platforms Replace the Pipeline: AI, Labor Erosion, and ... existence of a proposed evaluative framework (methodological output)

The empirical analysis covers MENA economies over the period 2010–2023.

Paper explicitly states the temporal and geographic scope: MENA economies, 2010–2023.

high null result Digital Transformation, AI Efficiency, and Sustainable Devel... sample scope (countries and years)

The study employs a dynamic panel data approach using the System Generalized Method of Moments (System GMM) estimator to address endogeneity, unobserved heterogeneity, and persistence effects.

Methods statement in the paper describing the use of System GMM for panel data covering MENA economies over 2010–2023.

high null result Digital Transformation, AI Efficiency, and Sustainable Devel... methodological approach (System GMM)

Four propositions formalize the gradient, cascade compounding, delegation-depth effects, and extension sufficiency, establishing boundary conditions for the framework's valid operating envelope.

Theoretical/formal propositions presented in the paper that articulate limits and conditions for the framework's applicability.

high null result Governed Auditable Decisioning Under Uncertainty: Synthesis ... formalized theoretical boundary conditions for framework validity

The framework is analytically assessed for transferability across four decision system architectures.

Paper reports an analytic (cross-architecture) assessment comparing framework applicability across four named decision system architectures.

high null result Governed Auditable Decisioning Under Uncertainty: Synthesis ... transferability / applicability of framework across decision system architecture...

A formal welfare framework, analogous to the Nordhaus optimal patent life, characterises the trade-offs and yields testable predictions.

Proposal of a formal theoretical framework by the authors (analogy to Nordhaus); presented as a modeling approach rather than as an implemented empirical model in the excerpt.

high null result Market Dynamics, Governance and Open Research Metadata in th... welfare trade-offs in boundary governance (analogous to optimal patent life anal...

CRediT contributions, funding acknowledgements and AI disclosure statements illustrate the annulus lifecycle.

Empirical examples/case illustrations cited by the authors to demonstrate how different metadata types move through the annulus; no systematic empirical analysis or sample size provided in the excerpt.

high null result Market Dynamics, Governance and Open Research Metadata in th... example-based illustration of metadata lifecycle (CRediT, funding acknowledgemen...

By analogy with the efficient market hypothesis, the width of the innovation annulus measures production inefficiency, set by the interplay of friction and demand.

Theoretical analogy and conceptual mapping presented in the paper; no empirical calibration or measurement of 'width' reported in the excerpt.

high null result Market Dynamics, Governance and Open Research Metadata in th... width of the innovation annulus as an indicator of production inefficiency

The innovation annulus is a permanent, functional feature of the ecosystem -- not a pathology to eliminate.

Normative/descriptive assertion by the authors based on their theoretical framing; no empirical longitudinal evidence provided in the excerpt.

high null result Market Dynamics, Governance and Open Research Metadata in th... persistence and functional role of the innovation annulus in the knowledge ecosy...

We introduce the innovation annulus: the zone between freely available structured data and the advancing frontier of commercially refined knowledge products.

Definition/construct introduced by the authors as part of their conceptual framework; no empirical validation shown in the excerpt.

high null result Market Dynamics, Governance and Open Research Metadata in th... existence and conceptual boundaries of the 'innovation annulus' between free str...

The real tension in scholarly knowledge infrastructure lies between the persistent cost of producing and refining structured metadata under deep technological friction, and the differentiated demands distinct communities place on data quality, focus and granularity.

Theoretical/analytical argument in the paper; presented as the central descriptive diagnosis rather than supported by empirical measurement in the excerpt.

high null result Market Dynamics, Governance and Open Research Metadata in th... trade-off between metadata production/refinement cost and community data-quality...

The outreach casenotes used in the study are fairly short and heavily redacted.

Descriptive statement about the dataset of street outreach casenotes provided by the nonprofit partner used in the audit (direct observation by authors).

high null result Auditing LLMs for Algorithmic Fairness in Casenote-Augmented... casenote length and degree of redaction

LLM zero-shot classification does not introduce additional textual biases beyond the algorithmic biases already present in tabular classification.

Authors' assessment/audit comparing zero-shot LLM classification using casenote text against tabular-only classification, concluding no additional textual bias introduced. (Details and sample size not provided in abstract.)

high null result Auditing LLMs for Algorithmic Fairness in Casenote-Augmented... additional textual bias introduced by LLM zero-shot classification relative to t...

We conducted an in-the-wild evaluation with over 2,200 individuals from heterogeneous organisations and roles in 116 countries, via log analysis, surveys, and 20 interviews.

Reported evaluation methods and sample in the paper's abstract: log analysis, surveys, and 20 interviews with over 2,200 participants across 116 countries.

high null result Learning from AVA: Early Lessons from a Curated and Trustwor... evaluation sample and methods

We measure processes of polarization and integration in global AI research over three decades using large-scale scientific publication data.

Methodological claim describing the study: the analysis spans three decades and uses large-scale publication data and network comparisons to randomized baselines.

high null result Polarization and Integration in Global AI Research measurement of polarization and integration processes

A stylized calibration to four providers using April 2026 data treats parameter values as inputs to a comparative risk mapping, not structural estimates.

Paper reports a calibration exercise using data from four providers (April 2026) and emphasizes it is a comparative mapping rather than structural estimation.

high null result The Inference Bottleneck: A Formal Model of Vertical Foreclo... comparative risk mapping across providers

Discrimination (QoS gap) vanishes at a joint boundary rather than at a simple threshold in alpha alone.

Analytical result from the model characterizing the boundary conditions for non-discrimination.

high null result The Inference Bottleneck: A Formal Model of Vertical Foreclo... presence/absence of QoS discrimination

« Prev 1 2 3 … 38 39 40 … 145 146 Next »