Evidence (4137 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	378	106	59	455	1007
Governance & Regulation	379	176	116	58	739
Research Productivity	240	96	34	294	668
Organizational Efficiency	370	82	63	35	553
Technology Adoption Rate	296	118	66	29	513
Firm Productivity	277	34	68	10	394
AI Safety & Ethics	117	177	44	24	364
Output Quality	244	61	23	26	354
Market Structure	107	123	85	14	334
Decision Quality	168	74	37	19	301
Fiscal & Macroeconomic	75	52	32	21	187
Employment Level	70	32	74	8	186
Skill Acquisition	89	32	39	9	169
Firm Revenue	96	34	22	—	152
Innovation Output	106	12	21	11	151
Consumer Welfare	70	30	37	7	144
Regulatory Compliance	52	61	13	3	129
Inequality Measures	24	68	31	4	127
Task Allocation	75	11	29	6	121
Training Effectiveness	55	12	12	16	96
Error Rate	42	48	6	—	96
Worker Satisfaction	45	32	11	6	94
Task Completion Time	78	5	4	2	89
Wages & Compensation	46	13	19	5	83
Team Performance	44	9	15	7	76
Hiring & Recruitment	39	4	6	3	52
Automation Exposure	18	17	9	5	50
Job Displacement	5	31	12	—	48
Social Protection	21	10	6	2	39
Developer Productivity	29	3	3	1	36
Worker Turnover	10	12	—	3	25
Skill Obsolescence	3	19	2	—	24
Creative Output	15	5	3	1	24
Labor Share of Income	10	4	9	—	23

Governance Remove filter

The framework produces ten testable propositions mapping hypothesized direct and mediated links among constructs and specifying contingencies for future empirical testing.

Explicit statement in the paper that the framework yields ten testable propositions; no empirical validation reported.

high null result Sustainable Marketing Framework for Strengthening Consumer T... propositions (hypothesized relationships)

Experimental structure determination (X‑ray, NMR, cryo‑EM) remains the gold standard but is slow, costly, and low‑throughput.

Paper explicitly states experimental methods are 'gold standard' and characterizes them as slow, costly, low‑throughput; the PDB is cited as the source of structural ground truth.

high null result Protein structure prediction powered by artificial intellige... throughput, cost, and speed of experimental structure determination

The authors did not perform primary empirical validation or simulation of TVR‑Sec across real VR deployments.

Methods and limitations section explicitly state no original empirical experiments or simulations were conducted; analysis is conceptual and qualitative.

high null result Securing Virtual Reality: Threat Models, Vulnerabilities, an... whether empirical validation/simulation was performed (none)

The paper's scope comprised a comparative literature review and conceptual integration of 31 peer‑reviewed studies published between 2023 and 2025.

Authors' methods description specifying sample size and publication window: 31 peer‑reviewed studies (2023–2025).

high null result Securing Virtual Reality: Threat Models, Vulnerabilities, an... number and date range of studies included in the review (31 studies, 2023–2025)

This study is descriptive and comparative rather than quantitative; it relies on available policy documents and secondary literature rather than original field interviews or measured outcomes.

Explicit methodological statement in the paper listing qualitative document analysis, comparative literature review, and policy commentary; limitation acknowledged by authors.

high null result <b>Regulating AI in National Security: A Comparative S... methodological approach and evidentiary scope (document/literature based, non‑qu...

A research agenda for AI economics should include: formalizing consent as a transaction/contracting problem; empirical RCTs and natural experiments measuring effects of consent designs; mechanism design for privacy-preserving data sharing; and policy evaluation of consent regulations.

Explicitly listed research directions in the workshop outputs and position papers; these are proposed next steps rather than empirical findings.

high null result Moving Beyond Clicks: Rethinking Consent and User Control in... proposed research topics and methodological approaches

Follow-up empirical methods should include qualitative interviews, focus groups, usability studies, field experiments (A/B tests), and policy/legal-technical assessments.

Recommended research methods enumerated in the workshop outputs and position papers; these are proposed future methods rather than findings from conducted studies.

high null result Moving Beyond Clicks: Rethinking Consent and User Control in... recommended empirical methods for future research

The Futures Design Toolkit (scenario planning, persona generation, speculative design) was used as a primary method in the workshop.

Methodological description in the workshop summary listing the Futures Design Toolkit and associated activities; procedural claim rather than empirical.

high null result Moving Beyond Clicks: Rethinking Consent and User Control in... use of specified design methods

Empirical generalization across all climate-AI systems is constrained by heterogeneous data availability and proprietary models, limiting the ability to produce universal quantitative claims.

Stated methodological limitation in the paper, noting heterogeneous data and the proprietary nature of some models restrict broad generalization.

high null result The Rise of AI in Weather and Climate Information and its Im... Extent of empirical generalizability across climate-AI systems

The paper does not provide granular quantitative estimates of the economic cost of infrastructural asymmetries in climate-AI.

Explicit limitation stated by the authors in the Methods/Limitations section.

high null result The Rise of AI in Weather and Climate Information and its Im... Absence of quantified economic cost estimates in the paper

There is a need for empirical research quantifying earnings dispersion, labor substitution effects, and the welfare impacts of GenAI-driven content economies over time.

Explicit research recommendation made in the paper based on gaps identified during analysis of the 377 videos (study is qualitative and does not measure these outcomes).

high null result Monetizing Generative AI: YouTubers' Collective Knowledge on... absence of quantitative measures in current study / identified need for future m...

The analysis identifies ten shared use cases that creators present as pathways to income using GenAI.

Coding of the 377-video corpus resulted in a catalog of ten use cases (as reported in the paper).

high null result Monetizing Generative AI: YouTubers' Collective Knowledge on... count and identification of distinct use-case categories (ten)

Risk and ambiguity manipulations: risk condition communicated a single explicit leak probability of 30%; ambiguity condition communicated the leak probability as a range (10–50%).

Paper's methods section describing the manipulations used in the randomized experiment (N = 610); these specific probability framings were the core independent-variable manipulations.

high null result The Data-Dollars Tradeoff: Privacy Harms vs. Economic Risk i... Manipulation parameters (leak-probability information presented to participants)

Experimental design: study used a 2 × 3 between-subjects design with N = 610, crossing information environment (Risk vs Ambiguity) with privacy-treatment conditions (including privacy-threatening vs neutral and different data-type labels).

Methodological description reported in the paper: participants (N = 610) randomized across 6 experimental arms derived from the 2 (Risk vs Ambiguity) × 3 (privacy treatments) factorial design; tasks included choosing between a standard product basket and an AI-personalized basket.

high null result The Data-Dollars Tradeoff: Privacy Harms vs. Economic Risk i... Experimental design / assignment (not an outcome variable)

When leak probabilities are known (risk condition: explicit 30% leak probability), adoption of personalization is about 50% and is not significantly affected by privacy-threatening versus neutral information.

Same randomized experiment (N = 610) with a risk manipulation that explicitly stated a single 30% leak probability. Measured adoption rates showed roughly 50% uptake and no statistically significant difference between privacy-threatening and neutral conditions under risk.

high null result The Data-Dollars Tradeoff: Privacy Harms vs. Economic Risk i... Adoption choice: percent choosing AI-personalized basket (≈50%)

Many apparent inter-domain differences vanish once measurement uncertainty is accounted for.

Bootstrap confidence intervals and repeated-sample comparisons showing that differences in citation share or prevalence observed in single-run snapshots are often not statistically significant when uncertainty from repeated sampling is included.

high null result Quantifying Uncertainty in AI Visibility: A Statistical Fram... statistical significance of inter-domain differences in citation share / prevale...

Falsifiability condition for intermediation-collapse: If intermediary margins remain stable despite measurable declines in information frictions, the intermediation-collapse mechanism is falsified.

Stated empirical test in the paper that compares measured intermediary markups/margins to proxies for information frictions and AI-driven automation across affected sectors.

high null result Abundant Intelligence and Deficient Demand: A Macro-Financia... intermediary margins versus measures of information frictions/automation

Falsifiability condition for Ghost GDP: If monetary velocity does not decline (or instead rises) as the labor share falls, the Ghost GDP channel is unsupported by the data.

Explicit falsification condition provided in the paper based on the model link labor share -> velocity -> consumption; suggested empirical test using monetary-velocity proxies and labor-share series from FRED.

high null result Abundant Intelligence and Deficient Demand: A Macro-Financia... empirical relationship between labor share and monetary velocity

Empirically, top-quintile households account for roughly 47–65% of U.S. consumption.

Calibration and reported quantitative scenarios in the paper using U.S. consumption concentration data (constructed from U.S. consumption/income micro- and macro-data sources referenced in the methods section).

high null result Abundant Intelligence and Deficient Demand: A Macro-Financia... share of U.S. consumption attributable to the top income quintile

Economy & Finance threads contained no self-referential content, suggesting agents can engage in market discussion without representing themselves as agents.

Topic-model-derived topical category labeling and tagging for self-referential themes showing zero instances of self-reference in posts categorized as Economy & Finance in the dataset; counts derived from the 361,605 posts.

high null result What Do AI Agents Talk About? Emergent Communication Structu... presence/absence of self-referential tags in Economy & Finance posts

Because the sample is small and purposive and the design is qualitative, insights are rich but not statistically representative or quantified across the broader research landscape.

Authors' stated study limitations in the paper acknowledging small purposive sample (n=16) and qualitative design.

high null result RCTs & Human Uplift Studies: Methodological Challenges and P... representativeness and generalizability of study findings

The study's data come from semi-structured interviews with 16 expert practitioners across biosecurity, cybersecurity, education, and labor.

Study methods reported in the paper: qualitative data source explicitly stated as 16 semi-structured interviews across listed domains.

high null result RCTs & Human Uplift Studies: Methodological Challenges and P... sample size and domain coverage of interviews

The workshop identifies specific research directions for AI economics: cost–benefit and ROI analyses of shared infrastructure; market design for procurement of co-designed systems; models of innovation incentives under different IP/data-governance regimes; labor market impact assessments; and empirical studies of how validation ecosystems affect adoption rates and pricing.

Explicitly listed research directions in the workshop summary and roadmap produced by consensus at the NSF workshop (Sept 26–27, 2024).

high null result Report for NSF Workshop on Algorithm-Hardware Co-design for ... articulated research agenda items and priority areas for future empirical study

The workshop's findings are based on qualitative synthesis of expert judgment and stakeholder inputs rather than primary empirical data or controlled experiments.

Explicitly stated in the Data & Methods section of the workshop summary; methods: expert panels, thematic breakout sessions, cross-disciplinary discussions, consensus-building.

high null result Report for NSF Workshop on Algorithm-Hardware Co-design for ... nature and strength of empirical support for the recommendations (qualitative vs...

The workshop convened researchers, clinicians, and industry leaders to address co-design across four thematic areas: teleoperations/telehealth/surgical operations; wearable and implantable medicine; home ICU/hospital systems/elderly care; and medical sensing/imaging/reconstruction.

Workshop agenda and participant list from the two-day NSF workshop (Sept 26–27, 2024); methods included thematic breakout sessions focused on these four areas. Documentation at https://sites.google.com/view/nsfworkshop.

high null result Report for NSF Workshop on Algorithm-Hardware Co-design for ... topics and thematic coverage of the workshop

Empirical work (experiments and measurements) is needed to quantify how much value interpretive traces add to downstream outputs, how RATs affect platform incentives, and what governance frameworks fairly allocate resulting rents.

Concluding recommendation in the paper stating the research gaps; not an empirical claim but a stated need.

high null result Chasing RATs: Tracing Reading for and as Creative Activity research agenda items (value quantification, platform incentive effects, governa...

The current presentation of RATs is speculative and illustrative; empirical validation, scalability, and ethical safeguards remain to be developed.

Limitations section of the paper explicitly states the speculative nature and lack of empirical evaluation.

high null result Chasing RATs: Tracing Reading for and as Creative Activity status of empirical validation/scalability/ethical development

Implementation of RATs requires instrumentation at the browser/platform level or via plugins and must address privacy/consent, storage/ownership, sharing controls, and interoperable trace formats.

Design and implementation considerations enumerated in the paper; this is a requirements statement rather than an empirical claim.

high null result Chasing RATs: Tracing Reading for and as Creative Activity implementation requirements and privacy/governance needs

Analytical approaches compatible with RATs include sequence/trajectory mining, network analysis of associations/co-read graphs, embedding/clustering of trajectories, qualitative inspection of reflections, and experimental (A/B or RCT) evaluation of downstream effects.

Methods section of the paper listing suggested analytical techniques; these are proposed methods rather than applied analyses.

high null result Chasing RATs: Tracing Reading for and as Creative Activity analytical approaches applicable to RAT data

The paper does not present large-scale empirical validation; its evidence is primarily theoretical exposition, a constructed illustrative example, and a literature survey.

Explicit description of methods and data in the paper (analysis type: theoretical exposition + illustrative example; no experimental sample reported).

high null result Ergodicity in reinforcement learning presence/absence of empirical experiments or sample-based validation

Local stochastic fluctuations can undo early discovery leads, preventing transient superiority from becoming permanent unless additional asymmetries intervene.

Dynamical analysis of monopolization stage in the model and simulation trajectories showing reversal or loss of early leads in symmetric interaction regimes; theoretical demonstration that fluctuations can destabilize early footholds.

high null result Macroscopic Dominance from Microscopic Extremes: Symmetry Br... persistence of local leads over time (probability of lead reversal due to stocha...

Transient superiority (finding resources faster) by itself does not stabilize a system-wide monopoly; early leads are fragile and can be undone by local stochastic fluctuations.

Analysis of monopolization dynamics and absorbing-state stability within the stochastic spatial model, plus numerical simulations showing symmetric interaction scenarios do not produce robust absorbing monopolies. This is model-based (no empirical validation).

high null result Macroscopic Dominance from Microscopic Extremes: Symmetry Br... long-term persistence/probability of absorbing (system-wide monopoly) state give...

There is limited empirical causal evidence linking specific explanation types to long-term outcomes (safety, fairness, economic performance) in real-world deployments.

Meta-level finding of the review: authors report gaps in the literature—few causal or longitudinal studies of explanation interventions in deployed, high-stakes settings.

high null result Explainable AI in High-Stakes Domains: Improving Trust, Tran... evidence availability for causal effects on safety, fairness, economic performan...

The literature groups explainability impacts along three linked dimensions — user trust, ethical governance, and organizational accountability.

Analytical result of the review's thematic coding and synthesis across interdisciplinary literature (categorization derived from the reviewed corpus).

high null result Explainable AI in High-Stakes Domains: Improving Trust, Tran... categorization structure of explainability impacts (three-dimension taxonomy)

The paper is primarily theoretical and prescriptive: it synthesizes literature and proposes a framework and design guidelines rather than reporting large-scale empirical datasets or causal identification of economic outcomes.

Meta-claim about the paper's methods explicitly stated in the Data & Methods summary; based on the paper's methodological description.

high null result Toward a science of human–AI teaming for decision-making: A ... presence/absence of empirical datasets or causal identification studies in the p...

Key measurable outcomes to assess Human–AI teams include accuracy/efficiency, robustness to novel cases, decision consistency, trust/misuse rates, training costs, and inequity indicators.

Prescriptive list of metrics offered by the authors as part of the research agenda and evaluation guidance; not empirically derived from a dataset in the paper.

high null result Toward a science of human–AI teaming for decision-making: A ... accuracy, efficiency, robustness, consistency, trust/misuse rates, training cost...

Empirical evaluation strategies for Human–AI teams should include randomized interventions, field trials, lab experiments, phased rollouts (difference-in-differences), and structural models that allow interaction terms between human skill and AI quality.

Methodological recommendation in the paper; suggested study designs rather than implemented analyses.

high null result Toward a science of human–AI teaming for decision-making: A ... appropriate empirical identification of team-level complementarities and causal ...

Measuring AI's economic impact requires new metrics that account for decision-value uplift, reduced tail-risk exposures, and dynamic gains from continuous learning; causal identification will require experiments or staggered rollouts.

Methodological recommendation backed by conceptual discussion of measurement challenges; no implementation of such measurement approaches is reported in the paper.

high null result Next-Generation Financial Analytics Frameworks for AI-Enable... proposed measurement constructs (decision-value uplift, tail-risk reduction, lea...

Performance and evaluation should be measured using forecast accuracy, decision lift/value added, latency, and false positive/negative rates.

Paper-prescribed evaluation metrics; presented as recommended practice rather than derived from empirical testing within the paper.

high null result Next-Generation Financial Analytics Frameworks for AI-Enable... forecast accuracy, decision lift (value added), system latency, false positive/n...

Core AI techniques for these frameworks include supervised/unsupervised ML, NLP for unstructured text, anomaly detection for control/transaction monitoring, and reinforcement/prescriptive models for recommendations.

Methodological claim listing standard ML/NLP/anomaly-detection techniques and prescriptive approaches; statement of methods rather than an empirical comparison of alternatives.

high null result Next-Generation Financial Analytics Frameworks for AI-Enable... method adoption/type metrics (e.g., frequency of supervised vs. unsupervised met...

Next‑gen frameworks use large-scale structured (transactions, ledgers, KPIs) and unstructured sources (reports, news, contracts, call transcripts) to power models.

Descriptive claim listing data types the paper recommends; presented as design input requirements rather than empirically validated data-integration projects.

high null result Next-Generation Financial Analytics Frameworks for AI-Enable... data coverage and diversity (e.g., proportion of structured vs. unstructured inp...

There is a need for quantitative studies and microdata on firm-level RM practices, AI adoption, and performance outcomes to measure effect sizes and causal pathways.

Stated research gaps and limitations in the review (lack of primary empirical quantification; heterogeneity across contexts).

high null result The Role of Risk Management as an Organizational Management ... availability of quantitative evidence on RM effects (effect sizes, causal estima...

The review's conclusions are limited by reliance on published literature (potential bias toward successful implementations), lack of primary empirical quantification (no effect sizes), and heterogeneity across organizational contexts limiting direct generalizability.

Explicit limitations stated in the paper summarizing scope and method (qualitative literature review, secondary evidence only).

high null result The Role of Risk Management as an Organizational Management ... generalizability and empirical precision of review findings

Heterogeneity in system designs and deployment contexts complicates cross-site comparisons.

Limitations section and observed variation in platform architectures, degrees of automation, and governance across sites reported via descriptive data and interviews.

high null result The Role of Artificial Intelligence in Healthcare Complaint ... comparability across deployment sites (heterogeneity in systems and contexts)

Non-random selection of institutions limits causal inference and external generalizability of the study's findings.

Study limitations explicitly state non-random site selection and heterogeneous deployments; methodological note that causal claims are constrained.

high null result The Role of Artificial Intelligence in Healthcare Complaint ... generalizability and causal inference validity

There is a need for standardized metrics and measurement protocols for public-sector productivity and non-market outcomes (service quality, processing time, cost per transaction, transparency, trust).

Methodological critique within the review pointing to heterogeneity of outcome measures across studies and calling for standardized metrics; based on synthesis of reviewed literature.

high null result Digital Transformation and AI Adoption in Government: Evalua... existence/adoption of standardized measurement protocols and consistency of repo...

Much of the literature on public-sector digital/AI interventions is descriptive or case-based; causal, quantitative evidence on net productivity effects is limited and context-dependent.

Methodological assessment within the review noting heterogeneous study designs, reliance on secondary sources, and a lack of randomized or quasi-experimental studies; the review explicitly states this limitation.

high null result Digital Transformation and AI Adoption in Government: Evalua... availability of causal quantitative estimates of productivity impacts

Research and monitoring priorities for economists include task-level analyses of substitutability/complementarity, modeling adoption as a function of regulatory costs and reimbursement incentives, and evaluating long-run welfare and distributional effects.

Explicit research recommendations stated in the narrative review, based on gaps identified in the literature and evolving empirical questions.

high null result Will AI Replace Physicians in the Near Future? AI Adoption B... research activity in recommended areas; quality of evidence informing policy

Policymakers and payers should consider liability reform, reimbursement models that reward safe human–AI collaboration, funding for independent clinical validation, and measures to prevent market concentration.

Policy recommendations and implications derived from the narrative review's synthesis of regulatory, economic, and implementation challenges.

high null result Will AI Replace Physicians in the Near Future? AI Adoption B... policy actions implemented (liability reform, reimbursement changes, funding all...

Research priorities include causal studies on AI’s impacts on SME productivity, employment and inequality in LMICs; cost–benefit analyses of financing and policy interventions; evaluation of data governance models; and development of metrics/monitoring systems for inclusive adoption.

Authors' identification of evidence gaps from the structured literature review highlighting areas with insufficient causal or evaluative research.

high null result Artificial Intelligence Adoption for Sustainable Development... existence and quality of targeted causal and evaluative research on AI in LMIC S...

« Prev 1 2 3 … 17 18 19 … 82 83 Next »