Evidence (11633 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	609	159	77	736	1615
Governance & Regulation	664	329	160	99	1273
Organizational Efficiency	624	143	105	70	949
Technology Adoption Rate	502	176	98	78	861
Research Productivity	348	109	48	322	836
Output Quality	391	120	44	40	595
Firm Productivity	385	46	85	17	539
Decision Quality	275	143	62	34	521
AI Safety & Ethics	183	241	59	30	517
Market Structure	152	154	109	20	440
Task Allocation	158	50	56	26	295
Innovation Output	178	23	38	17	257
Skill Acquisition	137	52	50	13	252
Fiscal & Macroeconomic	120	64	38	23	252
Employment Level	93	46	96	12	249
Firm Revenue	130	43	26	3	202
Consumer Welfare	99	51	40	11	201
Inequality Measures	36	105	40	6	187
Task Completion Time	134	18	6	5	163
Worker Satisfaction	79	54	16	11	160
Error Rate	64	78	8	1	151
Regulatory Compliance	69	64	14	3	150
Training Effectiveness	81	15	13	18	129
Wages & Compensation	70	25	22	6	123
Team Performance	74	16	21	9	121
Automation Exposure	41	48	19	9	120
Job Displacement	11	71	16	1	99
Developer Productivity	71	14	9	3	98
Hiring & Recruitment	49	7	8	3	67
Social Protection	26	14	8	2	50
Creative Output	26	14	6	2	49
Skill Obsolescence	5	37	5	1	48
Labor Share of Income	12	13	12	—	37
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

There are important regional differences—especially in developing contexts—that necessitate context-specific approaches to improving women’s participation in AI-enabled work.

Observation reported in the review drawing on geographically diverse studies and policy analyses; the abstract does not quantify differences or report sample sizes for cross-region comparisons.

high mixed Artificial Intelligence and GenderedEmployment: Reviewing Op... regional variation in barriers and opportunities affecting women's participation...

Social, cultural, and ethical considerations influence women’s engagement in AI-centric workplaces.

Claim made in the review, based on interdisciplinary literature that includes sociocultural analyses and ethical discussions; the abstract does not provide empirical effect estimates or sample sizes.

high mixed Artificial Intelligence and GenderedEmployment: Reviewing Op... women's engagement in AI-centric workplaces

AI applications—ranging from recruitment algorithms to workplace automation—can either reinforce gender disparities or promote equitable employment outcomes.

Stated in the review based on collated findings from multiple studies and analyses that document both harms (e.g., biased recruitment algorithms) and potential benefits (e.g., tools designed to reduce bias); no single empirical study or pooled effect size provided in the abstract.

high mixed Artificial Intelligence and GenderedEmployment: Reviewing Op... impact of AI applications on gender disparities in hiring and employment outcome...

Artificial Intelligence (AI) is rapidly transforming workplaces across the globe, offering both novel opportunities and unique challenges for women in technology-driven industries.

Stated in the paper's introduction/abstract as a summary conclusion based on a narrative literature review of peer-reviewed studies, policy analyses, and preprint research; no specific sample size or primary empirical method reported in the abstract.

high mixed Artificial Intelligence and GenderedEmployment: Reviewing Op... women's participation and experiences in AI-enabled workplaces

The study proposes a sectoral risk classification to better understand vulnerability patterns and workforce implications.

Paper reports development/proposal of a sectoral risk classification as a contribution (the classification itself and validation details are not described in the abstract).

high mixed AI and the Future of Job Profiles: A systematic Review of Se... sectoral vulnerability classification

The rapid integration of Artificial Intelligence (AI) across industries is fundamentally reshaping occupational structures and redefining employment dynamics.

Stated as an overall conclusion of the paper based on a systematic review of recent literature from major academic databases (details of included studies not provided in the abstract).

high mixed AI and the Future of Job Profiles: A systematic Review of Se... occupational structures and employment dynamics

These efficiency gains are offset by a growing 'Efficiency-Legitimacy Paradox' (i.e., improvements in efficiency come with worsening legitimacy concerns).

Conceptual synthesis from the systematic review (2018-2026) identifying a recurring trade-off across reviewed studies; specific empirical quantification not provided in abstract.

high mixed Artificial Intelligence, Public Policy and Governance - impl... trade-off between administrative efficiency and democratic legitimacy/procedural...

There is a structural shift from 'street level' bureaucracies to 'system-level' architectures that can be defined as the institutional division of 'Artificial Discretion' to algorithmic infrastructures.

Synthesis from the PRISMA-guided systematic review of literature (2018-2026) reporting observed changes in administrative architectures; specific studies not enumerated in abstract.

high mixed Artificial Intelligence, Public Policy and Governance - impl... institutional/administrative architecture (shift from street-level to system-lev...

As a General-Purpose Technology (GPT), Artificial Intelligence (AI) is fundamentally reconfiguring state capacity, as well as the mechanics of global economic management.

Systematic review of current research studies (2018-2026) conducted following PRISMA guidelines; synthesis of literature claiming broad institutional and macroeconomic effects. Number of studies not specified in abstract.

high mixed Artificial Intelligence, Public Policy and Governance - impl... state capacity and the mechanics of global economic management

Agentic AI differs from traditional algorithmic trading and generative AI through its capacity for goal-oriented autonomy, continuous learning, and multi-agent coordination.

Analytic comparison and synthesis across prior research and technical architectures in the survey; descriptive/definitional rather than empirical testing.

high mixed Agentic Artificial Intelligence in Finance: A Comprehensive ... capability differences (goal-oriented autonomy, continuous learning, multi-agent...

Uncertainty-aware exploration (in algorithms) alters fairness metrics compared to policies that ignore uncertainty.

Results from simulation experiments compare uncertainty-aware exploration policies to baseline policies and report changes in fairness metrics (as described in the abstract and results).

high mixed Fairness under uncertainty in sequential decisions fairness metrics

Analysis of more than two decades of M&A deals reveals shifts in acquisition activity and allows mapping of corporate linkages and overlapping investments.

Empirical longitudinal analysis of M&A deals over a period exceeding 20 years; method: mapping corporate linkages from M&A data (sample size/dataset not specified in the excerpt).

high mixed Industry 4.0 Inc.—Mergers and acquisitions and the digital t... acquisition activity and corporate linkages / overlapping investments

The emissions effects of digital trade are conditional rather than uniform, depending on complementary policy (carbon pricing, regulatory stringency), technological (AI-enhanced logistics), and energy (renewables) factors.

Synthesis of findings from fixed-effects regressions with interactions, carbon-pricing threshold analysis, machine-learning threshold detection, and SEM mediation on the monthly panel of 38 OECD economies (2000–2024).

high mixed From Digital Trade to Climate Gains: How Global Value Chains... CO2 emissions

Operationalizing hardware-based governance must address transition realities including legacy hardware, attestation at scale, and protection of civil liberties.

Policy implementation analysis in the paper identifying practical challenges to deploying hardware-layer controls (conceptual/operational analysis; no empirical trial data provided).

high mixed The Open-Weight Paradox: Why Restricting Access to AI Models... practical hurdles to governance deployment (legacy hardware, attestation scalabi...

For LLM agents, memory management critically impacts efficiency, quality, and security.

Statement in paper framing and motivation; supported conceptually by literature linking memory design to system properties (no specific experimental details provided in abstract).

high mixed FSFM: A Biologically-Inspired Framework for Selective Forget... efficiency, content quality, and security of LLM agents

The experimental findings are consistent with the paper's theoretical predictions.

Comparison reported in the paper between theoretical model predictions and observed outcomes from the controlled AI-agent trading experiments.

high mixed Information Aggregation with AI Agents consistency between theoretical predictions and experimental measures (e.g., agg...

Coding patterns are bimodal: in 41% of sessions, agents author virtually all committed code ("vibe coding"), while in 23%, humans write all code themselves.

Empirical analysis of authorship attribution across the 6,000 sessions in the SWE-chat dataset; percentages derived from session-level classification.

high mixed SWE-chat: Coding Agent Interactions From Real Users in the W... distribution of code authorship across sessions (agent-dominant vs human-only se...

A determinism study of 10 replays per case at temperature zero shows both architectures inherit residual API-level nondeterminism, but DPM exposes one nondeterministic call while summarization exposes N compounding calls.

Determinism experiment with 10 replays per case at temperature zero; qualitative/quantitative observation about number of nondeterministic LLM calls exposed by each architecture.

high mixed Stateless Decision Memory for Enterprise AI Agents system nondeterminism / number of nondeterministic LLM calls exposed per decisio...

Advanced prompting methods improve accuracy on inconclusive cases but over-correct, withholding decisions even on clear cases.

Empirical comparison of prompting methods reported in paper: advanced prompts increased accuracy on inconclusive (insufficient-information) cases but led to excessive deferral/withholding on clear cases.

high mixed Learning When Not to Decide: A Framework for Overcoming Fact... accuracy on inconclusive cases and rate of withholding/deferral on clear cases

Multi-agent workflows and benchmark evaluation reveal current capabilities, limitations, and research frontiers in agentic AI for physical design.

The paper states it analyzes recent experience with multi-agent workflows and benchmark evaluation; the abstract does not provide specific benchmark names, metrics, or sample sizes.

high mixed Invited: Agentic AI for Physical Design R&D: Status and Pros... capabilities and limitations as identified via multi-agent workflows and benchma...

Effective AI policy mixes are contingent on regional resource endowments and development conditions (i.e., variation across configurations indicates contingency on regional context).

Observed variation across the fsQCA-derived configurations; authors interpret differences as reflecting dependence on regional resources and development conditions.

high mixed How Can Artificial Intelligence Policies Promote the Sustain... regional science and technology industrial competitiveness

The study was a preregistered experiment across seven leading LLMs and twelve investment scenarios covering legitimate, high-risk, and objectively fraudulent opportunities.

Methodological description in the paper stating preregistration, 7 LLMs, 12 scenarios; combined dataset included 3,360 AI advisory conversations and a 1,201-participant human benchmark.

high mixed Large Language Models Outperform Humans in Fraud Detection a... study design characteristics (models tested and scenario types)

There is significant heterogeneity in methodological rigor across studies.

Authors' thematic observation from quality appraisal/extraction noting wide variation in methods, validation approaches, and reporting standards among the 64 studies.

high mixed AI-Driven Financial Risk Management and Decision Intelligenc... methodological rigor/quality of studies

AI is increasingly being integrated into both existing and newly emerging digital infrastructures, altering their architecture, functional role, and strategic significance as these systems begin to operate as embedded cognitive infrastructures shaping knowledge production, decision-making, and institutional processes.

Conceptual and descriptive claim presented by the paper (theoretical analysis/literature-informed observation). No empirical sample size or quantitative methods reported in the provided text.

high mixed Digital Sovereignty in the Global Cognitive-Informational Or... change in the architecture/role of digital infrastructures and their effect on k...

Hybrid ML+rules systems achieve partial DES-property fillability.

Result of the paper's analytic comparison across the four architectures identifying relative fillability levels for hybrid ML+rules systems.

high mixed Governed Auditable Decisioning Under Uncertainty: Synthesis ... DES-property fillability

Artificial intelligence raises the threshold at which refinement adds value.

Theoretical/analytical statement in the paper describing AI's effect on the marginal value of refinement; no empirical quantification provided in the excerpt.

high mixed Market Dynamics, Governance and Open Research Metadata in th... threshold of refinement effort required before additional value is realized

Open-source versus closed-source trade-offs (including deployment architectures and competitive differentiation) are a central strategic consideration when selecting an enterprise LLM approach.

Paper's comparative analysis of open-source and closed-source alternatives and discussion of strategic implications; supported by the Bills Converter design rationale.

high mixed Buy Or Build? A Practitioner’s Framework for Large Language ... strategic positioning / competitive differentiation from LLM architecture choice

AI is associated with a shift toward younger, relatively less educated workers.

Reported association in the paper's baseline empirical results linking AI presence/pervasiveness to changes in workforce composition (age and education).

high mixed Early Estimates of the Impact of AI Within BEA’s Industry Ec... worker composition by age and education

AI is becoming a geopolitical tool that defines trade, finance, supply chains, surveillance abilities, and diplomatic bargaining power.

Conceptual/qualitative synthesis in the paper's argument; no empirical methods or sample size reported in the abstract.

high mixed ARTIFICIAL INTELLIGENCE AND THE WEAPONIZATION OF ECONOMIC IN... influence over trade, finance, supply chains, surveillance capabilities, and dip...

Variable importance improvements to zero-shot tabular classification produce mixed results with respect to algorithmic fairness.

Authors report experiments applying variable-importance-based adjustments to zero-shot LLM tabular classification and evaluating resulting algorithmic fairness outcomes; described as producing mixed results. (Sample size not provided in abstract.)

high mixed Auditing LLMs for Algorithmic Fairness in Casenote-Augmented... algorithmic fairness (classification error disparities) resulting from variable-...

Targeted prompt interventions significantly alter the magnitude of market bubbles (they can amplify or suppress bubble size).

Randomized (or otherwise experimentally manipulated) prompt interventions applied to LLM agents in the simulated open-call auction, with resulting differences in measured bubble magnitude reported.

high mixed Dissecting AI Trading: Behavioral Finance and Market Bubbles magnitude of market bubbles

By analyzing agents' reasoning text through a twenty-mechanism scoring framework, targeted prompt interventions causally amplify or suppress specific behavioral mechanisms.

Qualitative and quantitative analysis of agents' chain-of-thought / reasoning text using a 20-mechanism scoring framework; experimental manipulations of prompts reported to change mechanism scores (interpreted causally as interventions on prompts).

high mixed Dissecting AI Trading: Behavioral Finance and Market Bubbles mechanism scores derived from agents' reasoning text (20-mechanism framework)

Given the results, educators should revisit pair programming as an educational tool in addition to embracing modern AI.

Authors' recommendation in the paper's conclusion based on experimental findings (performance, workload, emotion, retention outcomes).

high mixed Fast and Forgettable: A Controlled Study of Novices' Perform... educational practice recommendation (pair programming vs AI-assisted instruction...

Both US and Chinese strategies depend on cross-country relationships in AI innovation.

Conceptual assertion motivating the network analysis of international collaborations and citations.

high mixed Polarization and Integration in Global AI Research dependence of national strategies on cross-country research relationships

Formal network verification has made substantial progress in proving correctness properties but is typically applied in offline, pre-deployment settings and faces challenges in accommodating continuous changes and validating live production behavior.

Authors' summary of the state of the art in network verification (assertion in paper; no empirical data in abstract).

high mixed Aether: Network Validation Using Agentic AI and Digital Twin applicability of formal verification to live/continuous change

Overall, the proposed HRL framework improves learning efficiency and scalability, outperforming heuristic baselines while remaining below the perfect-information oracle bound.

Results reported in the paper from simulation experiments comparing the HRL framework to heuristic baselines and the oracle; pairwise differences analyzed (Wilcoxon tests referenced). The paper asserts better performance than heuristics but still worse than the oracle.

high mixed Omnichannel Supply Chains Amid Demand Shocks: A Centralized ... policy performance (learning efficiency, scalability, and supply-chain control p...

The proposed safety-filter outperforms a standalone deep reinforcement learning-based controller in energy and cost metrics, with only a slight increase in comfort temperature violations.

Reported experimental comparison between the safety-filter-enhanced controller and a standalone DRL controller in the paper; specific metrics and sample size not provided in the excerpt.

high mixed Safe Deep Reinforcement Learning for Building Heating Contro... energy metrics, cost metrics, and comfort temperature violations

Results also reveal divergences between the two interaction scenario types.

Abstract statement that divergences vary across different interaction contexts / scenario types.

high mixed Imperfectly Cooperative Human-AI Interactions: Comparing the... scenario_specific_outcomes

Results reveal divergences between purely simulated and human study datasets.

Abstract reports that findings diverge between simulation experiments and the human-subjects dataset; comparisons drawn across the two datasets (simulation N=2000, human N=290).

high mixed Imperfectly Cooperative Human-AI Interactions: Comparing the... comparative_outcomes_between_datasets

Confirmatory Factor Analysis (CFA) and Structural Equation Modeling (SEM) verified correlations among educational background, gender inclusiveness, digital literacy, and perceived algorithmic fairness.

Paper reports use of CFA and SEM to test relationships among those variables; reliability/fit supported by Composite Reliability (CR), Average Variance Extracted (AVE), and model-fit indicators.

high mixed A Machine Learning Perspective on FinTech-Driven Inclusion: ... correlations among educational background, gender inclusiveness, digital literac...

Experienced developers maintain control through detailed delegation while novices struggle between over-reliance and cautious avoidance.

Observed behaviors and accounts from the AI-assisted debugging task (10 juniors) and senior participants in ACTA/Delphi and blind review phases (5 + 5 seniors).

high mixed From Junior to Senior: Allocating Agency and Navigating Prof... Control over AI tools (detailed delegation) vs patterns of novice behavior (over...

AI is not just changing how engineers code—it is reshaping who holds agency across work and professional growth.

Qualitative synthesis of findings across the three-phase study (Delphi with 5 seniors; debugging task with 10 juniors; blind reviews by 5 seniors).

high mixed From Junior to Senior: Allocating Agency and Navigating Prof... Distribution of agency (decision-making control) across roles and career develop...

The rapid advancement of artificial intelligence (AI) technologies, particularly generative AI and large language models, has reignited debates about the future of work and the potential for widespread labor market disruption.

Statement in the paper's introduction/abstract citing recent empirical studies, industry reports, and ongoing debates; no original sample or numerical evidence reported in the abstract.

high mixed Artificial Intelligence And The Transformation of Labor Mark... job_displacement

How software developers interact with AI-powered tools, including Large Language Models (LLMs), plays a vital role in how these AI-powered tools impact them.

Based on qualitative analysis of twenty-two interviews with software developers about using LLMs for software development; asserted as a central finding in the paper's analysis.

high mixed Towards an Appropriate Level of Reliance on AI: A Preliminar... impact of AI tools on developers (broadly: productivity, skills, quality)

Outcomes of AI deployment in labor-market settings depend on complementary organizational practices, workers’ access to skills, and the regulatory environment.

Synthesis-derived moderator/ mechanism claim from qualitative analysis of the 19 included studies identifying organizational practices, skill access, and regulation as contextual moderators.

high mixed Artificial Intelligence in the Labor Market: Evidence on Wor... inclusion/exclusion outcomes contingent on moderators

Benefits of technology and data analytics are context-dependent, with emerging markets facing unique regulatory and infrastructural barriers.

Narrative synthesis of included studies noting heterogeneity by context and reports of regulatory/infrastructural constraints in emerging markets.

high mixed The Use of Technology and Data Analytics in Modern Auditing:... realized benefits / adoption in varying contexts

Cybersecurity has a moderating effect on audit data analytics.

Synthesis statement in the review summarizing included studies that report cybersecurity influences the effectiveness/usability of audit data analytics.

high mixed The Use of Technology and Data Analytics in Modern Auditing:... effectiveness of audit data analytics

No aggregation mechanism can simultaneously satisfy all desiderata of collective rationality (connection to Arrow's Impossibility Theorem); multi-agent deliberation navigates rather than resolves this constraint.

Theoretical argument connecting empirical multi-agent deliberation results to Arrow's Impossibility Theorem and observations that deliberation trades off competing desiderata rather than achieving all simultaneously.

high mixed Beyond Arrow's Impossibility: Fairness as an Emergent Proper... satisfiability of collective rationality desiderata under aggregation mechanisms

Alignment systematically shapes negotiation strategies and allocation patterns between agents.

Experimentally comparing negotiation behavior and allocation outcomes across agent pairs where one agent is aligned (via RAG) and the partner is either unaligned or adversarially prompted; patterns of strategy and allocation differences reported.

high mixed Beyond Arrow's Impossibility: Fairness as an Emergent Proper... negotiation strategies and resource allocation patterns

The design space articulates four configurations—No AI, Hidden AI, Translucent AI, and Visible AI—each trading off among accountability, autonomy, and coordination cost.

Conceptual taxonomy introduced in the paper (design artifact). No empirical evaluation or sample reported in the abstract; tradeoffs are argued theoretically.

high mixed Who Gets Credit? Operationalizing AI Disclosure as Epistemic... tradeoffs among accountability, autonomy, coordination cost under different disc...

« Prev 1 2 3 4 5 6 … 232 233 Next »