Evidence (3470 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	609	159	77	736	1615
Governance & Regulation	664	329	160	99	1273
Organizational Efficiency	624	143	105	70	949
Technology Adoption Rate	502	176	98	78	861
Research Productivity	348	109	48	322	836
Output Quality	391	120	44	40	595
Firm Productivity	385	46	85	17	539
Decision Quality	275	143	62	34	521
AI Safety & Ethics	183	241	59	30	517
Market Structure	152	154	109	20	440
Task Allocation	158	50	56	26	295
Innovation Output	178	23	38	17	257
Skill Acquisition	137	52	50	13	252
Fiscal & Macroeconomic	120	64	38	23	252
Employment Level	93	46	96	12	249
Firm Revenue	130	43	26	3	202
Consumer Welfare	99	51	40	11	201
Inequality Measures	36	105	40	6	187
Task Completion Time	134	18	6	5	163
Worker Satisfaction	79	54	16	11	160
Error Rate	64	78	8	1	151
Regulatory Compliance	69	64	14	3	150
Training Effectiveness	81	15	13	18	129
Wages & Compensation	70	25	22	6	123
Team Performance	74	16	21	9	121
Automation Exposure	41	48	19	9	120
Job Displacement	11	71	16	1	99
Developer Productivity	71	14	9	3	98
Hiring & Recruitment	49	7	8	3	67
Social Protection	26	14	8	2	50
Creative Output	26	14	6	2	49
Skill Obsolescence	5	37	5	1	48
Labor Share of Income	12	13	12	—	37
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Org Design Remove filter

Traditional questionnaires yielded slightly higher accuracy in risk assessment.

Result reported from the two experiments comparing traditional questionnaires to adaptive ARQuest versions; no numeric accuracy or sample size provided in the excerpt.

high negative AI in Insurance: Adaptive Questionnaires for Improved Risk P... risk assessment accuracy

Insurers must blindly trust users' responses, increasing the chances of fraud.

Stated as a motivating problem in the paper; presented as logical/empirical concern rather than supported by a reported study within the paper.

high negative AI in Insurance: Adaptive Questionnaires for Improved Risk P... fraud risk from self-reported responses

Insurance application processes often rely on lengthy and standardized questionnaires that struggle to capture individual differences.

Descriptive claim in paper introduction arguing limitations of standard questionnaires; no experiment or sample size reported for this assertion.

high negative AI in Insurance: Adaptive Questionnaires for Improved Risk P... ability of standardized questionnaires to capture individual differences

Using a stylised inpatient capacity signalling example and minimal game-theoretic reasoning, task optimisation alone is unlikely to change system outcomes when incentives are unchanged.

Theoretical analysis using a stylised inpatient capacity signalling example and game-theoretic reasoning presented in the paper (no empirical data/sample reported in the abstract).

high negative Incentives, Equilibria, and the Limits of Healthcare AI: A G... system-level outcomes in healthcare (response to task optimisation interventions...

Deployment of AI systems carries significant costs including ongoing costs of monitoring and it is unclear whether optimism of a deus ex machina solution is well-placed.

Conceptual/argumentative claim made by the authors in the paper (no empirical study or sample size reported in the abstract).

high negative Incentives, Equilibria, and the Limits of Healthcare AI: A G... costs and uncertainty associated with AI deployment (including monitoring costs)

Improvements in operational resilience (OR) effectively reduce corporate operational risk.

Further analysis reported in the paper linking higher OR to lower operational risk measures for firms in the sample.

high negative Does Artificial Intelligence Improve the Operational Resilie... corporate operational risk (reduction)

AI promotes operational resilience by reducing management agency conflicts.

Mechanism (mediation) tests reported in the paper showing AI associated with reductions in measures of agency/management conflict, which in turn relate to OR improvements.

high negative Does Artificial Intelligence Improve the Operational Resilie... management agency conflicts (reduction)

Practitioners identified specific functional deficiencies in AI: inability to maintain sustained partnerships.

Theme from semi-structured interviews with 10 practitioners; cited as an example of the functional gap.

high negative Bridging the Socio-Emotional Gap: The Functional Dimension o... AI capability to maintain sustained collaborative partnerships

Practitioners identified specific functional deficiencies in AI: inability to adapt contextually.

Theme from semi-structured interviews with 10 practitioners; cited as an example of the functional gap.

high negative Bridging the Socio-Emotional Gap: The Functional Dimension o... AI capability for contextual adaptation in collaborative work

Practitioners identified specific functional deficiencies in AI: inability to negotiate responsibilities.

Theme from semi-structured interviews with 10 practitioners; cited as an example of the functional gap.

high negative Bridging the Socio-Emotional Gap: The Functional Dimension o... AI capability to negotiate responsibilities in teamwork

Practitioners currently view AI models as intellectual teammates rather than social partners and expect fewer SEI attributes from them than from human teammates.

Qualitative findings from semi-structured interviews with 10 software practitioners reported in the study.

high negative Bridging the Socio-Emotional Gap: The Functional Dimension o... practitioners' expectations of SEI attributes in AI versus human teammates

Current AI systems lack SEI capabilities that humans bring to teamwork, creating a potential gap in collaborative dynamics.

Framed as background/context in the paper; asserted rather than empirically tested in this study.

high negative Bridging the Socio-Emotional Gap: The Functional Dimension o... presence of SEI capabilities in AI systems (vs. humans)

Unbalanced or poorly governed adoption of Big Data and AI contributes to increased systemic risk, cybersecurity vulnerability, regulatory fragmentation and third-party dependence on BigTech platforms.

Argument based on qualitative literature review and synthesis of international empirical studies and comparative sector analysis; no single-sample empirical study in this paper.

high negative Implications of Big Data Technologies for the Resilience of ... systemic risk; cybersecurity vulnerability; regulatory fragmentation; third-part...

Task orchestration is the most under-researched dimension among the five workplace-design components.

Finding from the PRISMA-guided systematic review of 120 papers, which mapped coverage across the five dimensions and identified task orchestration as having the least research attention.

high negative From Automation to Augmentation: A Framework for Designing H... volume/coverage of research on task orchestration

Decision authority allocation emerges as the binding constraint for Society 5.0 transitions.

Result synthesized from the systematic review and theoretical analysis mapping the five workplace-design dimensions; stated as the binding constraint in the paper's findings.

high negative From Automation to Augmentation: A Framework for Designing H... constraint on transitions to human-centric (Society 5.0) technology integration

A weak manager directing a weak worker achieves a 42% success rate, performing worse than the weak agent alone which achieves 44%.

Empirical comparison across the same 200 SWE-bench Lite instances and pipeline configurations, comparing weak-manager+weak-worker pipeline to weak single-agent baseline.

high negative Can AI Models Direct Each Other? Organizational Structure as... task success rate (percentage of tasks solved)

Under low emotional intelligence, the model predicts higher risks of over-reliance on AI, emotionally detached communication, and weaker delegation quality.

Theoretical predictions derived from the EI-moderated human–AI model presented in the paper.

high negative LEADER EMOTIONAL INTELLIGENCE IN THE GENERATIVE AI ERA: “HUM... delegation quality (and over-reliance / communication quality)

The common claim that generative AI simply amplifies the Dunning–Kruger effect is too coarse to capture the available evidence.

Paper's synthesis of heterogenous empirical findings from human–AI interaction, learning research, and model evaluation used to critique the uniform-amplification interpretation; no single empirical countertest reported.

high negative Beyond the Steeper Curve: AI-Mediated Metacognitive Decoupli... validity of the 'amplified Dunning–Kruger' interpretation

LLM use degrades metacognitive accuracy and flattens the classic competence–confidence gradient across skill groups (i.e., reduces calibration and narrows differences in self-assessed confidence by skill level).

Synthesis of studies from human–AI interaction and learning research reported in the paper that document worsened calibration and a reduction in the competence–confidence gradient when users rely on LLM outputs; the paper does not report a single combined sample size.

high negative Beyond the Steeper Curve: AI-Mediated Metacognitive Decoupli... metacognitive accuracy / calibration and competence–confidence gradient

The agent team topology exhibits higher operational fragility due to multi-author code generation.

Reported empirical observation from experiments comparing architectures, attributing increased fragility/errors to multi-author code generation in the agent team setup (stated qualitatively; no quantitative failure rates provided in the abstract).

high negative An Empirical Study of Multi-Agent Collaboration for Automate... operational fragility / error-proneness associated with multi-author code genera...

Prominent studies predict substantial job displacement due to automation.

Paper asserts this as background, referencing the existence of prominent studies in the literature (no specific citations or sample sizes provided in the abstract).

high negative AI Civilization and the Transformation of Work job losses / displacement

For organizations of n humans with AI agents, the optimal team size decreases with agent capability.

Derived implication from the stylized model's analysis of multi-human organizations interacting with AI agents.

high negative The Novelty Bottleneck: A Framework for Understanding Human ... optimal team size as a function of agent capability

There is no smooth sublinear regime for human effort; it transitions sharply from O(E) to O(1) with no intermediate scaling class.

Mathematical derivation from a stylized model of human-AI collaboration that assumes tasks decompose into atomic decisions, a fraction ν are novel, and specification/verification/error correction scale with task size.

high negative The Novelty Bottleneck: A Framework for Understanding Human ... human effort scaling (human time/effort required as task size E grows)

There is a growing gap between rapid experimentation with AI tools and limited organizational capability to institutionalize them in everyday workflows.

Argument supported by targeted literature synthesis and review of recent scholarly and institutional sources; no primary empirical sample reported in this paper.

high negative Behavioral Factors as Determinants of Successful Scaling of ... organizational capability to institutionalize AI initiatives (pilot-to-productio...

Technological proximity has a noteworthy negative effect on collaboration, underscoring the importance of complementary knowledge in AI innovation.

SAOM estimates from longitudinal patent collaboration data (2013–2024) showing a statistically negative coefficient for technological proximity (implying organizations closer in technology space are less likely to form ties).

high negative The evolutionary mechanism of artificial intelligence indust... tie formation / collaboration probability (as a function of technological proxim...

Within the set of agentic-mention filings, autonomy evidence remains rare.

Empirical statement derived from analysis of the identified agentic-mention filings (small number of such filings reported across 2024–2025).

high negative Measuring agentic AI adoption and control frameworks in fina... presence/rarity of autonomy-related evidence within agentic-mention filings

Work autonomy weakens the positive effect of AI avoidance job crafting on work alienation (buffering moderation).

Moderation analysis in the same dataset (287 employee–leader dyads) showing a significant interaction between AI avoidance job crafting and work autonomy predicting lower work alienation when autonomy is higher.

high negative Approach or avoidance? A dual-pathway model of job crafting ... work alienation

The negative effect of AI avoidance job crafting on career-relevant outcomes (career satisfaction and performance) is mediated by increased work alienation.

Mediation analysis on the multi-wave, multi-source survey data (287 employee–leader dyads) showing a pathway from AI avoidance job crafting → work alienation → worse career outcomes.

high negative Approach or avoidance? A dual-pathway model of job crafting ... career satisfaction and performance (mediated by work alienation)

AI avoidance job crafting negatively predicts career satisfaction and performance.

Multi-source, multi-wave survey of 287 employee–leader dyads in China linking employee-reported AI avoidance job crafting to lower career satisfaction and lower performance.

high negative Approach or avoidance? A dual-pathway model of job crafting ... career satisfaction and performance

Analysis of global datasets on energy dependency, economic concentration, debt levels, demographic trends, digital infrastructure, and AI adoption highlights that interconnected systemic risks can amplify economic instability.

Paper reports drawing upon multiple global datasets (energy dependency, economic concentration, debt, demographics, digital infrastructure, AI adoption) to analyze systemic risk interactions; specific datasets, sample sizes, and statistical methods are not detailed in the excerpt.

high negative Beyond Forecasting: Adaptive Economic Preparedness in a Geop... amplification of economic instability by interconnected systemic risks

Events such as supply chain disruptions, oil price surges linked to geopolitical conflicts, and sudden labour market shifts due to reverse migration have exposed the limitations of prediction-based planning frameworks.

Illustrative examples cited in the paper; the claim is supported by referenced global events and the paper's use of global datasets, but no specific empirical case-study sample sizes or quantification are provided in the excerpt.

high negative Beyond Forecasting: Adaptive Economic Preparedness in a Geop... exposure of limitations in prediction-based planning frameworks

Traditional economic models that rely heavily on historical data and linear forecasting are increasingly inadequate in capturing the complexity and unpredictability of contemporary economic shocks.

Conceptual claim supported by discussion and examples of recent shocks (supply chain disruptions, oil price surges, labor market shifts); no specific empirical evaluation or quantified model comparison reported in the excerpt.

high negative Beyond Forecasting: Adaptive Economic Preparedness in a Geop... predictive adequacy of traditional economic models

The global economic system is undergoing a structural transformation characterized by geopolitical tensions, energy price volatility, trade fragmentation, demographic imbalances, and rapid technological disruption driven by artificial intelligence.

Narrative synthesis in the paper drawing on global trends; the paper references global datasets on energy dependency, trade patterns, demographics, and AI adoption (no specific sample size or empirical study detailed in the excerpt).

high negative Beyond Forecasting: Adaptive Economic Preparedness in a Geop... structural transformation of the global economic system (presence of geopolitica...

The competence shadow compounds multiplicatively to produce degradation far exceeding naive additive estimates.

Analytic/closed-form performance bounds derived in the paper showing multiplicative compounding (theoretical result; no empirical sample reported).

high negative The Competence Shadow: Theory and Bounds of AI Assistance in... output_quality

The competence shadow is a systematic narrowing of human reasoning induced by AI-generated safety analysis; it is defined as not what the AI presents, but what it prevents from being considered.

Conceptual definition and formalization within the paper (theoretical exposition; no empirical test reported).

high negative The Competence Shadow: Theory and Bounds of AI Assistance in... decision_quality

Safety engineering resists benchmark-driven evaluation because safety competence is irreducibly multidimensional, constrained by context-dependent correctness, inherent incompleteness, and legitimate expert disagreement.

Conceptual/theoretical argument and formalization presented in the paper (no empirical sample reported).

high negative The Competence Shadow: Theory and Bounds of AI Assistance in... output_quality

Refining the state (as above) raises state-action blind mass from 0.0165 at \tau=50 to 0.1253 at \tau=1000.

Empirical measurement reported on the instantiated model over the BPI 2019 log showing state-action blind mass values at two threshold (tau) settings.

high negative The Stochastic Gap: A Markovian Framework for Pre-Deployment... state-action blind mass (measure of unsupported next-step decisions)

Currently, the region remains reactive as a 'recipient' rather than a 'creator' or an effective partner in the AI ecosystem.

Characterization reported by the authors based on their regional research and field study (qualitative findings from leaders across public/private sectors).

high negative Charting AI Governance Future in the Arab Region: A Policy R... degree of domestic AI creation/innovation versus reception/adoption

This gap hinders the ability of many governments in the region to push their countries toward joining the ranks of those benefiting from the AI revolution—both in developing the public sector and supporting economic growth and social development.

Authors' analysis and interpretation based on the regional research/field study described in the report.

high negative Charting AI Governance Future in the Arab Region: A Policy R... governments' ability to benefit from AI (public sector development; economic and...

The Arab region’s capacity for Artificial Intelligence (AI) governance remains limited relative to the accelerating pace of global AI developments and associated challenges.

Stated conclusion in the executive report based on a regional field study (authors' analysis of interviews/surveys and research across the region).

high negative Charting AI Governance Future in the Arab Region: A Policy R... AI governance capacity

These harms increasingly translate into financial loss through litigation, enforcement penalties, brand erosion, and failed deployments.

Paper argues this linkage using conceptual reasoning and illustrative examples/case vignettes; cites regulatory and market incidents but does not provide systematic empirical estimates or a sample size.

high negative Artificial Intelligence Governance In Corporate Strategy: Et... firm_revenue

AI systems can create material harms: discriminatory outcomes, privacy and security failures, opacity in decision logic, and regulatory noncompliance.

Paper lists these harms as core risks based on prior literature, regulatory developments, and conceptual risk analysis. Presented as well-documented categories rather than as new empirical findings; no sample size reported.

high negative Artificial Intelligence Governance In Corporate Strategy: Et... ai_safety_and_ethics

Insufficient organizational resources significantly inhibit AI adoption in procurement (β = -0.19, p < 0.05).

Same questionnaire survey (n=326) and multiple linear regression analysis; reported coefficient β=-0.19 with p<0.05.

high negative Research on the Adoption of Artificial Intelligence and Proc... AI adoption in procurement

Measuring only technical model performance (such as predictive accuracy) is insufficient for assessing the strategic impact of AI in drug discovery.

Argued in the paper as a critique of current evaluation practices; presented as a conceptual point rather than supported by new empirical data in the excerpt.

high negative Strategic Key Performance Indicators for AI in Lead Optimiza... adequacy of technical model performance metrics for capturing strategic impact

Pressure remains high to increase the probability of success to improve the effectiveness of pharmaceutical R&D.

Asserted in the paper as motivational context for the work; framed as an industry pressure point rather than backed by a specific empirical sample or quantified survey in the excerpt.

high negative Strategic Key Performance Indicators for AI in Lead Optimiza... probability of success in pharmaceutical R&D

Increasing cost and failure rates in the pharmaceutical R&D process have not fundamentally improved over the last decade.

Stated as a contextual observation in the paper's opening paragraph; presented as a summary of industry trends (no specific dataset, sample size, or citation included in the excerpt).

high negative Strategic Key Performance Indicators for AI in Lead Optimiza... cost and failure rates in pharmaceutical R&D

Without support, performance stays stable up to three issues but declines as additional issues increase cognitive load.

Empirical study / human-AI negotiation case study in a property rental scenario that varied the number of negotiated issues; the paper reports observed performance across different numbers of issues (no sample size for this specific comparison stated in the abstract).

high negative From Overload to Convergence: Supporting Multi-Issue Human-A... negotiation performance (ability to find good agreements) under increasing numbe...

Reliance on automated content generation introduces risks of cognitive overreliance, algorithmic bias, and strategic misalignment.

The paper articulates these risks as conceptual/qualitative concerns in its discussion; no quantitative estimates or empirical tests of these specific risks are reported in the provided excerpt.

high negative The Strategic Impact of Generative Artificial Intelligence o... risks to decision-making including cognitive overreliance, algorithmic bias, str...

Wide disagreement among AIs created confusion and undermined appropriate reliance on advice.

Reported experimental finding from the paper: manipulating within-panel disagreement across tasks produced wide disagreement conditions that, according to the abstract, led to confusion and reduced appropriate reliance. No quantitative metrics reported in abstract.

high negative More Isn't Always Better: Balancing Decision Accuracy and Co... appropriate reliance on advice / decision-making

High within-panel consensus fostered overreliance on AI advice.

Experimental manipulation of within-panel consensus across the three tasks; the abstract reports that high consensus increased participants' reliance on AI (interpreted as overreliance). Specific measures and sample size not provided in abstract.

high negative More Isn't Always Better: Balancing Decision Accuracy and Co... reliance on AI advice (overreliance)

« Prev 1 2 3 … 8 9 10 … 69 70 Next »