Evidence (14156 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	761	200	101	904	2020
Governance & Regulation	829	400	191	122	1566
Organizational Efficiency	784	193	125	84	1197
Technology Adoption Rate	637	236	124	97	1103
Research Productivity	431	131	58	340	972
Output Quality	481	183	59	47	770
Decision Quality	332	177	82	49	647
Firm Productivity	439	57	88	20	610
AI Safety & Ethics	218	279	66	33	602
Market Structure	181	170	123	24	503
Task Allocation	214	64	72	33	388
Skill Acquisition	174	62	62	17	315
Innovation Output	204	27	45	18	295
Employment Level	105	54	108	13	282
Fiscal & Macroeconomic	132	69	43	26	277
Consumer Welfare	117	63	42	11	233
Firm Revenue	154	48	26	3	231
Task Completion Time	173	31	8	12	225
Inequality Measures	44	123	50	6	223
Worker Satisfaction	89	65	22	12	188
Error Rate	71	92	10	2	175
Regulatory Compliance	77	69	14	5	165
Automation Exposure	58	56	26	13	156
Training Effectiveness	96	21	14	19	152
Wages & Compensation	77	37	25	6	145
Team Performance	86	17	27	10	141
Developer Productivity	95	17	14	6	133
Job Displacement	12	81	21	1	115
Hiring & Recruitment	52	7	8	3	70
Creative Output	32	20	8	3	64
Skill Obsolescence	5	47	6	1	59
Social Protection	28	16	8	2	54
Labor Share of Income	17	19	17	—	53
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

For organizations of n humans with AI agents, the optimal team size decreases with agent capability.

Derived implication from the stylized model's analysis of multi-human organizations interacting with AI agents.

high negative The Novelty Bottleneck: A Framework for Understanding Human ... optimal team size as a function of agent capability

There is no smooth sublinear regime for human effort; it transitions sharply from O(E) to O(1) with no intermediate scaling class.

Mathematical derivation from a stylized model of human-AI collaboration that assumes tasks decompose into atomic decisions, a fraction ν are novel, and specification/verification/error correction scale with task size.

high negative The Novelty Bottleneck: A Framework for Understanding Human ... human effort scaling (human time/effort required as task size E grows)

So far the maintenance and migration work was done largely manually by human experts.

Background assertion in the paper's introduction/abstract; no empirical backing provided in abstract.

high negative A Multi-agent AI System for Deep Learning Model Migration fr... degree of manual effort for model maintenance and migration historically

The regime divide deepens under AI capital concentration, admits a permanent displacement attractor in shallow markets, and generates equity market participation hysteresis in which the ERP remains elevated after employment has normalised.

Model-based assertions: analysis shows capital concentration magnifies regime separation, yields a permanent displacement attractor in shallow-market parameterizations, and produces hysteresis in participation leading to persistently elevated ERP after employment recovery.

high negative When Does AI Raise the Equity Risk Premium? Displacement, Pa... equity risk premium (ERP) persistence / participation hysteresis

The alignment risk channel is specific to agentic AI: correlated misalignment in AI objectives generates aggregate output shocks with fat left tails; formalised via Hansen-Sargent multiplier preferences, the resulting alignment risk premium (ARP) enters the equilibrium ERP decomposition as a priced factor additively separable from the participation wedge.

Theoretical formalisation in the paper: uses Hansen-Sargent multiplier preferences to capture model uncertainty/robustness and defines an ARP that is additively separable in the ERP decomposition.

high negative When Does AI Raise the Equity Risk Premium? Displacement, Pa... alignment risk premium (ARP) contribution to ERP

The participation compression channel operates through household wealth: displacement pushes marginal households below the equity market entry cost κ, concentrating aggregate consumption risk on a shrinking investor pool and—by the Basak-Cuoco mechanism—raising the required risk premium even as fundamentals improve.

Model mechanism described in the paper: heterogeneous-agent model with an explicit market entry cost κ and reference to the Basak-Cuoco mechanism leading to a higher required risk premium when investor base shrinks.

high negative When Does AI Raise the Equity Risk Premium? Displacement, Pa... equity risk premium (ERP)

The literature singles out endemic data quality issues, algorithmic bias, governance frameworks, and regulatory compliance as concerns that require trusted AI and sustainable digital finance ecosystems.

Synthesis from the reviewed literature noting recurring concerns and limitations reported across studies; the paper lists these as major challenges identified in the field.

high negative Artificial intelligence in sustainable finance and Environme... prevalence of data quality issues, algorithmic bias, governance and regulatory c...

AI can worsen financial and market performance if it crowds out normal R&D.

Paper's empirical analysis and interpretation linking AI dependence to poorer financial/market performance through displacement of standard R&D activities; presented as a study finding.

high negative The 'Intelligent Trap' in Corporate Finance—A Study Based on... financial and market performance

High AI dependency disclosed in financial reports does not improve firms' financial health and may even endanger it.

Empirical results drawn from the study's analysis of listed new energy vehicle and automobile manufacturers (2013–2023); statement appears in the paper's findings/conclusions.

high negative The 'Intelligent Trap' in Corporate Finance—A Study Based on... financial health / corporate financial condition

AI dependency reduces financial safety for listed new energy vehicle and automobile manufacturers.

Empirical analysis of a sample of listed new energy vehicle and automobile manufacturers covering 2013–2023; the paper reports data analysis showing AI dependency reduces financial safety.

high negative The 'Intelligent Trap' in Corporate Finance—A Study Based on... financial safety / corporate financial risk

More informative search can degrade both learning and consumer surplus unless the market learns as much as consumers (for example, by "reading the transcripts" of agentic conversations).

Analytical comparative statics in the paper's theoretical model showing how increasing the informativeness of consumer-side signals affects learning dynamics and welfare; relies on model assumptions about what information the market collects versus consumers.

high negative Agentic Markets: Equilibrium Effects of Improving Consumer S... consumer surplus (and market learning about product fit)

Performance degradation persists even when context is provided via structured semantic layers including AST-extracted function context and import graph resolution.

Experiments comparing unstructured versus structured context provision; structured semantic layers (AST context, import graph resolution) were evaluated and models still degraded with more context.

high negative SWE-PRBench: Benchmarking AI Code Review Quality Against Pul... model detection/performance when given structured semantic context

Models' performance degrades monotonically from diff-only (config_A) to diff+file content (config_B) to full context (config_C) across all 8 models.

Systematic ablation across three frozen context configurations (config_A, config_B, config_C) reported; all 8 evaluated models show monotonic performance decline as more context is provided.

high negative SWE-PRBench: Benchmarking AI Code Review Quality Against Pul... model performance score across context-provision configurations

Eight frontier models detect only 15–31% of human-flagged issues on the diff-only configuration (config_A).

Empirical evaluation across 8 models on SWE-PRBench (350 PRs) under the diff-only configuration; reported detection rates of 15–31% relative to human-flagged issues.

high negative SWE-PRBench: Benchmarking AI Code Review Quality Against Pul... detection rate of human-flagged issues

There is a growing gap between rapid experimentation with AI tools and limited organizational capability to institutionalize them in everyday workflows.

Argument supported by targeted literature synthesis and review of recent scholarly and institutional sources; no primary empirical sample reported in this paper.

high negative Behavioral Factors as Determinants of Successful Scaling of ... organizational capability to institutionalize AI initiatives (pilot-to-productio...

Data reveals that less than 0.7% of the Indian population uses AI-induced ride services.

Empirical statistic reported in the paper (declared as data) quantifying the share of the population using AI-induced ride services.

high negative Artificial Intelligence, Demand Switching and Sectoral Wage ... share of population using AI-induced ride services

The lack of a significant worsening in transportation-sector inequality can be attributed to sluggish demand switching from non-AI to AI-based services in India.

Argument in the paper linking empirical finding (no significant increase in inequality) to low observed adoption rates of AI-based ride services; supported by reported adoption statistic.

high negative Artificial Intelligence, Demand Switching and Sectoral Wage ... rate of demand switching / adoption

Evaluations across eight state-of-the-art multimodal models reveal that models achieved only 55.0% accuracy on help prediction.

Experimental evaluation reported in the paper comparing eight multimodal models on the Help Prediction task with reported accuracy metric.

high negative GUIDE: A Benchmark for Understanding and Assisting Users in ... help prediction accuracy

Evaluations across eight state-of-the-art multimodal models reveal that models achieved only 44.6% accuracy on behavior state detection.

Experimental evaluation reported in the paper comparing eight multimodal models on the Behavior State Detection task with reported accuracy metric.

high negative GUIDE: A Benchmark for Understanding and Assisting Users in ... behavior state detection accuracy

Technological proximity has a noteworthy negative effect on collaboration, underscoring the importance of complementary knowledge in AI innovation.

SAOM estimates from longitudinal patent collaboration data (2013–2024) showing a statistically negative coefficient for technological proximity (implying organizations closer in technology space are less likely to form ties).

high negative The evolutionary mechanism of artificial intelligence indust... tie formation / collaboration probability (as a function of technological proxim...

Sentiment signals derived from sparse news are commonly used in financial analysis and technology monitoring, yet transforming raw article-level observations into reliable temporal series remains a largely unsolved engineering problem.

Framing statement in the paper's introduction/abstract describing the problem motivation; conceptual argument rather than empirical test.

high negative Causal Reconstruction of Sentiment Signals from Sparse News ... reliability of temporal sentiment series reconstructed from article-level news

Ikema is a severely endangered Ryukyuan language spoken in Okinawa, Japan, with approximately 1,300 remaining speakers, most of whom are over 60 years old.

Demographic/descriptive claim reported in the paper's background (likely citing prior surveys or census estimates); the abstract states the ~1,300 speakers figure and age distribution.

high negative Automatic Speech Recognition for Documenting Endangered Lang... number and age distribution of speakers

The financial planning and investment management profession is undergoing a radical transformation driven by Generative AI (GenAI) and Agentic AI, creating urgent workforce displacement challenges that require coordinated government policy intervention alongside educational reform.

Author assertion in the paper's introduction/abstract; framing argument based on the paper's synthesized analysis (no empirical sample, no reported statistical test).

high negative STRENGTHENING FINANCIAL WORKFORCE COMPETITIVENESS: A CURRICU... rate of workforce displacement in the financial planning and investment manageme...

Within the set of agentic-mention filings, autonomy evidence remains rare.

Empirical statement derived from analysis of the identified agentic-mention filings (small number of such filings reported across 2024–2025).

high negative Measuring agentic AI adoption and control frameworks in fina... presence/rarity of autonomy-related evidence within agentic-mention filings

LLM design agents can fixate on existing paradigms and fail to explore alternatives when solving design challenges, potentially leading to suboptimal solutions (a pathology analogous to human designers).

Literature/background claim and authors' characterization of observed agent behavior; motivated the proposed metacognitive interventions. No numerical sample size reported.

high negative Supervising Ralph Wiggum: Exploring a Metacognitive Co-Regul... tendency to fixate on existing paradigms / lack of exploration leading to subopt...

Current closed models are generally ill-suited for scientific purposes (with some notable exceptions).

Argumentative and evaluative reasoning in the paper comparing features of closed models to scientific needs; no empirical sample size reported in abstract.

high negative How Open Must Language Models be to Enable Reliable Scientif... suitability of models for scientific research / quality of scientific inference

Restrictions on information about model construction and deployment threaten reliable inference in research that involves those models.

Conceptual argument and analysis presented in the paper (no empirical sample or randomized evaluation reported in abstract). The paper analyzes how specific types of information restrictions (about model construction and deployment) create threats to inference.

high negative How Open Must Language Models be to Enable Reliable Scientif... reliable inference / scientific inference

This inefficiency directly undermines UN Sustainable Development Goals 13 (Climate Action) and 10 (Reduced Inequalities) by hindering equitable AI access in resource-constrained regions.

Normative/analytic claim in the paper linking energy inefficiency to negative impacts on specific UN SDGs (argumentative, not empirically quantified in the abstract).

high negative EcoThink: A Green Adaptive Inference Framework for Sustainab... equitable AI access / progress toward SDGs 13 and 10

Current paradigms indiscriminately apply computation-intensive strategies like Chain-of-Thought (CoT) to billions of daily queries, causing LLM overthinking that amplifies carbon emissions and operational barriers.

Claim/assertion in the paper framing the problem (conceptual/observational argument; no specific empirical backing provided in the abstract).

high negative EcoThink: A Green Adaptive Inference Framework for Sustainab... carbon emissions and operational barriers from LLM overthinking

There is a potential for exclusion due to limited digital footprints, which can limit who benefits from AI-driven finance.

Abstract explicitly identifies potential exclusion of people with limited digital footprints as a challenge, based on qualitative interviews and case-study evidence.

high negative Artificial Intelligence, Climate Resilience, and Financial I... exclusion due to digital footprints

Data privacy concerns are a notable challenge in deploying AI-driven financial solutions.

Abstract lists data privacy concerns among identified challenges drawn from interviews and analysis across the three case studies.

high negative Artificial Intelligence, Climate Resilience, and Financial I... data privacy concerns

Infrastructure limitations pose a barrier to adoption and effective use of AI-enabled financial services.

Abstract identifies infrastructure limitations as a challenge, based on qualitative interviews and case-study evidence.

high negative Artificial Intelligence, Climate Resilience, and Financial I... infrastructure constraints on adoption

Digital literacy gaps are a challenge limiting the effectiveness and inclusion of AI-driven financial solutions.

Abstract lists digital literacy gaps among identified challenges, based on qualitative insights from the 1,500 interviews and case-study observations.

high negative Artificial Intelligence, Climate Resilience, and Financial I... digital literacy barriers to adoption

Triangulation with market data and sentiment analysis confirms that public enthusiasm often outpaces actual technological readiness.

Paper states market data and sentiment analysis were used to triangulate findings and reports this systematic gap; no numeric effect sizes or sample counts provided.

high negative Emerging Technologies Based on Large AI Models and the Desig... gap between public enthusiasm (sentiment) and technological readiness

Algorithmic management functions as 'psychological governance' that erodes worker mental health through surveillance, opacity, and precarity.

Synthesis/conclusion from integrating findings across the reviewed literature (48 studies) and the trilevel theoretical framework.

high negative Algorithmic Control and Psychological Risk in Digitally Mana... worker mental health (general deterioration)

Fear of deactivation (automated sanctions) creates chronic precarity; 78% report chronic fear.

Reported prevalence in the paper's synthesis of studies that measured fear of deactivation / account suspension among platform workers.

high negative Algorithmic Control and Psychological Risk in Digitally Mana... self-reported chronic fear of deactivation

Task defragmentation (fragmenting tasks via platform algorithms) leads to a reduced sense of accomplishment among drivers.

Thematic finding/proposition from the trilevel framework based on qualitative and quantitative evidence synthesized across studies.

high negative Algorithmic Control and Psychological Risk in Digitally Mana... reduced sense of accomplishment

Rating pressure is associated with emotional exhaustion, with 41–67% reporting high burnout.

Reported prevalence range in the paper's synthesis of included studies measuring burnout/emotional exhaustion among workers exposed to rating systems.

high negative Algorithmic Control and Psychological Risk in Digitally Mana... emotional exhaustion / high burnout prevalence

Income volatility from dynamic pricing is associated with depressive symptoms (reported prevalence range 23–41%).

Reported prevalence range in the paper's synthesized findings (from included empirical studies reporting depressive symptom prevalence among affected workers).

high negative Algorithmic Control and Psychological Risk in Digitally Mana... prevalence of depressive symptoms

Algorithmic opacity is linked to procedural anxiety.

Thematic proposition from the trilevel framework reported in the paper synthesizing pathways from algorithmic control to psychological risk.

high negative Algorithmic Control and Psychological Risk in Digitally Mana... procedural anxiety

Real estate pro forma development remains one of the most time-intensive functions in property investment, typically requiring twenty to forty hours per multifamily project through manual research, Excel-based modeling, and iterative scenario analysis.

Statement in paper asserting typical industry practice; not tied to the paper's controlled test. No empirical sample size or survey data reported alongside this assertion.

high negative AI-Augmented Real Estate Underwriting: A Practical Framework... task_completion_time

Policymakers in the EU and beyond will need to change course, and soon, if they are to effectively govern the next generation of AI technology.

Authors' prescriptive conclusion based on their analysis of shortcomings in the EU AI Act and institutional frameworks (policy recommendation; no empirical sample size in excerpt).

high negative Regulating AI Agents need for regulatory/policy change to effectively govern AI agents

The Act's allocation of monitoring and enforcement responsibilities, reliance on industry self-regulation, and level of government resourcing illustrate how a regulatory framework designed for conventional AI systems can be ill-suited to AI agents.

Authors' institutional analysis of the EU AI Act's monitoring/enforcement allocation, reliance on self-regulation, and resourcing (qualitative legal/institutional analysis; no quantitative sample size in excerpt).

high negative Regulating AI Agents fit between regulatory institutional design and requirements for governing AI ag...

The EU AI Act faces significant obstacles in confronting governance challenges arising from AI agents, such as unequal access to the economic opportunities afforded by AI agents.

Authors' argument that the Act may not prevent or address unequal access to benefits of AI agents (policy/legal analysis; no empirical sample size in excerpt).

high negative Regulating AI Agents distribution of economic opportunities from AI agents

The EU AI Act faces significant obstacles in confronting governance challenges arising from AI agents, such as the risk of misuse of agents by malicious actors.

Authors' analysis highlighting misuse risks and the Act's limitations in addressing them (policy/legal analysis; no empirical sample size in excerpt).

high negative Regulating AI Agents risk of malicious misuse and regulatory capacity to mitigate it

The EU AI Act faces significant obstacles in confronting governance challenges arising from AI agents, such as performance failures in autonomous task execution.

Authors' analytical argument that the Act's design and provisions do not adequately address autonomous performance failures (policy/legal analysis; no empirical sample size provided in excerpt).

high negative Regulating AI Agents ability of regulation to address performance failures (error rates / autonomous ...

The EU AI Act was promulgated prior to the development and widespread use of AI agents.

Factual/timing claim by the authors referencing the Act's adoption date relative to development and proliferation of AI agents (historical/policy analysis; dates verifiable externally).

high negative Regulating AI Agents temporal alignment between regulation and technology development

AI agents present particularly pressing questions for the European Union's AI Act.

Authors' normative/analytical claim based on the perceived fit between AI agents' characteristics and the EU AI Act's design (policy/legal analysis; no empirical sample size in excerpt).

high negative Regulating AI Agents regulatory adequacy of the EU AI Act for AI agents

AI can promote enterprises to adopt different income distribution modes by improving the marginal output of capital and substituting low-skilled labor (technology bias).

Theoretical mechanism articulated in the paper based on capital-labor substitution principle and factor reward theory; implied empirical testing using firm-level data.

high negative THE IMPACT OF ARTIFICIAL INTELLIGENCE ON ENTERPRISE INCOME D... labor compensation relative to capital returns / labor share

Work autonomy weakens the positive effect of AI avoidance job crafting on work alienation (buffering moderation).

Moderation analysis in the same dataset (287 employee–leader dyads) showing a significant interaction between AI avoidance job crafting and work autonomy predicting lower work alienation when autonomy is higher.

high negative Approach or avoidance? A dual-pathway model of job crafting ... work alienation

« Prev 1 2 3 … 48 49 50 … 283 284 Next »