Evidence (6507 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	609	159	77	736	1615
Governance & Regulation	664	329	160	99	1273
Organizational Efficiency	624	143	105	70	949
Technology Adoption Rate	502	176	98	78	861
Research Productivity	348	109	48	322	836
Output Quality	391	120	44	40	595
Firm Productivity	385	46	85	17	539
Decision Quality	275	143	62	34	521
AI Safety & Ethics	183	241	59	30	517
Market Structure	152	154	109	20	440
Task Allocation	158	50	56	26	295
Innovation Output	178	23	38	17	257
Skill Acquisition	137	52	50	13	252
Fiscal & Macroeconomic	120	64	38	23	252
Employment Level	93	46	96	12	249
Firm Revenue	130	43	26	3	202
Consumer Welfare	99	51	40	11	201
Inequality Measures	36	105	40	6	187
Task Completion Time	134	18	6	5	163
Worker Satisfaction	79	54	16	11	160
Error Rate	64	78	8	1	151
Regulatory Compliance	69	64	14	3	150
Training Effectiveness	81	15	13	18	129
Wages & Compensation	70	25	22	6	123
Team Performance	74	16	21	9	121
Automation Exposure	41	48	19	9	120
Job Displacement	11	71	16	1	99
Developer Productivity	71	14	9	3	98
Hiring & Recruitment	49	7	8	3	67
Social Protection	26	14	8	2	50
Creative Output	26	14	6	2	49
Skill Obsolescence	5	37	5	1	48
Labor Share of Income	12	13	12	—	37
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Productivity Remove filter

These patterns are consistent with a reorganization of the scientific production process rather than immediate efficiency gains, in line with theories of general-purpose technologies.

Interpretation linking observed changes in budget allocation, team size, and task breadth (from the proposal dataset and task-level analyses) to theoretical predictions about general-purpose technologies (GPTs); empirical findings show organizational change rather than large average short-run productivity gains.

high mixed Artificial Intelligence in Science: Returns, Reallocation, a... organizational reorganization vs efficiency gains (qualitative interpretation)

This paper offers a forward-looking framework that emphasizes the decentralizing potential of AI on labor markets, moving beyond the traditional displacement-versus-creation dichotomy.

Paper's stated contribution; based on conceptual framework and synthesis of historical and contemporary analyses (no empirical validation presented in the abstract).

high mixed AI Civilization and the Transformation of Work conceptual framing of AI's labor-market effects

The emergence of artificial intelligence and robotics is catalyzing a profound transformation in the nature of human labor.

Stated as a central premise in the paper's abstract; supported by the paper's synthesis of economic history, contemporary labor market data, and analysis of digital platform growth (no specific datasets or sample sizes reported in the abstract).

high mixed AI Civilization and the Transformation of Work nature of human labor / structure of labor markets

The resulting AI safety profile is asymmetric: AI is bottlenecked on frontier research (novel tasks) but unbottlenecked on exploiting existing knowledge.

Theoretical implication of the novelty-bottleneck model distinguishing novel (human-judgment) vs. routine (covered by agent prior) components of tasks.

high mixed The Novelty Bottleneck: A Framework for Understanding Human ... AI capability bottlenecks in frontier research vs. exploitation

Wall-clock time can be reduced to O(√E) through team parallelism, but total human effort remains O(E).

Model-derived result showing parallelism across humans can speed wall-clock completion time while aggregate human effort does not drop asymptotically.

high mixed The Novelty Bottleneck: A Framework for Understanding Human ... wall-clock task completion time and total human effort

Better agents improve the coefficient on human effort but not the exponent (i.e., they reduce the constant factor but do not change the asymptotic scaling class).

Analytic result from the stylized model under the paper's assumptions about task decomposition and novelty fraction ν.

high mixed The Novelty Bottleneck: A Framework for Understanding Human ... human effort (coefficient vs. asymptotic scaling exponent)

India's systematic investment plan (SIP) flows provide a high-frequency observable for the model's endogenous participation rate and constitute the natural empirical laboratory for the displacement–participation mechanism.

Empirical suggestion in the paper proposing SIP flows as an observable proxy for the modelled participation rate and recommending India as a lab to test the displacement–participation channel (no empirical test reported in the excerpt).

high mixed When Does AI Raise the Equity Risk Premium? Displacement, Pa... equity market participation rate (proxied by SIP flows)

Three analytical results characterise non-linear financial fragility, regime-contingent risk premium divergence, and the general equilibrium alignment squeeze.

Stated analytical results in the paper derived from the theoretical model describing three named phenomena (non-linear fragility, regime-contingent divergence, alignment squeeze).

high mixed When Does AI Raise the Equity Risk Premium? Displacement, Pa... financial fragility / risk premium behaviour / alignment-induced output effects

Whether AI is equity-bullish or equity-bearish depends on which channel dominates—a condition that differs sharply between deep financial markets, where the ARP is the dominant driver of elevated risk premia (Regime D), and shallow markets, where participation compression dominates (Regime E).

Model regime analysis in the paper distinguishing Regime D (deep markets, ARP-dominated) and Regime E (shallow markets, participation-compression-dominated) and stating comparative dominance determines net bullish/bearish outcome.

high mixed When Does AI Raise the Equity Risk Premium? Displacement, Pa... net effect of AI on equity returns / ERP

The equilibrium equity risk premium decomposes into three additively separable terms corresponding to these three channels (Proposition 1).

Formal proposition (Proposition 1) in the paper deriving an additive decomposition of the equilibrium ERP into the productivity, participation compression, and alignment risk terms.

high mixed When Does AI Raise the Equity Risk Premium? Displacement, Pa... equity risk premium (ERP) decomposition

We develop a heterogeneous-agent framework in which AI-driven labour displacement affects the equity risk premium (ERP) through three co-equal channels.

Stated model contribution in the paper: a theoretical heterogeneous-agent framework that posits three channels linking AI-driven labour displacement to the ERP (productivity, participation compression, alignment risk).

high mixed When Does AI Raise the Equity Risk Premium? Displacement, Pa... equity risk premium (ERP)

The top four models are statistically indistinguishable (mean score 0.147–0.153) while a clear tier gap separates them from the remaining four models (mean score <= 0.113).

Reported mean performance scores across 8 models and statement of statistical indistinguishability for the top four vs lower-tier four; numerical means provided.

high mixed SWE-PRBench: Benchmarking AI Code Review Quality Against Pul... mean model performance score

Behavioral factors — specifically trust calibration, cognitive load, and affective reactions — shape the transition of corporate AI initiatives from pilot deployments to scalable, sustained use.

Synthesis of human-AI interaction literature integrated with adoption frameworks (TAM and TOE); conceptual linkage rather than new empirical testing in this paper.

high mixed Behavioral Factors as Determinants of Successful Scaling of ... success of pilot-to-production transition (scalability and sustained use)

AI accelerates value-chain maturation while creating distinct risks — including professional responsibility tensions and potential system-level externalities.

Conceptual argument and risk analysis in the Article (theoretical reasoning and synthesis of management/ethics literature). No empirical causal estimate reported in the excerpt.

high mixed Rewired: Reconceptualizing Legal Services for the AI Age acceleration of value-chain maturation and emergence of professional responsibil...

The legal profession is at a crossroads, caught between intensifying fears of AI-driven displacement and a generational opportunity for transformation.

Author's synthesis and framing in the Article (conceptual assessment; literature/contextual synthesis). No empirical sample or experiment reported in the excerpt.

high mixed Rewired: Reconceptualizing Legal Services for the AI Age risk of AI-driven displacement and opportunity for transformation in the legal p...

This advantage is contingent upon robust AI governance, ethical frameworks, and the transition from 'pilot-lite' projects to integrated, data-driven 'AI-first' business models.

Conditional claim in the paper linking success to governance, ethics, and organizational integration; appears to be normative/analytical rather than empirical in the abstract.

high mixed The AI Advantage: Strategic Innovation and Global Expansion ... dependency of AI-driven advantage on governance, ethics, and organizational inte...

Machine-readable metrics and open scholarly infrastructure are reshaping scholarly profiles and incentives.

Conceptual and historical discussion referring to platforms and metrics (e.g., arXiv, Google Scholar, ORCID) as mechanisms changing incentives; no new empirical estimates provided.

high mixed A Brief History of AI for Scientific Discovery: Open Researc... changes in scholarly incentives and profile construction due to machine-readable...

That interconnected ecosystem is fundamentally restructuring who can do science (access), how fast discoveries propagate, and what counts as a valid scientific contribution.

Argumentative claim linking infrastructural and tool changes to changes in access, dissemination speed, and norms of contribution. The paper presents examples and narrative but no systematic empirical evaluation or sample.

high mixed A Brief History of AI for Scientific Discovery: Open Researc... access to scientific practice, speed of discovery dissemination, and norms of sc...

The most consequential development is not any single tool but the emergence of an interconnected ecosystem—AI agents, preprint platforms, open source codebases, and citation infrastructure—that forms a feedback loop.

Synthesis/argument based on multiple examples (LLM agents, preprint servers like arXiv, open-source code repositories, citation indices). No quantitative measurement or causal identification reported.

high mixed A Brief History of AI for Scientific Discovery: Open Researc... emergence of an interconnected scientific infrastructure ecosystem

The central tension in AI for science is between automation (building systems that replace human researchers) and augmentation (tools that amplify human creativity and judgement).

Analytical claim based on the paper's review of historical examples and conceptual discussion; no primary data or experimental design reported.

high mixed A Brief History of AI for Scientific Discovery: Open Researc... relationship between automation and augmentation in research practice

Science has repeatedly delegated its bottlenecks to machines—first inference, then search, then measurement, then the full workflow—and each delegation solves one problem while exposing a harder one underneath.

Interpretive historical argument drawing on examples across AI-for-science milestones (e.g., DENDRAL, search and inference systems, measurement automation, and contemporary end-to-end workflows). No quantitative sample or experimental method reported.

high mixed A Brief History of AI for Scientific Discovery: Open Researc... pattern of delegation and emergent bottlenecks in research workflows

Testing revealed AI excels at computational tasks but consistently misses nuanced factors like new construction rent premiums and infrastructure proximity impacts, validating the framework's hybrid structure as essential for professional-grade underwriting.

Findings from the controlled ChatGPT-4 test on the single 150-unit scenario: qualitative and comparative observations showing AI handled computations well but failed to capture specific local-market nuances, leading authors to endorse a hybrid human-AI framework.

high mixed AI-Augmented Real Estate Underwriting: A Practical Framework... output_quality

Phase Two requires human-led professional validation to correct AI limitations, apply local market knowledge, and integrate risk factors.

Framework description supported by observations from the controlled test where human review was used to correct AI outputs and apply local knowledge (e.g., adjusting for nuanced market factors).

high mixed AI-Augmented Real Estate Underwriting: A Practical Framework... task_allocation

Traffic performance is sensitive to the distribution of safe time gaps and the proportion of RL vehicles.

Simulation results comparing Fundamental Diagrams across scenarios with different distributions of safe time gaps and shares of RL-controlled vehicles. Number of simulation runs or replicates not stated in the claim text.

high mixed Macroscopic Characteristics of Mixed Traffic Flow with Deep ... traffic performance (e.g., flow, capacity) sensitivity to time-gap distribution ...

AUROC_2 and M-ratio produce fully inverted model rankings, demonstrating these metrics answer fundamentally different evaluation questions.

Metric comparison across models showing that AUROC_2-based ranking and M-ratio-based ranking are fully inverted in the reported results on the evaluated dataset.

high mixed Do LLMs Know What They Know? Measuring Metacognitive Efficie... model ranking by AUROC_2 versus model ranking by M-ratio

Temperature manipulation shifts Type-2 criterion while meta-d' remains stable for two of four models, dissociating confidence policy from metacognitive capacity.

Experimental manipulation (temperature changes) applied to models; reported result that Type-2 criterion shifted with temperature while meta-d' was stable for two models (out of four) in the 224,000-trial dataset.

high mixed Do LLMs Know What They Know? Measuring Metacognitive Efficie... Type-2 criterion (confidence policy) and meta-d' (metacognitive capacity)

Metacognitive efficiency is domain-specific, with different models showing different weakest domains, invisible to aggregate metrics.

Domain-level analyses reported in the paper showing per-domain M-ratio results and identification of different weakest domains per model, contrasted with aggregate metric behavior.

high mixed Do LLMs Know What They Know? Measuring Metacognitive Efficie... domain-specific metacognitive efficiency (M-ratio) across task domains

Metacognitive efficiency varies substantially across models even when Type-1 sensitivity is similar — Mistral achieves the highest d' but the lowest M-ratio.

Empirical comparison of Type-1 sensitivity (d') and metacognitive efficiency (M-ratio) across the four evaluated LLMs on the 224,000 QA trials; explicit statement that Mistral had highest d' but lowest M-ratio.

high mixed Do LLMs Know What They Know? Measuring Metacognitive Efficie... Type-1 sensitivity (d') and metacognitive efficiency (M-ratio)

The paper's primary contribution is to combine established ingredients—attention scarcity, free-entry dilution, superstar effects, and preferential attachment—into a unified framework directed at claims about AI-enabled entrepreneurship.

Stated contribution and methodological description in the paper (synthesis and applied formalisation); this is a descriptive/methodological claim rather than an empirical result.

high mixed The Economics of Builder Saturation in Digital Markets n/a (methodological contribution)

Modern pretrained time-series foundation models can forecast without task-specific training, but they do not fully incorporate economic behavior.

Statement in paper's introduction/abstract summarizing prior capabilities and limitations of pretrained time-series foundation models (no experimental sample or numeric evidence provided in the excerpt).

high mixed GARP-EFM: Improving Foundation Models with Revealed Preferen... ability of pretrained time-series models to forecast and degree to which they in...

The governance risk-mitigation effects of AI operate through increasing financial risk exposure.

Authors' mechanism tests indicate a relationship between AI adoption and changes in financial risk exposure measures, which they interpret as a channel affecting executive behavior.

high mixed The risk-mitigation effects of artificial intelligence adopt... financial risk exposure (financial risk/proxy metrics)

Organizational culture and technological readiness moderate the effectiveness of generative AI integration in decision-making processes.

The paper reports moderation effects tested in the SEM framework using survey data from senior managers, decision-makers, and AI adoption specialists (SmartPLS). No numeric moderator effect sizes or sample size provided in the excerpt.

high mixed The Strategic Impact of Generative Artificial Intelligence o... effectiveness of generative AI integration in decision-making (moderation effect...

Small language models offer privacy-preserving alternatives to frontier models, but their specialization is hindered by fragmented development pipelines that separate tool integration, data generation, and training.

Background claim stated in paper/abstract; no experimental data provided for this statement within the abstract.

high mixed EnterpriseLab: A Full-Stack Platform for developing and depl... privacy-preserving capability and ease of specialization of small LMs (vs fronti...

Extensive synthetic experiments show that policy regularizations reshape the narrative on what is the best DRL method for inventory management.

Paper states results from extensive synthetic experiments that change which DRL methods are considered best under policy regularization; abstract does not provide the experimental sample size, specific methods, or quantitative comparisons.

high mixed DeepStock: Reinforcement Learning with Policy Regularization... relative performance/ranking of DRL methods for inventory management

Implementation of human-replacing technologies leads to significant transformations in skill demand: it reduces reliance on low-skilled labour while increasing demand for qualified engineers, system operators and specialists in digital technologies.

Sector-specific analysis and review of international labour-market studies cited in the article documenting skill-biased effects of automation and digitalization; qualitative assessment for Ukraine's mining and metallurgical sector under workforce shortage conditions.

high mixed Human-replacing technologies as a driver of labour productiv... skill demand composition (shift from low-skilled to high-skilled roles)

Foreign direct investment (FDI) shows an insignificantly positive direct effect on local TFCP but a significantly negative indirect (spillover) effect, attributed to a 'pollution haven' effect.

Spatial Durbin Model estimates for FDI on panel (30 provinces, 2010–2023): direct coefficient positive but not significant; indirect coefficient significantly negative; interpretation given as pollution-haven mechanism.

high mixed Study on the impact of industrial intelligence and the digit... total factor carbon productivity (TFCP)

Industrial intelligence exhibits regional heterogeneity: a significantly negative direct effect in the east, a significantly positive direct effect in the central region, an insignificant direct effect in the west, and positive indirect (spillover) effects in the east and west.

Regional/subsample Spatial Durbin Model analyses dividing the sample into east, central, and west regions (30 provinces, 2010–2023); reported region-specific direct and indirect coefficients and significance levels.

high mixed Study on the impact of industrial intelligence and the digit... total factor carbon productivity (TFCP)

Industrial intelligence has an insignificantly negative direct effect on local TFCP, but its positive spatial spillover effect is significant at the 1% level, producing a significantly positive total effect.

Spatial Durbin Model results for industrial intelligence on panel (30 provinces, 2010–2023): direct coefficient negative and not statistically significant; indirect coefficient positive and significant at 1%; total effect positive and significant.

high mixed Study on the impact of industrial intelligence and the digit... total factor carbon productivity (TFCP)

China's TFCP rose overall from 2010 to 2023 but exhibited a widening regional gap of 'higher in the east, lower in the west'.

Panel data of 30 Chinese provincial-level regions (2010–2023); TFCP measured using an undesirable-output super-efficiency SBM model and summarized temporal and spatial patterns.

high mixed Study on the impact of industrial intelligence and the digit... total factor carbon productivity (TFCP)

The study identifies the main AI-enabled mechanisms advancing CE principles in smart manufacturing, waste valorisation, supply-chain transparency, and sustainable design.

Bibliometric network analysis of 196 peer-reviewed articles (2023–2024) and systematic review of 104 studies, per the abstract; identification is presented as a product of these analyses.

high mixed Artificial intelligence as a catalyst for the circular econo... AI-enabled mechanisms advancing circular economy principles (e.g., in smart manu...

Governmental structures, labor supply and demand, and incorporation of financial measures act as key intervening variables affecting achieved ROI from GenAI implementations.

Qualitative synthesis and theoretical analysis reported in the paper identifying contextual/intervening variables.

high mixed Measuring Business ROI of Generative AI Adoption on Azure Cl... influence of governance and labor market factors on ROI

Generative AI serves as an effective 'wingman' for employment lawyers, capable of replacing substantial junior associate work while requiring continued human expertise for client counseling, supervision, and final legal advice preparation.

Authors' synthesis of experimental results showing AI-produced substantive analysis plus discussion about remaining limitations (e.g., citation errors) and required human oversight; qualitative assertion about substitutability for junior associate tasks.

high mixed Robot Wingman: Using AI to Assess an Employment Termination potential replacement of junior associate tasks and required human oversight

The paper proposes new mechanisms through which big data affects individual welfare (beyond simple productivity gains), linking privacy costs, multiplier effects, and R&D transformation patterns.

Theoretical/mechanism development: the paper articulates new channels in its macro theoretical framework describing how data sharing impacts welfare via multiple mechanisms (model construction and analytic discussion; no empirical/sample validation).

high mixed Study on the impact of big data sharing on individuals’ welf... mechanisms linking big data to individual welfare (privacy, multiplier, R&D tran...

Consumption is affected by the multiplier effect and the transformation patterns of R&D.

Theoretical: model analysis links consumption dynamics to a multiplier effect and to how R&D transforms inputs/outputs (comparative statics/dynamics in the theoretical framework).

high mixed Study on the impact of big data sharing on individuals’ welf... consumption levels

Individuals’ welfare is influenced by both the privacy cost of big data sharing and their consumption levels.

Theoretical: welfare in the model is specified as a function of consumption and a privacy cost term arising from big data sharing; result follows from analytic derivation within the model (no empirical/sample data).

high mixed Study on the impact of big data sharing on individuals’ welf... individuals' welfare (as affected by privacy cost and consumption)

PPS gains are task-dependent: gains are large in high-ambiguity business analysis tasks but reverse in low-ambiguity travel planning tasks.

Task-level analysis across the three domains (business, technical, travel) within the controlled study (60 tasks total); authors report differential performance patterns by domain/ambiguity.

high mixed Evaluating 5W3H Structured Prompting for Intent Alignment in... relative_performance_by_task_domain (PPS vs baselines)

Artificial intelligence embedded in human decision-making can either enhance human reasoning or induce excessive cognitive dependence.

Stated as a conceptual claim in the paper's introduction/abstract; supported by the paper's conceptual framing (theoretical argument), no empirical sample or experimental data reported here.

high mixed Cognitive Amplification vs Cognitive Delegation in Human-AI ... human reasoning quality / cognitive dependence

Policy implication: smarter, better-coordinated green governance is needed to address the negative local impacts and the crowding-out interaction between AI and environmental regulation.

Policy recommendation drawn in the abstract based on the empirical spatial findings (negative local effects and negative interaction).

high mixed How artificial intelligence and environmental regulation inf... governance/policy recommendation

Substantial regional gaps persist: leading eastern provinces approach a UCEE value of 1.0 while some northeastern provinces remain below 0.1.

Regional UCEE index estimates from the Super-SBM model across the 30 provinces reported in the abstract.

high mixed How artificial intelligence and environmental regulation inf... UCEE index (regional/provincial levels)

These productivity gains are most pronounced for lower-skilled workers, producing a pattern the authors call “skill compression.”

Cross-study pattern reported in the literature review: comparative evidence across worker-skill strata in multiple empirical papers showing larger relative gains for lower-skilled/junior workers; specific underlying studies and sample sizes are not enumerated in the brief.

high mixed AI, Productivity, and Labor Markets: A Review of the Empiric... relative productivity/gains by worker skill level (leading to 'skill compression...

« Prev 1 2 3 4 5 … 130 131 Next »