Evidence (14922 claims)
Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.
The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).
Browse by theme
Nine broad, paper-level topics. Click one to filter the claims below.
Adoption
9047 claims
Filter claims →
Productivity
8066 claims
Filter claims →
Governance
7278 claims
Filter claims →
Human-AI Collaboration
6912 claims
Filter claims →
Org Design
4439 claims
Filter claims →
Innovation
4359 claims
Filter claims →
Labor Markets
3652 claims
Filter claims →
Skills & Training
3018 claims
Filter claims →
Inequality
2160 claims
Filter claims →
Claims by outcome category
Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 795 | 210 | 105 | 955 | 2131 |
| Governance & Regulation | 886 | 414 | 197 | 126 | 1654 |
| Organizational Efficiency | 826 | 204 | 129 | 87 | 1257 |
| Technology Adoption Rate | 681 | 259 | 128 | 110 | 1189 |
| Research Productivity | 464 | 138 | 65 | 349 | 1028 |
| Output Quality | 503 | 196 | 61 | 53 | 813 |
| Decision Quality | 351 | 180 | 84 | 51 | 673 |
| AI Safety & Ethics | 238 | 288 | 71 | 34 | 637 |
| Firm Productivity | 455 | 58 | 92 | 20 | 631 |
| Market Structure | 186 | 172 | 123 | 25 | 511 |
| Task Allocation | 222 | 70 | 76 | 34 | 407 |
| Innovation Output | 238 | 28 | 48 | 18 | 334 |
| Skill Acquisition | 177 | 62 | 62 | 17 | 318 |
| Employment Level | 107 | 57 | 108 | 13 | 287 |
| Fiscal & Macroeconomic | 135 | 72 | 44 | 26 | 284 |
| Firm Revenue | 172 | 50 | 28 | 5 | 256 |
| Consumer Welfare | 121 | 68 | 45 | 12 | 246 |
| Task Completion Time | 183 | 33 | 10 | 13 | 240 |
| Inequality Measures | 45 | 126 | 50 | 6 | 227 |
| Worker Satisfaction | 95 | 74 | 23 | 12 | 204 |
| Error Rate | 77 | 98 | 11 | 4 | 190 |
| Regulatory Compliance | 84 | 73 | 17 | 7 | 181 |
| Automation Exposure | 61 | 61 | 27 | 14 | 166 |
| Training Effectiveness | 98 | 21 | 14 | 19 | 154 |
| Wages & Compensation | 78 | 37 | 25 | 6 | 146 |
| Developer Productivity | 105 | 18 | 14 | 6 | 144 |
| Team Performance | 87 | 17 | 28 | 10 | 143 |
| Job Displacement | 12 | 83 | 23 | 1 | 119 |
| Hiring & Recruitment | 53 | 8 | 8 | 3 | 72 |
| Social Protection | 39 | 17 | 8 | 2 | 66 |
| Creative Output | 32 | 20 | 8 | 3 | 64 |
| Skill Obsolescence | 5 | 50 | 6 | 1 | 62 |
| Labor Share of Income | 17 | 20 | 17 | — | 54 |
| Worker Turnover | 15 | 15 | — | 3 | 33 |
| Industry | — | — | — | 1 | 1 |
Previous studies have identified language barriers as impediments to labor market engagement but empirical information assessing both policy reductions and the relative efficacy of professional, AI-assisted, and hybrid translation methods is scarce.
Paper's literature review claim that existing literature documents language barriers but lacks comparative empirical evaluations of policy reductions and multiple translation models; asserted as motivation for current study.
Translation verified against existing performance implementations achieves throughput parity with MJX (1.04x) for HalfCheetah JAX.
Benchmarking HalfCheetah implemented in the translated backend versus MJX, reporting a 1.04x throughput ratio (approximate parity).
Levers such as raising taxes, reforming pensions, boosting productivity interact with each other through feedback loops and time delays that are not yet well understood.
Literature and model motivation stated in the paper; the integrated model is built to capture such interactions and delays.
These efficiency and cost gains are achieved while maintaining accuracy parity with the matched hierarchical baseline.
Paper states accuracy parity was maintained in the empirical evaluation comparing the proposed framework to the matched hierarchical baseline on the 2,847-query testbed.
Logistics efficiency does not mediate (fails to fulfill) the anticipated role in transmitting AI's effects to supply chain stability.
Mechanism/mediation tests in the DML analysis on the 45 Chinese listed SEs (2012–2023) indicate no significant mediation via logistics efficiency.
The Photo Big 5 is only weakly correlated with cognitive measures such as test scores.
Correlation/associational analysis between Photo Big 5 trait scores and cognitive measures (e.g., test scores) reported for the MBA graduate sample.
The short‑term effect of AI on labor‑intensive industries is weak.
Short‑run/dynamic subgroup analysis in the China 2003–2017 panel indicating minimal or weak immediate growth effects for labor‑intensive sectors.
The article clarifies theoretical relationships and gaps between Material Passports, Digital Product Passports, and Digital Building Logbooks.
Theoretical analysis and synthesis section of the SLR where the authors compare concepts and identify overlaps and gaps among MPs, DPPs, and DBLs.
Personal experience with an AI 'boss' did not affect workers' attitudes on using AI in public decision making.
Same randomized design (N > 1,500) with attitudinal measures collected across a three-wave panel; comparison between AI-assigned and human-assigned participants showed no measurable effect on attitudes about AI in public decision making.
Correlation and illustrative regression results confirm the absence of an immediate statistical relationship between AI adoption and productivity at the aggregate level.
Both correlation analysis and an illustrative regression model applied to Eurostat aggregate data for 2021–2024; regression presented as illustrative (not necessarily causal); model specification details and robustness checks not given in the summary.
Labour productivity did not show a stable association with AI diffusion in Slovakia over the analysed period.
Correlation analysis between AI adoption indicators and labour productivity measures for Slovakia using harmonised Eurostat data (2021–2024); detailed coefficient estimates and significance levels not provided in the summary.
Diverse decision-making AI from different developers will commonly compete for finite shared resources in everyday devices (examples: charging slots, relay bandwidth, traffic priority).
Motivating background statement in the paper (observational/argumentative; examples drawn from real-world deployment contexts rather than reported experiment data).
There is an arithmetic crossover point between these regimes: it occurs where opposing tribes that form spontaneously first fit inside the available capacity.
Mathematical analysis in the paper deriving a capacity-based threshold (crossover) marked by whether spontaneously formed opposing tribes can be accommodated by available capacity.
When resources are abundant, the same ingredients (model diversity, individual RL, tribe formation) drive system overload to near zero.
Empirical and mathematical results in the paper showing that abundance of resources reduces overload to near zero under the same agent-population conditions.
The study presents an advanced systematic ranking of I4.0 adoption barriers in the Thai automotive industry.
Paper outputs a ranked list of barriers produced by the integrated Fuzzy BWM-PROMETHEE II-DEMATEL framework; full ranked list and quantitative ranks not included in the supplied summary.
Median hourly compensation for gig workers, after accounting for expenses and unpaid time, averages $14.20.
Earnings analysis using platform transaction records adjusted for reported expenses and estimated unpaid labor time; comparative baseline drawn from labor force and administrative wage data (24 countries, 2015–2025).
The study explores the influence of AI on HRM practice specifically within top IT companies.
Scope statement in the paper: empirical study involved HR professionals from various (described as top) IT firms. The summary does not supply the list of companies or sampling criteria.
Top management support does not have a direct influence on AI Adoption in the sampled firms.
PLS-SEM results from the 207-firm survey showing a non-significant direct path from top management support to AI Adoption (as reported in the paper).
Effort expectancy does not have a direct influence on AI Adoption in the sampled firms.
PLS-SEM results from the 207-firm survey showing a non-significant direct path from effort expectancy to AI Adoption (as reported in the paper).
This study developed a unified framework that integrates technology acceptance and trust-based perspectives.
Conceptual/methodological claim in the paper: authors report constructing an integrated framework based on literature and their empirical testing.
The paper contributes to both theory and policy by reconceptualizing procurement value and offering an actionable roadmap for embedding ESG principles in public healthcare procurement.
Scholarly contribution claimed via literature synthesis and framework/roadmap creation; contribution is normative and conceptual rather than empirically validated.
In the sentiment-analysis task, those individual differences do not produce human–AI complementarity: the joint performance of humans and AI did not exceed that of either alone.
Empirical finding reported from the preregistered sentiment-analysis experiment showing no complementarity effect (joint human-AI performance ≤ best individual performance). (Statistical tests and sample size not included in the excerpt.)
We conducted a systematic review and meta-analysis of the literature on AI/HR analytics and organizational decision making, using 85 publications and grounding the work in theories of algorithm-automated decision-making (AST) and matching/hybrid models (STS).
Paper's methods: systematic review and meta-analysis; sample = 85 publications; theoretical framing explicitly stated as AST and STS.
Macroeconomic fiscal moderation remains empirically unvalidated.
Synthesis conclusion from the review noting an absence of empirical evidence that Agentic AI produces macroeconomic fiscal moderation; i.e., no validated studies showing broad fiscal relief effects were identified in the reviewed literature.
By 2024 the RL-FRB/US model produced a federal budget deficit similar to the baseline: RL-FRB/US model: -1,767 trillion $ vs. FRB/US model: -1,758 trillion $.
Reported fiscal balance (federal budget deficit) simulation outputs for 2024 from comparative model runs in the paper.
No significant differences emerged in job titles and industry suggested by GPT-5 across genders.
Empirical finding from analysis of GPT-5 outputs comparing suggested job titles and industries for the 24 profiles; exact statistical tests not specified in the summary.
Self-generated (model-authored) Skills provide no average benefit.
Comparison of three evaluation conditions (no Skills, curated Skills, self-authored Skills) across SkillsBench. Averaged pass-rate deltas show that model-authored Skills do not increase average pass rate relative to baseline; analysis used 7,308 trajectories over 86 tasks and 7 agent–model configurations.
AI will not cause permanent mass unemployment at the aggregate level.
Analytical argument and literature synthesis using labor-economics theory (Skill-Biased Technological Change and structural transformation). No primary microdata, no stated empirical identification strategy or sample size in the paper (methodology appears to be theoretical and sectoral synthesis).
Empirical evaluation is needed on how AI-induced productivity gains translate into aggregate demand and labor absorption.
Identified research priority in the paper, based on theoretical uncertainty about demand-side labor absorption and lack of conclusive empirical evidence.
AI will not mechanically cause permanent mass unemployment at the aggregate level.
Theoretical framing and synthesis of existing empirical findings across task-based and macro studies; no single new dataset provided (paper draws on literature and conceptual models).
Occupation-level analyses (e.g., BLS OEWS cross-occupation wage regressions) risk misleading conclusions about AI’s distributional effects because they aggregate over the task- and firm-level heterogeneity that drives the mechanism.
Theoretical argument and empirical illustration in the paper showing how aggregation masks within-task compression and firm-level rent capture; example regressions on OEWS used to demonstrate the limitation.
Testing the model requires within-occupation, within-task panel data on task-level performance and wages linked to firm-level AI adoption, ownership of complementary assets, and measures of rent-sharing; such data are not available at scale.
Author statement about data requirements and current data limitations; empirical illustration and discussion note absence of large-scale linked microdata meeting these criteria.
Occupation-level regressions using BLS OEWS (2019–2023) are insufficient for testing the model’s task-level predictions because aggregation across tasks and firms hides the mechanism.
Empirical illustration in the paper using occupation-level regressions on BLS OEWS 2019–2023 showing that such aggregates do not reveal within-occupation, within-task dispersion or firm-level rent concentration effects; paper argues this is a data-adequacy limitation.
A sensitivity decomposition shows five of the moments (the non‑ΔGini moments) identify internal mechanism rates (how AI changes task production, education responses, screening intensity) but do not determine the aggregate sign of inequality change.
Local identification / sensitivity decomposition performed on the calibrated model; decomposition results reported in the paper attribute mechanism-rate identification to five moments and show they leave the sign of ΔGini indeterminate.
The paper introduces a novel taxonomy that separates patenting into three domains: core AI, traditional robotics, and AI-enhanced robotics.
Methodological contribution of the paper: construction and application of a classification scheme that assigns patent filings (1980–2019) into three domains (core AI, traditional robotics, AI-enhanced robotics). Data source: patent filings 1980–2019 (aggregate counts by domain and country). Exact number of patents not provided in the summary.
The proposed uncertainty measure connects to classical value-of-information concepts, bridging security mechanism analysis and economic theories of information, signaling, and screening.
Analytical comparison and discussion in the paper linking the entropy-style residual uncertainty metric to value-of-information literature (theoretical linkage).
AI did not significantly moderate the relationship between workplace stress and job performance.
Moderation test in PLS-SEM (SmartPLS 4.0) on N = 350; reported non-significant AI × Stress → Performance moderator (paper reports no significant moderating effect).
Use of AI raises needs for traceability, explainability, and continuous validation to maintain compliance and avoid error propagation in curricular decisions.
Paper's AI governance recommendations (prescriptive), referencing general AI risk principles rather than empirical study.
There is no accepted integrative digital model that maps measured or perceived value to algorithmic pricing.
Absence of such a model in the SLR sample of 30 articles and thematic coding that identified this gap explicitly.
There is no evidence of nonlinearities in the relationship between digital trade and urban house prices (the effect is linear across the sample).
Explicit tests for nonlinearity reported in the econometric analysis (details of test specification not provided in the summary).
When green-technology innovation is low (below the threshold), the main measurable effect of DE is on improving carbon emission efficiency (CEE), but DE does not yet reduce per capita emissions (PCE).
Results from the threshold-regression models on the 278-city panel (2011–2022) show that in the low-green-innovation regime DE coefficients are significant for CEE but not for PCE; mediating-effect models corroborate the efficiency channel in low-innovation contexts.
Realising DT value requires upfront investment in sensors, integration, standards, and skills; economic viability depends on contract structures and how gains are allocated between investors, owners, contractors, and operators.
Synthesis of cost/benefit discussions and case descriptions in the reviewed literature; policy and procurement examples referenced.
HCI has explored usable consent, but there is no systematic framework for consent in the AI era.
Literature synthesis and gap identification from workshop participants and solicited position papers; no systematic review or meta-analysis with counted studies reported in the summary.
Privacy-leak framing (risk vs ambiguity or privacy-threatening vs neutral) did not change participants' subsequent bargaining behavior with pricing algorithms.
The experiment measured downstream bargaining behavior with algorithms after the adoption/label tasks (N = 610) and reports no detectable effect of the privacy/leak framing on those bargaining outcomes.
Under truthful bidding, the decentralised price-based market matches a centralised value-optimal benchmark (i.e., decentralised allocation equals centralised value-optimal allocation).
Paper presents both a theoretical argument (mechanism properties under quasilinear utilities and discrete slices) and empirical validation in simulation by comparing decentralised outcomes to a centralised value-optimal baseline across configurations in the ablation study.
No clear evidence that project phase systematically shifts sentiment perception.
Project-phase indicators were collected each round and included in correlation and repeated-measures analyses; no consistent, systematic association between project phase and sentiment labeling was found.
Predictors of negative labeling are weak and at best trend-level (e.g., task conflict shows only weak/trend-level association with negative labels).
Correlation analyses and GEE models testing multiple predictors (mood states, life circumstances, team dynamics including task conflict) on negative vs other labels; effects for negative labeling were small and lacked robustness.
Experiments used realistic channel and beamforming datasets reflecting varying elevation angles and dynamic LEO link conditions.
Dataset description in the paper states use of realistic channel and beamforming data including varying elevation angles and dynamic links; no dataset size or public dataset identifiers provided in the summary.
There is a need for causal studies (randomized pilots, phased rollouts) to quantify net welfare effects including patient trust, equity, legal risk, and long-run labor impacts.
Authors' recommendation based on gaps identified in the mixed-methods evidence and acknowledged limitations around causal identification and long-term measurement.
Under the current estimated parameters, dynamics converge toward equilibria—implying convergent, policy-mediated adjustment rather than endogenous cyclical instability.
Inference from stability classification (stable-node equilibria) and model dynamics simulated or linearized around equilibria using 2016–2023–estimated parameters.