Evidence (7278 claims)
Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.
The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).
Browse by theme
Nine broad, paper-level topics. Click one to filter the claims below.
Adoption
9047 claims
Filter claims →
Productivity
8066 claims
Filter claims →
Governance
7278 claims
Filtered →
Human-AI Collaboration
6912 claims
Filter claims →
Org Design
4439 claims
Filter claims →
Innovation
4359 claims
Filter claims →
Labor Markets
3652 claims
Filter claims →
Skills & Training
3018 claims
Filter claims →
Inequality
2160 claims
Filter claims →
Claims by outcome category
Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 795 | 210 | 105 | 955 | 2131 |
| Governance & Regulation | 886 | 414 | 197 | 126 | 1654 |
| Organizational Efficiency | 826 | 204 | 129 | 87 | 1257 |
| Technology Adoption Rate | 681 | 259 | 128 | 110 | 1189 |
| Research Productivity | 464 | 138 | 65 | 349 | 1028 |
| Output Quality | 503 | 196 | 61 | 53 | 813 |
| Decision Quality | 351 | 180 | 84 | 51 | 673 |
| AI Safety & Ethics | 238 | 288 | 71 | 34 | 637 |
| Firm Productivity | 455 | 58 | 92 | 20 | 631 |
| Market Structure | 186 | 172 | 123 | 25 | 511 |
| Task Allocation | 222 | 70 | 76 | 34 | 407 |
| Innovation Output | 238 | 28 | 48 | 18 | 334 |
| Skill Acquisition | 177 | 62 | 62 | 17 | 318 |
| Employment Level | 107 | 57 | 108 | 13 | 287 |
| Fiscal & Macroeconomic | 135 | 72 | 44 | 26 | 284 |
| Firm Revenue | 172 | 50 | 28 | 5 | 256 |
| Consumer Welfare | 121 | 68 | 45 | 12 | 246 |
| Task Completion Time | 183 | 33 | 10 | 13 | 240 |
| Inequality Measures | 45 | 126 | 50 | 6 | 227 |
| Worker Satisfaction | 95 | 74 | 23 | 12 | 204 |
| Error Rate | 77 | 98 | 11 | 4 | 190 |
| Regulatory Compliance | 84 | 73 | 17 | 7 | 181 |
| Automation Exposure | 61 | 61 | 27 | 14 | 166 |
| Training Effectiveness | 98 | 21 | 14 | 19 | 154 |
| Wages & Compensation | 78 | 37 | 25 | 6 | 146 |
| Developer Productivity | 105 | 18 | 14 | 6 | 144 |
| Team Performance | 87 | 17 | 28 | 10 | 143 |
| Job Displacement | 12 | 83 | 23 | 1 | 119 |
| Hiring & Recruitment | 53 | 8 | 8 | 3 | 72 |
| Social Protection | 39 | 17 | 8 | 2 | 66 |
| Creative Output | 32 | 20 | 8 | 3 | 64 |
| Skill Obsolescence | 5 | 50 | 6 | 1 | 62 |
| Labor Share of Income | 17 | 20 | 17 | — | 54 |
| Worker Turnover | 15 | 15 | — | 3 | 33 |
| Industry | — | — | — | 1 | 1 |
Governance
Remove filter
Algorithmic accuracy alone does not determine value; legitimacy and uptake hinge on people's and process readiness.
Thematic conclusion drawn from interviews, Likert surveys, and document analysis across cases indicating non-technical factors strongly influence uptake despite algorithmic performance metrics. (Sample size not reported.)
The long-term dynamic effects of AI on resilience remain unverified and require longer-term data.
Authors explicitly state the need for longer time-series data to validate long-term dynamics.
Enterprise-level indicators used in the study do not directly capture supply chain network structure and node dependencies.
Explicit limitation noted by the authors about measurement and scope.
The study's sample is limited to listed manufacturing companies, so conclusions should be applied cautiously to small and medium-sized enterprises (SMEs).
Explicit limitation stated by the authors in the paper.
Mediation and moderation models are leveraged to explore how AI enhances resilience via resource allocation optimization, productivity, and technological innovation, and how conditional factors (e.g., agility) affect these links.
Authors state they used mediation and moderation models on firm-level data to test mechanisms and conditional effects.
The study uses data on A-share listed manufacturing companies from 2011 to 2023 and applies a multi-period difference-in-differences (DID) model to assess AI's impact on SCR.
Methods description provided in the paper summary: sample timeframe and econometric approach explicitly stated.
The article examines the socioeconomic implications of AI-driven automation through the lens of political economy and labor sociology.
Methodological statement in the paper indicating theoretical framing and disciplinary approaches; no empirical sample reported in the abstract.
The review is a focused qualitative evidence synthesis and the proposed governance model is an evidence-informed conceptual framework that warrants future empirical validation.
Authors' explicit framing of the review approach and caveat calling for empirical validation of the proposed model.
Given the focused Title/Abstract/Keywords query and the small, heterogeneous corpus, the findings are interpreted as a scoped evidence map rather than an exhaustive census of all AI-and-work research.
Authors' explicit limitation statement referencing the search strategy (title/abstract/keywords focus), small number of included studies (n=19), and heterogeneity of studies.
Nineteen studies met the eligibility criteria and were analyzed using qualitative thematic synthesis.
Reported result of the screening/eligibility process in the review: final included sample = 19 peer-reviewed articles; analysis method stated as qualitative thematic synthesis.
We conducted a systematic review guided by PRISMA 2020, searching Scopus and Web of Science (Title/Abstract/Keywords) for English-language journal articles published between 2015 and 2025.
Methods reported in the paper: PRISMA 2020-guided systematic review; databases searched explicitly named (Scopus, Web of Science); query fields (Title/Abstract/Keywords); language and date restrictions stated (English, 2015–2025).
The review focuses on the 2020–2025 period for studies of AI application in financial auditing.
Stated scope/timeframe of literature included in the review.
Article selection was conducted using the Scopus (Q1–Q4) and Sinta (1–2) databases based on predefined inclusion and exclusion criteria, resulting in a final sample of 15 articles.
Stated data sources and selection procedure in the Methods section; final sample size explicitly reported as 15.
This study employs a Systematic Literature Review (SLR) method following the PRISMA 2020 protocol.
Stated methodology in the paper: explicit use of SLR and PRISMA 2020 protocol.
We ran a longitudinal 20-month empirical study (July 2024 -- February 2026) that chronicles the system's evolution.
Explicit statement of study duration and dates in the paper's abstract.
The baselines are implemented as prompts, representing the realistic deployment alternative to a governed framework.
Methodological statement in paper describing how baselines were implemented (as prompts); presented as representing realistic alternative deployment.
We benchmark three systems on an 11-case balanced prior authorization appeal evaluation set.
Methodological statement in paper describing evaluation; sample size explicitly stated as 11 cases.
Collaboration among content creators can be modeled as a multi-agent stochastic linear bandit problem with a transferable utility (TU) cooperative game formulation, where a coalition's value equals the negative sum of its members' cumulative regrets.
Methodological/modeling claim: the paper defines a multi-agent stochastic linear bandit and maps coalition value to negative sum of cumulative regrets as the TU game payoff function.
Both country and domain rankings are stable from 2021-2024.
Temporal analysis reported in paper comparing GCI and ETGCI rankings across 2021-2024, concluding stability of rankings over that period.
We found no evidence that information provision drove effects on our behavioural outcomes.
Analysis from the preregistered experiments showing that manipulations of information provision did not produce corresponding changes in measured behaviours (e.g., petition signing, donations).
We observed no evidence of a correlation between AI persuasion effects on attitudes and behaviour.
Analysis reported in the two preregistered experiments comparing AI-induced changes in attitudes with corresponding behavioural outcomes across participants (sample reported in paper).
The study uses the 2015 Green Data Center Pilot Policy as a quasi-natural experiment and employs the difference-in-differences (DID) method to identify the policy's impact on urban inclusive green growth.
Author-stated research design: quasi-natural experiment leveraging the 2015 policy and DID estimation (methodological claim in the paper).
This study uses Partial Least Squares Structural Equation Modeling (PLS-SEM) on 350 survey responses to examine the effects of AI adoption, regulatory clarity, digital infrastructure readiness, and cross-border data governance quality on international trade performance, with compliance effectiveness as a mediating mechanism.
Methodological description in the paper: PLS-SEM analysis on a survey sample of 350 responses (sample size explicitly reported).
Empirical evidence remains limited on how AI deployment and institutional conditions jointly influence compliance effectiveness and international trade performance.
Statement of research gap based on the paper's literature review and motivation for the study.
The selected studies originated mainly from Peru, Colombia, Chile, and Ecuador.
Geographic provenance reported for the 27 included studies (country distribution summarized in results).
After screening, 27 studies were selected for inclusion in the review.
PRISMA-style screening and eligibility process reported in the methods/results, yielding 27 included studies.
The initial search returned 276,302 records.
Reported search yield from the Scopus query described in the methods.
A systematic search was conducted in the Scopus database following PRISMA 2020 guidelines for articles published between 2021 and 2025 using Boolean operators related to AI and decision-making.
Methodological description in the paper stating adherence to PRISMA 2020 and the search strategy (Scopus, 2021–2025).
Exploratory innovation does not show a significant direct association with long-term competitive performance.
PLS-SEM results from the survey of 104 Portuguese B2B managers reporting a non-significant direct path from exploratory innovation to performance.
ARS's implementation can be found at https://github.com/t54-labs/AgenticRiskStandard.
Link to code repository provided in the abstract (factual statement pointing to implementation).
As AI systems evolve into autonomous agents deployed in open environments and increasingly connected to payments or assets, the operational meaning of trust shifts to end-to-end outcomes: whether an agent completes tasks, follows user intent, and avoids failures that cause material or psychological harm.
Conceptual/argumentative claim presented in the paper (no empirical sample reported in the abstract).
Prior work on trustworthy AI emphasizes model-internal properties such as bias mitigation, adversarial robustness, and interpretability.
Summary statement about existing literature (no empirical data or sample reported in the abstract; asserted by authors as background).
On document intelligence (DocILE), our Code Factory variant matches Direct LLM on key field extraction (KILE: 80.0%).
Empirical evaluation reported on DocILE dataset of 5,680 invoices; KILE metric reported at 80.0%.
We evaluate on two task types: function-calling (BFCL, n=400) and document intelligence (DocILE, n=5,680 invoices).
Statement in paper specifying dataset/task types and sample sizes used in evaluation.
Explicit 'Sponsored' labels do not significantly reduce persuasion.
Experimental comparison including conditions with explicit 'Sponsored' labels; authors report no significant reduction in persuasion when labels were present (from the preregistered experiments).
A fifth of all products were randomly designated as sponsored and promoted in different ways.
Paper description of experimental manipulation: 20% of products (a fifth) were randomly designated as sponsored in the catalog.
We conducted two preregistered experiments with N = 2,012 participants.
Statement of experimental design in the paper (two preregistered experiments) with total sample size reported as N = 2,012.
Today's LLMs are trained to align with user preferences through methods such as reinforcement learning.
Statement of standard practice referenced in the paper, drawing on existing literature about alignment methods (e.g., reinforcement learning from human feedback). This is a descriptive claim about common training techniques rather than an experimental result in this paper.
A pre-registered experiment evaluates this thesis in a commons production economy -- where agents share a finite resource pool and collaboratively produce value -- at 50-1,000 agent scale.
Paper states that a pre-registered experiment is planned/described; the experiment context (commons production economy) and planned scale (50-1,000 agents) are specified in the excerpt. No experimental outcomes or effect estimates are reported here.
We instantiate SoP in AgentCity on an EVM-compatible layer-2 blockchain (L2) with a three-tier contract hierarchy (foundational, meta, and operational).
Reported implementation/instantiation described in the paper (system implementation claim). The paper states the platform (AgentCity) and technical details (EVM-compatible L2, three-tier contracts).
In this architecture, smart contracts are the law itself -- the actual legislative output that agents produce and that governs their behavior.
Architectural/design claim in the paper describing conceptual role of smart contracts within SoP; presented as an intended property of the system.
Agents discover, transact with, and delegate to agents owned by other parties without centralized oversight.
Asserted behavior pattern of autonomous agents in the paper's motivation; presented as descriptive claim rather than supported by a reported experiment or dataset in the excerpt.
Autonomous AI agents are beginning to operate across organizational boundaries on the open internet.
Stated as an empirical observation in the paper's introduction/introduction-level motivation; no specific dataset or sample described in the text excerpt.
The review covers publications between 2019 and 2025.
Explicit scope of the literature search reported by the authors (time window of included/considered publications).
The survey synthesizes methodological trends across data-, feature-, and decision-level fusion strategies.
Synthesis and categorization reported in the paper based on analysis of the included studies (n=18).
The review examines 18 multimodal GeoAI studies identified through a PRISMA-ScR screening process from 57 candidate publications between 2019 and 2025.
Explicit methodological reporting in the paper: PRISMA-ScR screening yielded 18 included studies out of 57 candidates over the 2019–2025 period.
This paper presents a systematic survey of recent GeoAI studies that fuse multiple geospatial data modalities for key urban mobility tasks.
Authors report conducting a systematic literature survey using a PRISMA-ScR screening process described in the paper.
Inclusive urban mobility examines whether transport systems equitably support the everyday movements and accessibility needs of historically marginalized and underserved populations.
Definition/interpretive claim presented in the paper as conceptual framing (no empirical measurement reported).
Without further assumptions, fitness need not increase over time.
Theoretical result from the model: analysis shows no guaranteed monotonic increase in fitness absent additional assumptions (proof/derivation in paper).
Humans retain partial control through a 'fitness function' that allocates limited computational resources across lineages.
Model assumption and formalization in the mathematical model: fitness function used to allocate computational resources across AI lineages (theoretical model specification).