Evidence (7198 claims)
Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.
The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).
Browse by theme
Nine broad, paper-level topics. Click one to filter the claims below.
Adoption
8921 claims
Filter claims →
Productivity
8002 claims
Filter claims →
Governance
7198 claims
Filtered →
Human-AI Collaboration
6864 claims
Filter claims →
Org Design
4398 claims
Filter claims →
Innovation
4286 claims
Filter claims →
Labor Markets
3629 claims
Filter claims →
Skills & Training
3001 claims
Filter claims →
Inequality
2141 claims
Filter claims →
Claims by outcome category
Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 790 | 208 | 103 | 950 | 2117 |
| Governance & Regulation | 869 | 411 | 195 | 126 | 1630 |
| Organizational Efficiency | 817 | 202 | 126 | 87 | 1243 |
| Technology Adoption Rate | 675 | 258 | 128 | 106 | 1178 |
| Research Productivity | 462 | 138 | 64 | 347 | 1023 |
| Output Quality | 501 | 193 | 61 | 52 | 807 |
| Decision Quality | 346 | 180 | 84 | 51 | 668 |
| AI Safety & Ethics | 235 | 285 | 70 | 34 | 630 |
| Firm Productivity | 452 | 58 | 91 | 20 | 627 |
| Market Structure | 184 | 171 | 123 | 24 | 507 |
| Task Allocation | 221 | 65 | 76 | 34 | 401 |
| Skill Acquisition | 176 | 62 | 62 | 17 | 317 |
| Innovation Output | 207 | 28 | 48 | 18 | 303 |
| Fiscal & Macroeconomic | 135 | 72 | 44 | 26 | 284 |
| Employment Level | 105 | 56 | 108 | 13 | 284 |
| Consumer Welfare | 121 | 67 | 45 | 11 | 244 |
| Firm Revenue | 160 | 50 | 28 | 4 | 242 |
| Task Completion Time | 182 | 33 | 10 | 13 | 239 |
| Inequality Measures | 45 | 126 | 50 | 6 | 227 |
| Worker Satisfaction | 94 | 73 | 23 | 12 | 202 |
| Error Rate | 76 | 98 | 11 | 4 | 189 |
| Regulatory Compliance | 81 | 73 | 17 | 7 | 178 |
| Automation Exposure | 61 | 59 | 26 | 14 | 163 |
| Training Effectiveness | 97 | 21 | 14 | 19 | 153 |
| Wages & Compensation | 78 | 37 | 25 | 6 | 146 |
| Developer Productivity | 105 | 18 | 14 | 6 | 144 |
| Team Performance | 87 | 17 | 28 | 10 | 143 |
| Job Displacement | 12 | 83 | 21 | 1 | 117 |
| Hiring & Recruitment | 52 | 8 | 8 | 3 | 71 |
| Social Protection | 39 | 17 | 8 | 2 | 66 |
| Creative Output | 32 | 20 | 8 | 3 | 64 |
| Skill Obsolescence | 5 | 49 | 6 | 1 | 61 |
| Labor Share of Income | 17 | 19 | 17 | — | 53 |
| Worker Turnover | 15 | 14 | — | 3 | 32 |
| Industry | — | — | — | 1 | 1 |
Governance
Remove filter
AI reduces information asymmetries.
Theoretical/conceptual argument in the paper framing AI as a general-purpose technology that improves information flows; supported by the paper's conceptual framework (no experimental or causal identification reported).
These systems are now being widely used to produce software, conduct business activities, and automate everyday personal tasks.
Authors' statement describing observed applications and uses (policy/legal analysis; specific empirical data or sample size not provided in excerpt).
AI agents have entered the mainstream.
Authors' declarative statement based on their review of recent developments and observed uptake (policy/legal analysis in the paper). No empirical sample size reported in excerpt.
Economies and organizations that prioritize adaptability, workforce transformation, and real-time decision-making capabilities are better positioned to sustain growth under volatile conditions.
Claim based on the paper's cross-cutting analysis of global indicators and the conceptual AEPM framework; the excerpt does not provide a quantified causal estimate, experimental evidence, or sample size supporting this assertion.
AEPM is structured around five core pillars—energy resilience, supply chain flexibility, human capital adaptability, financial sustainability, and AI-enabled decision systems—which together provide a comprehensive approach to managing uncertainty and enabling dynamic responses to structural disruptions.
Conceptual design of the AEPM presented in the paper; described as a multidimensional framework combining these five pillars. No empirical validation or quantified impact measures reported in the excerpt.
The paper proposes shifting from forecasting-centric economic management to an adaptive preparedness paradigm and introduces the Adaptive Economic Preparedness Model (AEPM), a multi-dimensional framework designed to enhance resilience at both organizational and national levels.
Presentation of a conceptual model (AEPM) in the paper structured around five pillars; this is a proposed framework rather than an empirically validated intervention (no evaluation sample or randomized test reported in the excerpt).
The contribution is a falsifiable architectural thesis, a clear threat model, and a set of experimentally testable hypotheses for future work on distillation resistance, alignment, and model governance.
Theoretical contribution claim: the paper proposes hypotheses and a threat model intended to be testable in future empirical work; no experiments in the paper itself are reported.
The authors call for shifting evaluation and assurance from tool qualification toward workflow qualification to achieve trustworthy Physical AI.
Normative recommendation based on the paper's theoretical analysis (policy/recommendation; no empirical sample reported).
The paper derives non-degradation conditions that characterize shadow-resistant workflows for AI-assisted safety analysis.
Analytic derivations and formal criteria presented in the paper (theoretical result; no empirical validation/sample size reported).
The paper formalizes four canonical human–AI collaboration structures and derives closed-form performance bounds for them.
Theoretical/mathematical derivations and models in the paper (no empirical verification/sample size reported).
A five-dimensional competence framework captures safety competence via domain knowledge, standards expertise, operational experience, contextual understanding, and judgment.
Theoretical contribution: paper defines and formalizes a five-dimension framework (no empirical validation/sample size reported).
To facilitate adoption of our evaluation framework, we detail our testing protocols and make relevant materials publicly available.
Statement in paper that testing protocols and materials are documented and released publicly (paper claims to provide materials).
We assess an AI model with 10,101 participants spanning interactions in three AI use domains (public policy, finance, and health) and three locales (US, UK, and India).
Reported sample size and study design details stated in abstract: N = 10,101; three domains and three locales specified.
This paper introduces a framework for evaluating harmful AI manipulation via context-specific human-AI interaction studies.
Paper describes a proposed evaluation framework (methodological contribution); claimed in abstract/introduction as new contribution. No numeric sample required for the claim itself.
The result is evidence-based triggers that replace calendar schedules and make governance auditable.
Claimed outcome of applying the decision-theoretic framework in the paper (argumentative; no empirical deployment or case-study evidence reported in the summary).
The paper provides a decision-theoretic framework for retraining policies.
Explicit claim about the paper's contribution; the article presents a framework (conceptual/methodological exposition).
The retraining decision is a cost minimization problem with a threshold that falls out of your loss function.
Decision-theoretic derivation presented in the paper (analytical/theoretical reasoning; no empirical validation reported).
Retraining can be better understood as approximate Bayesian inference under computational constraints.
Theoretical argument and decision-theoretic framing presented in the paper (conceptual/mathematical derivation rather than empirical testing).
The framework is designed for direct application to engineering processes for which operational event logs are available.
Statement of intended applicability in the paper and demonstration on a large enterprise procurement workflow (BPI 2019 log).
The same quantities that delimit statistically credible autonomy (blind masses, escalation gate, m(s), etc.) also determine expected oversight burden (the framework includes an expected oversight-cost identity over the workflow visitation measure).
Theoretical identity and discussion in the paper plus demonstration on the empirical workflow showing how the introduced quantities relate to expected oversight costs.
On the held-out split, m(s) = max_a \hat{\pi}(a|s) tracks realized autonomous step accuracy within 3.4 percentage points on average.
Empirical evaluation on the paper's held-out test split (chronological 20%); reported average discrepancy between the maximum predicted action probability and realized autonomous-step accuracy.
Refining the operational state to include case context, economic magnitude, and actor class expands the state space from 42 to 668.
Empirical report in the paper showing state-space expansion when additional contextual variables are included in state definition (numbers 42 and 668 stated).
We instantiate the framework on the Business Process Intelligence Challenge 2019 purchase-to-pay log (251,734 cases, 1,595,923 events, 42 distinct workflow actions) and construct a log-driven simulated agent from a chronological 80/20 split of the same process.
Empirical instantiation described in the paper using the BPI 2019 purchase-to-pay event log; dataset statistics (cases, events, distinct actions) and an 80/20 chronological train/test split are reported.
We develop a measure-theoretic Markov framework for agentic AI in organizations, whose core quantities are state blind-spot mass B_n(\tau), state-action blind mass B^{SA}_{\pi,n}(\tau), an entropy-based human-in-the-loop escalation gate, and an expected oversight-cost identity over the workflow visitation measure.
Theoretical development presented in the paper (definition and derivation of the measure-theoretic Markov framework and associated quantities).
The framework aims to support more comparable benchmarks and cumulative research on human-AI readiness, advancing safer and more accountable human-AI collaboration.
Stated aims and intended impact in paper; aspirational/conceptual rather than empirically demonstrated in excerpt.
Operationalizing evaluation through interaction traces rather than model properties or self-reported trust enables deployment-relevant assessment of calibration, error recovery, and governance.
Methodological claim/proposed approach in paper; presented as enabling assessment but no empirical evaluation reported in excerpt.
The taxonomy and metrics are connected to the Understand-Control-Improve (U-C-I) lifecycle of human-AI onboarding and collaboration.
Conceptual mapping described in paper; no empirical tests or sample reported in excerpt.
We introduce a four part taxonomy of evaluation metrics spanning outcomes, reliance behavior, safety signals, and learning over time.
Explicit methodological claim in paper announcing a taxonomy; described as a contribution rather than empirically tested in excerpt.
This paper proposes a measurement framework for evaluating human-AI decision-making centered on team readiness.
Methodological contribution presented in paper; conceptual framework proposed (no empirical validation reported in excerpt).
Artificial intelligence (AI) systems are deployed as collaborators in human decision-making.
Statement in paper (conceptual/observational claim); no empirical sample or method provided in excerpt.
Late disclosure of AI involvement improved affective engagement for AI-enhanced content.
Reported experimental result in the abstract from the two online studies (study 1: n = 325; study 2: n = 371) manipulating disclosure timing (early vs. late).
The results of this regional research outline a multi-dimensional policy roadmap that dives deep into the region’s current capabilities and the hurdles it faces in catching up with the AI revolution from a governance and policy perspective, presenting them in a practical framework for public sector leaders.
Report summary claiming that the study's results produce a comprehensive roadmap and practical framework (content description).
This executive report provides a roadmap for establishing an AI governance infrastructure through a set of strategic policy recommendations across seven key pillars.
Document assertion describing the content and structure of the report (authors' deliverable).
The reality of limited AI governance capacity calls for a series of policy interventions at both local and regional levels to empower the AI ecosystem in the Arab region.
Authors' policy recommendation derived from the regional study and synthesis of findings.
A governance model linking 'trustworthy AI' practices to competitive advantage yields reduced uncertainty, faster deployment cycles, and higher stakeholder trust.
Central claim of the paper tying the proposed AIGSF to business benefits; supported by conceptual linkage and illustrative examples rather than quantified empirical evidence or controlled evaluation.
Case illustrations across hiring, credit, consumer services, and generative AI draw lessons on controls such as model documentation, algorithmic audits, impact assessments, and human-in-the-loop oversight.
Paper includes qualitative case illustrations in the listed domains to demonstrate governance controls; these are presented as examples and lessons rather than as systematic empirical studies (no sample sizes reported).
The paper develops an AI Governance Strategic Framework (AIGSF) and an implementation roadmap that connect ethical accountability, regulatory readiness, cybersecurity resilience, and performance outcomes.
Paper contribution described as an integrative conceptual framework and roadmap; supported by theoretical grounding and illustrative cases rather than empirical validation; no sample size provided.
AI governance should be treated as a strategic governance function—anchored in board oversight and enterprise risk management—rather than a narrow technical or compliance task.
Central normative recommendation and thesis of the paper; derived from an integrative conceptual framework grounded in corporate governance theory, ERM, and emerging regulation. No empirical testing or sample reported.
AI has moved from a peripheral digital capability to a central driver of corporate strategy, reshaping decision-making, customer engagement, operations, and risk exposure.
Statement presented in the paper's introduction and motivation; supported by integrative conceptual design and literature grounding (theory and descriptive citations). No empirical sample or quantitative analysis reported.
A policy of 20% mandatory practice preserves 92% more capability than the simulation baseline (baseline includes a 5% background AI-failure rate).
Simulation comparing baseline (5% background AI-failure rate) to a counterfactual with 20% mandatory practice; reported 92% relative preservation of capability.
The model predicts that periodic AI failures improve human capability 2.7-fold (relative improvement reported in simulations).
Simulation experiments comparing scenarios with/without periodic AI failures; reported fold-change in capability of 2.7×.
Validated against 15 countries' PISA data (102 points), the model achieves R^2 = 0.946 with 3 parameters and attains the lowest BIC among compared specifications.
Empirical validation using PISA dataset covering 15 countries and 102 data points; reported fit statistics (R^2, number of parameters, BIC).
The model was calibrated to four domains: education, medicine, navigation, and aviation.
Model calibration procedures applied separately to four named domains reported in the paper.
We present a two-variable dynamical systems model coupling capability (H) and delegation (D), grounded in three axioms: learning requires capability, practice, and disuse causes forgetting.
Model specification and theoretical construction described in the paper (two-variable dynamical system; three axioms).
Legal professionals, courts, and regulators should replace the outdated 'black box' mental model with verification protocols based on how these systems actually fail.
Policy recommendation stated in the abstract based on the paper's analysis; no trial or deployment evidence of such protocols provided in the excerpt.
The adoption of generative AI across commercial and legal professions offers dramatic efficiency gains.
Asserted in the paper's introduction/abstract; no empirical data, sample, or quantitative study reported in the excerpt.
AI adoption and the associated improved governance lead to higher total factor productivity (TFP).
Empirical analysis showing a positive association between firm-level AI application index and measures of total factor productivity in the 2010–2023 Chinese A-share panel.
AI adoption and the associated improved governance lead to a lower cost of debt financing for firms.
Empirical tests linking firm-level AI application and governance improvements to measures of debt financing costs (e.g., interest rates on debt, financing spreads) in the Chinese A-share firm sample.
The governance risk-mitigation effects of AI operate through enhancing external monitoring.
Mechanism analyses showing that AI adoption is associated with measures of stronger external monitoring (e.g., analyst coverage, media scrutiny, regulator activity) in the firm-year panel, linking that channel to reduced misconduct.
The governance risk-mitigation effects of AI operate through strengthening internal control capacity.
Mechanism analyses showing that higher AI application is associated with improved internal control measures (as reported by firms or regulatory/financial-control indicators) in the dataset of Chinese A-share firms.