Evidence (7198 claims)

Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.

The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).

Browse by theme

Nine broad, paper-level topics. Click one to filter the claims below.

Human-AI Collaboration

Claims by outcome category

Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.

Outcome	Positive	Negative	Mixed	Null	Total
Other	790	208	103	950	2117
Governance & Regulation	869	411	195	126	1630
Organizational Efficiency	817	202	126	87	1243
Technology Adoption Rate	675	258	128	106	1178
Research Productivity	462	138	64	347	1023
Output Quality	501	193	61	52	807
Decision Quality	346	180	84	51	668
AI Safety & Ethics	235	285	70	34	630
Firm Productivity	452	58	91	20	627
Market Structure	184	171	123	24	507
Task Allocation	221	65	76	34	401
Skill Acquisition	176	62	62	17	317
Innovation Output	207	28	48	18	303
Fiscal & Macroeconomic	135	72	44	26	284
Employment Level	105	56	108	13	284
Consumer Welfare	121	67	45	11	244
Firm Revenue	160	50	28	4	242
Task Completion Time	182	33	10	13	239
Inequality Measures	45	126	50	6	227
Worker Satisfaction	94	73	23	12	202
Error Rate	76	98	11	4	189
Regulatory Compliance	81	73	17	7	178
Automation Exposure	61	59	26	14	163
Training Effectiveness	97	21	14	19	153
Wages & Compensation	78	37	25	6	146
Developer Productivity	105	18	14	6	144
Team Performance	87	17	28	10	143
Job Displacement	12	83	21	1	117
Hiring & Recruitment	52	8	8	3	71
Social Protection	39	17	8	2	66
Creative Output	32	20	8	3	64
Skill Obsolescence	5	49	6	1	61
Labor Share of Income	17	19	17	—	53
Worker Turnover	15	14	—	3	32
Industry	—	—	—	1	1

Governance Remove filter

AI reduces information asymmetries.

Theoretical/conceptual argument in the paper framing AI as a general-purpose technology that improves information flows; supported by the paper's conceptual framework (no experimental or causal identification reported).

high positive Artificial intelligence, institutional innovation and econom... information asymmetries (reduction)

These systems are now being widely used to produce software, conduct business activities, and automate everyday personal tasks.

Authors' statement describing observed applications and uses (policy/legal analysis; specific empirical data or sample size not provided in excerpt).

high positive Regulating AI Agents use of AI agents across software production, business processes, and personal ta...

AI agents have entered the mainstream.

Authors' declarative statement based on their review of recent developments and observed uptake (policy/legal analysis in the paper). No empirical sample size reported in excerpt.

high positive Regulating AI Agents AI agent adoption / prevalence

Economies and organizations that prioritize adaptability, workforce transformation, and real-time decision-making capabilities are better positioned to sustain growth under volatile conditions.

Claim based on the paper's cross-cutting analysis of global indicators and the conceptual AEPM framework; the excerpt does not provide a quantified causal estimate, experimental evidence, or sample size supporting this assertion.

high positive Beyond Forecasting: Adaptive Economic Preparedness in a Geop... ability to sustain growth under volatile conditions

AEPM is structured around five core pillars—energy resilience, supply chain flexibility, human capital adaptability, financial sustainability, and AI-enabled decision systems—which together provide a comprehensive approach to managing uncertainty and enabling dynamic responses to structural disruptions.

Conceptual design of the AEPM presented in the paper; described as a multidimensional framework combining these five pillars. No empirical validation or quantified impact measures reported in the excerpt.

high positive Beyond Forecasting: Adaptive Economic Preparedness in a Geop... capacity to manage uncertainty and mount dynamic responses to structural disrupt...

The paper proposes shifting from forecasting-centric economic management to an adaptive preparedness paradigm and introduces the Adaptive Economic Preparedness Model (AEPM), a multi-dimensional framework designed to enhance resilience at both organizational and national levels.

Presentation of a conceptual model (AEPM) in the paper structured around five pillars; this is a proposed framework rather than an empirically validated intervention (no evaluation sample or randomized test reported in the excerpt).

high positive Beyond Forecasting: Adaptive Economic Preparedness in a Geop... resilience of organizations and nations to structural disruptions

The contribution is a falsifiable architectural thesis, a clear threat model, and a set of experimentally testable hypotheses for future work on distillation resistance, alignment, and model governance.

Theoretical contribution claim: the paper proposes hypotheses and a threat model intended to be testable in future empirical work; no experiments in the paper itself are reported.

high positive A Public Theory of Distillation Resistance via Constraint-Co... provision_of_falsifiable_thesis_and_testable_hypotheses

The authors call for shifting evaluation and assurance from tool qualification toward workflow qualification to achieve trustworthy Physical AI.

Normative recommendation based on the paper's theoretical analysis (policy/recommendation; no empirical sample reported).

high positive The Competence Shadow: Theory and Bounds of AI Assistance in... governance_and_regulation

The paper derives non-degradation conditions that characterize shadow-resistant workflows for AI-assisted safety analysis.

Analytic derivations and formal criteria presented in the paper (theoretical result; no empirical validation/sample size reported).

high positive The Competence Shadow: Theory and Bounds of AI Assistance in... output_quality

The paper formalizes four canonical human–AI collaboration structures and derives closed-form performance bounds for them.

Theoretical/mathematical derivations and models in the paper (no empirical verification/sample size reported).

high positive The Competence Shadow: Theory and Bounds of AI Assistance in... task_allocation

A five-dimensional competence framework captures safety competence via domain knowledge, standards expertise, operational experience, contextual understanding, and judgment.

Theoretical contribution: paper defines and formalizes a five-dimension framework (no empirical validation/sample size reported).

high positive The Competence Shadow: Theory and Bounds of AI Assistance in... skill_acquisition

To facilitate adoption of our evaluation framework, we detail our testing protocols and make relevant materials publicly available.

Statement in paper that testing protocols and materials are documented and released publicly (paper claims to provide materials).

high positive Evaluating Language Models for Harmful Manipulation availability of testing protocols and materials

We assess an AI model with 10,101 participants spanning interactions in three AI use domains (public policy, finance, and health) and three locales (US, UK, and India).

Reported sample size and study design details stated in abstract: N = 10,101; three domains and three locales specified.

high positive Evaluating Language Models for Harmful Manipulation sample composition and scale of the empirical study

This paper introduces a framework for evaluating harmful AI manipulation via context-specific human-AI interaction studies.

Paper describes a proposed evaluation framework (methodological contribution); claimed in abstract/introduction as new contribution. No numeric sample required for the claim itself.

high positive Evaluating Language Models for Harmful Manipulation existence of an evaluation framework for harmful AI manipulation

The result is evidence-based triggers that replace calendar schedules and make governance auditable.

Claimed outcome of applying the decision-theoretic framework in the paper (argumentative; no empirical deployment or case-study evidence reported in the summary).

high positive Retraining as Approximate Bayesian Inference retraining trigger design and governance auditability

The paper provides a decision-theoretic framework for retraining policies.

Explicit claim about the paper's contribution; the article presents a framework (conceptual/methodological exposition).

high positive Retraining as Approximate Bayesian Inference existence of a prescriptive framework for retraining policies

The retraining decision is a cost minimization problem with a threshold that falls out of your loss function.

Decision-theoretic derivation presented in the paper (analytical/theoretical reasoning; no empirical validation reported).

high positive Retraining as Approximate Bayesian Inference formalization of retraining decision rule (cost-minimization/threshold)

Retraining can be better understood as approximate Bayesian inference under computational constraints.

Theoretical argument and decision-theoretic framing presented in the paper (conceptual/mathematical derivation rather than empirical testing).

high positive Retraining as Approximate Bayesian Inference conceptual framing of retraining

The framework is designed for direct application to engineering processes for which operational event logs are available.

Statement of intended applicability in the paper and demonstration on a large enterprise procurement workflow (BPI 2019 log).

high positive The Stochastic Gap: A Markovian Framework for Pre-Deployment... adoptability / applicability to engineering processes

The same quantities that delimit statistically credible autonomy (blind masses, escalation gate, m(s), etc.) also determine expected oversight burden (the framework includes an expected oversight-cost identity over the workflow visitation measure).

Theoretical identity and discussion in the paper plus demonstration on the empirical workflow showing how the introduced quantities relate to expected oversight costs.

high positive The Stochastic Gap: A Markovian Framework for Pre-Deployment... expected oversight burden / oversight cost

On the held-out split, m(s) = max_a \hat{\pi}(a|s) tracks realized autonomous step accuracy within 3.4 percentage points on average.

Empirical evaluation on the paper's held-out test split (chronological 20%); reported average discrepancy between the maximum predicted action probability and realized autonomous-step accuracy.

high positive The Stochastic Gap: A Markovian Framework for Pre-Deployment... accuracy of autonomous step selection (realized autonomous step accuracy)

Refining the operational state to include case context, economic magnitude, and actor class expands the state space from 42 to 668.

Empirical report in the paper showing state-space expansion when additional contextual variables are included in state definition (numbers 42 and 668 stated).

high positive The Stochastic Gap: A Markovian Framework for Pre-Deployment... other

We instantiate the framework on the Business Process Intelligence Challenge 2019 purchase-to-pay log (251,734 cases, 1,595,923 events, 42 distinct workflow actions) and construct a log-driven simulated agent from a chronological 80/20 split of the same process.

Empirical instantiation described in the paper using the BPI 2019 purchase-to-pay event log; dataset statistics (cases, events, distinct actions) and an 80/20 chronological train/test split are reported.

high positive The Stochastic Gap: A Markovian Framework for Pre-Deployment... other

We develop a measure-theoretic Markov framework for agentic AI in organizations, whose core quantities are state blind-spot mass B_n(\tau), state-action blind mass B^{SA}_{\pi,n}(\tau), an entropy-based human-in-the-loop escalation gate, and an expected oversight-cost identity over the workflow visitation measure.

Theoretical development presented in the paper (definition and derivation of the measure-theoretic Markov framework and associated quantities).

high positive The Stochastic Gap: A Markovian Framework for Pre-Deployment... other

The framework aims to support more comparable benchmarks and cumulative research on human-AI readiness, advancing safer and more accountable human-AI collaboration.

Stated aims and intended impact in paper; aspirational/conceptual rather than empirically demonstrated in excerpt.

high positive From Accuracy to Readiness: Metrics and Benchmarks for Human... benchmarks, cumulative research, safety and accountability in human-AI collabora...

Operationalizing evaluation through interaction traces rather than model properties or self-reported trust enables deployment-relevant assessment of calibration, error recovery, and governance.

Methodological claim/proposed approach in paper; presented as enabling assessment but no empirical evaluation reported in excerpt.

high positive From Accuracy to Readiness: Metrics and Benchmarks for Human... assessment of calibration, error recovery, governance via interaction traces

The taxonomy and metrics are connected to the Understand-Control-Improve (U-C-I) lifecycle of human-AI onboarding and collaboration.

Conceptual mapping described in paper; no empirical tests or sample reported in excerpt.

high positive From Accuracy to Readiness: Metrics and Benchmarks for Human... linking metrics to U-C-I onboarding lifecycle

We introduce a four part taxonomy of evaluation metrics spanning outcomes, reliance behavior, safety signals, and learning over time.

Explicit methodological claim in paper announcing a taxonomy; described as a contribution rather than empirically tested in excerpt.

high positive From Accuracy to Readiness: Metrics and Benchmarks for Human... evaluation metrics taxonomy (outcomes, reliance behavior, safety signals, learni...

This paper proposes a measurement framework for evaluating human-AI decision-making centered on team readiness.

Methodological contribution presented in paper; conceptual framework proposed (no empirical validation reported in excerpt).

high positive From Accuracy to Readiness: Metrics and Benchmarks for Human... team readiness evaluation

Artificial intelligence (AI) systems are deployed as collaborators in human decision-making.

Statement in paper (conceptual/observational claim); no empirical sample or method provided in excerpt.

high positive From Accuracy to Readiness: Metrics and Benchmarks for Human... deployment of AI as collaborators

Late disclosure of AI involvement improved affective engagement for AI-enhanced content.

Reported experimental result in the abstract from the two online studies (study 1: n = 325; study 2: n = 371) manipulating disclosure timing (early vs. late).

high positive AI content labeling and user engagement on social media: The... affective engagement for AI-enhanced content under late disclosure

The results of this regional research outline a multi-dimensional policy roadmap that dives deep into the region’s current capabilities and the hurdles it faces in catching up with the AI revolution from a governance and policy perspective, presenting them in a practical framework for public sector leaders.

Report summary claiming that the study's results produce a comprehensive roadmap and practical framework (content description).

high positive Charting AI Governance Future in the Arab Region: A Policy R... comprehensiveness and practicality of the policy roadmap produced by the study

This executive report provides a roadmap for establishing an AI governance infrastructure through a set of strategic policy recommendations across seven key pillars.

Document assertion describing the content and structure of the report (authors' deliverable).

high positive Charting AI Governance Future in the Arab Region: A Policy R... existence of a multi-pillar policy roadmap in the report

The reality of limited AI governance capacity calls for a series of policy interventions at both local and regional levels to empower the AI ecosystem in the Arab region.

Authors' policy recommendation derived from the regional study and synthesis of findings.

high positive Charting AI Governance Future in the Arab Region: A Policy R... adoption of policy interventions to strengthen AI governance and ecosystem

A governance model linking 'trustworthy AI' practices to competitive advantage yields reduced uncertainty, faster deployment cycles, and higher stakeholder trust.

Central claim of the paper tying the proposed AIGSF to business benefits; supported by conceptual linkage and illustrative examples rather than quantified empirical evidence or controlled evaluation.

high positive Artificial Intelligence Governance In Corporate Strategy: Et... firm_revenue

Case illustrations across hiring, credit, consumer services, and generative AI draw lessons on controls such as model documentation, algorithmic audits, impact assessments, and human-in-the-loop oversight.

Paper includes qualitative case illustrations in the listed domains to demonstrate governance controls; these are presented as examples and lessons rather than as systematic empirical studies (no sample sizes reported).

high positive Artificial Intelligence Governance In Corporate Strategy: Et... regulatory_compliance

The paper develops an AI Governance Strategic Framework (AIGSF) and an implementation roadmap that connect ethical accountability, regulatory readiness, cybersecurity resilience, and performance outcomes.

Paper contribution described as an integrative conceptual framework and roadmap; supported by theoretical grounding and illustrative cases rather than empirical validation; no sample size provided.

high positive Artificial Intelligence Governance In Corporate Strategy: Et... organizational_efficiency

AI governance should be treated as a strategic governance function—anchored in board oversight and enterprise risk management—rather than a narrow technical or compliance task.

Central normative recommendation and thesis of the paper; derived from an integrative conceptual framework grounded in corporate governance theory, ERM, and emerging regulation. No empirical testing or sample reported.

high positive Artificial Intelligence Governance In Corporate Strategy: Et... governance_and_regulation

AI has moved from a peripheral digital capability to a central driver of corporate strategy, reshaping decision-making, customer engagement, operations, and risk exposure.

Statement presented in the paper's introduction and motivation; supported by integrative conceptual design and literature grounding (theory and descriptive citations). No empirical sample or quantitative analysis reported.

high positive Artificial Intelligence Governance In Corporate Strategy: Et... organizational_efficiency

A policy of 20% mandatory practice preserves 92% more capability than the simulation baseline (baseline includes a 5% background AI-failure rate).

Simulation comparing baseline (5% background AI-failure rate) to a counterfactual with 20% mandatory practice; reported 92% relative preservation of capability.

high positive The enrichment paradox: critical capability thresholds and i... preserved human capability under mandatory practice policy vs baseline

The model predicts that periodic AI failures improve human capability 2.7-fold (relative improvement reported in simulations).

Simulation experiments comparing scenarios with/without periodic AI failures; reported fold-change in capability of 2.7×.

high positive The enrichment paradox: critical capability thresholds and i... human capability (H) under periodic AI-failure regime

Validated against 15 countries' PISA data (102 points), the model achieves R^2 = 0.946 with 3 parameters and attains the lowest BIC among compared specifications.

Empirical validation using PISA dataset covering 15 countries and 102 data points; reported fit statistics (R^2, number of parameters, BIC).

high positive The enrichment paradox: critical capability thresholds and i... fit of model to PISA data (explained variance, model selection via BIC)

The model was calibrated to four domains: education, medicine, navigation, and aviation.

Model calibration procedures applied separately to four named domains reported in the paper.

high positive The enrichment paradox: critical capability thresholds and i... model parameter fits across domains

We present a two-variable dynamical systems model coupling capability (H) and delegation (D), grounded in three axioms: learning requires capability, practice, and disuse causes forgetting.

Model specification and theoretical construction described in the paper (two-variable dynamical system; three axioms).

high positive The enrichment paradox: critical capability thresholds and i... human capability as a dynamical variable (H) and delegation level (D)

Legal professionals, courts, and regulators should replace the outdated 'black box' mental model with verification protocols based on how these systems actually fail.

Policy recommendation stated in the abstract based on the paper's analysis; no trial or deployment evidence of such protocols provided in the excerpt.

high positive When AI output tips to bad but nobody notices: Legal implica... adoption of verification protocols / change in mental model

The adoption of generative AI across commercial and legal professions offers dramatic efficiency gains.

Asserted in the paper's introduction/abstract; no empirical data, sample, or quantitative study reported in the excerpt.

high positive When AI output tips to bad but nobody notices: Legal implica... efficiency gains

AI adoption and the associated improved governance lead to higher total factor productivity (TFP).

Empirical analysis showing a positive association between firm-level AI application index and measures of total factor productivity in the 2010–2023 Chinese A-share panel.

high positive The risk-mitigation effects of artificial intelligence adopt... total factor productivity (TFP)

AI adoption and the associated improved governance lead to a lower cost of debt financing for firms.

Empirical tests linking firm-level AI application and governance improvements to measures of debt financing costs (e.g., interest rates on debt, financing spreads) in the Chinese A-share firm sample.

high positive The risk-mitigation effects of artificial intelligence adopt... cost of debt financing (interest rate/spread measures)

The governance risk-mitigation effects of AI operate through enhancing external monitoring.

Mechanism analyses showing that AI adoption is associated with measures of stronger external monitoring (e.g., analyst coverage, media scrutiny, regulator activity) in the firm-year panel, linking that channel to reduced misconduct.

high positive The risk-mitigation effects of artificial intelligence adopt... external monitoring intensity (analyst coverage, media/regulatory scrutiny proxi...

The governance risk-mitigation effects of AI operate through strengthening internal control capacity.

Mechanism analyses showing that higher AI application is associated with improved internal control measures (as reported by firms or regulatory/financial-control indicators) in the dataset of Chinese A-share firms.

high positive The risk-mitigation effects of artificial intelligence adopt... internal control capacity (corporate internal control metrics)

« Prev 1 2 3 … 81 82 83 … 143 144 Next »