Evidence (7278 claims)

Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.

The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).

Browse by theme

Nine broad, paper-level topics. Click one to filter the claims below.

Human-AI Collaboration

Claims by outcome category

Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.

Outcome	Positive	Negative	Mixed	Null	Total
Other	795	210	105	955	2131
Governance & Regulation	886	414	197	126	1654
Organizational Efficiency	826	204	129	87	1257
Technology Adoption Rate	681	259	128	110	1189
Research Productivity	464	138	65	349	1028
Output Quality	503	196	61	53	813
Decision Quality	351	180	84	51	673
AI Safety & Ethics	238	288	71	34	637
Firm Productivity	455	58	92	20	631
Market Structure	186	172	123	25	511
Task Allocation	222	70	76	34	407
Innovation Output	238	28	48	18	334
Skill Acquisition	177	62	62	17	318
Employment Level	107	57	108	13	287
Fiscal & Macroeconomic	135	72	44	26	284
Firm Revenue	172	50	28	5	256
Consumer Welfare	121	68	45	12	246
Task Completion Time	183	33	10	13	240
Inequality Measures	45	126	50	6	227
Worker Satisfaction	95	74	23	12	204
Error Rate	77	98	11	4	190
Regulatory Compliance	84	73	17	7	181
Automation Exposure	61	61	27	14	166
Training Effectiveness	98	21	14	19	154
Wages & Compensation	78	37	25	6	146
Developer Productivity	105	18	14	6	144
Team Performance	87	17	28	10	143
Job Displacement	12	83	23	1	119
Hiring & Recruitment	53	8	8	3	72
Social Protection	39	17	8	2	66
Creative Output	32	20	8	3	64
Skill Obsolescence	5	50	6	1	62
Labor Share of Income	17	20	17	—	54
Worker Turnover	15	15	—	3	33
Industry	—	—	—	1	1

Governance Remove filter

Conversational agents are increasingly integrated into the most private and intimate aspects of users' lives, from discussions of mental health to financial decisions.

Asserted as descriptive background in the paper (position/argumentative claim); examples provided (mental health, financial decisions); no empirical study or sample size reported in the excerpt.

high positive Who Does Your AI Work For? Designing Conversational Agents a... degree of integration of conversational agents into private/intimate user contex...

Audit outputs can be leveraged as red-teaming inputs to stress-test fairness robustness and strengthen AI governance through improved data quality and oversight (proposed intervention).

Proposed methodological/policy recommendation in the paper (proposal, not evaluated empirically in the excerpt).

high positive Towards Using Ai Bias Audits As Inputs For Red Teaming And P... robustness of fairness assessments and strength of AI governance

AI-enabled hiring systems are widely adopted.

Statement in paper (background claim); no empirical sample or citation provided in the excerpt.

high positive Towards Using Ai Bias Audits As Inputs For Red Teaming And P... adoption of AI-enabled hiring systems

The study advances an integrative framework of sustainable AI governance emphasizing regulatory adaptability, institutional coordination, and ethical oversight as mechanisms for aligning AI innovation with long-term financial stability and sustainability objectives, and offers policy-relevant guidance for regulators and financial institutions.

Study conclusion reported in the abstract describing the proposed integrative framework and its policy relevance; based on the study's comparative mixed-methods analysis.

high positive Artificial Intelligence in Financial Security Markets: Catal... presence of a policy-relevant sustainable AI governance framework emphasizing re...

AI-enabled financial innovation is associated with improvements in risk assessment capabilities.

Comparative institutional analysis and integration of secondary quantitative indicators with qualitative documentary evidence across China, the United States, and the United Kingdom (2022–2025) as described in the abstract.

high positive Artificial Intelligence in Financial Security Markets: Catal... risk assessment capabilities

AI-enabled financial innovation is associated with improvements in ESG integration.

Same comparative mixed-methods approach across China, the United States, and the United Kingdom (2022–2025) using secondary quantitative indicators and qualitative documentary evidence, reported in the abstract.

high positive Artificial Intelligence in Financial Security Markets: Catal... ESG integration

AI-enabled financial innovation is associated with improvements in market efficiency.

Comparative mixed-methods analysis (comparative institutional analysis) across leading financial systems in China, the United States, and the United Kingdom (2022–2025), integrating secondary quantitative indicators with qualitative documentary evidence as reported in the study abstract.

high positive Artificial Intelligence in Financial Security Markets: Catal... market efficiency

Managers can make both (Agentic Technical Debt and Stochastic Tax) visible through lightweight dashboards and governance controls.

Prescriptive/recommendation in the paper; authors state they 'outline' approaches for managers to surface these concepts using dashboards and governance controls. No empirical evaluation or case study evidence reported in the provided excerpt.

high positive Governing Technical Debt in Agentic AI Systems visibility/monitoring of agentic technical debt and operating burden via dashboa...

Aggregators and niche specialists employ more open governance and sourcing logics that foster innovation, specialization, and ecosystem diversity.

Presented as a comparative finding from the taxonomy and qualitative examination of non-hyperscaler ML platform providers; supports drawn from conceptual analysis and examples in the paper rather than quantitative measures (no sample size reported in abstract).

high positive An Ai Economy Beyond Big Tech Hyperscalers? A Taxonomy Of Ma... openness of governance/sourcing and resulting innovation, specialization, and ec...

Ongoing efforts of the initiative aim to incorporate benchmarks that address concerns about bias by considering alternative perspectives and human centered use cases.

Statement of planned/ongoing work in the paper regarding future benchmark inclusion to address bias and human-centered use cases; no empirical results provided.

high positive BEAMS: Benchmarking and Evaluating AI for Modeling and Simul... planned incorporation of bias-aware benchmarks and human-centered use case consi...

Implemented tests include causal translation, model iteration, causal reasoning, conformance, model behavior explanation, suggested model building steps, and suggested model fixes.

Specific list of implemented test categories provided in the paper; descriptive/reporting evidence from the initiative's work.

high positive BEAMS: Benchmarking and Evaluating AI for Modeling and Simul... types/categories of tests implemented

Tests for several distinct categories of evaluation have been implemented and applied to AI tools that support qualitative model building, quantitative model building, and model discussion.

Paper reports that a set of tests have been implemented and applied to AI tools across qualitative and quantitative modeling and discussion; no sample sizes or numeric evaluation results provided in the excerpt.

high positive BEAMS: Benchmarking and Evaluating AI for Modeling and Simul... existence and application of implemented evaluation tests across types of modeli...

A steering group focuses on prioritizing potential benchmarks, while a technical group focuses on implementing the benchmarks in the form of automated tests.

Organizational description in the paper specifying roles (steering group and technical group); no quantitative evaluation reported.

high positive BEAMS: Benchmarking and Evaluating AI for Modeling and Simul... organizational roles for benchmark prioritization and implementation

The open source sd ai project hosted by the initiative establishes transparency and enables contributions to be shared broadly.

Descriptive statement about the open-source project hosted by the initiative; no empirical measures of transparency or contribution sharing provided.

high positive BEAMS: Benchmarking and Evaluating AI for Modeling and Simul... transparency and breadth of contributions enabled by the open source sd ai proje...

The initiative uses open digital and organizational infrastructure to collaboratively evaluate AI tools for modeling and simulation.

Descriptive claim in the paper about organizational approach (open infrastructure and collaborative evaluation); no empirical testing or sample size reported.

high positive BEAMS: Benchmarking and Evaluating AI for Modeling and Simul... use of open infrastructure for collaborative evaluation

The BEAMS Initiative aims to guide the development of AI tools for modeling and simulation toward forms that are responsible and ethical by establishing benchmarks for human centered modeling and simulation practices.

Descriptive statement about the Initiative's stated aims and purpose in the paper; organizational description rather than empirical evidence.

high positive BEAMS: Benchmarking and Evaluating AI for Modeling and Simul... existence and purpose of the BEAMS Initiative (benchmarking for responsible/ethi...

Tools that can automate aspects of modeling practice must complement human expertise, not replace it.

Normative claim made in the paper (argument about human-centered design); no empirical evidence or sample size reported.

high positive BEAMS: Benchmarking and Evaluating AI for Modeling and Simul... relationship between automated modeling tools and human expertise (complementari...

AI tools to support real world decision making must be able to build simulation models that inform their recommendations and render them interpretable.

Normative assertion in the paper (position statement / requirement); no empirical study or sample size reported.

high positive BEAMS: Benchmarking and Evaluating AI for Modeling and Simul... ability of AI tools to build interpretable simulation models that inform recomme...

The agentic future is not predetermined; leaders must both skate to where the puck is going and actively steer it toward a good place, ensuring innovation delivers welfare gains felt by businesses and consumers around the world.

Normative recommendation offered by the authors; based on conceptual argument and interpretation of the framework rather than empirical testing in the excerpt.

high positive From Augmentation to Reconstruction: Guiding the AI Disrupti... policy/leadership influence on welfare distribution of AI-driven innovation

These complementary investments produce the familiar 'productivity J-curve' of general-purpose technologies.

Stated as an economic analogy/claim drawing on general-purpose technology literature; presented as an asserted mechanism rather than shown with new empirical estimates in the excerpt.

high positive From Augmentation to Reconstruction: Guiding the AI Disrupti... productivity trajectory (J-curve) following complementary investments

The most consequential disruption resides in the third stage (Reconstruction) where workflows and markets are rebuilt around delegation, machine-to-machine interaction, continuous monitoring, and auditable constraints.

Theoretical claim in the paper backed by conceptual reasoning and illustrative sector examples; no quantitative evidence provided in the excerpt.

high positive From Augmentation to Reconstruction: Guiding the AI Disrupti... magnitude/importance of disruption arising from Reconstruction-stage changes

Because reputation-based, ex post sanctions cannot be relied upon for dissociative agents, governance should shift to observability-based, ex ante, constitutive, protocol-based behavioral harnesses.

Prescriptive recommendation derived from the theoretical critique of identity-based governance; paper proposes observability- and protocol-focused alternatives but does not present empirical tests or trials.

high positive Dissociative Identity: Language Model Agents Lack Grounding ... governance effectiveness of observability-based, ex ante protocol mechanisms

Reputation mechanisms function both as social signals and as corrective feedback that sustain an equilibrium of trustworthy behavior, presuming a persistent identity associated with behavioral continuity, sanction sensitivity, and costly non-fungibility.

Conceptual/theoretical argument presented in the paper drawing on reputation theory and social signaling; no empirical sample or quantitative data reported.

high positive Dissociative Identity: Language Model Agents Lack Grounding ... trustworthy behavior (sustaining equilibrium of trust)

Restoring honest billing will require verification that ties reported token counts to evidence the provider does not control, such as trusted execution attestation, cryptographic proofs of inference, or third-party re-execution.

Policy/recommendation proposed by the authors based on their findings (argument that independent verification is necessary).

high positive Token Inflation: How Dishonest Providers Can Overcharge for ... requirements for restoring honest billing (types of verification needed)

Even when the user can see the full reasoning string, tokenization ambiguity alone still allows 50.85% over-reporting below the detection threshold.

Experimental result reported in the paper showing over-reporting due solely to tokenizer ambiguity when reasoning string is visible (no sample size in excerpt).

high positive Token Inflation: How Dishonest Providers Can Overcharge for ... percent over-reporting of billed tokens due to tokenization ambiguity

At current frontier reasoning prices, that turns a $100 honest bill into roughly a $1,569 bill on the same query.

Numerical example/price calculation based on the reported inflation (uses current frontier reasoning prices; calculation given by the authors).

high positive Token Inflation: How Dishonest Providers Can Overcharge for ... billed dollar amount for same query

In the most permissive setting, hidden reasoning usage can be inflated by 1,469% on average without detection.

Experimental/adversarial evaluation reported in the paper showing average inflation in a permissive audit setting (no sample size for queries provided in excerpt).

high positive Token Inflation: How Dishonest Providers Can Overcharge for ... percent over-reporting of hidden reasoning token usage

We study three recent token auditing frameworks and show that a provider with ordinary commercial capabilities can systematically inflate billed token counts.

Empirical/analytical evaluation of three token-auditing frameworks studied by the authors; adversarial provider simulation/experiment (paper states three frameworks were studied).

high positive Token Inflation: How Dishonest Providers Can Overcharge for ... ability to inflate billed token counts (systematic over-reporting)

Per-token billing is now the standard pricing model for commercial large language models (LLMs).

Author assertion about prevailing commercial pricing practices (no empirical sample or citation provided in excerpt).

high positive Token Inflation: How Dishonest Providers Can Overcharge for ... pricing model (per-token adoption)

We discuss implications for Information Systems (IS) design and propose future field evaluations.

Paper includes a discussion section outlining IS design implications and suggestions for future empirical/field work.

high positive Multi Agent Systems In The Lean Startup Cycle: Operationalis... proposed implications and future research directions

The approach preserves statistical rigour, traceability, and nuanced Persevere/Iterate decisions when accelerating experimentation.

Reported outcomes of controlled simulations and description of system design that enforces statistical procedures and logging; stated in manuscript as findings.

high positive Multi Agent Systems In The Lean Startup Cycle: Operationalis... statistical rigour, traceability, and decision quality in experimentation (Perse...

Logs render capabilities observable at the feature level, turning 'agentic AI' into a disciplined experimentation infrastructure rather than a generic assistant.

Implementation logs and descriptions from the Node.js instantiation reported in the paper; qualitative claim about observability and traceability at the feature level.

high positive Multi Agent Systems In The Lean Startup Cycle: Operationalis... feature-level observability/traceability of experimentation activities

The Multi Agent System reduces time-to-validated-learning by roughly an order of magnitude while preserving statistical rigour, traceability, and nuanced Persevere/Iterate decisions.

Results from the controlled simulations reported in the paper (comparison between agentic multi-agent system and manual B-M-L cycles).

high positive Multi Agent Systems In The Lean Startup Cycle: Operationalis... time-to-validated-learning (and preservation of statistical rigour, traceability...

Controlled simulations compare agentic and manual B-M-L cycles on feature ideas.

Reported controlled simulation experiments in the paper comparing agentic (multi-agent) and manual B-M-L cycles; methodological description present in manuscript.

high positive Multi Agent Systems In The Lean Startup Cycle: Operationalis... comparison of agentic vs manual B-M-L cycles (experimentation performance metric...

We instantiate them in a Node.js package instrumenting a production-grade SaaS codebase.

Implementation artifact reported in the paper (Node.js package) and description of instrumentation on a production-grade SaaS codebase.

high positive Multi Agent Systems In The Lean Startup Cycle: Operationalis... existence and instantiation of a Node.js package that instruments a SaaS codebas...

Drawing on the Dynamic Capabilities View, we derive fifteen meta-requirements and thirty-three design principles (consolidated into seven goal-directed groups) for sensing, seizing, reconfiguring, orchestration, and governance.

Design-theory derivation reported in the paper (counts of meta-requirements and design principles are stated in the manuscript).

high positive Multi Agent Systems In The Lean Startup Cycle: Operationalis... number and organization of derived meta-requirements and design principles

We propose a multi-agent artefact that operationalises the Build–Measure–Learn (B-M-L) cycle as a closed-loop control system.

Design science study described in the paper; conceptual derivation and artifact instantiation (Node.js package) reported in the manuscript.

high positive Multi Agent Systems In The Lean Startup Cycle: Operationalis... operationalisation of the Build–Measure–Learn cycle as a closed-loop control sys...

The review synthesizes fragmented evidence and links AI use to SME performance improvements, while outlining directions for future research on sustainable AI adoption.

Self-description of the article's contribution based on the authors' focused literature review (2016-2024).

high positive The Role of Artificial Intelligence in Strengthening Financi... synthesis quality and linkage of AI to performance improvements

Cloud-based AI solutions, targeted employee training, and explainable AI are identified strategies to overcome AI adoption challenges in SMEs.

Recommendations synthesized from the reviewed literature (2016-2024); presented as enabling strategies rather than results from a single empirical intervention).

high positive The Role of Artificial Intelligence in Strengthening Financi... effectiveness of strategies for enabling AI adoption

AI supports more data-driven financial planning for SMEs.

Identified across the reviewed empirical and conceptual studies in the 2016-2024 literature (synthesis rather than new empirical estimate).

high positive The Role of Artificial Intelligence in Strengthening Financi... use of data-driven methods in financial planning

AI enables real-time fraud detection for SMEs.

Synthesis of empirical and conceptual literature reporting AI applications in fraud detection (review-level claim; no aggregated quantitative effect provided).

high positive The Role of Artificial Intelligence in Strengthening Financi... timeliness and effectiveness of fraud detection

AI enables more accurate credit risk assessment for SMEs.

Review synthesizing studies on credit scoring and risk assessment within the 2016-2024 corpus (no single pooled sample size or unified effect estimate provided).

high positive The Role of Artificial Intelligence in Strengthening Financi... credit risk assessment accuracy

AI improves cash flow and financial forecasting for SMEs.

Synthesis of empirical studies and conceptual papers in the 2016-2024 literature reviewed (review article does not report primary sample sizes/effect estimates).

high positive The Role of Artificial Intelligence in Strengthening Financi... cash flow and financial forecasting accuracy

AI offers strong potential to enhance the financial stability and growth of SMEs when supported by suitable organizational capacities and governance.

Focused review of high-quality research (2016-2024) synthesizing empirical and conceptual studies on AI applications in SME finance (no single-sample primary data reported).

high positive The Role of Artificial Intelligence in Strengthening Financi... financial stability and growth of SMEs

Regulatory divergence across the European Union, United States, and China has moved AI governance from a compliance function to a strategy-shaping constraint.

Framing statement in the paper's introduction arguing that cross-jurisdiction regulatory divergence elevates governance to a strategic constraint; presented as contextual motivation rather than tested causal finding.

high positive Research on the adaptation path of corporate strategy based ... role_of_AI_governance_in_corporate_strategy

Firms with higher governance exposure and AI maturity exhibit more advanced, multi-dimensional adaptation across regulatory environments.

Paper conclusion synthesizing regression and index results linking governance exposure and AI maturity to adaptation intensity and configuration.

high positive Research on the adaptation path of corporate strategy based ... adaptation_intensity_and_configuration

Governance exposure significantly predicted all adaptation indices (β = 0.35–0.47, R² = 0.29–0.41, all p ≤ 0.004).

Reported regression results: 'Regression showed governance exposure significantly predicted all adaptation indices (β = 0.35–0.47, R² = 0.29–0.41, all p ≤ 0.004)'.

high positive Research on the adaptation path of corporate strategy based ... adaptation_indices (composite measures)

Compartmentalization scores were highest for tri-jurisdictional organizations (0.82 ± 0.05).

Reported composite-index results: 'compartmentalization scores were highest for tri-jurisdictional organizations (0.82 ± 0.05).'

high positive Research on the adaptation path of corporate strategy based ... compartmentalization_score

Ethical signaling intensity was greatest in tech firms (0.82 ± 0.04).

Reported composite-index results by sector: 'Ethical signaling intensity was greatest in tech firms (0.82 ± 0.04).'

high positive Research on the adaptation path of corporate strategy based ... ethical_signaling_score

Modularity peaked in multinational corporations (0.86 ± 0.04).

Reported composite-index results: 'modularity peaked in multinational corporations (0.86 ± 0.04).'

high positive Research on the adaptation path of corporate strategy based ... modularity_score

« Prev 1 2 3 … 58 59 60 … 145 146 Next »