Evidence (14922 claims)

Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.

The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).

Browse by theme

Nine broad, paper-level topics. Click one to filter the claims below.

Human-AI Collaboration

Claims by outcome category

Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.

Outcome	Positive	Negative	Mixed	Null	Total
Other	795	210	105	955	2131
Governance & Regulation	886	414	197	126	1654
Organizational Efficiency	826	204	129	87	1257
Technology Adoption Rate	681	259	128	110	1189
Research Productivity	464	138	65	349	1028
Output Quality	503	196	61	53	813
Decision Quality	351	180	84	51	673
AI Safety & Ethics	238	288	71	34	637
Firm Productivity	455	58	92	20	631
Market Structure	186	172	123	25	511
Task Allocation	222	70	76	34	407
Innovation Output	238	28	48	18	334
Skill Acquisition	177	62	62	17	318
Employment Level	107	57	108	13	287
Fiscal & Macroeconomic	135	72	44	26	284
Firm Revenue	172	50	28	5	256
Consumer Welfare	121	68	45	12	246
Task Completion Time	183	33	10	13	240
Inequality Measures	45	126	50	6	227
Worker Satisfaction	95	74	23	12	204
Error Rate	77	98	11	4	190
Regulatory Compliance	84	73	17	7	181
Automation Exposure	61	61	27	14	166
Training Effectiveness	98	21	14	19	154
Wages & Compensation	78	37	25	6	146
Developer Productivity	105	18	14	6	144
Team Performance	87	17	28	10	143
Job Displacement	12	83	23	1	119
Hiring & Recruitment	53	8	8	3	72
Social Protection	39	17	8	2	66
Creative Output	32	20	8	3	64
Skill Obsolescence	5	50	6	1	62
Labor Share of Income	17	20	17	—	54
Worker Turnover	15	15	—	3	33
Industry	—	—	—	1	1

On 2025 year-to-date (through 2025-08-01), the system achieved Sharpe 1.40 +/- 0.22 across 20 random seeds.

Backtest/performance claim: reported Sharpe ratio with reported uncertainty and a sample size of 20 seeds; time window specified as 2025 YTD through 2025-08-01. No further details on portfolio construction, leverage, transaction costs, or benchmark adjustment provided in the excerpt.

medium positive Can Blindfolded LLMs Still Trade? An Anonymization-First Fra... Sharpe ratio (mean and +/- presumably standard error or standard deviation) over...

Regulatory sandboxes offer a flexible and innovation-friendly governance model compared to traditional command-and-control mechanisms.

Normative and comparative analysis within a law & economics framework; no empirical performance data reported in the abstract.

medium positive Experimentalism beyond ex ante regulation: A law and economi... flexibility of governance and degree of innovation-friendliness

Comparative insights from FinTech identify the institutional design features necessary to ensure the effectiveness and resilience of regulatory sandboxes.

Comparative case-based reasoning drawing on FinTech regulatory sandbox experience (abstract does not report number or selection of cases).

medium positive Experimentalism beyond ex ante regulation: A law and economi... presence and performance of institutional design features (effectiveness/resilie...

AI regulatory sandboxes may correct specific government failures, including regulatory capture, rent-seeking, and knowledge gaps.

Analytical claims supported by comparative reasoning (FinTech examples) and economic analysis of government failure; no empirical testing or sample size reported in the abstract.

medium positive Experimentalism beyond ex ante regulation: A law and economi... incidence/severity of government failures such as regulatory capture, rent-seeki...

AI regulatory sandboxes facilitate iterative regulatory learning while promoting responsible AI innovation.

Theoretical argument using experimentalist governance concepts and law & economics reasoning; comparative insights referenced but no empirical sample detailed in the abstract.

medium positive Experimentalism beyond ex ante regulation: A law and economi... degree of regulatory learning and indicators of responsible AI innovation

AI regulatory sandboxes can reduce negative externalities associated with AI deployment.

Conceptual and economic analysis in the paper (no empirical quantification or sample size reported in the abstract).

medium positive Experimentalism beyond ex ante regulation: A law and economi... magnitude/frequency of negative externalities (e.g., harms from AI systems)

AI regulatory sandboxes can mitigate information asymmetries between regulators and firms.

Analytical application of an economic analysis of law framework; theoretical argumentation rather than reported empirical measurement in the abstract.

medium positive Experimentalism beyond ex ante regulation: A law and economi... level of information asymmetry between regulators and AI firms

A well-established legal framework for data privacy (e.g., PIPL) enhances the benefits of big data for corporate performance.

Inference drawn from the observed stronger positive big-data effect on firm value after PIPL implementation, as reported by the paper's moderation analysis.

medium positive How Big Data Enhances Firm Value Under Data Privacy Regulati... firm performance / firm value

Robust sensitivity tests confirm the main findings, indicating that the results are not driven by model specification or sample selection.

Paper reports multiple robustness/sensitivity checks (unspecified in summary) that the authors state produce consistent results supporting the primary conclusions.

medium positive How Big Data Enhances Firm Value Under Data Privacy Regulati... firm value

The positive impact of big data on firm performance is strengthened following the implementation of China's Personal Information Protection Law (PIPL).

Moderation/interacted-specification analysis in the paper comparing pre- and post-PIPL periods (or interacting big-data measure with a PIPL indicator), showing a larger positive effect on firm value after PIPL implementation.

medium positive How Big Data Enhances Firm Value Under Data Privacy Regulati... firm value / firm performance

The positive effect of big data on firm value operates through improving operational efficiency and reducing costs.

Mechanism analysis reported in the paper indicating mediation/channel tests where big data adoption is associated with measures of operational efficiency and cost reductions, which in turn relate to higher firm value.

medium positive How Big Data Enhances Firm Value Under Data Privacy Regulati... operational efficiency; operating costs; firm value

Big data application significantly improves firm value.

Results from fixed-effects regressions on the 2007–2021 panel showing a statistically significant positive coefficient for the big-data keyword-frequency measure on firm value (paper reports significance and effect direction).

medium positive How Big Data Enhances Firm Value Under Data Privacy Regulati... firm value

It is optimal to start taxing AI when cognitive workers start to consider switching to manual jobs.

Analytical result derived from the extended dynamic taxation model and its comparative-static/optimal-policy analysis; the timing rule for introducing an AI tax follows from the model's equilibrium conditions and welfare optimization.

medium positive Workers' Incentives and the Optimal Taxation of AI optimal timing of initiating taxation on AI (triggered by cognitive workers' inc...

The model implies testable governance diagnostics linking latent fragility to observable patterns: recorded dissent (anonymous vs. formal voting gaps), scenario-set diversity, pipeline and method concentration, and anchor lag.

Theoretical mapping from model primitives and observable quantities to proposed diagnostics; the paper enumerates observable patterns that should correlate with model-implied fragility. This is a theoretical implication rather than an empirically validated claim.

medium positive Cohesion as Concentration: Exclusion-Driven Fragility in Fin... observable diagnostics (recorded dissent patterns, voting gaps, scenario diversi...

The clearest added value of AI over structured self-reflection lies in increasing felt accountability.

Based on RCT comparisons showing no significant AI advantage over the written-reflection questionnaire on overall goal progress, but showing higher perceived social accountability in the AI condition and a significant mediation of the AI effect on progress via perceived accountability (indirect effect = 0.15, 95% CI [0.04, 0.31]).

medium positive AI-Assisted Goal Setting Improves Goal Progress Through Soci... perceived social accountability and resulting goal progress

AI-assisted goal setting can improve short-term (two-week) goal progress.

Aggregate interpretation based on the RCT finding that the AI condition outperformed the no-support control on two-week goal progress (d = 0.33, p = .016); two-week follow-up window specified in study.

medium positive AI-Assisted Goal Setting Improves Goal Progress Through Soci... short-term goal progress (self-reported at two weeks)

The AI increased perceived social accountability relative to the written-reflection questionnaire.

Reported comparison from the RCT showing higher perceived social accountability in the AI condition versus the written-reflection condition; measured via self-report scales at follow-up (exact scale and statistics reported in paper).

medium positive AI-Assisted Goal Setting Improves Goal Progress Through Soci... perceived social accountability (self-report)

JobMatchAI provides factor-wise explanations through resume-driven search workflows.

Paper states that the system gives factor-wise explanations and ties them to resume-driven workflows; the excerpt references interpretable reranking and demo artifacts but does not include user study or explanation-faithfulness metrics.

medium positive JobMatchAI An Intelligent Job Matching Platform Using Knowle... explainability: factor-wise explanations presented to users within resume-driven...

JobMatchAI optimizes utility across skill fit, experience, location, salary, and company preferences.

Paper claims the system's objective/utility function includes these factors and that the reranking/optimization accounts for them. No optimization algorithm details, weighting, or empirical utility gains are given in the excerpt.

medium positive JobMatchAI An Intelligent Job Matching Platform Using Knowle... aggregate utility across factors: skill fit, experience, location, salary, compa...

JobMatchAI is production-ready.

Paper explicitly describes JobMatchAI as "production-ready" and also claims a hosted website and installable package (artifacts consistent with deployment readiness). No formal certification, deployment metrics, or uptime/performance SLAs are provided in the excerpt.

medium positive JobMatchAI An Intelligent Job Matching Platform Using Knowle... production readiness (availability of deployable artifacts such as hosted site a...

For AI agent tool design, surfacing contextual information outperforms prescribing procedural workflows.

Authors' conclusion drawn from the suite of experiments (GraphRAG vs TDD prompting vs auto-improvement) showing better regression reduction and/or resolution when contextual information is surfaced.

medium positive TDAD: Test-Driven Agentic Development - Reducing Code Regres... effectiveness in reducing regressions and improving resolution when using contex...

An autonomous auto-improvement loop raised resolution from 12% to 60% on a 10-instance subset with 0% regression.

Reported experiment on a 10-instance subset where an auto-improvement loop was applied (numbers provided in the excerpt).

medium positive TDAD: Test-Driven Agentic Development - Reducing Code Regres... resolution rate (increase from 12% to 60%) and regression rate (reported as 0%) ...

Smaller models benefit more from contextual information (which tests to verify) than from procedural instructions (how to do TDD).

Inferred from comparative results across models (Qwen3-Coder 30B vs Qwen3.5-35B-A3B) and interventions (contextual test-surfacing vs TDD prompting) reported in the paper.

medium positive TDAD: Test-Driven Agentic Development - Reducing Code Regres... relative improvement in regression rate and resolution when providing contextual...

When deployed as an agent skill, GraphRAG improved resolution from 24% to 32%.

Empirical comparison reported in the evaluation on SWE-bench Verified (same experimental context as above).

medium positive TDAD: Test-Driven Agentic Development - Reducing Code Regres... resolution rate (percentage of issues/problems resolved)

TDAD's GraphRAG workflow reduced test-level regressions by 70% (from 6.08% to 1.82%).

Empirical result reported from the SWE-bench Verified evaluation using the GraphRAG workflow (sample details: Qwen3-Coder 30B on 100 instances and Qwen3.5-35B-A3B on 25 instances as reported).

medium positive TDAD: Test-Driven Agentic Development - Reducing Code Regres... test-level regression rate (percentage of tests that regressed)

Partial validation against observed AIS vessel behavior shows PIER is consistent with the fastest real transits while exhibiting 23.1× lower variance.

Comparison between PIER trajectories and observed fastest transits in AIS data (details in paper); reported relative variance reduction of 23.1×.

medium positive Physics-informed offline reinforcement learning eliminates c... variance of transit times or fuel use compared to fastest observed AIS transits

PIER eliminates catastrophic fuel waste: great-circle routing produces extreme fuel consumption (>1.5× median) in 4.8% of voyages, while PIER reduces this to 0.5% (a 9-fold reduction).

Analysis on the same 2023 AIS validation dataset across seven Gulf of Mexico routes (840 episodes per method) comparing distribution tails of voyage fuel consumption; reported incidence rates (4.8% vs 0.5%).

medium positive Physics-informed offline reinforcement learning eliminates c... fraction of voyages with fuel consumption >1.5× median

PIER reduces mean CO2 emissions by 10% relative to great-circle routing.

Offline evaluation using physics‑calibrated environments grounded in historical AIS data and ocean reanalysis products; validation on one full year (2023) of AIS across seven Gulf of Mexico routes with 840 episodes per method; reported mean reduction of 10% and bootstrap 95% CI for mean savings [2.9%, 15.7%].

medium positive Physics-informed offline reinforcement learning eliminates c... mean CO2 emissions per voyage (percent reduction vs great-circle routing)

The system is in production at Personize.ai.

Deployment statement in the paper asserting production use at Personize.ai.

medium positive Governed Memory: A Production Architecture for Multi-Agent W... deployment status (production at Personize.ai)

The LoCoMo result confirms that governance and schema enforcement impose no retrieval quality penalty.

Interpretation in the paper linking LoCoMo benchmark accuracy (74.8%) to the conclusion that governance/schema enforcement did not degrade retrieval quality.

medium positive Governed Memory: A Production Architecture for Multi-Agent W... inferred retrieval quality impact of governance/schema enforcement (no penalty)

Governed Memory implements a closed-loop schema lifecycle with AI-assisted authoring and automated per-property refinement.

Design description in the paper describing the closed-loop schema lifecycle and AI-assisted authoring/refinement.

medium positive Governed Memory: A Production Architecture for Multi-Agent W... schema lifecycle process including AI-assisted authoring and per-property refine...

Governed Memory uses reflection-bounded retrieval with entity-scoped isolation.

Design description in the paper specifying reflection-bounded retrieval and entity-scoped isolation.

medium positive Governed Memory: A Production Architecture for Multi-Agent W... retrieval strategy (reflection-bounded) and isolation scope (entity-scoped)

Governed Memory uses tiered governance routing with progressive context delivery.

Design description in the paper listing tiered governance routing and progressive delivery as mechanisms.

medium positive Governed Memory: A Production Architecture for Multi-Agent W... governance routing strategy (tiered) and context delivery method (progressive)

Governed Memory implements a dual memory model combining open-set atomic facts with schema-enforced typed properties.

Design specification within the paper describing the dual memory model (architectural mechanism).

medium positive Governed Memory: A Production Architecture for Multi-Agent W... memory model design: open-set atomic facts + schema-enforced typed properties

The paper presents Governed Memory, a shared memory and governance layer addressing the memory governance gap.

System architecture and design description in the paper (proposal of a shared memory and governance layer).

medium positive Governed Memory: A Production Architecture for Multi-Agent W... existence of an architecture called Governed Memory

The results confirm the positive impact of cognitive technologies on the development of entrepreneurial opportunities and innovative activity.

Conclusion drawn from the positive estimated association (0.33 coefficient) and the observed increases in the indices between 2020 and 2024 reported in the paper.

medium positive Innovative Cognitive Tools for Studying Market Opportunities... entrepreneurial opportunities and innovation activity (proxied by the Market Opp...

The Cognitive Tools Index and the Market Opportunity Index were -0.42 and -0.35 in 2020 and 0.94 and 0.92 in 2024, respectively.

Reported observed/computed index values for the years 2020 and 2024 in the study (data source and aggregation method not detailed in the excerpt).

medium positive Innovative Cognitive Tools for Studying Market Opportunities... Cognitive Tools Index and Market Opportunity Index (yearly values for 2020 and 2...

The empirical study for 2020–2024 showed that a one standard unit increase in the Cognitive Tools Index is associated with an average 0.33 increase in the Market Opportunity Index.

Estimated coefficient reported from the panel econometric model over 2020–2024 (model included lags and used instrumental approach; sample size and standard errors not provided in the excerpt).

medium positive Innovative Cognitive Tools for Studying Market Opportunities... Market Opportunity Index (effect of one standard unit change in Cognitive Tools ...

Pidgin significantly outperformed standard English on measures of knowledge transfer across agriculture, education, and health domains.

Aggregate analysis of questionnaire comprehension items (44-item instrument) across domain-specific modules administered to 45 participants; comparative language-performance results reported in study.

medium positive From Linguistic Hybridity to Development Sovereignty: Pidgin... domain-specific comprehension / knowledge transfer

Volunteers who used proverbs and vernacular registers were incorporated into local kinship structures, granted traditional titles, and perceived as legitimate development actors rather than outsiders.

Qualitative evidence from participant observation and discourse samples collected during fieldwork; interview and questionnaire items on perceptions of volunteer legitimacy and social integration.

medium positive From Linguistic Hybridity to Development Sovereignty: Pidgin... social integration indicators (kinship incorporation, traditional titles, percei...

Agricultural techniques taught in Pidgin were nearly universally adopted by recipients.

Self-reported adoption/behavior-change items in the 44-item questionnaire and corroborating qualitative observation of agricultural practice among participants in the sample (N = 45).

medium positive From Linguistic Hybridity to Development Sovereignty: Pidgin... adoption of agricultural innovations / reported behavior change

Pidgin-mediated interventions achieved large comprehension gains on health messaging, exceeding 30 percentage points compared with standard English.

Quantitative comparison derived from the 44-item field questionnaire (comprehension items) administered to the 45-participant sample; reported percentage-point difference (>30 pp) in health-message comprehension by language of instruction.

medium positive From Linguistic Hybridity to Development Sovereignty: Pidgin... health-message comprehension (percentage-point gain)

Using Cameroon Pidgin English as the primary medium for Peace Corps development work produced substantially better knowledge transfer, uptake, and social legitimacy than standard English.

Mixed-methods field study of Peace Corps interventions in Cameroon's Northwest: 44-item questionnaire administered to 45 participants across agriculture, education, and health; quantitative measures of comprehension and self-reported adoption; supplemented by qualitative observation and discourse samples.

medium positive From Linguistic Hybridity to Development Sovereignty: Pidgin... knowledge transfer (comprehension), behavioral uptake/adoption, social legitimac...

A hybrid strategic–computational framework, supported by governance mechanisms (human-in-the-loop checkpoints, escalation paths, accountability structures), is motivated to manage tensions and ensure responsible decision-making in AI-rich managerial contexts.

Synthesis-driven prescriptive framework produced by cross-framework analysis; conceptual recommendation rather than implementation evidence.

medium positive Comparative analysis of strategic vs. computational thinking... presence and effectiveness of hybrid governance mechanisms in managing human–alg...

Roles oriented to information processing, optimisation, and operational precision (monitor, disseminator, resource allocator) are substantially enhanced by computational thinking (automation, optimisation, algorithmic decision-support).

Theoretical mapping of computational capabilities onto Mintzberg’s information-processing roles; conceptual reasoning without empirical validation.

medium positive Comparative analysis of strategic vs. computational thinking... enhancement in information-processing tasks (accuracy, speed, automation potenti...

AI adoption will shift fact-checking tasks (more monitoring, less rote verification), creating a need for reskilling and new roles (AI tool operators, analysts); donor and public investments should fund capacity building for local organizations.

Workforce implications inferred from interview reports about changing task mixes and the study's interpretive recommendations.

medium positive Fact-Checking Platforms in the Middle East: A Comparative St... changes in task allocation, workforce skills, and need for capacity-building inv...

Investments should prioritize hybrid models where automation provides scale and humans handle contextual, adversarial, and legally sensitive judgments.

Recommendation based on interview findings about AI benefits and limitations and the study's interpretive synthesis.

medium positive Fact-Checking Platforms in the Middle East: A Comparative St... verification effectiveness and error mitigation in workflows

The study distills context-sensitive best practices for fact-checking in restrictive environments, including safety protocols, local partnerships, and hybrid verification workflows.

Synthesis of findings from document analysis and interviews producing a set of recommended practices documented in the study's outputs.

medium positive Fact-Checking Platforms in the Middle East: A Comparative St... recommended operational practices for safety and verification effectiveness

AI can lower verification costs and scale reach by automating tasks such as classification, clustering, alerting, and translation.

Interview reports from platform staff and interpretive analysis identifying AI-assisted use cases for prioritization, monitoring, and translation.

medium positive Fact-Checking Platforms in the Middle East: A Comparative St... verification cost/time and monitoring/translation capacity

Community reporting and audience-focused formats are used to improve engagement.

Platform outputs and staff interviews describing deployment of community-reporting mechanisms and tailored audience formats.

medium positive Fact-Checking Platforms in the Middle East: A Comparative St... audience engagement

« Prev 1 2 3 … 245 246 247 … 298 299 Next »