Evidence (14922 claims)

Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.

The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).

Browse by theme

Nine broad, paper-level topics. Click one to filter the claims below.

Human-AI Collaboration

Claims by outcome category

Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.

Outcome	Positive	Negative	Mixed	Null	Total
Other	795	210	105	955	2131
Governance & Regulation	886	414	197	126	1654
Organizational Efficiency	826	204	129	87	1257
Technology Adoption Rate	681	259	128	110	1189
Research Productivity	464	138	65	349	1028
Output Quality	503	196	61	53	813
Decision Quality	351	180	84	51	673
AI Safety & Ethics	238	288	71	34	637
Firm Productivity	455	58	92	20	631
Market Structure	186	172	123	25	511
Task Allocation	222	70	76	34	407
Innovation Output	238	28	48	18	334
Skill Acquisition	177	62	62	17	318
Employment Level	107	57	108	13	287
Fiscal & Macroeconomic	135	72	44	26	284
Firm Revenue	172	50	28	5	256
Consumer Welfare	121	68	45	12	246
Task Completion Time	183	33	10	13	240
Inequality Measures	45	126	50	6	227
Worker Satisfaction	95	74	23	12	204
Error Rate	77	98	11	4	190
Regulatory Compliance	84	73	17	7	181
Automation Exposure	61	61	27	14	166
Training Effectiveness	98	21	14	19	154
Wages & Compensation	78	37	25	6	146
Developer Productivity	105	18	14	6	144
Team Performance	87	17	28	10	143
Job Displacement	12	83	23	1	119
Hiring & Recruitment	53	8	8	3	72
Social Protection	39	17	8	2	66
Creative Output	32	20	8	3	64
Skill Obsolescence	5	50	6	1	62
Labor Share of Income	17	20	17	—	54
Worker Turnover	15	15	—	3	33
Industry	—	—	—	1	1

The study theoretically extends workforce integration and social inclusion frameworks by explicitly incorporating language access mechanisms.

Authors assert theoretical contribution based on empirical findings linking translation access to labor-market integration, discussed in the paper's theoretical framing and implications sections.

medium positive Translation Models Empowering Immigrant Workforce Integratio... theoretical frameworks (inclusion of language access mechanisms)

This research is innovative by performing a comparative, multi-model evaluation of translation methods within a single labor market context, providing empirical evidence previously inaccessible in the literature.

Study design explicitly compares professional, AI-assisted, and hybrid models using combined quantitative and qualitative methods within specified U.S. cities; the paper frames this comparative, single-market approach as filling a literature gap.

medium positive Translation Models Empowering Immigrant Workforce Integratio... methodological contribution / novelty (comparative evaluation across translation...

Hybrid translation models produced approximately 20% higher retention rates relative to conventional methods.

Reported comparative retention-rate analysis from the study's quantitative dataset (survey of 150 LEP immigrants and placement/retention tracking) analyzed in SPSS v28.

medium positive Translation Models Empowering Immigrant Workforce Integratio... retention rate (worker retention over measured period)

Hybrid human–AI translation models achieved up to 40% greater accuracy in job placement compared to conventional translation methods.

Comparative quantitative evaluation reported in the study comparing placement accuracy across translation models (professional, AI-assisted, hybrid) using survey outcomes and placement metrics derived from the sample and analyzed in SPSS v28.

medium positive Translation Models Empowering Immigrant Workforce Integratio... job placement accuracy (percentage correct/appropriate placements)

Professional and hybrid human–AI translation services significantly enhance employment alignment, retention, and workplace satisfaction for immigrants with limited English proficiency.

Quantitative analysis of survey data (n=150 LEP immigrants) and corroborating qualitative interview data (50 employers, 20 providers) analyzed via SPSS v28 and thematic coding in NVivo 14; the paper reports statistically significant improvements attributed to professional and hybrid translation models.

medium positive Translation Models Empowering Immigrant Workforce Integratio... employment alignment (job matching), retention (job tenure/retention rates), wor...

Multi-agent systems demonstrated improved collaborative behavior when guided by standardized prompt frameworks, reducing ambiguity and enhancing synergistic task execution.

Experimental simulations of multi-agent systems employing standardized prompt frameworks, with assessments of collaborative behavior expressed as coordination coherence and synergistic task execution efficiency. (Number of agents, experimental runs, and quantitative results not specified in the provided text.)

medium positive Prompt Engineering for Autonomous AI Agents: Enhancing Decis... collaborative behavior/coordination coherence; ambiguity reduction (fewer coordi...

Well-constructed prompts significantly strengthened agents' ability to interpret complex inputs, generate context-appropriate actions, and maintain consistent performance under variable conditions.

Findings drawn from the experimental simulations comparing prompt quality (described as 'well-constructed' versus alternatives) and reporting improvements across interpretation, action-generation, and performance consistency metrics. (Details on experimental replication, sample size, and statistical significance not provided in the excerpt.)

medium positive Prompt Engineering for Autonomous AI Agents: Enhancing Decis... ability to interpret complex inputs (interpretation accuracy); generation of con...

Structured, context-rich, and strategically layered prompts improved agents’ situational awareness, reasoning accuracy, and operational adaptability.

Quantitative research design using experimental simulations where prompt structure was manipulated and agent outputs were evaluated. Performance indicators cited include response accuracy, task completion efficiency, coordination coherence, and error rates. (Paper does not report sample size or statistical values in the provided text.)

medium positive Prompt Engineering for Autonomous AI Agents: Enhancing Decis... situational awareness; reasoning accuracy; operational adaptability (measured vi...

Hierarchical verification (property, interaction, and rollout tests) confirms semantic equivalence for all five environments; cross-backend policy transfer confirms zero sim-to-sim gap for all five.

Verification methodology described in the paper: hierarchical tests (property checks, interaction tests, rollout comparisons) applied to each of the five environments, plus cross-backend policy transfer experiments showing identical behavior/performance between backends.

medium positive Automatic Generation of High-Performance RL Environments semantic equivalence measures (verification pass/fail) and sim-to-sim gap (measu...

TCGJax is the first deployable JAX Pokemon TCG engine, achieving 717K SPS for random actions and 153K SPS for PPO; 6.6x faster than the Python reference.

New environment synthesized from a web-extracted specification with throughput benchmarks for random-action and PPO modes, and a direct comparison to a Python reference implementation yielding 6.6x speedup.

medium positive Automatic Generation of High-Performance RL Environments random-action throughput (SPS), PPO throughput (SPS), speedup factor vs Python r...

The translated HalfCheetah JAX implementation outperforms Brax by 5x at matched GPU batch sizes.

Benchmarks comparing throughput of the HalfCheetah JAX translation against Brax under matched GPU batch sizes, reporting a 5x improvement.

medium positive Automatic Generation of High-Performance RL Environments throughput (speedup factor) vs Brax at matched batch sizes

PokeJAX is the first GPU-parallel Pokemon battle simulator, achieving 500M steps-per-second (SPS) for random actions and 15.2M SPS for PPO; 22,320x faster than the TypeScript reference.

Throughput benchmarks reported for PokeJAX (random-action SPS and PPO SPS) and direct comparison of SPS to a TypeScript reference implementation yielding the 22,320x factor. (Single environment: Pokemon battle simulator.)

medium positive Automatic Generation of High-Performance RL Environments random-action throughput (SPS), PPO throughput (SPS), speedup factor vs TypeScri...

EmuRust yields a 1.5x PPO speedup via Rust parallelism for a Game Boy emulator.

Benchmark comparison of PPO training/inference throughput between reference implementation and EmuRust; reported speedup factor 1.5x for PPO. (Single environment: Game Boy emulator.)

medium positive Automatic Generation of High-Performance RL Environments PPO throughput / training speed (speedup factor)

A reusable recipe (generic prompt template, hierarchical verification, iterative agent-assisted repair) produces semantically equivalent high-performance RL environments for <$10 in compute cost.

Methodological description in the paper: recipe combining prompt template, hierarchical verification, and agent-assisted repair; demonstrated by producing multiple environments with reported compute cost under $10. Empirical support comes from the set of reproduced environments (five total) and their reported build costs.

medium positive Automatic Generation of High-Performance RL Environments cost to produce high-performance environments (USD) and semantic equivalence

As AI adoption rises within companies, industries, and regions, demand for complementary skills increases even in non-AI roles.

Longitudinal/cross-sectional analysis of job postings (n ≈ 30 million, 2018–2024) with measures of AI diffusion at company, industry, and regional levels and comparisons of skill demand in non-AI roles over time and across contexts.

medium positive Complement or Substitute? How AI Increases the Demand for Hu... demand for complementary skills in non-AI roles (frequency of skill requirements...

Complementary (non-technical) skills are associated with meaningful wage premiums, particularly in managerial, sales, or finance roles working with AI.

Wage/salary analysis linked to skill requirements within the same nearly 30 million job postings dataset (2018–2024), with subgroup analysis for managerial, sales, and finance roles identified as working with AI.

medium positive Complement or Substitute? How AI Increases the Demand for Hu... wage premium associated with complementary skills (salary level differences)

The success of sustainable development is deeply tied to the responsiveness and credibility of governance systems.

Central thesis of the paper supported by synthesis of governance frameworks, SDGs, and illustrative international examples; the summary does not provide quantitative metrics or sample-based validation.

medium positive Good Governance and Sustainable Development: Pathways, Princ... overall success/achievement of sustainable development (SDG outcomes)

Governance innovations, information systems, and inclusive institutions increase the prospects of just and adaptable progress.

Illustrated via discerning international instances and conceptual synthesis against SDG and governance frameworks; no specific sample size or controlled empirical study is described in the summary.

medium positive Good Governance and Sustainable Development: Pathways, Princ... prospects of just (equitable) and adaptable (resilient) development progress

Transparency, inclusive participation, robust regulation, and the rule of law shape development outcomes across economic, social, environmental, and institutional spheres.

Conceptual analysis leveraging global governance frameworks and the Sustainable Development Goals (SDGs), supported by international examples and literature cited in the paper; no quantitative sample size or statistical analysis is reported in the summary.

medium positive Good Governance and Sustainable Development: Pathways, Princ... development outcomes across economic, social, environmental, and institutional s...

Alongside concerns, AI proliferation may introduce new, positive affordances for military decision-making organizations.

Normative/analytical claim by the author based on argumentation; no empirical demonstration, experimental results, or case-study evidence is provided in the excerpt.

medium positive AI governance for military decision-making: A proposal for m... positive affordances (benefits) from AI in military decision-making

Military AI adoption is incentivized by competitive pressures and expanding national security needs.

Author assertion based on qualitative argumentation and literature-informed reasoning; no empirical study, dataset, or sample size reported in the text.

medium positive AI governance for military decision-making: A proposal for m... level of AI adoption by military institutions (drivers of adoption)

Process-oriented skills appear in 15.6% of feasible transition pathways and emerge as the highest-leverage intervention.

Feature analysis of the 4,534 identified transitions showing process-oriented skills present in 15.6% of pathways; statement that these skills constitute the highest-leverage intervention (comparative ranking implied by analysis).

medium positive Graph-Based Analysis of AI-Driven Labor Market Transitions: ... share of feasible transition pathways that include process-oriented skills (15.6...

Eliciting probabilities (instead of forcing binary labels) enables post-hoc recalibration that improves both individual-worker and crowd-level label quality.

Methodological approach in the field experiment: comparison between binary-label interface and elicited-probability interface, followed by linear-in-log-odds recalibration applied to probabilistic responses at worker and crowd aggregation levels. Improvements in label quality reported (specific metrics and sizes not included in the excerpt).

medium positive Managing Cognitive Bias in Human Labeling Operations for Rar... label quality at worker and crowd levels (measured via calibration and classific...

The improvements from balanced feedback, probabilistic elicitation, and pipeline-level recalibration carry through to downstream convolutional neural network (CNN) reliability out of sample.

The study trained convolutional neural networks on labels produced under the different labeling and recalibration pipelines and evaluated out-of-sample reliability; reported that the gains observed at the labeling stage improved downstream CNN reliability (exact architectures, training/validation splits, and quantitative out-of-sample results not provided in the excerpt).

medium positive Managing Cognitive Bias in Human Labeling Operations for Rar... downstream CNN out-of-sample reliability (e.g., generalization performance, accu...

Pipeline-level recalibration substantially improves probabilistic calibration of labels.

Empirical evaluation in the DiagnosUs experiment where probabilistic labels were recalibrated (linear-in-log-odds) and calibration metrics were compared pre- and post-recalibration (specific calibration metrics and numeric results not provided in the excerpt).

medium positive Managing Cognitive Bias in Human Labeling Operations for Rar... probabilistic calibration (e.g., calibration error, Brier score, reliability dia...

Post-processing probabilistic labels using a linear-in-log-odds recalibration approach at the worker and crowd levels substantially improves classification performance.

The paper applied linear-in-log-odds recalibration to elicited probabilistic labels at both individual-worker and aggregated crowd levels, then evaluated classification performance on labels before and after recalibration (methods and quantitative effect sizes not provided in the excerpt).

medium positive Managing Cognitive Bias in Human Labeling Operations for Rar... classification performance of models trained on labels (e.g., accuracy, AUC or o...

Balanced feedback (higher positive prevalence in the feedback stream) and probabilistic elicitation reduce rare-event misses.

Results from the DiagnosUs field experiment comparing conditions that vary feedback prevalence (20% vs. 50%) and response interface (binary labels vs. elicited probabilities); miss rates were compared across conditions (sample sizes not given in the excerpt).

medium positive Managing Cognitive Bias in Human Labeling Operations for Rar... rare-event miss rate (false negative rate for positive examples)

A combined scenario pairing moderate productivity gains with moderate cost control nearly eliminates the deficit by 2050.

Specific combined policy scenario simulated in the model projecting fiscal indicators to 2050; reported outcome is near-elimination of the government deficit under those assumptions.

medium positive Fiscal Dynamics in Japan under Demographic Pressure government fiscal deficit (aggregate) projected for year 2050

Policy experiments show that productivity improvements and controlling per-person costs offer the most effective near-term relief, because they act quickly through revenue and spending channels.

Counterfactual/policy scenario simulations run with the calibrated system dynamics model comparing effects of productivity gains and per-person cost controls versus other levers; near-term (short- to medium-run) impacts reported.

medium positive Fiscal Dynamics in Japan under Demographic Pressure near-term changes in fiscal indicators (tax revenue, public spending, fiscal def...

The model, grounded in official statistics, tracks historical trends reasonably well.

Model historical validation presented in the paper comparing model outputs to observed historical time series (fit to past demographic and fiscal indicators).

medium positive Fiscal Dynamics in Japan under Demographic Pressure goodness-of-fit between model outputs and historical series for demographics and...

This study offers the first systematic analysis of labor markets and the qualitative traits of participants in the criminal ecosystem of the SDE.

Authors' stated contribution claiming novelty; systematic analysis of labor-market roles and participant traits within the paper (methods described as systematic analysis/qualitative review; no external verification or comparative bibliometric analysis provided).

medium positive THE LABOR MARKET IN TERMS OF THE SHADOW DIGITAL ECONOMY existence and characterization of labor-market analysis in SDE research

AI innovation produces significant positive spatial spillover effects on employment in neighboring cities, promoting expansion of their employment scale.

Spatial analysis (spatial econometric tests) on the 268 Chinese cities (2010–2023) indicating positive spillovers to neighboring cities' employment.

medium positive How Does AI Innovation Affect Urban Employment in China? A M... employment in neighboring cities (spatial spillover effect)

Temporally, AI innovation affects urban employment through both immediate and lagged effects, with the magnitude of these effects diminishing over time.

Temporal (lag) analysis in extended tests on the 268-city panel covering 2010–2023.

medium positive How Does AI Innovation Affect Urban Employment in China? A M... urban employment over time (immediate and lagged effects)

Governmental digital attention positively moderates the relationship between AI innovation and urban employment.

Moderation analysis using measures of governmental digital attention and AI innovation in the 268-city panel (2010–2023).

medium positive How Does AI Innovation Affect Urban Employment in China? A M... urban employment

AI innovation indirectly promotes employment growth by enhancing urban economic density (mediation effect).

Mechanism (mediation) analysis conducted on the 268-city panel (2010–2023) showing economic density as an intermediary channel.

medium positive How Does AI Innovation Affect Urban Employment in China? A M... employment growth (mediated by urban economic density)

The positive employment effect of AI innovation is stronger in southern cities than in others.

Geographic heterogeneity analysis across 268 Chinese cities (2010–2023).

medium positive How Does AI Innovation Affect Urban Employment in China? A M... urban employment in southern cities

The positive employment effect of AI innovation is more pronounced in the tertiary sector.

Heterogeneity/sectoral analysis using the panel of 268 Chinese cities (2010–2023).

medium positive How Does AI Innovation Affect Urban Employment in China? A M... employment in the tertiary sector

The positive employment effect of AI innovation is more pronounced in the secondary sector.

Heterogeneity/sectoral analysis using the same panel of 268 Chinese cities (2010–2023).

medium positive How Does AI Innovation Affect Urban Employment in China? A M... employment in the secondary sector

Overall, AI innovation has a positive effect on urban employment.

Empirical testing on a panel of 268 Chinese cities over the period 2010–2023 (integrated theoretical and empirical analysis).

medium positive How Does AI Innovation Affect Urban Employment in China? A M... urban employment (employment scale)

Our framework achieves a 67% cost reduction compared to the matched hierarchical baseline.

Empirical comparison against a matched hierarchical baseline on the reported evaluation set; paper reports a 67% reduction in cost (operational/cost-per-query as reported by authors).

medium positive One Supervisor, Many Modalities: Adaptive Tool Orchestration... operational cost (cost-per-query or aggregated cost as reported)

Our framework achieves an 85% reduction in conversational rework compared to the matched hierarchical baseline.

Empirical comparison against a matched hierarchical baseline on the reported evaluation set; paper reports an 85% reduction in conversational rework.

medium positive One Supervisor, Many Modalities: Adaptive Tool Orchestration... conversational rework (amount/frequency of follow-up/redo interactions)

Our framework achieves a 72% reduction in time-to-accurate-answer compared to the matched hierarchical baseline.

Empirical comparison against a matched hierarchical baseline on the reported evaluation set (2,847 queries); paper reports a 72% reduction in the time-to-accurate-answer metric.

medium positive One Supervisor, Many Modalities: Adaptive Tool Orchestration... time-to-accurate-answer

Successful adaptation does not require wholesale abandonment of traditional models nor uncritical technological embrace, but deliberate institutional redesign balancing technological innovation with preservation of core academic values.

Authors' synthesis and prescriptive conclusion drawn from the analysis; presented as a recommended strategy rather than empirically validated practice.

medium positive Are Universities Becoming Obsolete in the Age of Artificial ... recommended adaptation strategy for institutions (balance between innovation and...

Strategic recommendations emphasize hybrid models that integrate AI capabilities while preserving irreplaceable human elements in higher education.

Paper's concluding recommendations based on its comparative function analysis and normative assessment; not accompanied by empirical trials of proposed hybrid models.

medium positive Are Universities Becoming Obsolete in the Age of Artificial ... advocated institutional model (hybrid AI-human integration)

Workforce development systems need lifelong learning infrastructure and dynamic credentialing to support continuous reskilling in an AI-rich environment.

Prescriptive conclusion from the authors based on projected labor-market and skills impacts; no empirical pilot or sample study cited to validate the recommendation.

medium positive Are Universities Becoming Obsolete in the Age of Artificial ... requirement for lifelong learning infrastructure and dynamic credentialing

The transformation driven by AI requires governments to redesign accreditation frameworks and quality assurance mechanisms.

Policy recommendation arising from the paper's analysis of accreditation and validation issues; presented as normative guidance rather than empirically tested intervention.

medium positive Are Universities Becoming Obsolete in the Age of Artificial ... need for redesign of accreditation frameworks and quality assurance mechanisms

AI systems democratize knowledge access, personalize learning, and offer scalable skills training.

The paper presents this as a conceptual claim based on literature synthesis and theoretical analysis; no empirical sample size or primary data reported.

medium positive Are Universities Becoming Obsolete in the Age of Artificial ... knowledge access, personalization of learning, scalability of skills training

Systematic economic impact assessment is vital for guiding public investments, workforce development, and policy decisions related to agricultural technology adoption.

Author conclusion based on study findings from IMPLAN 2022 I–O modeling and the observed differences between robotics and traditional greenhouse scenarios; normative recommendation.

medium positive ECONOMIC IMPACTS OF ROBOTICS TECHNOLOGY IN REMOTE GREENHOUSE... policy relevance / decision-support for public investment and workforce planning...

Technological innovation in agriculture (robotics) not only boosts productivity but also contributes to broader regional resilience and economic diversification.

Synthesis of I–O model outcomes (expanded sectoral impacts and higher multipliers) and conceptual arguments in the paper relating diversified economic linkages and productivity gains to regional resilience.

medium positive ECONOMIC IMPACTS OF ROBOTICS TECHNOLOGY IN REMOTE GREENHOUSE... regional resilience; economic diversification (sectoral output and value added c...

Robotics adoption supports sustainable employment opportunities (i.e., durable regional jobs) rather than simply eliminating jobs.

I–O modeling results showing induced and indirect employment effects from robotics investments in NWI; study discussion framing these as sustainable employment opportunities.

medium positive ECONOMIC IMPACTS OF ROBOTICS TECHNOLOGY IN REMOTE GREENHOUSE... employment (jobs created/sustained; job composition)

« Prev 1 2 3 … 259 260 261 … 298 299 Next »