Evidence (4004 claims)

Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.

The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).

Browse by theme

Nine broad, paper-level topics. Click one to filter the claims below.

Human-AI Collaboration

Claims by outcome category

Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.

Outcome	Positive	Negative	Mixed	Null	Total
Other	870	233	116	1066	2363
Governance & Regulation	976	451	218	133	1809
Organizational Efficiency	949	224	144	88	1416
Technology Adoption Rate	764	287	141	122	1325
Research Productivity	501	152	74	362	1101
Output Quality	542	216	69	69	896
Decision Quality	387	198	94	54	740
Firm Productivity	513	67	101	27	714
AI Safety & Ethics	249	303	73	36	667
Market Structure	190	192	134	27	548
Task Allocation	243	77	91	36	452
Innovation Output	291	33	55	20	401
Skill Acquisition	206	72	65	21	364
Employment Level	133	63	115	22	335
Fiscal & Macroeconomic	153	79	52	32	323
Task Completion Time	206	37	12	15	272
Firm Revenue	179	52	29	5	266
Consumer Welfare	130	76	47	13	266
Inequality Measures	48	137	51	6	242
Worker Satisfaction	101	81	25	13	220
Error Rate	84	110	11	5	210
Wages & Compensation	98	47	30	10	185
Regulatory Compliance	88	73	17	7	185
Automation Exposure	66	64	33	16	182
Team Performance	105	29	30	11	176
Training Effectiveness	109	22	14	21	168
Developer Productivity	114	21	14	8	158
Job Displacement	12	90	24	1	127
Hiring & Recruitment	57	9	9	5	80
Skill Obsolescence	6	56	9	1	72
Social Protection	43	17	8	2	70
Creative Output	35	21	9	4	70
Labor Share of Income	18	21	17	1	57
Worker Turnover	15	16	—	4	35
Industry	—	—	—	1	1

Labor Markets Remove filter

Governing-logic stability uncertainty (whether decision logic or objectives remain stationary) is a distinct risk posed by agentic AI.

Conceptual argument and proposed taxonomy; no empirical tests reported.

high negative Visioning Human-Agentic AI Teaming: Continuity, Tension, and... stability of AI decision logic/objectives over time

Epistemic grounding uncertainty (uncertainty about how/why an AI produced a particular output) increases with agentic AI.

Literature synthesis on model-level opacity and causal explanation limits; conceptual reasoning in the paper.

high negative Visioning Human-Agentic AI Teaming: Continuity, Tension, and... ability to explain/ground AI outputs

Behavioral trajectory uncertainty (difficulty predicting long-run actions) is a primary form of uncertainty introduced by agentic AI.

Conceptual classification and argument; proposed as one of three principal uncertainties; no empirical estimation.

high negative Visioning Human-Agentic AI Teaming: Continuity, Tension, and... predictability of long-run agentic AI actions

Integration and engineering complexity (legacy systems, privacy/compliance pipelines, multi-channel platforms) is a persistent barrier to deployment.

Industry case studies and practitioner reports synthesized in the review documenting integration challenges; no systematic cost accounting or sample sizes presented.

high negative The Effectiveness of ChatGPT in Customer Service and Communi... integration complexity metrics, implementation time/cost, number of integration ...

Hallucinations and factual errors from generative AI can damage service quality and customer trust.

Documented failure cases and empirical reports from the literature aggregated by the review; no novel incident count or experimental data in this paper.

high negative The Effectiveness of ChatGPT in Customer Service and Communi... incidence of factual errors/hallucinations, measures of service quality and cust...

Generative AI is susceptible to social and representational biases and to factual errors or hallucinations; it lacks tacit, contextual domain expertise.

Documented examples in the literature of biased outputs and hallucinations; controlled evaluations and audits of model outputs; qualitative reports highlighting lack of tacit knowledge in domain-specific tasks.

high negative ChatGPT as an Innovative Tool for Idea Generation and Proble... incidence of biased content; factual error/hallucination rate; performance on do...

The quality of AI-generated outputs is highly variable; models frequently produce mediocre but plausible-sounding content that requires human filtering.

Multiple user studies and qualitative reports documenting variability in output quality and the need for human curation; outcome measures include error rates, user-rated quality, and time spent vetting.

high negative ChatGPT as an Innovative Tool for Idea Generation and Proble... output quality distributions; user-perceived quality; time/effort for human filt...

Factual errors and 'hallucinations' create misinformation risks and can produce costly service failures.

Model evaluation studies, incident case reports from deployments, and academic/industry analyses documenting hallucination rates and concrete failure examples.

high negative The Effectiveness of ChatGPT in Customer Service and Communi... factual accuracy / hallucination rate; incidents of service failure (operational...

High linguistic diversity in Africa makes building and evaluating multilingual language technologies more difficult and is a barrier to inclusive AI.

Synthesis of technical literature on NLP and multilingual model development and policy/NGO reports highlighting missing language resources; no original model evaluation reported.

high negative Towards Responsible Artificial Intelligence Adoption: Emergi... language technology availability, model performance across African languages, nu...

Structural constraints—limited digital infrastructure, scarce and skewed data, and high linguistic diversity—complicate AI development, deployment and evaluation in African contexts.

Desk review of infrastructure and data availability reports and scholarly literature demonstrating gaps and their effects; no new measurement in this paper.

high negative Towards Responsible Artificial Intelligence Adoption: Emergi... internet/digital infrastructure coverage, availability and representativeness of...

Rapid skill obsolescence in AI necessitates frequent curriculum updates and responsive governance.

Identified as a risk: the paper notes AI skill change rates and recommends frequent updates and governance mechanisms. This aligns with general domain knowledge; the paper does not provide empirical measurement of obsolescence rates.

high negative Curriculum engineering: organisation, orientation, and manag... update frequency, lag between skill demand change and curriculum update

Aligning multiple standards is complex, posing a disadvantage and implementation risk.

Stated explicitly in Disadvantages/Risks: complexity of aligning multiple standards is listed. This is a reasoned observation in the paper rather than empirically demonstrated.

high negative Curriculum engineering: organisation, orientation, and manag... complexity measures (number of standards to reconcile, conflicts identified), ti...

Implementing this framework requires significant resources and continuous updating.

Stated explicitly under Main Finding and Disadvantages/Risks; paper lists cost/time metrics to track (cost-per-curriculum, time-to-update) and highlights resource intensity. Support is descriptive/analytic rather than empirical.

high negative Curriculum engineering: organisation, orientation, and manag... resource intensity (cost-per-curriculum), time-to-update, maintenance burden

The digital divide (lack of reliable electricity and connectivity) constrains adoption of MIS and AI, creating geographic and regional inequities in who benefits from the framework.

Infrastructure constraint argument presented in the paper; no quantified coverage maps or population-level access statistics included.

high negative Establishes a technical and academic bridge between the educ... coverage of system access, differential adoption rates by region, inequality in ...

AI-driven equivalency systems carry risks including algorithmic bias, opaque decisions without explainability, and potential reinforcement of inequities when training data under-represents some regions/institutions.

Risk assessment drawing on established AI ethics literature; no empirical bias audit from the proposed system is provided.

high negative Establishes a technical and academic bridge between the educ... measures of algorithmic bias (disparate impact), explainability scores, unequal ...

The major disadvantage of an MIS is dependency on reliable electricity and internet, creating systemic vulnerability due to the digital divide.

Paper notes infrastructure dependency as a constraint; assertion grounded in common infrastructural realities but no measured connectivity or outage statistics from DRC/SA are provided.

high negative Establishes a technical and academic bridge between the educ... geographic/regional access to equivalency services and system uptime availabilit...

Potential limitations include limited methodological detail on case selection and measurement, possible selection and reporting bias from practitioner-sourced examples, and variable generalizability to small firms or highly regulated industries.

Authors' self-reported limitations in the Methods/Limitations section (qualitative assessment).

high negative Governed Hyperautomation for CRM and ERP: A Reference Patter... methodological completeness and generalizability (qualitative limitation)

Prompt fraud exploits the natural-language interface of large language models (LLMs) to produce outputs that appear authoritative (reports, audit trails, explanations) without system intrusion, credential theft, or software exploitation.

Definition and threat-model description using conceptual examples and case vignettes; literature/regulatory review to position the threat relative to traditional fraud vectors.

high negative Prompt Engineering or Prompt Fraud? Governance Challenges fo... production of authoritative-appearing artifacts by LLMs without technical system...

Data privacy and cross-border compliance issues arise from using cloud and SECaaS, complicating legal compliance for firms.

Regulatory analyses and compliance reports; documented examples in case studies and industry guidance on cross-border data flows.

high negative Security- as- a- service: enhancing cloud security through m... compliance incident rates / regulatory risk exposure

The cloud shared responsibility model creates potential ambiguities in liability between providers and customers.

Regulatory guidance, legal analyses, and documented post-incident case studies showing confusion over responsibilities.

high negative Security- as- a- service: enhancing cloud security through m... clarity/ambiguity of security and liability responsibilities

Automation and LLM-driven orchestration add opacity; errors in instrument control or analysis could propagate quickly, raising liability, insurance, and reproducibility concerns.

Analytical discussion of risks and analogies to automated systems in other domains; no incident-level empirical data from microscopy given.

high negative ChatMicroscopy: A Perspective Review of Large Language Model... frequency and impact of errors, liability exposure, reproducibility failures

Ethical and governance issues related to LLM-driven microscopy include accountability, reproducibility, access inequities, data privacy, and concentration of capabilities in large providers.

Policy-oriented synthesis and analogies to governance challenges observed in other AI deployments; no new empirical measurement in microscopy contexts.

high negative ChatMicroscopy: A Perspective Review of Large Language Model... presence of governance risks: accountability gaps, reproducibility problems, une...

Integration of LLMs with microscopes faces challenges including safety and reliability of instrument control, verification of scientific outputs, data provenance, and alignment with experimental constraints.

Analytical discussion based on known reliability and safety issues in automated systems and AI tool use; no empirical incident data from microscopy provided.

high negative ChatMicroscopy: A Perspective Review of Large Language Model... risks to safety, reliability, and scientific validity when deploying LLM-driven ...

There is substantial uncertainty in economic forecasts due to possible scale-up failures, regulatory constraints, feedstock price volatility, and path‑dependent lock‑in effects.

Synthesis of technical failure modes, regulatory uncertainty, and sensitivity analyses reported in TEA/LCA literature and economic modeling sections of the review.

high negative Harnessing Microbial Factories: Biotechnology at the Edge of... forecast variance in cost trajectories, probability of commercial success, and s...

Regulatory and biosafety concerns (including environmental release risks and dual‑use issues) increase fixed costs and create entry barriers that shape industry structure and diffusion.

Policy and governance literature reviewed alongside technical case studies; citations of regulatory requirements, biosafety frameworks, and examples of compliance costs affecting project viability.

high negative Harnessing Microbial Factories: Biotechnology at the Edge of... regulatory compliance costs, time-to-market, number of approved facilities/proce...

Engineering and economic challenges—scale‑up hurdles, process robustness, feedstock cost, and downstream purification—limit industrial deployment of many bio-based processes.

Case study TEA/LCA summaries and process reports in the review highlighting scale-up failures or increased costs at larger scales, purification complexity for low‑concentration products, and sensitivity to feedstock prices.

high negative Harnessing Microbial Factories: Biotechnology at the Edge of... capital and operating costs, purification yield and cost, process robustness met...

Technical biological limitations—metabolic burden, pathway crosstalk, byproduct formation, and genetic instability—remain major constraints on strain performance and scalability.

Multiple experimental reports and method papers cited in the review documenting decreased growth/productivity due to engineered pathway burden, unintended interactions between pathways, accumulation of byproducts, and genetic mutations during production runs.

high negative Harnessing Microbial Factories: Biotechnology at the Edge of... strain growth rate, productivity (g/L/h), byproduct concentrations, genetic muta...

Measurement issues (task-based output measurement, attributing output changes to AI) and selection into early adoption bias estimated productivity gains upward.

Methodological robustness checks reported in the paper: task-based measures, bounding exercises, placebo tests, and analysis of pre-trends; discussions of selection on unobservables and potential upward bias.

high negative S-TCO: A Sustainable Teacher Context Ontology for Educationa... validity/bias of estimated productivity effects

Implementing the governed hyperautomation pattern raises upfront costs (governance tooling, monitoring, validation, compliance processes).

Economic and cost-structure discussion in the paper, based on qualitative reasoning and industry experience; no quantified cost estimates or sample-based cost analysis provided.

high negative Governed Hyperautomation for CRM and ERP: A Reference Patter... upfront implementation costs (governance tooling, validation, compliance overhea...

The cost of formalizing informal labor (CFIL) implies formalizing a worker costs on average 88% more than the informal wage in 2023.

New CFIL metric calculated for 19 countries (2023 baseline) by estimating the additional employer cost of hiring and formalizing an informal worker and reporting it relative to the informal wage, using compiled statutory obligations and informal wage benchmarks.

high negative Salaried Labor Costs in Latin America and the Caribbean: A T... CFIL (additional cost of formalizing) as % above informal wage

There is sizable attrition in the pipeline from applicant admission through to direct employment of AI graduates, indicating leakages at multiple stages (application → admission → graduation → employment).

Quantification of human-resource losses across pipeline stages using the monitoring dataset for the 191 institutions; descriptive counts/percentages of entrants, admitted students, graduates, and those directly employed in AI roles (pipeline loss metrics reported in paper).

high negative Employment og Graduates of Educational Programs in the Field... Attrition rates / absolute losses at sequential pipeline stages (applicants → ad...

Graduates from Russian universities running AI-related educational programs together with alternative training routes (self-education and professional retraining) satisfy 43.9% of estimated national AI personnel demand.

Monitoring dataset of 191 Russian universities implementing AI-related programs; aggregated counts of university graduates plus estimated contributions from self-education and professional retraining compared to an estimated national AI personnel demand (coverage reported as 43.9%).

high negative Employment og Graduates of Educational Programs in the Field... Share (%) of estimated national AI personnel demand satisfied by combined univer...

AI automates routine and some mid-skill tasks, reducing employment in those occupations.

Empirical task-based exposure measures mapping AI capabilities to occupational task content, microdata analyses of employment by occupation using household/employer/administrative datasets, and panel regressions/decompositions that document within-occupation declines and between-occupation shifts.

high negative Intelligence and Labor Market Transformation: A Critical Ana... employment levels in routine and mid-skill occupations

Relying on secondary literature limits the paper's ability to make causal inferences and constrains empirical generalizability to all sectors or countries.

Stated limitations in the paper's Data & Methods section acknowledging scope and inferential constraints.

high negative Who Loses to Automation? AI-Driven Labour Displacement and t... causal inference strength and generalizability of conclusions

Increases in K_T reduce employment levels in affected firms and industries even when aggregate productivity rises.

Panel econometric estimates at firm and industry levels relating K_T intensity to employment outcomes, controlling for demand, input prices, and firm characteristics; difference-in-differences specifications and instrumental-variable robustness checks; corroborated by sectoral case studies.

high negative The Macroeconomic Transition of Technological Capital in the... employment (firm- and industry-level employment counts or employment growth)

Rising technological capital (K_T) — proxied by robot/automation density, software and intangible capital accumulation, AI adoption surveys, and AI-related patenting — leads to a decline in labor’s share of output.

Firm- and industry-level panel regressions linking constructed K_T intensity measures to labor shares, supported by macro growth-accounting decompositions; robustness checks include difference-in-differences and instrumenting adoption with plausibly exogenous shocks (e.g., cross-border technology diffusion, trade shocks); validated with cross-country comparisons and case studies.

high negative The Macroeconomic Transition of Technological Capital in the... labor share of income (share of output paid to labor)

The study used standard scientific methods, employing a comparative approach and inductive and deductive methods to identify patterns of interaction between legal regulation and technological development.

Methodology section of the paper explicitly states the use of comparative, inductive and deductive methods and theoretical synthesis.

high neutral ECONOMIC SYSTEMS IN THE CONTEXT OF DIGITALISATION AND AI: TH... methodological approach used in the study

The paper develops a theoretical and legal model that treats law as an integral part of the economic system influencing income distribution, labour relations, market structure and productivity dynamics.

Model construction through synthesis of theoretical perspectives using inductive and deductive methods and comparative legal analysis (methodology described in the paper).

high neutral ECONOMIC SYSTEMS IN THE CONTEXT OF DIGITALISATION AND AI: TH... role of legal frameworks in shaping economic institutional conditions (income di...

The study uses LinkedIn and GitHub data to examine firms' adoption of GitHub Copilot and related SWE skills and labor outcomes.

Statement of data sources and study design reported in the paper (LinkedIn profiles/skill listings linked to GitHub repository/adoption signals).

high neutral Firms' GitHub Copilot adoption and labor market outcomes for... data sources / methodological description

Expert assessment involved three senior academics producing reports and appointment-level syntheses.

Paper states that three senior academics produced assessment reports and synthesised appointment-level recommendations; n=3 assessors.

high neutral The Relic Condition: When Published Scholarship Becomes Mate... expert assessment procedure (number and type of assessors)

The distillation pipeline used an eight-layer extraction method and a nine-module skill architecture grounded in local, closed-corpus analysis.

Methods description in paper specifying an eight-layer extraction approach and nine-module skill architecture; presented as the technical design of the distillation pipeline.

high neutral The Relic Condition: When Published Scholarship Becomes Mate... pipeline architecture (layers/modules)

SAFI measures LLM performance on text-based representations of skills, not full occupational execution.

Methodological caveat stated by the authors clarifying the scope and limits of SAFI.

high neutral The AI Skills Shift: Mapping Skill Obsolescence, Emergence, ... scope of SAFI measure (text-based representations vs full job execution)

We propose an AI Impact Matrix that positions skills into four quadrants: High Displacement Risk, Upskilling Required, AI-Augmented, and Lower Displacement Risk.

Conceptual/interpretive framework introduced by the authors; described in text as proposed by the paper.

high neutral The AI Skills Shift: Mapping Skill Obsolescence, Emergence, ... interpretive classification of skills into four impact quadrants

Using a strictly algorithmic baseline (mathematical bottleneck aggregation), we calculate Relative Occupational Automation Indices (OAI) for the U.S. labor market based on the DWA-level scores.

Method and calculation claim: algorithmic baseline aggregation applied across the 923 occupations / 2,087 DWAs to produce OAIs mapped to the U.S. labor market. Specific aggregation formula referenced but not numerically detailed in the excerpt.

high neutral Bounded by Risk, Not Capability: Quantifying AI Occupational... Relative Occupational Automation Index (OAI)

We deconstructed 923 occupations into 2,087 Detailed Work Activities (DWAs).

Explicit data processing claim in the paper: mapping of 923 occupations to 2,087 DWAs for analysis.

high neutral Bounded by Risk, Not Capability: Quantifying AI Occupational... coverage of occupations and DWAs used for analysis

A variance decomposition indicates that most expert disagreement about long-run macroeconomic outcomes is driven by differing beliefs about the economic effects of highly capable AI, rather than disagreement about the pace of AI capability progress.

Authors' variance-decomposition analysis of survey responses separating components due to beliefs about AI capabilities vs. beliefs about economic effects given capabilities (methodological details referenced but not provided in excerpt).

high neutral Forecasting the Economic Effects of AI sources of expert disagreement (capabilities vs. economic effects)

The paper addresses three institutional audiences: enterprise finance and operations teams; government and regulatory bodies developing AI labor displacement frameworks; and financial markets requiring a machine labor index as a long-duration economic signal.

Stated intended audiences in the paper (descriptive statement).

high neutral HEWU: A Standardized Framework for Measuring Machine-Generat... intended institutional audiences

The framework is calibrated with O*NET task data, a survey of 3,778 domain experts, and GPT-4o-derived task decompositions, and implemented in computer vision.

Calibration and empirical implementation using O*NET, a domain expert survey (n=3,778), and GPT-4o task decompositions; applied to computer vision tasks.

high neutral Economics of Human and AI Collaboration: When is Partial Aut... validity of calibration / empirical grounding of the framework

We introduce an entropy-based measure of task complexity that maps model accuracy into a labor substitution ratio, quantifying human labor displacement at each accuracy level.

New metric proposed in the paper (entropy-based task complexity) and mapping procedure from accuracy to substitution ratio; implemented in the framework.

high neutral Economics of Human and AI Collaboration: When is Partial Aut... labor substitution ratio (human labor displaced per unit accuracy)

Costinot and Werning (2023) develop a sufficient-statistic approach and find optimal technology taxes of 1–3.7% on robots.

Citation reported in the paper summarizing Costinot and Werning (2023)'s quantitative sufficient-statistic estimate.

high neutral NBER WORKING PAPER SERIES optimal robot tax rate

« Prev 1 2 3 … 18 19 20 … 80 81 Next »