Evidence (4004 claims)

Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.

The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).

Browse by theme

Nine broad, paper-level topics. Click one to filter the claims below.

Human-AI Collaboration

Claims by outcome category

Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.

Outcome	Positive	Negative	Mixed	Null	Total
Other	870	233	116	1066	2363
Governance & Regulation	976	451	218	133	1809
Organizational Efficiency	949	224	144	88	1416
Technology Adoption Rate	764	287	141	122	1325
Research Productivity	501	152	74	362	1101
Output Quality	542	216	69	69	896
Decision Quality	387	198	94	54	740
Firm Productivity	513	67	101	27	714
AI Safety & Ethics	249	303	73	36	667
Market Structure	190	192	134	27	548
Task Allocation	243	77	91	36	452
Innovation Output	291	33	55	20	401
Skill Acquisition	206	72	65	21	364
Employment Level	133	63	115	22	335
Fiscal & Macroeconomic	153	79	52	32	323
Task Completion Time	206	37	12	15	272
Firm Revenue	179	52	29	5	266
Consumer Welfare	130	76	47	13	266
Inequality Measures	48	137	51	6	242
Worker Satisfaction	101	81	25	13	220
Error Rate	84	110	11	5	210
Wages & Compensation	98	47	30	10	185
Regulatory Compliance	88	73	17	7	185
Automation Exposure	66	64	33	16	182
Team Performance	105	29	30	11	176
Training Effectiveness	109	22	14	21	168
Developer Productivity	114	21	14	8	158
Job Displacement	12	90	24	1	127
Hiring & Recruitment	57	9	9	5	80
Skill Obsolescence	6	56	9	1	72
Social Protection	43	17	8	2	70
Creative Output	35	21	9	4	70
Labor Share of Income	18	21	17	1	57
Worker Turnover	15	16	—	4	35
Industry	—	—	—	1	1

Labor Markets Remove filter

Automated compliance and credentialing systems raise governance issues (auditability, appeals mechanisms) and risk incorrect automated deregistration if not properly governed.

Governance and algorithmic-risk discussion in the paper; logical argumentation rather than case-based evidence.

high negative <i>Electrotechnical education, institutional complianc... rate of incorrect automated decisions, existence and effectiveness of appeal pro...

The paper models career progression as a continuous function and treats certification gaps as discontinuities that impede labour-market mobility.

Mathematical/conceptual modeling described in the methods (career-progression-as-continuous-function approach); this is a modeling choice reported in the paper rather than an empirical finding.

high negative <i>Electrotechnical education, institutional complianc... labour-market mobility / continuity of career progression (in the conceptual mod...

There is limited long-term impact evidence and few system-level assessments of AI in developing-country agriculture.

Authors' methodological caveat based on the temporal scope and types of studies available in the >60-study review.

high negative A systematic review of the economic impact of artificial int... presence/absence of long-term impact evaluations and system-level assessments

Substantial compute and resource requirements for training and inference concentrate capabilities among well‑resourced labs and firms.

Paper discusses large compute budgets for training/inference and states that performance scales with data, model size, and compute; it infers concentration of capabilities but provides no empirical market concentration measures.

high negative Protein structure prediction powered by artificial intellige... distribution of computational capability/resources across organizations and resu...

Structure predictors depend on training data and exhibit biases; experimental validation remains necessary.

Paper notes dependence on training data biases and the need for experimental validation; references data sources (PDB, UniRef, metagenomic catalogs) but does not quantify bias magnitudes.

high negative Protein structure prediction powered by artificial intellige... bias in model predictions attributable to training data coverage/quality; requir...

Current limitations include inaccurate prediction of multi‑chain complexes, flexible or rare conformational states, and limited prediction of dynamic ensembles.

Paper explicitly enumerates these limitations in the 'Ongoing limitations' section; no quantitative failure rates are given.

high negative Protein structure prediction powered by artificial intellige... accuracy for multi‑chain complexes, flexible/rare conformations, and ensemble/dy...

Traditional computational methods struggle without homologous templates or with complex folding/dynamics.

Paper discusses limitations of traditional computational methods, emphasizing dependence on homologous templates and difficulty with complex folding/dynamics; specific method comparisons or sample sizes are not provided.

high negative Protein structure prediction powered by artificial intellige... accuracy/success of traditional computational structure prediction in low‑homolo...

Opacity, bias, and errors in AI systems demand auditing, standards, and governance (algorithmic accountability) to ensure trustworthy assessment.

Synthesis of literature on algorithmic bias and accountability plus policy analysis recommending audits and standards; supported by country cases that discuss governance concerns.

high negative The Future of Assessment: Rethinking Evaluation in an AI-Ass... algorithmic fairness, transparency, and reliability

Student data used by AI vendors raises risks around consent, reuse, commercial exploitation, and other data-privacy concerns.

Policy analysis and literature on data governance, privacy law debates; examples from national policy documents in the comparative cases. No original data on breaches or misuse presented.

high negative The Future of Assessment: Rethinking Evaluation in an AI-Ass... privacy risks and governance of student data

Empirical evaluation of integrated defenses, quantitative cost/benefit analyses, and standardized threat models for VR are research gaps that remain unaddressed in the literature window surveyed (2023–2025).

Authors' stated limitations from their comparative literature review of 31 studies noting an absence of primary empirical validation and quantitative economic analyses in the reviewed corpus.

high negative Securing Virtual Reality: Threat Models, Vulnerabilities, an... presence/absence of empirical validation, cost‑benefit studies, and standard thr...

Immersive VR systems collect continuous multimodal signals (motion tracking, gaze, voice, biometrics) that enable novel inference, spoofing, and manipulation attacks beyond traditional IT threats.

Synthesis of threat descriptions across the 31 reviewed peer‑reviewed studies (2023–2025) documenting sensor modalities and attack vectors; qualitative comparative evaluation of attack surfaces.

high negative Securing Virtual Reality: Threat Models, Vulnerabilities, an... existence and extent of expanded attack surface due to multimodal signal collect...

Limitations of the study include reliance on self-reported perceptions (subject to response and survivorship bias), lack of experimental/causal identification, potential non-representative sample, and cross-sectional design limiting inference about long-term productivity effects.

Authors' stated limitations in the paper summary.

high negative Artificial Intelligence as a Catalyst for Innovation in Soft... validity threats (self-report bias, lack of causal design) as reported by author...

Tasks that are routine, repetitive, or pattern‑based (e.g., boilerplate coding, refactoring, unit test generation, some accessibility fixes) will be increasingly automated by AI.

Task‑level decomposition and examples of current automation capabilities (code generation, test suggestion tools); conceptual projection rather than empirical measurement.

high negative How AI Will Transform the Daily Life of a Techie within 5 Ye... rate of automation for routine software development tasks (proportion of such ta...

Common barriers to effective RM implementation include siloed functions/weak coordination, limited resources or expertise, poor data quality/lack of metrics, and cultural resistance driven by short-term incentives.

Frequent identification of these barriers across the reviewed literature and practitioner sources synthesized via thematic analysis over the last ten years.

high negative The Role of Risk Management as an Organizational Management ... barriers to RM adoption/implementation; likelihood of successful RM

Hierarchy compresses: fewer organizational layers are needed for a given firm output as coordination costs fall.

Analytical proposition in the theoretical model and simulation results showing reduced number of layers under coordination compression.

high negative AI as Coordination-Compressing Capital: Task Reallocation, O... number of hierarchical layers per firm

A one standard-deviation increase in AI adoption (2019–2025, 38 OECD countries) causally reduces employment in routine cognitive occupations by 2.3%.

Panel of 38 OECD countries, 2019–2025; AI Adoption Index (composite of enterprise AI investment, AI patent filings, workforce/firm AI-use surveys); instrumental-variable (IV) estimation to identify causal effect on occupational employment; country and year fixed effects and macro controls reported.

high negative Artificial Intelligence and Labor Market Transformation: Emp... Employment in routine cognitive occupations (percent change per 1 SD increase in...

Higher measured GDP need not imply higher aggregate welfare: the private costs of the arms race can outweigh the market gains from increased output.

Welfare comparisons performed in the model showing parameter regions where private equilibrium raises GDP but reduces aggregate welfare once investment costs are included.

high negative Janus-Faced Technological Progress and the Arms Race in the ... aggregate welfare (utility/net social surplus)

Because private incentives push agents toward tail outcomes, aggregate overinvestment occurs relative to the social optimum (the arms race is inefficient).

Welfare calculations and comparison of private vs social optima within the model; the paper shows private equilibrium investment exceeds the socially optimal investment given the externalities of the arms race.

high negative Janus-Faced Technological Progress and the Arms Race in the ... aggregate welfare (social welfare loss due to overinvestment)

Upfront costs for AI adoption are substantial: development, clinical validation, regulatory compliance, EHR integration, and ongoing monitoring.

Implementation and regulatory literature synthesized in the review documenting typical cost categories and reported expenditures for clinical AI projects.

high negative Will AI Replace Physicians in the Near Future? AI Adoption B... fixed and recurring implementation costs

Large language models (LLMs) suffer from hallucinations (fabricated facts), overconfidence, and unpredictable failure modes in open-ended tasks.

Technical papers and benchmarks on LLM factuality, calibration, and failure modes summarized in the review; empirical evaluations showing instances of fabricated outputs and calibration issues.

high negative Will AI Replace Physicians in the Near Future? AI Adoption B... factual accuracy of outputs; calibration (confidence vs accuracy); failure rate ...

Contemporary AI systems have no capacity for physical examination, sensorimotor procedures, or direct patient-contact diagnostics.

Technical limitations of CNNs and LLMs described in literature (lack of embodiment, no sensorimotor capabilities) and absence of credible empirical demonstrations of safe autonomous physical clinical procedures in reviewed studies.

high negative Will AI Replace Physicians in the Near Future? AI Adoption B... ability to perform physical exam / procedural tasks / direct patient-contact dia...

Current models exhibit poor out-of-distribution (OOD) generalization: performance degrades when inputs differ from training distributions.

Technical literature and robustness/domain-shift research reviewed in the paper documenting declines in model accuracy under domain shift and dataset changes.

high negative Will AI Replace Physicians in the Near Future? AI Adoption B... model accuracy/performance under domain shift / OOD inputs

Risks include bias and discrimination, opacity in decision-making, privacy and cybersecurity threats, liability gaps, and uneven distribution of benefits that can exacerbate inequality.

Compilation from academic and policy literature, regulatory gap analyses, and examples of problematic AI use cases identified in the report's sectoral review.

high negative AI Governance and Data Privacy: Comparative Analysis of U.S.... bias/discrimination incidents, decision-making opacity, privacy/cybersecurity in...

AI creates significant ethical, legal and distributional risks.

Review of policy documents, academic and policy literature, and documented examples of AI deployment across multiple sectors highlighting harms (bias, privacy breaches, liability gaps, unequal benefits).

high negative AI Governance and Data Privacy: Comparative Analysis of U.S.... ethical risks, legal gaps, and distributional outcomes (inequality)

Reliance on imperfect data and model assumptions can produce biased or misleading forecasts; careful validation, transparency about assumptions, and governance are necessary.

Risks & governance discussion in the paper raising this limitation and recommending practices (qualitative argumentation).

high negative AI-Based Predictive Skill Gap Analysis for Workforce Plannin... risk of biased or misleading forecasts arising from data/model limitations (qual...

AI-generated code can introduce security vulnerabilities and raise licensing/intellectual-property concerns.

Case studies of security incidents, analyses of generated code provenance, and vulnerability-detection studies synthesized in the review.

high negative ChatGPT as a Tool for Programming Assistance and Code Develo... incidence of security vulnerabilities in generated code; instances of license or...

LLMs sometimes generate incorrect, nonsensical, or insecure code (hallucinations).

Multiple benchmarks, code-generation accuracy tests, and incident case studies documented in the empirical literature showing incorrect or fabricated outputs.

high negative ChatGPT as a Tool for Programming Assistance and Code Develo... code correctness/error rate; incidence of hallucinated outputs (false or fabrica...

Data security, privacy risks, unequal gains, and regulatory shortfalls can undermine the benefits of AI/robotics adoption.

Policy and risk analyses from secondary literature, case studies, and institutional reports synthesized in the paper; examples cited but no original incident-level dataset or incidence rates provided.

high negative AI and Robotics Redefine Output and Growth: The New Producti... data/privacy risk incidence, inequality measures, regulatory adequacy (qualitati...

Transition frictions and skills mismatches are important barriers to workers moving into newly created AI‑related roles.

Qualitative review of workforce and skills literature, case studies, and sector reports; evidence comes from secondary sources with varied methodologies; the paper does not report pooled quantitative estimates.

high negative AI and Robotics Redefine Output and Growth: The New Producti... transition costs, skills mismatch incidence, retraining needs (labor market fric...

International and national legal approaches to these stages are fragmented, creating uncertainty for IP, privacy, liability and evidence law.

Comparative review of international and national legal approaches and judicial responses cited in the paper (secondary legal sources).

high negative Ethical and societal challenges to the adoption of generativ... degree of fragmentation and legal uncertainty across jurisdictions

Output-stage risks include authenticity/deception concerns, attribution and reuse-rights disputes, reputational harms, and broader societal impacts from abundant generated media.

Review of empirical studies on media authenticity, legal cases, and policy analyses included in the narrative review.

high negative Ethical and societal challenges to the adoption of generativ... authenticity, deception potential, attribution disputes, reputational and societ...

Process-stage risks include governance of model development, control over deployment, transparency, auditing, and operational safety.

Conceptual synthesis of technical governance literature and policy reports cited in the narrative review.

high negative Ethical and societal challenges to the adoption of generativ... governance and operational safety concerns in model development/deployment

Input-stage risks include concerns about consent, copyright, representativeness, bias, provenance and data ownership for training material.

Synthesis of legal and policy literature and documented legal cases/statutes related to training data and IP/privacy issues (secondary sources only).

high negative Ethical and societal challenges to the adoption of generativ... legal/ethical compliance and risk factors in training datasets

Generative audiovisual AI poses material ethical, control, transparency and legal challenges across three stages — input (training data), process (development & deployment), and output (use of artifacts).

Conceptual three-stage framework built from comparative review of literature, legal cases/statutes and policy reports described in the paper.

high negative Ethical and societal challenges to the adoption of generativ... presence and types of ethical, governance, transparency and legal risks across i...

Limitations of the study include potential selection bias in reviewed sources and contingency of conclusions on evolving legal decisions and technology developments.

Author-stated limitations section within the paper; qualitative acknowledgement rather than empirical bias assessment.

high negative Ethical and societal challenges to the adoption of generativ... reliability and generalizability of the review's conclusions

Output-stage risks include challenges to authenticity and provenance, erosion of trust (deepfakes and misinformation), and potential legal liability for harms caused by generated content.

Synthesis of technical papers on deepfakes, legal analyses of liability, and policy reports referenced in the review; no original incident dataset or quantitative prevalence estimate included.

high negative Ethical and societal challenges to the adoption of generativ... authenticity/provenance verification success, consumer trust, incidence of misin...

Input-stage risks include copyright infringement, lack of consent, poor data provenance, and biases/representational harms encoded in training datasets.

Review and synthesis of academic and legal literature on training data issues; examples and case law discussed, but no original dataset audit or sample counts provided.

high negative Ethical and societal challenges to the adoption of generativ... legal/compliance risk and bias in generated outputs arising from training data

Use of these models faces significant ethical, control, transparency, and legal challenges across three stages—input (training data), process (development/control), and output (generated artifacts).

Framework constructed from interdisciplinary literature (technical, ethical, legal sources) and review of statutes/judicial approaches; qualitative synthesis rather than primary data.

high negative Ethical and societal challenges to the adoption of generativ... presence and severity of ethical/legal/control challenges across input/process/o...

High environmental constraints in many African regions (poor infrastructure, challenging geography, frequent climate shocks) materially affect logistics, resilience, and supply-chain performance.

Review of literature on infrastructure, geography, and climate impacts in the conceptual paper.

high negative Continental shift: operations and supply chain management re... infrastructure and environmental constraints' impact on logistics/resilience

Africa is abundant in natural resources but exhibits relatively low development/outcomes from those resources, creating resource allocation and value-capture problems relevant to OSCM.

Development economics and regional studies literature cited in the paper's synthesis; conceptual claim without new empirical testing.

high negative Continental shift: operations and supply chain management re... resource endowment versus development outcomes (value capture in supply chains)

Africa has a large informal economy and many informal organizations that shape supply-chain behavior and market functioning.

Literature synthesis citing development and institutional studies (no primary data collection in the paper).

high negative Continental shift: operations and supply chain management re... prevalence of informality and its influence on supply-chain behavior

Results reflect small-scale e-commerce use cases; external validity to larger firms, other sectors, or more complex tasks is not established.

Scope of deployments limited to small-scale e-commerce settings as stated in methods; no cross-sector or large-firm samples reported in summary.

high negative Artificial Intelligence Agents in Knowledge Work: Transformi... generalisability/external validity of observed productivity effects

The study's evidence is observational rather than randomized controlled trials, so causal estimates about productivity impacts are suggestive rather than definitive.

Declared study design: applied experimentation and observational analysis of deployments (no randomized assignment); methods section explicitly notes observational limitation.

high negative Artificial Intelligence Agents in Knowledge Work: Transformi... strength of causal inference (ability to attribute observed productivity changes...

Deficits in governance, auditing, and interpretability constrain the safe deployment of generative AI in firms.

Synthesis of industry reports and conceptual literature noting gaps in governance and interpretability; no quantitative governance dataset reported.

high negative The Use of ChatGPT in Business Productivity and Workflow Opt... presence/absence of governance processes, frequency of audit findings, deploymen...

Algorithmic biases in generative AI can amplify and codify discriminatory patterns in organizational decisions.

Extensive literature on algorithmic bias synthesized in the review and applied to generative models; case examples referenced.

high negative The Use of ChatGPT in Business Productivity and Workflow Opt... disparities in decision outcomes (error rates, disparate impact metrics by group...

Generative AI use introduces significant organizational risks including data privacy breaches and leakage when models or third‑party services are used.

Conceptual analysis and references to documented incidents and industry reports within the review; no single aggregated incident dataset provided.

high negative The Use of ChatGPT in Business Productivity and Workflow Opt... incidence of data breaches/leakage, number of privacy violations

Generated code can introduce security vulnerabilities.

Security analyses and code audits documenting examples where LLM-generated code contains known vulnerability patterns; incident-oriented case studies and controlled experiments assessing vulnerability incidence.

high negative ChatGPT as a Tool for Programming Assistance and Code Develo... incidence of security vulnerabilities in AI-generated code

LLMs can produce plausible-looking but incorrect or insecure code (so-called 'hallucinations').

Benchmarks and controlled tests demonstrating incorrect outputs; security analyses and replicated examples showing erroneous or insecure snippets produced by LLMs across multiple models and prompts.

high negative ChatGPT as a Tool for Programming Assistance and Code Develo... code correctness/error rate and frequency of insecure code returned

The technical feasibility of robust token verification and resistance to spoofing needs demonstration; it is not yet proven.

Authors explicitly acknowledge this limitation in the paper; no prototypes or red-team results are presented.

high negative Token Taxes: mitigating AGI's economic risks robustness of token verification to spoofing/evasion

AI-driven impacts will be heterogeneous across education, race, gender, age, firm size, and geography, implying crucial equity concerns and the need for disaggregated reporting and targeted validation.

Policy analysis and literature synthesis in the paper; this claim reflects widely-documented labor economics findings about heterogeneous technological impacts though no new empirical breakdowns provided here.

high negative Enhancing BLS Methodologies for Projecting AI's Impact on Em... distribution of employment/wage/transition impacts across demographic and firm/r...

« Prev 1 2 3 … 17 18 19 … 80 81 Next »