Evidence (16496 claims)

Search and filter individual claims pulled from the papers. Looking for a specific finding ("what's the effect on wages?"), you're in the right place. Want to compare whole outcome categories against each other instead? Use the Evidence Explorer.

The board below groups claims two ways: by broad theme (nine paper-level topics) and by outcome category (the 34 claim-level outcomes that the Explorer and Syntheses also use).

Browse by theme

Nine broad, paper-level topics. Click one to filter the claims below.

Human-AI Collaboration

Claims by outcome category

Counts by direction of finding. These are the same 34 outcome categories the Explorer compares and the Syntheses are written for. A linked row has a published synthesis.

Outcome	Positive	Negative	Mixed	Null	Total
Other	870	233	116	1066	2363
Governance & Regulation	976	451	218	133	1809
Organizational Efficiency	949	224	144	88	1416
Technology Adoption Rate	764	287	141	122	1325
Research Productivity	501	152	74	362	1101
Output Quality	542	216	69	69	896
Decision Quality	387	198	94	54	740
Firm Productivity	513	67	101	27	714
AI Safety & Ethics	249	303	73	36	667
Market Structure	190	192	134	27	548
Task Allocation	243	77	91	36	452
Innovation Output	291	33	55	20	401
Skill Acquisition	206	72	65	21	364
Employment Level	133	63	115	22	335
Fiscal & Macroeconomic	153	79	52	32	323
Task Completion Time	206	37	12	15	272
Firm Revenue	179	52	29	5	266
Consumer Welfare	130	76	47	13	266
Inequality Measures	48	137	51	6	242
Worker Satisfaction	101	81	25	13	220
Error Rate	84	110	11	5	210
Wages & Compensation	98	47	30	10	185
Regulatory Compliance	88	73	17	7	185
Automation Exposure	66	64	33	16	182
Team Performance	105	29	30	11	176
Training Effectiveness	109	22	14	21	168
Developer Productivity	114	21	14	8	158
Job Displacement	12	90	24	1	127
Hiring & Recruitment	57	9	9	5	80
Skill Obsolescence	6	56	9	1	72
Social Protection	43	17	8	2	70
Creative Output	35	21	9	4	70
Labor Share of Income	18	21	17	1	57
Worker Turnover	15	16	—	4	35
Industry	—	—	—	1	1

AI learns indiscriminately from implicit knowledge, acquiring both beneficial patterns and harmful biases.

Asserted in the paper as a conceptual point about training data and learned patterns; no empirical evaluation or quantified bias measures provided.

high mixed Reliable AI Needs to Externalize Implicit Knowledge: A Human... patterns and biases acquired by AI from implicit knowledge

Workload-aware blended pricing reorders the leaderboard substantially: 7 of 10 top-ranked endpoints under the chat preset (3:1 input:output) fall out of the top 10 under the retrieval-augmented preset (20:1).

Comparison of endpoint rankings under two workload presets (chat preset 3:1 and retrieval-augmented preset 20:1); statement gives counts (7 of top 10 change).

high mixed Token Arena: A Continuous Benchmark Unifying Energy and Cogn... change in top-10 endpoint rankings between workload presets

Modeled joules per correct answer varies by a factor of 6.2 across endpoints.

Modeled energy estimate combined with task accuracy to compute joules per correct answer across 78 endpoints.

high mixed Token Arena: A Continuous Benchmark Unifying Energy and Cogn... joules per correct answer (modeled energy efficiency)

Across 78 endpoints, the same model on different endpoints differs in tail latency by an order of magnitude.

Empirical tail-latency measurements across 78 endpoints serving 12 model families.

high mixed Token Arena: A Continuous Benchmark Unifying Energy and Cogn... tail latency

The same model on different endpoints differs in fingerprint similarity to first party by up to 12 points.

Empirical measurement of fingerprint (output-distribution) similarity to a first-party reference across the same set of endpoints (78 endpoints, 12 model families).

high mixed Token Arena: A Continuous Benchmark Unifying Energy and Cogn... fingerprint similarity to first-party reference (endpoint fidelity)

Across 78 endpoints serving 12 model families, the same model on different endpoints differs in mean accuracy by up to 12.5 points on math and code.

Empirical measurement across 78 endpoints and 12 model families comparing mean accuracy on math and code tasks.

high mixed Token Arena: A Continuous Benchmark Unifying Energy and Cogn... mean accuracy on math and code benchmarks

Whether the futures these configurations help create remain governable and worth inhabiting will depend on leaders who can see, early enough, where and how consequential decisions are actually being shaped.

Normative/prognostic claim linking future governability to leaders' detection capabilities (conceptual; no empirical test provided in the excerpt).

high mixed Leading Across the Spectrum of Human-AI Relationships: A Con... future governability of organizations/systems with human–AI decision configurati...

These configurations will shape how power, responsibility, and trust are distributed in organizational life.

Theoretical/prognostic claim in the paper linking configurations to distribution of power, responsibility, and trust (no empirical quantification in the excerpt).

high mixed Leading Across the Spectrum of Human-AI Relationships: A Con... distribution of power, responsibility, and trust within organizations

Fluent users' failures occur alongside greater success on complex tasks.

Combined analysis of task complexity, success outcomes, and failure incidence in the 27K transcripts showing that fluent users both attempt and have greater success on complex tasks even while experiencing more failures.

high mixed A paradox of AI fluency success on complex tasks

Fluent users adopt a fundamentally different interactional mode: they iterate collaboratively with the AI, refining goals and critically assessing outputs, whereas novices take a passive stance.

Qualitative and quantitative analysis of the same 27,000 annotated WildChat transcripts, with annotations describing interactional mode and user behavior (iteration, goal refinement, critical assessment vs. passivity).

high mixed A paradox of AI fluency interactional mode / engagement style

Augmentation is bounded rather than linear (i.e., human-AI augmentation shows diminishing or negative returns past a balanced zone).

Synthesis of interview themes across 34 cases producing the bounded-augmentation / curvilinear conceptualization.

high mixed E-leadership and human-AI collaboration: socio-technical ali... perceived team effectiveness as a function of AI-use intensity

Mediators such as trust, cohesion and accountability are reshaped when AI-generated contributions enter collaboration.

Thematic evidence from interviews indicating changes in trust, cohesion and accountability dynamics associated with the introduction of AI outputs into team collaboration.

high mixed E-leadership and human-AI collaboration: socio-technical ali... trust, cohesion, accountability

Social (leadership engagement, trust, ownership, mediation and alignment) and technical (automation, creation, reliability, distraction and integration) subsystems combine to enable or erode team effectiveness, summarized in an e-leadership–AI orientation matrix.

Analytic synthesis from thematic coding (Gioia-informed) of interview data producing a conceptual matrix mapping social and technical factors to outcomes.

high mixed E-leadership and human-AI collaboration: socio-technical ali... perceived team effectiveness (as a function of social and technical subsystems)

Analysis identifies a curvilinear pattern of bounded augmentation, where effectiveness peaks in a zone of balanced use but declines under under-use and over-reliance.

Thematic (Gioia-informed) analysis of 34 semi-structured interviews with project managers across five UK industries; pattern emerges from cross-case coding and synthesis.

high mixed E-leadership and human-AI collaboration: socio-technical ali... perceived team effectiveness

Generative AI-powered tools like ChatGPT are reshaping market skill demands while also offering new forms of on-demand learning support to meet those demands.

Framed in paper as background/motivation; asserted from prior literature and the paper's motivating claims rather than reported as a quantified result in this study.

high mixed Upskilling with Generative AI: Practices and Challenges for ... impact of generative AI on market skill demands and availability of on-demand le...

In operational meteorology, adjoint-based methods derive value from the forecast model itself but require full data assimilation infrastructure.

Technical background in paper describing adjoint-based methods and their infrastructural requirements (methodological literature references; no new empirical data).

high mixed Calibrating Attribution Proxies for Reward Allocation in Par... suitability and infrastructure requirements of adjoint-based value methods

The rise of digital agents will transform the foundations of production, labour markets, institutional arrangements and the international distribution of economic power.

Synthesis and theoretical projection across sections of the paper; presented as a broad conclusion without reported empirical quantification in the provided text.

high mixed DIGITAL AGENTS AS FUNCTIONAL EQUIVALENTS OF ECONOMIC ACTORS:... transformation of production systems, labour markets, institutions, and internat...

There is a fundamental asymmetry between economic and social reproduction: digital agents can compensate for productive functions of the population but are unable to substitute the population's functions of social reproduction.

Theoretical argument and conceptual distinction in the paper; no empirical study measuring substitution in social reproduction provided.

high mixed DIGITAL AGENTS AS FUNCTIONAL EQUIVALENTS OF ECONOMIC ACTORS:... capacity of digital agents to substitute productive vs social reproduction funct...

The retrieved sources are substantially different for each search engine (average pairwise Jaccard similarity < 0.2).

Computed average Jaccard similarity of source-domain sets returned by each engine (Google organic results, Google AIO, Gemini Flash 2.5) across the 11,500 queries; reported average similarity < 0.2.

high mixed How Generative AI Disrupts Search: An Empirical Study of Goo... overlap (Jaccard similarity) of retrieved source domains across engines

These patterns suggest that AI adoption is associated with expected efficiency gains that shape both firms' pricing behaviour and their macroeconomic expectations.

Interpretation based on observed increases in productivity/profitability and different pricing/inflation expectations among adopters vs non-adopters in survey and DID analyses.

high mixed The economic impact of artificial intelligence: evidence fro... interpretive link between productivity/profitability gains and firms' pricing an...

The rapid growth of AI and automation offers Sub-Saharan Africa economic opportunities as well as labor market challenges.

Systematic review of the literature reported in the paper; scope and number of studies not specified in the abstract/summary provided.

high mixed The Impact of AI-Driven Automation on Semi and Unskilled Wor... economic opportunities and labor market challenges in Sub‑Saharan Africa

LLMs are able to extract signals from unstructured text (financial news headlines) but have limitations without explicit quantitative optimization.

Interpretation in discussion/conclusion: empirical finding that LLM-based portfolios beat naive diversification but underperform AI-optimized strategies, implying LLMs extract signals from text yet lack full optimization capability.

high mixed Few-Shot Portfolio Optimization: Can Large Language Models O... ability to extract actionable signals from unstructured text as reflected in por...

Statistical tests confirmed significant performance differences (p ≤ 0.01).

Reported inferential statistics in results: statistical tests comparing strategy performances produced p-values at or below 0.01.

high mixed Few-Shot Portfolio Optimization: Can Large Language Models O... statistical significance of performance differences between strategies

Susceptibility to visual priming varies across state-of-the-art VLMs.

Comparative experiments run across multiple state-of-the-art vision-language models showing differential changes in IPD behavior when exposed to the same visual primes and color cues. (Paper notes variation in susceptibility and mitigation effectiveness across models; specific model list and per-model sample sizes not given in the abstract.)

high mixed The Effects of Visual Priming on Cooperative Behavior in Vis... magnitude of change in cooperation/defection behavior due to visual priming, per...

Color-coded reward matrices alter VLM decision patterns.

Experimental condition varying the visual presentation of the IPD payoff matrix (color-coding of rewards) and measuring resulting decision patterns of multiple VLMs in IPD trials. (Reported as part of the experimental setup across models; exact counts not provided in abstract.)

high mixed The Effects of Visual Priming on Cooperative Behavior in Vis... changes in cooperation/defection choices in IPD when reward matrices are color-c...

VLM behavior can be influenced by image content depicting behavioral concepts (kindness/helpfulness vs. aggressiveness/selfishness).

Experimental manipulation in the Iterated Prisoner's Dilemma (IPD): VLMs were exposed to images labeled/connoting 'kindness/helpfulness' versus 'aggressiveness/selfishness' and subsequent choices in IPD rounds were recorded across multiple state-of-the-art VLMs. (Paper reports experiments across multiple VLMs; exact sample sizes per model/condition not stated in the abstract.)

high mixed The Effects of Visual Priming on Cooperative Behavior in Vis... cooperation rate (choice to cooperate vs. defect) in Iterated Prisoner's Dilemma...

AI adoption leads both to job displacement and job creation, including the emergence of new occupational categories.

Abstract states the review examines empirical evidence on both job displacement and creation and the emergence of new occupations; no numeric counts or sample sizes provided in abstract.

high mixed AI and the Transformation of Human Employment: Challenges, O... job destruction and creation; emergence of new occupations

The study identifies short-term transitional risks and long-term productivity gains associated with AI integration in the workforce.

Abstract states the paper evaluates both short-term risks and long-term productivity gains from AI integration based on the reviewed literature; no empirical quantification given in abstract.

high mixed AI and the Transformation of Human Employment: Challenges, O... transitional risks and productivity gains

AI-driven automation and augmentation are reshaping employment landscapes, with emphasis on sector-level disruption, skill transformation, and socioeconomic consequences.

Abstract states this as a conclusion of the review drawing on interdisciplinary empirical literature; no specific studies or sample sizes cited in abstract.

high mixed AI and the Transformation of Human Employment: Challenges, O... employment landscape changes (sector disruption, skill transformation, socioecon...

The accelerating deployment of artificial intelligence across industries has fundamentally altered the structure of global labour markets.

Statement in abstract summarizing a systematic review of interdisciplinary literature (economics, computer science, organizational behaviour, public policy); no specific sample size reported in abstract.

high mixed AI and the Transformation of Human Employment: Challenges, O... structure of global labour markets

The magnitude of AI’s effect on potential GDP varied across industries and depended on the level of digital maturity, human resources, and institutional conditions.

Decompositional analysis across aggregated industry data and scenario-based modeling drawing on sectoral sources and reviews.

high mixed THE IMPACT OF AI ON POTENTIAL GDP AND LONG-TERM ECONOMIC GRO... industry-specific magnitude of AI contribution to GDP

Firms may continue to exist as legal and physical entities, but their coordinating function will be displaced as they become data nodes within regionally governed AI infrastructure.

Predictive/conceptual claim within the framework; no empirical sample reported in the excerpt and presented as a theoretical outcome of Interface Internalization.

high mixed Structural Dissolution: How Artificial Intelligence Dismantl... change in the coordinating role of firms (from coordinators to data nodes)

The Structural Dissolution Framework challenges the Coasian view that organizational boundaries are determined by transaction cost minimization, arguing that AI makes such boundaries economically obsolete.

Theoretical critique of transaction-cost-based explanations for firm boundaries presented in the paper; argumentative and conceptual rather than supported by empirical tests in the provided summary.

high mixed Structural Dissolution: How Artificial Intelligence Dismantl... economic relevance of transaction-cost-based firm boundaries

Regional data sovereignty entities will emerge as organizational forms that replace the coordinating role of firms and markets.

Normative/predictive claim within the paper's framework arguing for new organizational forms (regional data sovereignty entities); illustrated conceptually (e.g., through resource-dependent regional economies) rather than empirically tested in the provided text.

high mixed Structural Dissolution: How Artificial Intelligence Dismantl... emergence of regional data sovereignty entities as coordinators

Domain-specific data refinement infrastructure will become the new basis of positional control in industries.

Theoretical claim in the framework asserting a shift in positional control to data refinement infrastructure; presented as a predicted structural outcome rather than supported by empirical data in the provided text.

high mixed Structural Dissolution: How Artificial Intelligence Dismantl... basis of positional control (movement to data refinement infrastructure)

AI adoption moves value creation away from physical resources and human collaboration toward continuous token flows produced through data refinement loops.

Theoretical/analytical claim within the Structural Dissolution Framework and illustrative discussion; no empirical quantification provided in the text excerpt.

high mixed Structural Dissolution: How Artificial Intelligence Dismantl... source of value creation (physical/human → data/token flows)

The mechanism driving this restructuring is 'Interface Internalization', through which inter-agent coordination is absorbed into intra-system computation.

Conceptual mechanism defined and argued in the paper; presented as the central theoretical mechanism rather than as an empirically validated finding.

high mixed Structural Dissolution: How Artificial Intelligence Dismantl... shift of coordination from inter-agent (firms/markets) to intra-system computati...

AI dissolves the boundaries that once separated firms, markets, experts, and consumers by internalizing human multimodal interfaces (language, vision, and behavioral data) into computational systems.

Theoretical argument and conceptual framework introduced in the paper (Structural Dissolution Framework); no empirical sample or quantitative analysis reported for this claim in the text provided.

high mixed Structural Dissolution: How Artificial Intelligence Dismantl... dissolution of boundaries between firms, markets, experts, and consumers

Failures are structured by task family and execution surface, with HR, management, and multi-system business workflows as persistent bottlenecks and local workspace repair comparatively easier but unsaturated.

Error-mode analysis across the 105 tasks and evaluated models reported in experiments; authors identify task-family-level patterns (HR, management, multi-system workflows) and relative ease of local workspace repair.

high mixed Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-Wor... failure distribution by task family / execution surface

Whether LLM-based assistants improve or degrade code quality remains unresolved: existing studies report contradictory outcomes contingent on context and evaluation criteria.

Review finds mixed/contradictory findings across included studies regarding code quality effects.

high mixed The Impact of LLM-Assistants on Software Developer Productiv... code quality (e.g., correctness, maintainability, defects)

Architectural interventions can instead be used to trade off personalization against preference privacy.

Proposed solution described in the paper (architectural interventions) as an alternative to prompt-level fixes; presented as a design tradeoff rather than empirically validated mitigation in the excerpt.

high mixed When Agents Shop for You: Role Coherence in AI-Mediated Mark... trade-off between personalization and preference privacy under architectural int...

AI-driven automation marks the beginning of a new political era—one in which the role of work in society becomes a central axis of welfare conflict.

Theoretical and interpretive claim in the paper, motivated by the survey findings and broader argumentation about political consequences.

high mixed AI, the Future of Work, and the Politics of the Welfare Stat... political salience of 'the role of work' in welfare politics / emergence of new ...

The system tends to be factually correct when it answers but often omits information (i.e., 'the system is right when it answers — it just leaves things out').

Interpretation combining reported factual accuracy (85.5%) with low completeness (0.40) from benchmark results.

high mixed Benchmarking Complex Multimodal Document Processing Pipeline... factual accuracy vs. answer completeness

Differences between models are large enough to shape outcomes in practice, so reliability should be incorporated alongside average performance when assessing and deploying LLMs in high-stakes decision contexts.

Authors' interpretation of empirical differences in funding decisions, scores, confidence, and reliability across models in the controlled experiment; presented as an implication/recommendation.

high mixed Algorithmic personalities and the myth of neutrality: financ... policy recommendation regarding assessment criteria (reliability + average perfo...

This hybrid Make governance form has qualitatively different economics, capability requirements, and governance structures than pre-AI in-house development.

Paper's conceptual comparison between pre-AI hierarchy and post-AI hybrid Make governance (theoretical reasoning and examples; no empirical quantification).

high mixed The Buy-or-Build Decision, Revisited: How Agentic AI Changes... economics and capability requirements of in-house development governance

AI reshapes seven canonical decision determinants for make-or-buy choices: cost, strategic differentiation, asset specificity, vendor lock-in, time-to-market, quality and compliance, and organizational capability.

Paper's factor-level conceptual analysis enumerating and discussing seven determinants (theoretical synthesis rather than empirical measurement).

high mixed The Buy-or-Build Decision, Revisited: How Agentic AI Changes... sensitivity of canonical make-or-buy determinants to AI

Demographic characteristics intersect with AI exposure—i.e., exposure varies by demographic groups.

Paper reports that it examines how demographic characteristics intersect with exposure based on recent empirical studies; no demographic breakdowns or sample sizes provided in the abstract.

high mixed AI Displacement Risk in the Labor Market: Evidence, Exposure... variation in AI exposure across demographic groups

Recent studies combine task-level exposure metrics with employment and usage data to assess AI exposure and impacts.

Paper notes that it draws on studies that use task-level exposure metrics alongside employment and usage data; methodological claim rather than a quantitative result.

high mixed AI Displacement Risk in the Labor Market: Evidence, Exposure... measurement approach for AI exposure (task-level exposure linked to employment/u...

Generative large language models (LLMs) present organizations with a transformative technology whose labor market implications remain nascent yet consequential.

Statement in paper synthesizing emerging empirical research; no specific study, method, or sample size reported in the abstract.

high mixed AI Displacement Risk in the Labor Market: Evidence, Exposure... labor market implications (disruption and augmentation)

The adoption of AI in Israel constitutes a systemic transformation of employment relations, necessitating doctrinal adaptation and institutional reform to keep the labor market aligned with foundational legal principles.

Synthesis and conclusion from the paper's combined legal and empirical analysis; presented as the author's overarching interpretive claim rather than as a specific quantified finding.

high mixed Artificial Intelligence in Israel, Trends, Developments, and... degree of systemic transformation of employment relations and need for doctrinal...

« Prev 1 2 3 … 12 13 14 … 329 330 Next »