Evidence (5157 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	609	159	77	736	1615
Governance & Regulation	664	329	160	99	1273
Organizational Efficiency	624	143	105	70	949
Technology Adoption Rate	502	176	98	78	861
Research Productivity	348	109	48	322	836
Output Quality	391	120	44	40	595
Firm Productivity	385	46	85	17	539
Decision Quality	275	143	62	34	521
AI Safety & Ethics	183	241	59	30	517
Market Structure	152	154	109	20	440
Task Allocation	158	50	56	26	295
Innovation Output	178	23	38	17	257
Skill Acquisition	137	52	50	13	252
Fiscal & Macroeconomic	120	64	38	23	252
Employment Level	93	46	96	12	249
Firm Revenue	130	43	26	3	202
Consumer Welfare	99	51	40	11	201
Inequality Measures	36	105	40	6	187
Task Completion Time	134	18	6	5	163
Worker Satisfaction	79	54	16	11	160
Error Rate	64	78	8	1	151
Regulatory Compliance	69	64	14	3	150
Training Effectiveness	81	15	13	18	129
Wages & Compensation	70	25	22	6	123
Team Performance	74	16	21	9	121
Automation Exposure	41	48	19	9	120
Job Displacement	11	71	16	1	99
Developer Productivity	71	14	9	3	98
Hiring & Recruitment	49	7	8	3	67
Social Protection	26	14	8	2	50
Creative Output	26	14	6	2	49
Skill Obsolescence	5	37	5	1	48
Labor Share of Income	12	13	12	—	37
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Human Ai Collab Remove filter

Chinese Marxism's dialectical approach—rooted in the yin‑yang principle—constitutes an alternative epistemology that fundamentally differs from Western either/or logic, and this epistemology underpins the semi‑core's policy and strategic stance.

Philosophical and textual analysis of contemporary Chinese Marxist thought presented in the paper, interpreted in relation to Bauman's philosophical work; no empirical measurement reported, presented as conceptual/theoretical evidence.

high mixed Theorising the Interregnum: epistemological orientation (yin‑yang dialectic vs Western either/or)

Tool developers, users, and social scientists conceptualize 'context' differently, and these divergent conceptualizations reveal specific pitfalls inherent in computational approaches to context.

Analytic comparison across stakeholder perspectives derived from interviews and conceptual analysis in the paper (qualitative evidence; sample size unspecified).

high mixed Context Collapse: Barriers to Adoption for Generative AI in ... differences in conceptual definitions and the resulting pitfalls for computation...

AI adoption significantly reshaped task profiles for 73% of respondents, particularly affecting routine data processing, administrative tasks, and scheduling activities.

Survey data and secondary data analysis reported in this study (sample size not stated); self-reported change in task profiles with reported percentage (73%).

high mixed Artificial Intelligence Adoption and Career Reconfiguration ... task profile change (impact on routine data processing, administrative tasks, sc...

Providing issue-specific design guidance reduces design violations, but substantial non-compliance remains.

Intervention experiments in paper: agents were given issue-specific design guidance and resulting patch compliance measured; reported reduction in violations but remaining non-compliance.

high mixed Does Pass Rate Tell the Whole Story? Evaluating Design Const... design violations / design satisfaction

Policy implication: encouraging public sharing of AI-assisted solutions offsets the decline associated with private diversion (flow margin) but cannot repair participation-driven deterioration in conditional resolution; the latter requires directly maintaining contributor engagement.

Prescriptive conclusion from the theoretical model comparing interventions: public-sharing encouragement helps with flow-margin diversion but not with supply-side contributor thinning.

high mixed When AI Improves Answers but Slows Knowledge Creation: Match... archive creation (via posted volume) and conditional resolution (via contributor...

Diagnostic prediction: in a congested regime, observing a joint decline in posted volume and conditional resolution implies supply-side pool thinning is quantitatively present; by contrast, volume decline with stable or rising resolution indicates private diversion (flow margin) alone is the dominant force.

Analytical diagnostic derived from the model that links empirical patterns (volume and conditional resolution) to underlying mechanisms; no empirical validation given in the excerpt.

high mixed When AI Improves Answers but Slows Knowledge Creation: Match... posted volume and conditional resolution probability (joint pattern)

AI adoption across firms is heterogeneous, varying across sectors such as finance, technology, and manufacturing.

Survey of 150 leading Nigerian firms across finance, tech, and manufacturing showing variation in AI integration; supported by qualitative interviews and policy analysis.

high mixed Human Capital and the AI-Powered Future of Work: (Training, ... heterogeneity in AI adoption across firms/sectors

The rapid, heterogeneous integration of Artificial Intelligence (AI) technologies is profoundly reshaping the dynamics of work across the Nigerian business sector, generating both significant economic opportunities and acute labor market challenges.

Mixed-methods study combining a quantitative survey of 150 leading Nigerian firms across finance, tech, and manufacturing and qualitative analysis of government policy and workforce interviews.

high mixed Human Capital and the AI-Powered Future of Work: (Training, ... dynamics of work (economic opportunities and labor market challenges)

For the short-run optimization problem of AI deployment given fixed job responsibilities and worker skill levels, the firm’s optimal strategy for an m-step job can be computed in time O(m^2) using dynamic programming; the long-run joint optimization including task assignment to workers can also be solved in polynomial time up to an arbitrarily small error term.

Algorithmic results and complexity analysis derived in the theoretical sections and appendices of the paper (dynamic programming construction and polynomial-time solution statements).

high mixed Chaining Tasks, Redefining Work: A Theory of AI Automation computational complexity (time complexity) of computing optimal AI deployment an...

Appending a neighboring step to an existing AI chain adds no additional human verification burden (verification is a fixed cost at the chain level), which can make appending steps to a chain optimal even if manual execution is individually preferable for the appended step.

Theoretical model setup and formal argument showing verification is incurred only at the last augmented step of a chain; illustrative examples (data scientist workflow) and comparative-cost reasoning in the paper.

high mixed Chaining Tasks, Redefining Work: A Theory of AI Automation marginal verification cost when extending AI chains

AI chaining can overturn standard comparative advantage logic in assignment: when multiple adjacent steps are executed as an AI chain, a step may be assigned to AI (as part of the chain) even if manual human execution would be preferred for that step in isolation.

Theoretical model of production as an ordered sequence of steps with firms endogenously bundling contiguous steps into tasks and jobs; formal comparative-static arguments and illustrative examples in the paper showing how fixed verification costs per chain change marginal assignment incentives.

high mixed Chaining Tasks, Redefining Work: A Theory of AI Automation assignment of individual steps to AI versus human execution

The effect of increasing the share of AI-automated R&D tasks is non-monotonic: firms initially target more radical innovations, but beyond a threshold of human-AI complementarity, they shift the focus toward incremental innovations.

Analytical comparative-statics in the theoretical model: varying the fraction of R&D tasks performable by AI yields a non-monotonic relationship between AI task-share and optimal recombination distance, with a threshold determined by human-AI complementarity.

high mixed Bridging Distant Ideas: the Impact of AI on R&D and Recombin... targeted recombination distance / radicalness of innovations as a function of AI...

Higher AI productivity encourages more distant recombinations, if the direct facilitation effect is stronger than the indirect effect due to intensified competition from rivals.

Comparative-static result from the analytical model: the paper derives a condition comparing the direct facilitation effect of AI on accessing distant knowledge and the indirect effect from increased competition; when the former dominates, equilibrium recombination distance increases with AI productivity.

high mixed Bridging Distant Ideas: the Impact of AI on R&D and Recombin... recombination distance (degree of distance in knowledge-space targeted by firms)

Models performed well on commonly discussed topics but struggled with specialized health data.

Task-level performance comparison across topics in the elicited population statistics: better accuracy on commonly discussed topics, poorer performance on specialized health data tasks.

high mixed Bayesian Elicitation with LLMs: Model Size Helps, Extra "Rea... topic-specific estimation accuracy

In a preliminary experiment, giving models web search access degraded predictions for already-accurate models, while modestly improving predictions for weaker ones.

A preliminary comparative test where some models were given web search access and changes in predictive performance were observed: degradation for already-accurate models and modest improvement for weaker models.

high mixed Bayesian Elicitation with LLMs: Model Size Helps, Extra "Rea... change in predictive accuracy with web search access

Developers actively manage the collaboration, externalizing plans into persistent artifacts, and negotiating AI autonomy through context injection and behavioral constraints.

Observed behaviors in chat transcripts and committed artifacts showing developers creating persistent plans, injecting context, and specifying constraints to shape AI behavior.

high mixed Programming by Chat: A Large-Scale Behavioral Analysis of 11... practices for managing AI collaboration (externalization of plans, context injec...

Developers redistribute cognitive work to AI, delegating diagnosis, comprehension, and validation rather than engaging with code and outputs directly.

Content and interaction analyses of chat sessions showing developer prompts delegating diagnosis, comprehension, and validation tasks to the AI assistants (Cursor and GitHub Copilot) across the dataset.

high mixed Programming by Chat: A Large-Scale Behavioral Analysis of 11... allocation of cognitive tasks (diagnosis, comprehension, validation) between dev...

Conversational programming operates as progressive specification, with developers iteratively refining outputs rather than specifying complete tasks upfront.

Qualitative/content analysis of the 74,998 messages across 11,579 sessions indicating patterns of iterative prompts and refinements rather than one-shot complete specifications.

high mixed Programming by Chat: A Large-Scale Behavioral Analysis of 11... mode of task specification (iterative refinement vs complete upfront specificati...

An Evolutionary Game Theory (EGT) framework produces a 'Red Queen' co-evolutionary dynamic between platforms' algorithmic control and worker behavior in which neither side reaches a stable static equilibrium.

Analytical EGT model and numerical simulations of a population-level game between workers (choices: compliance vs. algorithmic gaming) and a platform varying surveillance strictness; model-based result (no empirical sample size).

high mixed THE RED QUEEN in the DASHBOARD: CO-EVOLUTIONARY DYNAMICS of ... presence of ongoing co-evolutionary (Red Queen) dynamics / lack of stable static...

This paper proposes three archetypal AI technology types: AI for effort reduction, AI to increase observability, and mechanism-level incentive change AI.

Conceptual taxonomy introduced by the authors (theoretical classification presented in the paper).

high mixed Incentives, Equilibria, and the Limits of Healthcare AI: A G... typology of AI technologies (categorical classification)

AI-driven conversational coaching is increasingly used to support workplace negotiation, yet prior work assumes uniform effectiveness across users.

Background claim in paper indicating prior literature trends and assumptions (stated in introduction/motivation).

high mixed Not My Truce: Personality Differences in AI-Mediated Workpla... adoption/use of AI coaching in workplace negotiation

Participants were clustered into three profiles -- resilient, overcontrolled, and undercontrolled -- based on the Big-Five personality traits and ARC typology.

Paper reports clustering analysis on participants using Big-Five trait measures and ARC typology; clustering result described as three profiles. Total sample reported as N=267.

high mixed Not My Truce: Personality Differences in AI-Mediated Workpla... personality profile membership (resilient, overcontrolled, undercontrolled)

We conducted a between-subjects experiment (N=267) comparing theory-driven AI (Trucey), general-purpose AI (Control-AI), and a traditional negotiation handbook (Control-NoAI).

Stated experimental design in paper: between-subjects randomized comparison across three conditions with total sample N=267.

high mixed Not My Truce: Personality Differences in AI-Mediated Workpla... effectiveness of coaching modalities (psychological and negotiation performance ...

We provide empirical evidence for the inverse parametric knowledge effect: ontological grounding value is inversely proportional to LLM training data coverage of the domain.

Empirical claim based on the controlled experiment (pattern linking grounding value to parametric knowledge coverage reported in paper).

high mixed Ontology-Constrained Neural Reasoning in Enterprise Agentic ... value of ontological grounding relative to LLM parametric knowledge coverage

These findings carry implications for workforce transition policy, regional economic planning, and the temporal dynamics of labor market adjustment.

Paper's discussion/interpretation of modeled ATE results and their policy/economic implications; no empirical test provided for policy outcomes.

high mixed Agentic AI and Occupational Displacement: A Multi-Regional T... policy relevance / labor market adjustment dynamics

Practitioners see the socio-emotional gap not as AI's failure to exhibit SEI traits, but as a functional gap in collaborative capabilities.

Reported interpretation from interview data (10 practitioners) indicating practitioners framed the gap functionally rather than as missing emotional traits.

high mixed Bridging the Socio-Emotional Gap: The Functional Dimension o... framing of the AI–human socio-emotional gap (functional vs. emotional)

AI technologies and digital platforms have fundamentally altered the organization of work and modes of value realization.

Synthesis of contemporary literature and theoretical analysis in a conceptual study (no empirical sample reported).

high mixed The labor theory of value in the era of artificial intellige... organization of work and modes of value realization in platform economies

Leader emotional intelligence (EI) moderates decision quality, delegation, and managerial communication when generative AI tools (Copilot/ChatGPT) are used in corporate management.

Theoretical EI-moderated human–AI model described in the paper and proposal to test it using a randomized online experiment.

high mixed LEADER EMOTIONAL INTELLIGENCE IN THE GENERATIVE AI ERA: “HUM... decision quality (and delegation quality, managerial communication)

The four-variable account (produced output, underlying understanding, calibration accuracy, self-assessed ability) better explains phenomena like overconfidence, over- and under-reliance on AI, 'crutch' effects, and weak transfer than the simpler claim that generative AI merely amplifies the Dunning–Kruger effect.

Argumentative synthesis in the paper comparing explanatory power of the proposed four-variable framework against the more general Dunning–Kruger metaphor; draws on examples and empirical patterns from the reviewed literature rather than a single empirical test.

high mixed Beyond the Steeper Curve: AI-Mediated Metacognitive Decoupli... explanatory fit for phenomena such as overconfidence, reliance patterns, crutch ...

A useful working model is 'AI-mediated metacognitive decoupling': LLM use widens the gap among produced output, underlying understanding, calibration accuracy, and self-assessed ability.

Conceptual synthesis and theoretical proposal grounded in reviewed empirical findings from multiple literatures (human–AI interaction, learning research, model evaluation); presented as the paper's working model rather than as a single empirical estimate.

high mixed Beyond the Steeper Curve: AI-Mediated Metacognitive Decoupli... degree of alignment/decoupling between produced output, underlying understanding...

All models exhibit task-dependent confabulation: they perform well on standardized legislative templates (e.g., EU directive transpositions) but generate plausible yet unfounded reasoning for politically idiosyncratic proposals.

Qualitative and quantitative analysis across the 15 proposals showing high-fidelity outputs for standardized/template-like proposals and instances of fabricated or unsupported rationale for idiosyncratic proposals; based on model outputs compared to official explanatory memoranda using the dual evaluation framework.

high mixed Can Commercial LLMs Be Parliamentary Political Companions? C... incidence of confabulation / faithfulness to official reasoning, stratified by t...

As technological progress devalues labor, the welfare benefits of steering are at first increased but, beyond a critical threshold, decline and optimal policy shifts toward greater redistribution.

Theoretical model extension analyzing planner's optimal choice as labor's economic value changes; the paper states a non-monotonic relationship with a critical threshold.

high mixed NBER WORKING PAPER SERIES welfare benefits of steering; optimal policy (steering vs redistribution)

Using pre-existing exposure as an instrument for ChatGPT adoption in a long-difference IV design, ChatGPT adoption causes households to spend more time on digital leisure activities while leaving total time spent on productive online activities unchanged.

IV long-difference empirical design: instrumenting household adoption with pre-ChatGPT exposure (2021 browsing); outcome measured as changes in categorized browsing durations (LLM-based classification into 'leisure' vs 'productive' sites); controls include demographic-by-region fixed effects and browsing composition controls.

high mixed https://arxiv.org/pdf/2603.03144 change in time spent on digital leisure activities and total time on productive ...

These patterns are consistent with a reorganization of the scientific production process rather than immediate efficiency gains, in line with theories of general-purpose technologies.

Interpretation linking observed changes in budget allocation, team size, and task breadth (from the proposal dataset and task-level analyses) to theoretical predictions about general-purpose technologies (GPTs); empirical findings show organizational change rather than large average short-run productivity gains.

high mixed Artificial Intelligence in Science: Returns, Reallocation, a... organizational reorganization vs efficiency gains (qualitative interpretation)

This paper offers a forward-looking framework that emphasizes the decentralizing potential of AI on labor markets, moving beyond the traditional displacement-versus-creation dichotomy.

Paper's stated contribution; based on conceptual framework and synthesis of historical and contemporary analyses (no empirical validation presented in the abstract).

high mixed AI Civilization and the Transformation of Work conceptual framing of AI's labor-market effects

The emergence of artificial intelligence and robotics is catalyzing a profound transformation in the nature of human labor.

Stated as a central premise in the paper's abstract; supported by the paper's synthesis of economic history, contemporary labor market data, and analysis of digital platform growth (no specific datasets or sample sizes reported in the abstract).

high mixed AI Civilization and the Transformation of Work nature of human labor / structure of labor markets

AI agents are approaching an inflection point where the binding constraint shifts from raw capability to how work is delegated, verified, and rewarded at scale.

Conceptual argument presented in the paper's introduction/positioning; no empirical data, experiments, or sample reported.

high mixed EpochX: Building the Infrastructure for an Emergent Agent Ci... how work is delegated, verified, and rewarded

The resulting AI safety profile is asymmetric: AI is bottlenecked on frontier research (novel tasks) but unbottlenecked on exploiting existing knowledge.

Theoretical implication of the novelty-bottleneck model distinguishing novel (human-judgment) vs. routine (covered by agent prior) components of tasks.

high mixed The Novelty Bottleneck: A Framework for Understanding Human ... AI capability bottlenecks in frontier research vs. exploitation

Wall-clock time can be reduced to O(√E) through team parallelism, but total human effort remains O(E).

Model-derived result showing parallelism across humans can speed wall-clock completion time while aggregate human effort does not drop asymptotically.

high mixed The Novelty Bottleneck: A Framework for Understanding Human ... wall-clock task completion time and total human effort

Better agents improve the coefficient on human effort but not the exponent (i.e., they reduce the constant factor but do not change the asymptotic scaling class).

Analytic result from the stylized model under the paper's assumptions about task decomposition and novelty fraction ν.

high mixed The Novelty Bottleneck: A Framework for Understanding Human ... human effort (coefficient vs. asymptotic scaling exponent)

The top four models are statistically indistinguishable (mean score 0.147–0.153) while a clear tier gap separates them from the remaining four models (mean score <= 0.113).

Reported mean performance scores across 8 models and statement of statistical indistinguishability for the top four vs lower-tier four; numerical means provided.

high mixed SWE-PRBench: Benchmarking AI Code Review Quality Against Pul... mean model performance score

Behavioral factors — specifically trust calibration, cognitive load, and affective reactions — shape the transition of corporate AI initiatives from pilot deployments to scalable, sustained use.

Synthesis of human-AI interaction literature integrated with adoption frameworks (TAM and TOE); conceptual linkage rather than new empirical testing in this paper.

high mixed Behavioral Factors as Determinants of Successful Scaling of ... success of pilot-to-production transition (scalability and sustained use)

AI accelerates value-chain maturation while creating distinct risks — including professional responsibility tensions and potential system-level externalities.

Conceptual argument and risk analysis in the Article (theoretical reasoning and synthesis of management/ethics literature). No empirical causal estimate reported in the excerpt.

high mixed Rewired: Reconceptualizing Legal Services for the AI Age acceleration of value-chain maturation and emergence of professional responsibil...

The legal profession is at a crossroads, caught between intensifying fears of AI-driven displacement and a generational opportunity for transformation.

Author's synthesis and framing in the Article (conceptual assessment; literature/contextual synthesis). No empirical sample or experiment reported in the excerpt.

high mixed Rewired: Reconceptualizing Legal Services for the AI Age risk of AI-driven displacement and opportunity for transformation in the legal p...

This advantage is contingent upon robust AI governance, ethical frameworks, and the transition from 'pilot-lite' projects to integrated, data-driven 'AI-first' business models.

Conditional claim in the paper linking success to governance, ethics, and organizational integration; appears to be normative/analytical rather than empirical in the abstract.

high mixed The AI Advantage: Strategic Innovation and Global Expansion ... dependency of AI-driven advantage on governance, ethics, and organizational inte...

Actual sharing often contradicted willingness to share (the privacy paradox), with consistently high sharing rates across all conditions.

Observed discrepancy reported in the experimental results (N=240): despite variation in willingness-to-share, behavioral sharing rates remained high and similar across human, white-box AI, and black-box AI conditions.

high mixed Understanding Data-Sharing with AI Systems: The Roles of Tra... discrepancy between stated willingness to share vs actual sharing behavior

Machine-readable metrics and open scholarly infrastructure are reshaping scholarly profiles and incentives.

Conceptual and historical discussion referring to platforms and metrics (e.g., arXiv, Google Scholar, ORCID) as mechanisms changing incentives; no new empirical estimates provided.

high mixed A Brief History of AI for Scientific Discovery: Open Researc... changes in scholarly incentives and profile construction due to machine-readable...

That interconnected ecosystem is fundamentally restructuring who can do science (access), how fast discoveries propagate, and what counts as a valid scientific contribution.

Argumentative claim linking infrastructural and tool changes to changes in access, dissemination speed, and norms of contribution. The paper presents examples and narrative but no systematic empirical evaluation or sample.

high mixed A Brief History of AI for Scientific Discovery: Open Researc... access to scientific practice, speed of discovery dissemination, and norms of sc...

The most consequential development is not any single tool but the emergence of an interconnected ecosystem—AI agents, preprint platforms, open source codebases, and citation infrastructure—that forms a feedback loop.

Synthesis/argument based on multiple examples (LLM agents, preprint servers like arXiv, open-source code repositories, citation indices). No quantitative measurement or causal identification reported.

high mixed A Brief History of AI for Scientific Discovery: Open Researc... emergence of an interconnected scientific infrastructure ecosystem

The central tension in AI for science is between automation (building systems that replace human researchers) and augmentation (tools that amplify human creativity and judgement).

Analytical claim based on the paper's review of historical examples and conceptual discussion; no primary data or experimental design reported.

high mixed A Brief History of AI for Scientific Discovery: Open Researc... relationship between automation and augmentation in research practice

« Prev 1 2 3 4 5 … 103 104 Next »