Evidence (5157 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	609	159	77	736	1615
Governance & Regulation	664	329	160	99	1273
Organizational Efficiency	624	143	105	70	949
Technology Adoption Rate	502	176	98	78	861
Research Productivity	348	109	48	322	836
Output Quality	391	120	44	40	595
Firm Productivity	385	46	85	17	539
Decision Quality	275	143	62	34	521
AI Safety & Ethics	183	241	59	30	517
Market Structure	152	154	109	20	440
Task Allocation	158	50	56	26	295
Innovation Output	178	23	38	17	257
Skill Acquisition	137	52	50	13	252
Fiscal & Macroeconomic	120	64	38	23	252
Employment Level	93	46	96	12	249
Firm Revenue	130	43	26	3	202
Consumer Welfare	99	51	40	11	201
Inequality Measures	36	105	40	6	187
Task Completion Time	134	18	6	5	163
Worker Satisfaction	79	54	16	11	160
Error Rate	64	78	8	1	151
Regulatory Compliance	69	64	14	3	150
Training Effectiveness	81	15	13	18	129
Wages & Compensation	70	25	22	6	123
Team Performance	74	16	21	9	121
Automation Exposure	41	48	19	9	120
Job Displacement	11	71	16	1	99
Developer Productivity	71	14	9	3	98
Hiring & Recruitment	49	7	8	3	67
Social Protection	26	14	8	2	50
Creative Output	26	14	6	2	49
Skill Obsolescence	5	37	5	1	48
Labor Share of Income	12	13	12	—	37
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Human Ai Collab Remove filter

The model shows when these systems become vulnerable to strategic use from within government.

Analytical result derived from the paper's formal theoretical model (no empirical validation reported).

high negative AI Governance under Political Turnover: The Alignment Surfac... vulnerability of automated systems to strategic internal use

The compliance layer can also create a stable approval boundary that political successors learn to navigate while preserving the appearance of lawful administration.

Stated conclusion/insight from the paper's formal argument and conceptual framing (theoretical, no empirical sample).

high negative AI Governance under Political Turnover: The Alignment Surfac... creation of a stable approval boundary exploitable by successive governments

Manual tools like mind maps support structure creation but lack intelligent (AI) assistance.

Paper's comparison of manual tools versus AI-augmented tools (background/related-work discussion; no empirical evaluation reported for this claim).

high negative MindTrellis: Co-Creating Knowledge Structures with AI throug... presence of intelligent assistance in manual structure-creation tools

Current LLM-based systems let users query information but do not let users shape how knowledge is organized.

Paper's analysis of existing tools and limitations (literature/feature comparison described in introduction; no new empirical test reported).

high negative MindTrellis: Co-Creating Knowledge Structures with AI throug... capability to shape knowledge organization in LLM-based systems

Knowledge workers face increasing challenges in synthesizing information from multiple documents into structured conceptual understanding.

Statement in paper's introduction/motivation; conceptual observation (no empirical data reported here).

high negative MindTrellis: Co-Creating Knowledge Structures with AI throug... ability to synthesize information from multiple documents into structured concep...

In the absence of intervention, individually rational adoption of genAI will assuredly and profoundly reduce collective welfare.

Conclusion drawn from the paper's theoretical model (normative/predictive claim based on model dynamics; no empirical validation or sample reported in abstract).

high negative Generative artificial intelligence reduces social welfare th... collective (social) welfare

Habit formation around genAI use can couple otherwise separate domains, so that adoption in low-stakes tasks spills over into high-value tasks and amplifies welfare losses.

Theoretical/model-based claim showing coupling across domains via habit formation (model extension; no empirical sample reported in abstract).

high negative Generative artificial intelligence reduces social welfare th... spillover adoption and amplified welfare losses

The introduction of genAI—while initially beneficial at the individual level—will reduce social welfare for the most important types of tasks.

Model-derived result: theoretical analysis indicates social-welfare reductions in high-value tasks despite individual gains (no empirical sample reported in abstract).

high negative Generative artificial intelligence reduces social welfare th... social welfare for high-value tasks

Generative models are vulnerable to model collapse: when trained on data generated by earlier versions of themselves, their outputs can lose diversity and accuracy.

Theoretical claim / conceptual claim presented in the paper (no empirical sample size given in abstract); refers to degradation of model outputs when trained on self-generated data.

high negative Generative artificial intelligence reduces social welfare th... output diversity and accuracy

Industrial robots are widely used in manufacturing, yet most manipulation still depends on fixed waypoint scripts that are brittle to environmental changes.

Background statement in the paper's introduction; general literature/field observation (no new primary data reported for this claim in the abstract).

high negative Learning-augmented robotic automation for real-world manufac... robustness of fixed waypoint script manipulation

Each new task domain requires painstaking, expert-driven harness engineering: designing the prompts, tools, orchestration logic, and evaluation criteria that make a foundation model effective.

Author assertion in the paper's introduction/abstract describing the state of practice; no empirical method, dataset, or sample size reported in the excerpt.

high negative The Last Harness You'll Ever Build need for human (expert) harness engineering

Ungoverned coupling between humans and AI can produce fragility, lock-in, polarization, and domination basins.

Theoretical/modeling analysis showing destabilizing dynamics and multiple basins of attraction when governance regularization is absent or weak; no empirical sample.

high negative A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism,... fragility, lock-in, polarization, and domination outcomes in the dynamical model

Classical robot ethics framed around obedience (e.g. Asimov's laws) is too narrow for contemporary AI systems.

Literature synthesis and conceptual argument drawing on developments in adaptive, generative, embodied, and embedded AI; no empirical sample reported.

high negative A Co-Evolutionary Theory of Human-AI Coexistence: Mutualism,... adequacy of obedience-based ethical framing for contemporary AI

Current evaluation proxies are insufficient for predicting downstream human impact.

Empirical results in the paper showing decoupling between standard quantitative proxies (e.g., sparsity, faithfulness) and human outcomes (clarity, decision utility, confidence) across datasets and analyst reviews.

high negative Rethinking XAI Evaluation: A Human-Centered Audit of Shapley... predictive validity of quantitative evaluation proxies for human impact

A highlighting policy that is optimal for sophisticated agents can perform arbitrarily poorly when deployed to naive agents.

Constructive worst-case examples and theoretical bounds in the paper demonstrating arbitrarily large performance degradation when applying sophisticated-optimal policies to naive agents.

high negative Algorithmic Feature Highlighting for Human-AI Decision-Makin... performance (loss in decision quality) of highlighting policies when agent type ...

Optimizing highlighting for sophisticated agents can be computationally intractable, even in simple discrete and binary settings.

Theoretical complexity results and proofs in the paper showing hardness of the optimization problem under the sophisticated-agent model; no sample/calibration required (formal/algorithmic analysis).

high negative Algorithmic Feature Highlighting for Human-AI Decision-Makin... computational tractability of the highlighting optimization problem

Ethical concerns—such as transparency, explainability, psychological effects, and responsible AI governance—are critical factors influencing employability outcomes.

Review synthesis highlighting ethical issues from empirical and industry literature as influential on employability outcomes.

high negative The Impact of AI on Employability and Evolving Job Roles of ... ethical concerns' impact on employability

There are significant AI adoption challenges in education and industry that affect employability and role transformation.

Synthesized evidence from industry reports and empirical studies discussed in the review highlighting barriers to adoption in education and industry.

high negative The Impact of AI on Employability and Evolving Job Roles of ... AI adoption challenges

From the perspectives of 'personal subordination' and 'economic subordination', AIGC deeply and implicitly controls the labor process through mechanisms such as dynamic path planning, blurring the boundaries of determination.

Analytical/legal argument in the paper linking conceptual standards of subordination to specific algorithmic mechanisms (e.g., dynamic path planning); supported by mechanistic discussion but no reported empirical measurement or sample.

high negative AIGC+ Determination of Labor Relations in the Context of the... task_allocation / algorithmic control of tasks

AIGC constantly challenges traditional standards for determining labor relations.

Paper's analytic claim based on conceptual/legal argument that algorithmic features of AIGC complicate application of existing labor-relation tests; no quantitative validation or sample size provided.

high negative AIGC+ Determination of Labor Relations in the Context of the... employment (classification/determination of labor relations)

The transformation toward algorithmic enterprises raises critical concerns regarding agency, accountability, data monopolization, and algorithmic bias.

Presented as a principal concern in the paper's conceptual discussion and interdisciplinary critique; based on analysis of governance and ethical literature rather than new empirical evidence in the abstract.

high negative Algorithmic Enterprises: Rethinking Firm Strategy in the Age... risks to agency, accountability, market power (data monopolization), and algorit...

Algorithmic management and monitoring have reduced employees’ autonomy and perceived work meaningfulness, contributing to 'AI anxiety' characterised by concerns about job loss, skill obsolescence, and diminished control.

Qualitative studies, survey evidence, and theoretical literature reviewed that document impacts of algorithmic management on autonomy, meaningfulness, and worker anxiety (mixed-methods literature).

high negative From Technological Substitution to Institutional Response: A... employee autonomy, perceived work meaningfulness, and AI-related anxiety

Automation has intensified income inequality between high-skilled and low-skilled workers.

Synthesis of empirical literature linking automation adoption to widening wage and income gaps across skill groups (literature review).

high negative From Technological Substitution to Institutional Response: A... income/wage inequality between skill groups

Displacement effects have extended from manufacturing into cognitive roles such as clerical work and customer service.

Review of empirical studies documenting automation/substitution effects in cognitive, clerical, and customer-service roles (literature synthesis).

high negative From Technological Substitution to Institutional Response: A... occupational displacement in cognitive/clerical/customer-service roles

Automation has put downward pressure on wages.

Cited empirical studies and wage analyses in the reviewed literature indicating wage suppression associated with automation adoption (literature review).

high negative From Technological Substitution to Institutional Response: A... wage levels / wage pressure

AI and robotics have led to contractions in low-skilled occupations.

Synthesis of empirical literature reporting occupational contractions in low-skilled jobs following automation adoption (literature review).

high negative From Technological Substitution to Institutional Response: A... contraction in employment in low-skilled occupations

Extensive empirical evidence shows that AI and robotics can substitute for rule-based, codifiable routine tasks.

Review cites extensive empirical studies demonstrating substitution of rule-based, codifiable routine tasks by AI/robotics (literature synthesis).

high negative From Technological Substitution to Institutional Response: A... substitution of routine tasks (automation exposure)

Artificial intelligence and robotic technologies are fundamentally reshaping labour markets and pose multifaceted challenges to workers engaged in routine and low-skilled tasks.

Narrative review of domestic and international scholarly literature over the past decade (literature review / synthesis).

high negative From Technological Substitution to Institutional Response: A... risks to routine and low-skilled workers (labor market disruption / challenges)

Structural barriers, workforce biases, and digital skill gaps affect women’s participation in AI-enabled sectors.

Claim derived from the paper's synthesis of literature (peer-reviewed studies, policy analyses, preprints) identifying common barriers; the abstract does not report quantitative meta-analysis or specific sample sizes.

high negative Artificial Intelligence and GenderedEmployment: Reviewing Op... drivers of women's participation in AI-enabled sectors (barriers and gaps)

Vibe coding (unstructured GenAI-driven coding) promises rapid prototyping but often suffers from architectural drift, limited traceability, and reduced maintainability.

Paper asserts this as a motivating observation and characterizes vibe coding's weaknesses; the abstract frames these as commonly observed problems motivating the Shift-Up approach (no sample size given in abstract).

high negative Shift-Up: A Framework for Software Engineering Guardrails in... architectural drift, traceability, maintainability

In post-AGI economies the presupposition of agent autonomy becomes nontrivial because artificial systems may exhibit varying degrees of autonomy, functioning as tools, delegates, strategic market actors, manipulators of choice environments, or possible welfare subjects.

Theoretical argumentation and conceptual classification in the paper; no empirical data reported (modeling/motivating discussion).

high negative Post-AGI Economies: Autonomy and the First Fundamental Theor... validity/applicability of the autonomy presupposition in welfare economics

Scalable AI tutoring for procedural skill learning requires structured knowledge representations, yet constructing these representations remains a labor-intensive bottleneck.

Background/claim made in the paper's introduction framing the problem; no specific quantitative evidence reported in the abstract.

high negative Developing Models of Procedural Skills using an AI-assisted ... effort required to construct structured knowledge representations

Under-represented groups tend to be systematically under-observed because of historical exclusion and selective feedback, which exacerbates uncertainty for those groups.

Conceptual claim supported by illustrative examples (e.g., lending context) and simulations demonstrating selective feedback effects; literature citation likely included in paper.

high negative Fairness under uncertainty in sequential decisions observation frequency/data availability for under-represented groups; resulting ...

Policies that ignore the unobserved (counterfactual) space can harm decision makers (via unrealized gains or losses) and subjects (via compounding exclusion and reduced access).

Theoretical argumentation and illustrative examples (e.g., loan denial counterfactuals) and modelled simulations showing downstream harms when ignoring unobserved outcomes.

high negative Fairness under uncertainty in sequential decisions unrealized gains/losses for decision makers; compounding exclusion and reduced a...

Experiments on simulated data with varying bias show that unequal uncertainty and selective feedback produce disparities across groups.

Simulation experiments described in the paper manipulate bias and feedback patterns and report resulting group disparities (synthetic datasets; experiment details in methods/results sections).

high negative Fairness under uncertainty in sequential decisions group disparities (fairness metrics)

The study is framed based on Job Demands-Resources (JD-R) theory, positing that HAI-C task complexity is a job demand and AI self-efficacy/humble leadership act as resources that can mitigate negative effects on engagement.

Introduction states JD-R theory as the theoretical basis and describes job demands (HAI-C task complexity) and job/personal resources (humble leadership, AI self-efficacy) in the hypothesized model.

high negative How does human-AI collaboration task complexity affect emplo... theoretical framing / hypothesized relationships

HAI-C tech-learning anxiety reduces employees' work engagement (serves as the mediator between HAI-C task complexity and work engagement).

Mediation analysis via hierarchical regression and bootstrapping on the three-wave survey sample of 497 employees; reported in Results as the mediating mechanism.

high negative How does human-AI collaboration task complexity affect emplo... work engagement

Human-AI collaboration task complexity (HAI-C task complexity) negatively affects employees' work engagement by amplifying their HAI-C tech-learning anxiety.

Three-wave longitudinal survey of matched data from 497 employees; mediation analysis using hierarchical regression and bootstrapping reported in the Results section.

high negative How does human-AI collaboration task complexity affect emplo... work engagement

LLMs are not only less accurate on ideologically contested economic questions, but systematically less reliable in one ideological direction than the other, underscoring the need for direction-aware evaluation in high-stakes economic and policy settings.

Synthesis of empirical findings: lower accuracy on contested items, higher accuracy for intervention-aligned cases in 18/20 models, and error skew toward intervention-oriented predictions; policy recommendation follows from these empirical patterns.

high negative Ideological Bias in LLMs' Economic Causal Reasoning overall model reliability and directional bias on ideologically contested causal...

This directional skew is not eliminated by one-shot in-context prompting.

Intervention of one-shot in-context prompting applied to models; evaluation shows the intervention-oriented error skew persists despite one-shot prompting.

high negative Ideological Bias in LLMs' Economic Causal Reasoning effectiveness of one-shot in-context prompting at reducing ideological direction...

Ideology-contested items are consistently harder than non-contested ones.

Comparison of model performance (accuracy) on contested subset (1,056 items) versus non-contested items in the 10,490-triplet benchmark; reported consistent lower accuracy on contested items.

high negative Ideological Bias in LLMs' Economic Causal Reasoning accuracy (difficulty of items measured by model error rate)

Important boundary conditions include data maturity, process integration, governance discipline, and the degree of functional trust between finance and operating units.

List of boundary conditions reported in the paper based on documentary case analysis and synthesis with literature.

high negative Research on the Impact of Generative AI on the Quality of Ma... constraints on GenAI impact on management accounting decision quality

GenAI does not improve management accounting decision quality primarily by replacing managerial judgment.

Interpretive finding based on documentary analysis of disclosures from the three case firms and relevant literature; presented as a summary conclusion in the paper.

high negative Research on the Impact of Generative AI on the Quality of Ma... management accounting decision quality

The stakes are particularly high in spreadsheet environments, where process and artifact are inseparable: each decision the agent makes is recorded directly in cells that belong to and reflect on the user.

Conceptual / domain-specific argument made by the authors (no empirical sample attached to the claim).

high negative Auditing and Controlling AI Agent Actions in Spreadsheets risk associated with automated changes to user-owned artifacts

AI agents can perform sophisticated, multi-step knowledge work autonomously from start to finish, yet this process remains effectively inaccessible during execution: by the time users receive the output, all underlying decisions have already been made without their involvement.

Author assertion / conceptual description in the paper (no empirical quantification provided for this general statement).

high negative Auditing and Controlling AI Agent Actions in Spreadsheets process transparency / accessibility during execution

Advances in AI agent capabilities have outpaced users' ability to meaningfully oversee their execution.

Author assertion / literature-level observation presented in the paper (no empirical sample reported for this claim).

high negative Auditing and Controlling AI Agent Actions in Spreadsheets user oversight ability

Selective forgetting remains underexplored compared to retention in LLM agent memory research.

Authors' literature survey / position statement in paper (assertion made in abstract).

high negative FSFM: A Biologically-Inspired Framework for Selective Forget... extent of research coverage on forgetting vs retention

Beyond technical barriers there are organizational ones: a persistent AI literacy gap, cultural heterogeneity, and governance structures that have not yet caught up with agentic capabilities.

Interview data (over 30) reporting organizational challenges including limited AI literacy, diverse cultural attitudes across organizations, and lagging governance relative to agentic AI capabilities.

high negative Agentic AI in Engineering and Manufacturing: Industry Perspe... organizational readiness factors (AI literacy, culture, governance alignment)

Adoption is constrained less by model capability than by fragmented and machine-unfriendly data, stringent security and regulatory requirements, and limited API-accessible legacy toolchains.

Stakeholder interviews (over 30) reporting barriers to deployment; qualitative synthesis identifies data fragmentation, security/regulatory requirements, and legacy toolchain access as primary constraints.

high negative Agentic AI in Engineering and Manufacturing: Industry Perspe... barriers to AI adoption in engineering/manufacturing

Providing agents feedback about past performance makes them worse at information aggregation and reduces their profits.

Experimental condition where agents received feedback about past performance; compared aggregation (log error of last price) and profits with and without feedback and found worse aggregation and lower profits when feedback was given.

high negative Information Aggregation with AI Agents information aggregation (log error of the last price) and profits

« Prev 1 2 3 … 10 11 12 … 103 104 Next »