Evidence (5157 claims)
Adoption
7395 claims
Productivity
6507 claims
Governance
5877 claims
Human-AI Collaboration
5157 claims
Innovation
3492 claims
Org Design
3470 claims
Labor Markets
3224 claims
Skills & Training
2608 claims
Inequality
1835 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 609 | 159 | 77 | 736 | 1615 |
| Governance & Regulation | 664 | 329 | 160 | 99 | 1273 |
| Organizational Efficiency | 624 | 143 | 105 | 70 | 949 |
| Technology Adoption Rate | 502 | 176 | 98 | 78 | 861 |
| Research Productivity | 348 | 109 | 48 | 322 | 836 |
| Output Quality | 391 | 120 | 44 | 40 | 595 |
| Firm Productivity | 385 | 46 | 85 | 17 | 539 |
| Decision Quality | 275 | 143 | 62 | 34 | 521 |
| AI Safety & Ethics | 183 | 241 | 59 | 30 | 517 |
| Market Structure | 152 | 154 | 109 | 20 | 440 |
| Task Allocation | 158 | 50 | 56 | 26 | 295 |
| Innovation Output | 178 | 23 | 38 | 17 | 257 |
| Skill Acquisition | 137 | 52 | 50 | 13 | 252 |
| Fiscal & Macroeconomic | 120 | 64 | 38 | 23 | 252 |
| Employment Level | 93 | 46 | 96 | 12 | 249 |
| Firm Revenue | 130 | 43 | 26 | 3 | 202 |
| Consumer Welfare | 99 | 51 | 40 | 11 | 201 |
| Inequality Measures | 36 | 105 | 40 | 6 | 187 |
| Task Completion Time | 134 | 18 | 6 | 5 | 163 |
| Worker Satisfaction | 79 | 54 | 16 | 11 | 160 |
| Error Rate | 64 | 78 | 8 | 1 | 151 |
| Regulatory Compliance | 69 | 64 | 14 | 3 | 150 |
| Training Effectiveness | 81 | 15 | 13 | 18 | 129 |
| Wages & Compensation | 70 | 25 | 22 | 6 | 123 |
| Team Performance | 74 | 16 | 21 | 9 | 121 |
| Automation Exposure | 41 | 48 | 19 | 9 | 120 |
| Job Displacement | 11 | 71 | 16 | 1 | 99 |
| Developer Productivity | 71 | 14 | 9 | 3 | 98 |
| Hiring & Recruitment | 49 | 7 | 8 | 3 | 67 |
| Social Protection | 26 | 14 | 8 | 2 | 50 |
| Creative Output | 26 | 14 | 6 | 2 | 49 |
| Skill Obsolescence | 5 | 37 | 5 | 1 | 48 |
| Labor Share of Income | 12 | 13 | 12 | — | 37 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
Human Ai Collab
Remove filter
The model shows when these systems become vulnerable to strategic use from within government.
Analytical result derived from the paper's formal theoretical model (no empirical validation reported).
The compliance layer can also create a stable approval boundary that political successors learn to navigate while preserving the appearance of lawful administration.
Stated conclusion/insight from the paper's formal argument and conceptual framing (theoretical, no empirical sample).
Manual tools like mind maps support structure creation but lack intelligent (AI) assistance.
Paper's comparison of manual tools versus AI-augmented tools (background/related-work discussion; no empirical evaluation reported for this claim).
Current LLM-based systems let users query information but do not let users shape how knowledge is organized.
Paper's analysis of existing tools and limitations (literature/feature comparison described in introduction; no new empirical test reported).
Knowledge workers face increasing challenges in synthesizing information from multiple documents into structured conceptual understanding.
Statement in paper's introduction/motivation; conceptual observation (no empirical data reported here).
In the absence of intervention, individually rational adoption of genAI will assuredly and profoundly reduce collective welfare.
Conclusion drawn from the paper's theoretical model (normative/predictive claim based on model dynamics; no empirical validation or sample reported in abstract).
Habit formation around genAI use can couple otherwise separate domains, so that adoption in low-stakes tasks spills over into high-value tasks and amplifies welfare losses.
Theoretical/model-based claim showing coupling across domains via habit formation (model extension; no empirical sample reported in abstract).
The introduction of genAI—while initially beneficial at the individual level—will reduce social welfare for the most important types of tasks.
Model-derived result: theoretical analysis indicates social-welfare reductions in high-value tasks despite individual gains (no empirical sample reported in abstract).
Generative models are vulnerable to model collapse: when trained on data generated by earlier versions of themselves, their outputs can lose diversity and accuracy.
Theoretical claim / conceptual claim presented in the paper (no empirical sample size given in abstract); refers to degradation of model outputs when trained on self-generated data.
Industrial robots are widely used in manufacturing, yet most manipulation still depends on fixed waypoint scripts that are brittle to environmental changes.
Background statement in the paper's introduction; general literature/field observation (no new primary data reported for this claim in the abstract).
Each new task domain requires painstaking, expert-driven harness engineering: designing the prompts, tools, orchestration logic, and evaluation criteria that make a foundation model effective.
Author assertion in the paper's introduction/abstract describing the state of practice; no empirical method, dataset, or sample size reported in the excerpt.
Ungoverned coupling between humans and AI can produce fragility, lock-in, polarization, and domination basins.
Theoretical/modeling analysis showing destabilizing dynamics and multiple basins of attraction when governance regularization is absent or weak; no empirical sample.
Classical robot ethics framed around obedience (e.g. Asimov's laws) is too narrow for contemporary AI systems.
Literature synthesis and conceptual argument drawing on developments in adaptive, generative, embodied, and embedded AI; no empirical sample reported.
Current evaluation proxies are insufficient for predicting downstream human impact.
Empirical results in the paper showing decoupling between standard quantitative proxies (e.g., sparsity, faithfulness) and human outcomes (clarity, decision utility, confidence) across datasets and analyst reviews.
A highlighting policy that is optimal for sophisticated agents can perform arbitrarily poorly when deployed to naive agents.
Constructive worst-case examples and theoretical bounds in the paper demonstrating arbitrarily large performance degradation when applying sophisticated-optimal policies to naive agents.
Optimizing highlighting for sophisticated agents can be computationally intractable, even in simple discrete and binary settings.
Theoretical complexity results and proofs in the paper showing hardness of the optimization problem under the sophisticated-agent model; no sample/calibration required (formal/algorithmic analysis).
Ethical concerns—such as transparency, explainability, psychological effects, and responsible AI governance—are critical factors influencing employability outcomes.
Review synthesis highlighting ethical issues from empirical and industry literature as influential on employability outcomes.
There are significant AI adoption challenges in education and industry that affect employability and role transformation.
Synthesized evidence from industry reports and empirical studies discussed in the review highlighting barriers to adoption in education and industry.
From the perspectives of 'personal subordination' and 'economic subordination', AIGC deeply and implicitly controls the labor process through mechanisms such as dynamic path planning, blurring the boundaries of determination.
Analytical/legal argument in the paper linking conceptual standards of subordination to specific algorithmic mechanisms (e.g., dynamic path planning); supported by mechanistic discussion but no reported empirical measurement or sample.
AIGC constantly challenges traditional standards for determining labor relations.
Paper's analytic claim based on conceptual/legal argument that algorithmic features of AIGC complicate application of existing labor-relation tests; no quantitative validation or sample size provided.
The transformation toward algorithmic enterprises raises critical concerns regarding agency, accountability, data monopolization, and algorithmic bias.
Presented as a principal concern in the paper's conceptual discussion and interdisciplinary critique; based on analysis of governance and ethical literature rather than new empirical evidence in the abstract.
Algorithmic management and monitoring have reduced employees’ autonomy and perceived work meaningfulness, contributing to 'AI anxiety' characterised by concerns about job loss, skill obsolescence, and diminished control.
Qualitative studies, survey evidence, and theoretical literature reviewed that document impacts of algorithmic management on autonomy, meaningfulness, and worker anxiety (mixed-methods literature).
Automation has intensified income inequality between high-skilled and low-skilled workers.
Synthesis of empirical literature linking automation adoption to widening wage and income gaps across skill groups (literature review).
Displacement effects have extended from manufacturing into cognitive roles such as clerical work and customer service.
Review of empirical studies documenting automation/substitution effects in cognitive, clerical, and customer-service roles (literature synthesis).
Automation has put downward pressure on wages.
Cited empirical studies and wage analyses in the reviewed literature indicating wage suppression associated with automation adoption (literature review).
AI and robotics have led to contractions in low-skilled occupations.
Synthesis of empirical literature reporting occupational contractions in low-skilled jobs following automation adoption (literature review).
Extensive empirical evidence shows that AI and robotics can substitute for rule-based, codifiable routine tasks.
Review cites extensive empirical studies demonstrating substitution of rule-based, codifiable routine tasks by AI/robotics (literature synthesis).
Artificial intelligence and robotic technologies are fundamentally reshaping labour markets and pose multifaceted challenges to workers engaged in routine and low-skilled tasks.
Narrative review of domestic and international scholarly literature over the past decade (literature review / synthesis).
Structural barriers, workforce biases, and digital skill gaps affect women’s participation in AI-enabled sectors.
Claim derived from the paper's synthesis of literature (peer-reviewed studies, policy analyses, preprints) identifying common barriers; the abstract does not report quantitative meta-analysis or specific sample sizes.
Vibe coding (unstructured GenAI-driven coding) promises rapid prototyping but often suffers from architectural drift, limited traceability, and reduced maintainability.
Paper asserts this as a motivating observation and characterizes vibe coding's weaknesses; the abstract frames these as commonly observed problems motivating the Shift-Up approach (no sample size given in abstract).
In post-AGI economies the presupposition of agent autonomy becomes nontrivial because artificial systems may exhibit varying degrees of autonomy, functioning as tools, delegates, strategic market actors, manipulators of choice environments, or possible welfare subjects.
Theoretical argumentation and conceptual classification in the paper; no empirical data reported (modeling/motivating discussion).
Scalable AI tutoring for procedural skill learning requires structured knowledge representations, yet constructing these representations remains a labor-intensive bottleneck.
Background/claim made in the paper's introduction framing the problem; no specific quantitative evidence reported in the abstract.
Under-represented groups tend to be systematically under-observed because of historical exclusion and selective feedback, which exacerbates uncertainty for those groups.
Conceptual claim supported by illustrative examples (e.g., lending context) and simulations demonstrating selective feedback effects; literature citation likely included in paper.
Policies that ignore the unobserved (counterfactual) space can harm decision makers (via unrealized gains or losses) and subjects (via compounding exclusion and reduced access).
Theoretical argumentation and illustrative examples (e.g., loan denial counterfactuals) and modelled simulations showing downstream harms when ignoring unobserved outcomes.
Experiments on simulated data with varying bias show that unequal uncertainty and selective feedback produce disparities across groups.
Simulation experiments described in the paper manipulate bias and feedback patterns and report resulting group disparities (synthetic datasets; experiment details in methods/results sections).
The study is framed based on Job Demands-Resources (JD-R) theory, positing that HAI-C task complexity is a job demand and AI self-efficacy/humble leadership act as resources that can mitigate negative effects on engagement.
Introduction states JD-R theory as the theoretical basis and describes job demands (HAI-C task complexity) and job/personal resources (humble leadership, AI self-efficacy) in the hypothesized model.
HAI-C tech-learning anxiety reduces employees' work engagement (serves as the mediator between HAI-C task complexity and work engagement).
Mediation analysis via hierarchical regression and bootstrapping on the three-wave survey sample of 497 employees; reported in Results as the mediating mechanism.
Human-AI collaboration task complexity (HAI-C task complexity) negatively affects employees' work engagement by amplifying their HAI-C tech-learning anxiety.
Three-wave longitudinal survey of matched data from 497 employees; mediation analysis using hierarchical regression and bootstrapping reported in the Results section.
LLMs are not only less accurate on ideologically contested economic questions, but systematically less reliable in one ideological direction than the other, underscoring the need for direction-aware evaluation in high-stakes economic and policy settings.
Synthesis of empirical findings: lower accuracy on contested items, higher accuracy for intervention-aligned cases in 18/20 models, and error skew toward intervention-oriented predictions; policy recommendation follows from these empirical patterns.
This directional skew is not eliminated by one-shot in-context prompting.
Intervention of one-shot in-context prompting applied to models; evaluation shows the intervention-oriented error skew persists despite one-shot prompting.
Ideology-contested items are consistently harder than non-contested ones.
Comparison of model performance (accuracy) on contested subset (1,056 items) versus non-contested items in the 10,490-triplet benchmark; reported consistent lower accuracy on contested items.
Important boundary conditions include data maturity, process integration, governance discipline, and the degree of functional trust between finance and operating units.
List of boundary conditions reported in the paper based on documentary case analysis and synthesis with literature.
GenAI does not improve management accounting decision quality primarily by replacing managerial judgment.
Interpretive finding based on documentary analysis of disclosures from the three case firms and relevant literature; presented as a summary conclusion in the paper.
The stakes are particularly high in spreadsheet environments, where process and artifact are inseparable: each decision the agent makes is recorded directly in cells that belong to and reflect on the user.
Conceptual / domain-specific argument made by the authors (no empirical sample attached to the claim).
AI agents can perform sophisticated, multi-step knowledge work autonomously from start to finish, yet this process remains effectively inaccessible during execution: by the time users receive the output, all underlying decisions have already been made without their involvement.
Author assertion / conceptual description in the paper (no empirical quantification provided for this general statement).
Advances in AI agent capabilities have outpaced users' ability to meaningfully oversee their execution.
Author assertion / literature-level observation presented in the paper (no empirical sample reported for this claim).
Selective forgetting remains underexplored compared to retention in LLM agent memory research.
Authors' literature survey / position statement in paper (assertion made in abstract).
Beyond technical barriers there are organizational ones: a persistent AI literacy gap, cultural heterogeneity, and governance structures that have not yet caught up with agentic capabilities.
Interview data (over 30) reporting organizational challenges including limited AI literacy, diverse cultural attitudes across organizations, and lagging governance relative to agentic AI capabilities.
Adoption is constrained less by model capability than by fragmented and machine-unfriendly data, stringent security and regulatory requirements, and limited API-accessible legacy toolchains.
Stakeholder interviews (over 30) reporting barriers to deployment; qualitative synthesis identifies data fragmentation, security/regulatory requirements, and legacy toolchain access as primary constraints.
Providing agents feedback about past performance makes them worse at information aggregation and reduces their profits.
Experimental condition where agents received feedback about past performance; compared aggregation (log error of last price) and profits with and without feedback and found worse aggregation and lower profits when feedback was given.