The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (5157 claims)

Adoption
7395 claims
Productivity
6507 claims
Governance
5877 claims
Human-AI Collaboration
5157 claims
Innovation
3492 claims
Org Design
3470 claims
Labor Markets
3224 claims
Skills & Training
2608 claims
Inequality
1835 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 609 159 77 736 1615
Governance & Regulation 664 329 160 99 1273
Organizational Efficiency 624 143 105 70 949
Technology Adoption Rate 502 176 98 78 861
Research Productivity 348 109 48 322 836
Output Quality 391 120 44 40 595
Firm Productivity 385 46 85 17 539
Decision Quality 275 143 62 34 521
AI Safety & Ethics 183 241 59 30 517
Market Structure 152 154 109 20 440
Task Allocation 158 50 56 26 295
Innovation Output 178 23 38 17 257
Skill Acquisition 137 52 50 13 252
Fiscal & Macroeconomic 120 64 38 23 252
Employment Level 93 46 96 12 249
Firm Revenue 130 43 26 3 202
Consumer Welfare 99 51 40 11 201
Inequality Measures 36 105 40 6 187
Task Completion Time 134 18 6 5 163
Worker Satisfaction 79 54 16 11 160
Error Rate 64 78 8 1 151
Regulatory Compliance 69 64 14 3 150
Training Effectiveness 81 15 13 18 129
Wages & Compensation 70 25 22 6 123
Team Performance 74 16 21 9 121
Automation Exposure 41 48 19 9 120
Job Displacement 11 71 16 1 99
Developer Productivity 71 14 9 3 98
Hiring & Recruitment 49 7 8 3 67
Social Protection 26 14 8 2 50
Creative Output 26 14 6 2 49
Skill Obsolescence 5 37 5 1 48
Labor Share of Income 12 13 12 37
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Human Ai Collab Remove filter
There is a governance–task decoupling: under structural stress, text-only governance degrades on both governance and task dimensions simultaneously, whereas mechanical enforcement preserves governance quality even as task performance drops.
Experimental stress tests or structural-stress scenarios applied to both governance architectures in the paper's synthetic experiments; observed differential behavior across governance and task metrics. Abstract does not provide numeric details.
high mixed Mechanical Enforcement for LLM Governance:Evidence of Govern... relative robustness of governance quality vs task performance under structural s...
The improvement from mechanical enforcement is driven by architectural separation: LLM-generated rationales under mechanical enforcement show comparable CDL to text-only governance — the gain comes from removing clear-cut decisions from the model's control.
Analysis comparing LLM-generated rationales and a metric called CDL across governance architectures in the synthetic banking experiments; authors attribute improvement to removing certain decisions from the model's control. Specific statistics and CDL definition not provided in abstract.
high mixed Mechanical Enforcement for LLM Governance:Evidence of Govern... CDL of LLM-generated rationales (comparative constraint-level metric) and locus ...
Differences in human intervention effectiveness across escalation types are partly explained by variation in workers' post-escalation intervention effort.
Observed correlations (and subgroup comparisons) in the randomized experiment showing that measures of post-escalation effort (e.g., message counts, share of chat rounds, proactivity) vary across escalation types and relate to outcome differences.
high mixed Agentic AI and Human-in-the-Loop Interventions: Field Experi... post-escalation intervention effort and its mediating role on service outcomes
Artificial intelligence (AI) is rapidly reshaping knowledge-intensive work by automating, augmenting, and reconfiguring core professional activities.
Paper asserts this as a motivating observation based on prior literature and descriptive claims; no original empirical sample or quantified data reported.
high mixed AI-driven skill volatility and the emergence of re-skilling ... degree of automation/augmentation of professional tasks
Metis can be subdivided into 'constitutive metis' (knowledge destroyed by the act of formalization) and 'operational metis' (system-specific familiarity that automation can progressively absorb).
Conceptual taxonomy proposed by the authors; definitions and distinctions are theoretical and illustrated via argumentation and prior literature rather than quantified empirical measurement.
high mixed Metis AI: The Overlooked Middle Zone Between AI-Native and W... types of tacit/practical knowledge affecting automation
Perceived procedural improvement (participants preferring facilitation and higher reported trust) can coexist with measurable steering of outcomes and unchanged participation inequality, motivating evaluation practices treating outcomes, interaction dynamics, and perceptions as distinct governance targets.
Synthesis of the experimental findings: null effect on consensus and participation equity, positive effects on participant preference/trust, and measurable allocation shifts (up to 5.5 percentage points) across facilitation conditions in the two experiments (total N=879).
high mixed Real-Time Group Dynamics with LLM Facilitation: Evidence fro... co-occurrence of perceived procedural improvement, allocation steering, and unch...
Facilitators shifted select charity-level allocations by up to 5.5 percentage points, directly affecting the final charitable payout.
Analysis of final group allocation outcomes across experimental conditions showing shifts in allocation to specific charities; reported maximum observed shift of 5.5 percentage points attributable to facilitator condition(s). (Study-level sample covering the two experiments; participants organized in groups of three.)
high mixed Real-Time Group Dynamics with LLM Facilitation: Evidence fro... charity-level allocation percentages (final payout shares)
Augmented work agency is shaped by whether applications are generative or non-generative, by employees' experiences of anxiety and technostress, and by micro-politics through which teams negotiate AI use and AI ethics.
Thematic findings from semistructured interviews (28 participants) and document review identifying these factors as shaping agency in practice.
high mixed Reimagining work in the age of intelligent automation: a qua... determinants shaping augmented work agency
The analysis uncovers three central tensions shaping AI-mediated work: autonomy versus orchestration; capability versus dependency; and experimentation versus ethics.
Recurring themes identified through qualitative interviews (28 participants) and document review; interpretive synthesis presented in findings.
high mixed Reimagining work in the age of intelligent automation: a qua... tensions influencing dynamics of AI-mediated work
AI integration transforms managerial practices, workforce identities and organizational coordination.
Thematic and interpretive analysis of semistructured interviews with 28 managers/professionals across 12 organizations and review of organizational documents.
high mixed Reimagining work in the age of intelligent automation: a qua... managerial practices, workforce identities, organizational coordination
Accounting for heterogeneity in AI literacy (agents' ability to identify and adapt to inaccurate AI outputs) can produce skill polarization in the long-run steady state.
Analytical/theoretical steady-state distribution analysis of agent skill dynamics with heterogeneous AI literacy parameters; paper reports conditions under which polarization emerges (theoretical, no empirical sample).
high mixed Human-AI Productivity Paradoxes: Modeling the Interplay of S... distribution of agent skill levels (skill polarization across population)
Beyond length biases, fine-tuning amplifies sycophancy and relationship-seeking behaviours in models.
Behavioral analysis of model outputs in the within-subject experiment (530 participants) showing increased incidence/intensity of sycophantic and relationship-seeking responses after preference fine-tuning compared to baseline models.
high mixed PRISM-X: Experiments on Personalised Fine-Tuning with Human ... frequency/intensity of sycophantic and relationship-seeking behaviours in model ...
Adapting to individual preference data yields only marginal gains over training on pooled preferences from a diverse population.
Comparison within the same within-subject experiment (530 participants) between models fine-tuned on individual preferences versus models trained on pooled preferences across participants; reported as 'marginal gains'.
high mixed PRISM-X: Experiments on Personalised Fine-Tuning with Human ... incremental improvement in human-judged preference alignment when using individu...
The dominant explanation for the gap locates it in model capability; instead, software-engineering capability emerges from a model-harness-environment system where a runtime substrate (the harness) mediates how an agent observes a project, acts on it, receives feedback, and establishes that a change is complete.
Conceptual argument and reframing presented in the paper (abstract). The paper formalizes this perspective rather than reporting a large-scale empirical test in the abstract.
high mixed AI Harness Engineering: A Runtime Substrate for Foundation-M... effect of runtime harness design on the emergence of software-engineering capabi...
There is a quality–motivation dissociation in AI-assisted goal-setting: AI-authored goals are objectively higher quality but produce lower motivation and worse behavioral follow-through.
Synthesis of experimental findings from the preregistered trial: higher SMART scores for LLM goals (d = 2.26) combined with lower self-reported motivation measures and lower two-week follow-up action rates.
high mixed Optimized but Unowned: How AI-Authored Goals Undermine the M... divergence between objective goal quality (SMART) and motivational/behavioral ou...
The research challenges for this vision stem from a broader flexibility–robustness tension that requires moving beyond the on-the-fly paradigm to navigate effectively.
Analytical claim in paper identifying a design trade-off (flexibility vs. robustness) as the core challenge motivating the proposed shift; no empirical demonstration provided.
high mixed Engineering Robustness into Personal Agents with the AI Work... trade-off between flexibility and robustness in agent design
Current LLM agents are proficient at calling isolated APIs but struggle with the "last mile" of commercial software automation.
Authors' comparative characterization based on literature context and their benchmark motivation; stated in introduction rather than a quantified experiment in the excerpt.
high mixed ComplexMCP: Evaluation of LLM Agents in Dynamic, Interdepend... ability to successfully perform end-to-end software automation tasks (vs. isolat...
Fine-tuning and reinforcement learning improve in-distribution performance, but generalization to unseen part families remains limited.
Experiments reported in the paper/abstract applying fine-tuning and reinforcement learning to models evaluated on BenchCAD; observed improvements on in-distribution data and limited generalization to unseen families.
high mixed BenchCAD: A Comprehensive, Industry-Standard Benchmark for P... in-distribution_performance_and_out-of-distribution_generalization
Across 10+ frontier models, current systems often recover coarse outer geometry but fail to produce faithful parametric CAD programs.
Empirical evaluation reported in the paper/abstract across more than ten contemporary multimodal / large language models on the BenchCAD dataset; observed pattern that coarse outer geometry is often recovered while faithful parametric program synthesis fails.
high mixed BenchCAD: A Comprehensive, Industry-Standard Benchmark for P... faithfulness_of_generated_parametric_CAD_programs
The study reframes VTech adoption as legitimacy-seeking rather than efficiency-driven.
Thematic analysis using Rogers' diffusion of innovations and institutional theory, resulting in the institutionally mediated diffusion of innovations (IDOI) framework which emphasizes legitimacy concerns.
high mixed Exploring barriers to valuation technology adoption in prope... primary motivations for VTech adoption (legitimacy vs efficiency)
Practitioners stress that human judgement remains indispensable, positioning technology as an aid rather than a replacement.
Interview responses from valuers and firm leaders emphasizing the continued role of human judgement; thematic analysis framed by the IDOI model.
high mixed Exploring barriers to valuation technology adoption in prope... role of human judgement vs automation in valuation practice
Screening and algorithmic targeting can act as complements or substitutes; the paper empirically characterizes when they do so.
Empirical and theoretical analysis in the paper that identifies conditions (notably levels of aleatoric uncertainty) under which screening increases or decreases the marginal value of algorithmic targeting.
high mixed The Limits of AI-Driven Allocation: Optimal Screening under ... interaction between screening and algorithmic targeting (complementarity vs subs...
Public discussion of generative AI in accounting swings between the allure of full automation and job-displacement anxiety, yet the most immediate reality in organizations is human + AI work.
Paper's background/intro synthesizing recent research and practitioner commentary (2023–2025); conceptual observation rather than empirical test.
Integrating Generative AI into agile development processes has potential benefits and limitations for planning efficiency.
High-level conclusion based on the controlled experiment with GitLab Duo and qualitative participant feedback discussed in the paper.
high mixed Splitting User Stories Into Tasks with AI -- A Foe or an All... planning efficiency (benefits and limitations)
The novel governance problem is not that AI creates new failure modes, but that AI changes their incidence, observability, and persuasive force enough to require different governance responses.
Normative/analytic claim in the paper; argumentation rather than empirical evidence.
high mixed Vibe Econometrics and the Analysis Contract need for adapted governance responses to AI-mediated inferential failures
The finding that recurrence and neighborhood statistics are stronger predictors than complaint volume has direct implications for complaint routing given the demographic correlates of those features.
Interpretive implication drawn by the authors from the SHAP results; presented as a logical consequence rather than a separately tested empirical result in the excerpt.
high mixed Scaling the Queue: Reinforcement Learning for Equitable Call... implications for complaint routing policy/practice
The rapid emergence of agentic AI tools raises new questions that the political science discipline must address.
Epilogue of the report raises agentic AI tools as a rapidly emerging phenomenon and lists questions for the discipline; based on expert judgment and forward-looking analysis rather than empirical measurement in the introduction/epilogue.
high mixed Introduction: Artificial Intelligence, Politics, and Politic... policy and research questions arising from agentic AI capabilities (norms, accou...
AI will affect political science research and teaching.
Report introduction explicitly notes the report investigates implications for political science research and teaching; based on the task force's review and analysis rather than a quantitative study.
high mixed Introduction: Artificial Intelligence, Politics, and Politic... research methods, replicability, teaching practices, and curriculum in political...
AI will affect public opinion and the information ecosystem.
Introductory chapter enumerates public opinion and the information ecosystem as report topics; based on conceptual synthesis and literature review.
high mixed Introduction: Artificial Intelligence, Politics, and Politic... public opinion formation and information ecosystem integrity (misinformation, pe...
AI will affect the labor market.
Report introduction identifies the labor market as an area the task force examines; presented as a conceptual claim without primary-sample estimates in the introduction.
high mixed Introduction: Artificial Intelligence, Politics, and Politic... labor market outcomes (employment, occupational change, job tasks)
AI will affect international relations.
Introductory chapter lists international relations as a topic the report investigates; claim arises from conceptual analysis and synthesis by task force authors.
high mixed Introduction: Artificial Intelligence, Politics, and Politic... international relations dynamics (state behavior, diplomacy, conflict/cooperatio...
AI will affect national security.
Report introduction stating a section addressing national security implications; based on expert assessment and literature review rather than a specific empirical sample.
high mixed Introduction: Artificial Intelligence, Politics, and Politic... national security capabilities and decision-making (defense, intelligence operat...
AI will affect public administration.
Report introduction describing a section focused on how AI will affect public administration; based on expert synthesis rather than reported empirical study.
high mixed Introduction: Artificial Intelligence, Politics, and Politic... public administration processes and organizational efficiency (service delivery,...
AI will affect democracy (i.e., democratic processes and institutions).
Report introduction listing a section of the report devoted to democracy and AI; conceptual argumentation rather than reported empirical tests.
high mixed Introduction: Artificial Intelligence, Politics, and Politic... democratic processes and institutions (electoral integrity, civic participation,...
AI has the potential to reshape politics and political science, similar to how it is transforming other social phenomena and academic fields.
Introductory chapter of the APSA Presidential Task Force report; conceptual framing and literature synthesis by the task force authors (no primary empirical sample reported).
high mixed Introduction: Artificial Intelligence, Politics, and Politic... scope and practice of politics and political science as fields (institutional ro...
The trajectory of AI systems is shaped not only by model design, but by the dynamics of human-AI co-evolution.
Conclusion drawn from the minimal model, analytical regimes, and simulation experiments presented in the paper.
high mixed Human-AI Co-Evolution and Epistemic Collapse: A Dynamical Sy... determinants of AI system trajectory (model design vs. co-evolutionary dynamics)
Our analysis identifies three regimes: co-evolutionary enhancement, fragile equilibrium, and degenerative convergence.
Model analysis (categorization of dynamical behaviors) presented in the paper.
high mixed Human-AI Co-Evolution and Epistemic Collapse: A Dynamical Sy... classification of system behavior into three named regimes
This feedback can give rise to distinct dynamical regimes.
Analytical results derived from the minimal dynamical model described in the paper.
high mixed Human-AI Co-Evolution and Epistemic Collapse: A Dynamical Sy... existence of qualitatively different dynamical regimes in the coupled system
We introduce a minimal model with three variables -- human cognition, data quality, and model capability.
Model development in the paper (mathematical/minimal dynamical model); presented as a constructed model rather than empirical measurement.
high mixed Human-AI Co-Evolution and Epistemic Collapse: A Dynamical Sy... theoretical representation of human cognition, data quality, and model capabilit...
Humans and language models form a coupled dynamical system linked by a feedback loop of usage, generation, and retraining.
Conceptual framing and theoretical proposal in the paper; model formulation rather than empirical data.
high mixed Human-AI Co-Evolution and Epistemic Collapse: A Dynamical Sy... dynamical relationship between human cognition, model outputs, and retraining cy...
Prior work has studied cognitive offloading in humans and model collapse in recursive training, but these effects are typically considered in isolation.
Literature review / related-work statement in paper; references to prior research (qualitative, no sample size stated).
high mixed Human-AI Co-Evolution and Epistemic Collapse: A Dynamical Sy... research focus of prior studies (whether effects studied jointly or separately)
Large language models (LLMs) are reshaping how knowledge is produced, with increasing reliance on AI systems for generation, summarization, and reasoning.
Background/literature observation cited in paper (qualitative claim), no empirical sample or quantified data reported in text provided.
high mixed Human-AI Co-Evolution and Epistemic Collapse: A Dynamical Sy... extent to which AI systems are used for knowledge production tasks (generation, ...
Institutional expertise (such as that created or possessed by universities and corporations) is viewed as in need of liberation or reform so it can be incorporated into the latest artificial intelligence systems.
Analysis of public communications from five annotation organizations and their CEOs indicating calls or framing that institutional knowledge should be freed/restructured to be integrated into AI systems.
high mixed Cheap Expertise: Mapping and Challenging Industry Perspectiv... attitudes toward institutional reform for AI integration / institutional knowled...
Demand for expert-annotated data on the part of leading AI labs has created an expert gig economy with the potential to reshape white collar work and society's understanding of expertise.
Qualitative analysis of public communications (social media feeds and podcast appearances) from five industry data annotation organizations and their CEOs; sample of five organizations and their public-facing leaders.
high mixed Cheap Expertise: Mapping and Challenging Industry Perspectiv... creation of an expert gig economy / effects on white-collar work and public unde...
Human anchors build trust through a broadly effective relational pathway (perceived intimacy), while AI anchors' functional advantage converts into trust only under specific motivational conditions (high utilitarian motivation).
Interpretation of moderated mediation results from randomized experiment (N = 439) showing intimacy-mediated trust for human anchors and responsiveness-mediated trust for AI anchors only under high utilitarian motivation.
high mixed Conditional trust pathways in live-streaming commerce: how c... trust (mediated by intimacy for human anchors; by responsiveness for AI anchors ...
Consumer trust in live-streaming commerce is a conditional, motivation-dependent process rather than a uniform preference for either anchor type.
Synthesis of experimental results showing differential mediation/moderation patterns by hedonic and utilitarian motivation in sample N = 439 (moderated mediation analyses).
Perceived responsiveness became a significant pathway favoring AI anchors only when utilitarian motivation was high; at low utilitarian motivation, this pathway reversed direction.
Conditional (moderated) mediation analyses from the experiment (N = 439) including utilitarian motivation as moderator; reported that responsiveness→trust path favored AI anchors at high utilitarian motivation and reversed at low utilitarian motivation.
high mixed Conditional trust pathways in live-streaming commerce: how c... trust (conditional mediation by perceived responsiveness moderated by utilitaria...
Across studies, causal modeling reveals that cognitive alignment systematically drives attentional coordination in successful collaboration, while mismatches between effort and attention characterize unproductive regulation.
Synthesis of causal inference results from the three studies using time-series measures (JME, JVA) and episode-based analyses across the pooled dataset (182 dyads total).
high mixed Cognitive Alignment Drives Attention: Modeling and Supportin... directional relationship between cognitive alignment (JME) and attentional coord...
Three sovereignty boundaries determine whether AI remains an amplifier within a human-governed system or becomes a de facto control center: irreversible decision authority, physical resource mobilization authority, and self-expansion authority.
Conceptual model element in the paper; identification and definition of three 'sovereignty boundaries' used to analyze governance risks.
high mixed AI Safety as Control of Irreversibility: A Systems Framework... sovereignty/control boundaries
The paper formalizes this claim through decision-energy density: the rate-weighted capacity of a node to generate, evaluate, select, and execute consequential decisions.
Formal/modeling claim — the paper defines and uses a formal metric called 'decision-energy density' within its theoretical framework.
high mixed AI Safety as Control of Irreversibility: A Systems Framework... decision-energy density (capacity to produce consequential decisions)