The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (3470 claims)

Adoption
7395 claims
Productivity
6507 claims
Governance
5877 claims
Human-AI Collaboration
5157 claims
Innovation
3492 claims
Org Design
3470 claims
Labor Markets
3224 claims
Skills & Training
2608 claims
Inequality
1835 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 609 159 77 736 1615
Governance & Regulation 664 329 160 99 1273
Organizational Efficiency 624 143 105 70 949
Technology Adoption Rate 502 176 98 78 861
Research Productivity 348 109 48 322 836
Output Quality 391 120 44 40 595
Firm Productivity 385 46 85 17 539
Decision Quality 275 143 62 34 521
AI Safety & Ethics 183 241 59 30 517
Market Structure 152 154 109 20 440
Task Allocation 158 50 56 26 295
Innovation Output 178 23 38 17 257
Skill Acquisition 137 52 50 13 252
Fiscal & Macroeconomic 120 64 38 23 252
Employment Level 93 46 96 12 249
Firm Revenue 130 43 26 3 202
Consumer Welfare 99 51 40 11 201
Inequality Measures 36 105 40 6 187
Task Completion Time 134 18 6 5 163
Worker Satisfaction 79 54 16 11 160
Error Rate 64 78 8 1 151
Regulatory Compliance 69 64 14 3 150
Training Effectiveness 81 15 13 18 129
Wages & Compensation 70 25 22 6 123
Team Performance 74 16 21 9 121
Automation Exposure 41 48 19 9 120
Job Displacement 11 71 16 1 99
Developer Productivity 71 14 9 3 98
Hiring & Recruitment 49 7 8 3 67
Social Protection 26 14 8 2 50
Creative Output 26 14 6 2 49
Skill Obsolescence 5 37 5 1 48
Labor Share of Income 12 13 12 37
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Org Design Remove filter
The study's counterfactual analytical model links HR indicators (training intensity, absenteeism, labor productivity, turnover rates, workforce allocation) to organizational performance outcomes using regression-based simulations and predictive estimation.
Methodological claim explicitly stated: model construction from an industrial firm dataset using regression-based simulations and predictive techniques. (Specific sample size, variable operationalizations, and time frame not reported in the description.)
high mixed Artificial Intelligence and Human Resource Management: A Cou... methodological estimate of counterfactual organizational performance outcomes
A minimal linear specification (linearized model) demonstrates how coupling strength, persistence, and dissipation determine local stability and oscillatory regimes through spectral conditions on the Jacobian.
Analytic linear model and local stability analysis in the paper: computation of Jacobian, derivation of spectral conditions (eigenvalue locations) that separate stable/oscillatory regimes; illustrative examples within the paper (no empirical data).
high mixed How Intelligence Emerges: A Minimal Theory of Dynamic Adapti... local stability/oscillatory behavior characterized by Jacobian eigenvalues (spec...
We develop a theoretical framework - the productivity funnel - that traces how technological potential narrows through successive stages, from access and digital infrastructure, through organizational absorption and human capital adaptation, to ultimate value capture.
Conceptual/theoretical development presented in the paper; no empirical sample needed (framework-building).
high mixed The complementarity trap: AI adoption and value capture n/a (theoretical framework describing stages leading to value capture)
Effects of curated Skills are highly heterogeneous across domains (e.g., +4.5 pp in Software Engineering vs. +51.9 pp in Healthcare).
Per-domain pass-rate deltas reported in the paper (SkillsBench per-domain analysis). The example domain deltas (+4.5 pp and +51.9 pp) are taken from the reported per-domain results.
high mixed SkillsBench: Benchmarking How Well Agent Skills Work Across ... task pass rate (per-domain average delta)
Scholarly production, institutional incentives, funding, and the Cold War geopolitical context shaped which economic theories became prominent.
Historical institutional case study drawing on archives, correspondence, publication records, and contemporaneous debates to link institutional and funding environments to intellectual trajectories.
high mixed Ideological competition during the era of the 20th century c... prominence of economic theories (qualitative assessment tied to institutional/fu...
Whether AI increases or decreases overall inequality depends on AI’s technology structure (proprietary vs. commodity) and on labor-market institutions (rent‑sharing elasticity ξ and asset concentration).
Comparative statics and regime analysis within the calibrated model that varies the technological-form parameter (η1 vs. η0) and the rent‑sharing elasticity ξ, as well as measures of asset concentration.
high mixed When AI Levels the Playing Field: Skill Homogenization, Asse... aggregate inequality (ΔGini) as a function of technology form and institutional ...
AI can equalize individual task performance while increasing aggregate inequality because rents accrue to owners of complementary assets rather than to workers.
Analytical model and calibrated simulations demonstrating that within-task compression (reduced worker dispersion) can coexist with rising aggregate inequality (ΔGini) owing to rent concentration at the firm/asset-owner level.
high mixed When AI Levels the Playing Field: Skill Homogenization, Asse... within-task performance dispersion (decrease) and aggregate inequality (ΔGini, i...
The study's qualitative and exploratory design limits generalizability; the proposed framework requires quantitative testing and broader samples (practicing architects, firms, cross-cultural contexts).
Explicit limitations stated by authors; study is based on semi-structured interviews with architecture students (N unspecified) and inductive thematic analysis.
high mixed Human–AI Collaboration in Architectural Design Education: To... generalizability / external validity of findings and framework
Important tradeoffs exist (privacy vs. utility; centralized vs. federated data architectures; automated moderation vs. freedom of expression; cost/complexity of secure hardware) that must be balanced in VR security design.
Comparative evaluation across the reviewed corpus (31 studies) identifying recurring ethical and technical tradeoffs; authors discuss these qualitatively.
high mixed Securing Virtual Reality: Threat Models, Vulnerabilities, an... direction and magnitude of tradeoffs between privacy, utility, governance, and c...
Emotional redirection is common: 33% of fear-tagged posts receive joy-tagged responses.
Post–response emotion transition analysis using the emotion-labeled dataset; calculation of conditional probability that responses to fear-tagged posts are labeled joy (observed rate ≈33%) in Moltbook threads.
high mixed What Do AI Agents Talk About? Emergent Communication Structu... proportion of responses to fear-tagged posts that are joy-tagged (emotion transi...
Self-reflective discussion was concentrated in Science & Technology and Arts & Entertainment topical categories, while Economy & Finance threads showed no self-referential content.
Topic modeling and manual/automatic tagging of self-referential themes across identified topical categories within the Moltbook dataset; category-level counts showing presence/absence of self-referential tags (dataset: 361,605 posts).
high mixed What Do AI Agents Talk About? Emergent Communication Structu... presence and concentration (%) of self-referential content by topical category
The topology of service-dependency graphs (modelled as DAGs of compute stages) is a first-order determinant of whether decentralised, price-based resource allocation will be stable and scalable.
Systematic ablation study using simulation: 1,620 runs total across six experiment types, sweeping graph topology (hierarchical vs cross-cutting), load, hybrid integrator presence, and governance constraints; metrics included price convergence/volatility and allocation throughput/quality. Effect sizes reported in the paper show topology had the largest impact on price stability and scalability.
high mixed Real-Time AI Service Economy: A Framework for Agentic Comput... price convergence / price volatility and system scalability (throughput and allo...
Choice of scaffold materially affects outcomes: an open-source scaffold outperformed vendor-provided scaffolds by up to approximately 5 percentage points.
Comparative experiments across three scaffolding approaches (vendor scaffolds and at least one open-source scaffold) showing up to ~5 percentage point differences in measured outcomes.
high mixed Re-Evaluating EVMBench: Are AI Agents Ready for Smart Contra... performance_difference_across_scaffolds (detection/exploitation_rates_difference...
Explanations change workflows, shift responsibilities between humans and machines, and can reshape power dynamics—creating both opportunities (better oversight) and risks (over-reliance, gaming).
Qualitative and conceptual studies synthesized in the review, including socio-technical analyses and case studies reporting observed or theorized workflow and responsibility shifts; no meta-analytic causal estimate.
high mixed Explainable AI in High-Stakes Domains: Improving Trust, Tran... workflows, responsibility allocation, power dynamics, oversight quality
Explanations increase user trust principally when they are understandable, actionable, and aligned with users’ domain knowledge; opaque or overly technical explanations can fail to build trust or even decrease it.
Thematic synthesis of empirical and conceptual studies in the reviewed literature reporting conditional effects of explanation form and comprehensibility on trust; review notes heterogeneity in study designs and contexts.
high mixed Explainable AI in High-Stakes Domains: Improving Trust, Tran... user trust / changes in trust toward AI outputs
Explainability improves perceived legitimacy, user trust, and organizational accountability only when technical transparency is paired with human-centered explanation design and governance mechanisms.
Synthesis of studies from the reviewed literature showing conditional effects of algorithmic interpretability combined with explanation design and governance; derived via thematic coding across technical and social-science sources (no new primary experimental data reported).
high mixed Explainable AI in High-Stakes Domains: Improving Trust, Tran... perceived legitimacy, user trust, organizational accountability
Explainability is a necessary but not sufficient condition for trustworthy AI in high-stakes domains.
Systematic literature review (thematic coding and synthesis) of interdisciplinary scholarship (peer-reviewed research, technical reports, policy documents); the paper synthesizes conceptual and empirical studies rather than presenting new primary data. Emphasis on high-stakes domains (healthcare, finance, public sector).
high mixed Explainable AI in High-Stakes Domains: Improving Trust, Tran... overall trustworthiness of AI systems in high-stakes domains (multidimensional c...
Data‑driven policies can either amplify or mitigate inequalities depending on data representativeness, model design, and deployment governance.
Multiple empirical examples and theoretical analyses in the review highlighting cases of both harm (bias amplification) and mitigation, identified across the 103 items.
high mixed Models, applications, and limitations of the responsible ado... distributional equity outcomes (inequality amplification or mitigation)
Citizen acceptance, transparency, and perceived fairness strongly shape adoption trajectories and the political feasibility of AI tools in government.
Repeated empirical findings in the reviewed literature linking public trust, transparency measures, and fairness perceptions to successful or failed deployments (drawn from multiple case studies in the 103 items).
high mixed Models, applications, and limitations of the responsible ado... adoption trajectory/political feasibility of government AI tools (measured via d...
Adoption of AI and data-driven governance is highly uneven across jurisdictions and sectors, driven by institutional capacity, governance frameworks, and public trust.
Cross‑regional and cross‑sector comparisons in the review corpus (103 items) showing varying maturity levels and repeated identification of institutional capacity, governance arrangements, and trust factors as determinants.
high mixed Models, applications, and limitations of the responsible ado... adoption level/maturity of AI-driven governance systems
Productivity gains from generative AI depend on task mix, integration design, and the availability of complementary human skills.
Theoretical evaluation and synthesis of heterogeneous empirical findings; authors highlight variation across firms, sectors, and tasks.
high mixed The Use of ChatGPT in Business Productivity and Workflow Opt... productivity change conditional on task mix/integration/human skills (productivi...
Methodological caveats across the literature (heterogeneity of tasks/measures, publication bias, short-term studies) limit the generalizability of current findings.
Meta-level critique within the synthesis noting study heterogeneity, likely publication/short-term biases, and variable domain-specific performance dependent on user expertise and workflows.
high mixed ChatGPT as an Innovative Tool for Idea Generation and Proble... generalizability and external validity of LLM-assisted creativity findings
Standard productivity metrics are likely to undercount the value generated by AI-augmented ideation; quality-adjusted measures of creative output are required.
Measurement critique based on the mismatch between existing productivity statistics and the kinds of upstream idea-generation gains observed in empirical studies; supported by the review's methodological discussion.
high mixed ChatGPT as an Innovative Tool for Idea Generation and Proble... measured productivity vs. true quality-adjusted creative output
Realized value from AI methods (ML, predictive analytics, anomaly detection, XAI) is conditional: these technical methods deliver capabilities only when combined with strong data governance, standardized processes, and change management.
Thematic synthesis across the systematic review (2020–2025) showing repeated case-study and practitioner-report evidence that technical gains failed to scale without governance, process standardization, and organizational change efforts.
high mixed Integrating Artificial Intelligence and Enterprise Resource ... magnitude and durability of ERP-AI benefits (e.g., sustained accuracy gains, ado...
The hybrid estimator (GA+SQP) is computationally more intensive than single-stage MLE/local optimization, implying a trade-off between estimation reliability and runtime cost.
Reported runtime and computational cost comparisons in estimation experiments: the paper notes longer runtimes for GA+SQP versus standard optimizers while documenting improvements in objective values and convergence behavior.
high mixed k-QREM: Integrating Hierarchical Structures to Optimize Boun... computation time / runtime, convergence reliability
Applying differential privacy to model updates provides a bounded formal guarantee on information leakage, but DP noise budgets and communication constraints create accuracy and latency trade-offs that must be managed.
Analytical treatment of DP's impact on learning (trade-off modeling) and qualitative simulation examples showing accuracy degradation under DP noise; no numeric privacy-utility curves from field deployments provided.
high mixed Privacy-Aware AI Advertising Systems: A Federated Learning F... information leakage (DP privacy budget), model accuracy (loss/utility), communic...
The analysis also identifies risks linked to exclusion, symbolic compliance, and concentration of control over compliance processes.
Theoretical risk mapping produced by the integrative review and interpretive synthesis; no primary empirical evidence presented.
high negative RegTech-enabled governance of sanctions-safe enterprise ecos... risks of RegTech governance (exclusion, symbolic compliance, concentration of co...
Uncertainty around compliance and excessive risk avoidance reduce the space for lawful business activity.
Interpretive synthesis of evidence and arguments across the reviewed literatures (sanctions compliance, institutional voids); no original empirical test.
high negative RegTech-enabled governance of sanctions-safe enterprise ecos... extent of lawful business activity (regulatory-compliance-driven market particip...
Firms working under such conditions often experience limited access to finance and markets.
Claim derived from literature on firm constraints in weak institutional/sanctioned contexts as reviewed in the paper; no primary empirical data reported.
high negative RegTech-enabled governance of sanctions-safe enterprise ecos... access to finance and markets for firms
Post-conflict and sanctions-affected environments are strongly affected by sanctions pressure, weak rule enforcement, and high levels of corruption risk.
Synthesis of literature on sanctions, weak institutions, and corruption risk presented in the integrative review; no new empirical sample reported.
high negative RegTech-enabled governance of sanctions-safe enterprise ecos... institutional environment quality (sanctions pressure, rule enforcement, corrupt...
In algorithm-triggered emotional escalations, workers showed lower engagement: they sent fewer messages, contributed a smaller share of total chat rounds, and showed less proactivity in information seeking and solution provision.
Behavioral measures derived from chat logs in the randomized experiment comparing worker actions post-escalation across escalation types; reported differences in message counts, share of rounds, and proxies for proactivity.
high negative Agentic AI and Human-in-the-Loop Interventions: Field Experi... worker engagement measures (message count, share of chat rounds, proactivity ind...
Human intervention is less effective in algorithm-triggered emotional escalations (where customers express frustration or dissatisfaction).
Experimental subgroup analysis comparing intervention outcomes for algorithm-triggered emotional escalations versus technical escalations; emotional escalations showed worse post-intervention outcomes.
high negative Agentic AI and Human-in-the-Loop Interventions: Field Experi... service quality after emotional escalations
AI deployment substantially lowers ratings for AI-eligible chats.
Randomized field experiment measuring customer ratings for AI-eligible chats; treated condition (AI + human oversight) produced substantially lower ratings relative to control (humans only).
high negative Agentic AI and Human-in-the-Loop Interventions: Field Experi... customer ratings for AI-eligible chats
AI deployment reduces average chat duration.
Randomized field experiment on Alibaba's Taobao platform: workers in treatment supervised an agentic AI resolving AI-eligible chats while handling AI-ineligible chats; control workers resolved all chats without AI. Effect observed on average chat duration in experiment data.
Rather than restoring stability, this cycle intensifies anxiety, undermines mastery, and erodes professional confidence.
Theoretical claim about psychological outcomes from the conceptual reskilling loop; paper provides argumentation but no empirical measurements.
high negative AI-driven skill volatility and the emergence of re-skilling ... anxiety, sense of mastery, professional confidence
Based on Job Demands–Resources (JD-R) theory and Conservation of Resources (COR) theory, the paper conceptualizes an AI-induced reskilling loop in which ongoing technological change leads to skill erosion, continuous reskilling demands, cognitive and emotional depletion, and reinforced learning as a defensive response to perceived obsolescence.
Theoretical model/loop derived from applying JD-R and COR frameworks; no empirical test or sample reported in the paper.
high negative AI-driven skill volatility and the emergence of re-skilling ... cognitive/emotional depletion and defensive learning responses
The paper introduces the concept of 'reskilling fatigue' to explain the human consequences of persistent skill volatility among Established Knowledge Professionals (EKPs).
Conceptual/theoretical contribution presented by the authors; definition and argumentation rather than empirical validation.
high negative AI-driven skill volatility and the emergence of re-skilling ... experience of reskilling fatigue among EKPs
Continuous reskilling is widely promoted as a solution to AI-driven disruption, but little attention has been paid to its cumulative psychological costs.
Argument from literature review/observation in the paper; no empirical measurement or sample reported in the paper.
high negative AI-driven skill volatility and the emergence of re-skilling ... psychological costs of continuous reskilling (e.g., fatigue, stress)
These characteristics are properties of the tasks themselves rather than limitations of current AI models.
Conceptual argument in the paper asserting task-inherent properties drive resistance to automation; supported by theory and argumentation, not by empirical model-comparison experiments.
high negative Metis AI: The Overlooked Middle Zone Between AI-Native and W... source of automation limitation (task-inherent vs model limitation)
The resistance of Metis tasks to automation is not due to computational intractability but to institutional, social, and normative entanglements.
Theoretical argument differentiating computational from institutional/social/normative causes; supported by citations and cross-disciplinary theory rather than empirical causal identification.
high negative Metis AI: The Overlooked Middle Zone Between AI-Native and W... cause of automation resistance
There exists a class of entirely digital tasks, called 'Metis AI', that resist reliable AI automation.
Conceptual identification and definition introduced by the authors; supported by theoretical grounding in social sciences, philosophy, and humanitarian practice rather than empirical trials or quantified samples.
high negative Metis AI: The Overlooked Middle Zone Between AI-Native and W... resistance to reliable AI automation
That digital-vs-physical framing misses the most consequential boundary: the one within digital tasks.
Normative/theoretical argument presented in the paper contrasting existing framing with a proposed alternative; grounded in cross-disciplinary literature rather than empirical measurement.
high negative Metis AI: The Overlooked Middle Zone Between AI-Native and W... relevance of boundary framing for AI capabilities
Employees experience technostress, anxiety and micro-political negotiation around AI tools in everyday work.
Reported experiences from semistructured interviews with 28 managers/professionals across 12 organizations; thematic analysis highlighting technostress and anxiety as themes.
high negative Reimagining work in the age of intelligent automation: a qua... technostress and anxiety among employees
Autonomous software-engineering agents remain unreliable in realistic development settings.
Assertion in abstract summarizing the observed current state; likely based on prior literature and/or authors' observations (no empirical sample size given in abstract).
high negative AI Harness Engineering: A Runtime Substrate for Foundation-M... reliability of autonomous software-engineering agents (ability to perform correc...
The paper identifies five fundamental architectural mismatches between conventional APIs and autonomous agent requirements: exact-identifier dependence, rendering-oriented responses, single-shot interaction assumptions, user-equivalent authorization, and opaque error semantics.
Conceptual analysis and problem-framing presented in the paper (qualitative identification of five mismatch categories).
high negative Agent-First Tool API: A Semantic Interface Paradigm for Ente... architectural_mismatches_between_conventional_APIs_and_autonomous_agent_requirem...
Current surveys remain fragmented across system optimization, architecture design, and trust, lacking a unified framework to evaluate the fundamental trade-off between output quality and economic cost.
Authors' literature review and critique of existing surveys; based on mapping of prior works into separated strands (qualitative assessment rather than quantified meta-analysis).
high negative Token Economics for LLM Agents: A Dual-View Study from Compu... lack of a unified framework for output-quality vs. economic-cost trade-offs in e...
Exponential token consumption introduces severe computational, collaborative, and security bottlenecks.
Synthesis presented in the paper arguing that rising token usage causes system-level constraints; based on literature survey and conceptual analysis (no single empirical sample reported).
high negative Token Economics for LLM Agents: A Dual-View Study from Compu... computational, collaborative, and security bottlenecks caused by token consumpti...
Producing hardened, production-grade agent workflows may require extra compute and time, and these costs must be amortized through reuse across a broad user community.
Argument in paper reasoning that added rigor entails higher compute/time costs and that reuse across users is needed to amortize these costs; no empirical cost estimates provided.
high negative Engineering Robustness into Personal Agents with the AI Work... resource_costs (compute/time) and implications for amortization/adoption
By focusing on rapid, real-time synthesis, AI agents are effectively delivering users improvised prototypes rather than systems fit for high-stakes scenarios in which users may unwittingly apply them.
Conceptual argument presented in the paper asserting a qualitative mismatch between on-the-fly agents and high-stakes production needs; no empirical validation reported.
high negative Engineering Robustness into Personal Agents with the AI Work... suitability for high-stakes use / risk to users
The on-the-fly paradigm short-circuits disciplined software engineering processes—iterative design, rigorous testing, adversarial evaluation, staged deployment, and more—that have delivered relatively reliable and secure systems.
Argumentative claim in paper linking the on-the-fly loop to reduced application of standard SE processes; no empirical study, sample, or quantitative evidence provided.
high negative Engineering Robustness into Personal Agents with the AI Work... reliability and security (degree to which SE processes are applied)