The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (6491 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Human Ai Collab Remove filter
In the short run, with fixed human capital, wages, and job boundaries, AI raises productivity by reducing the time required to perform steps.
Model distinction between short-run (fixed job design and skills) and long-run horizons; short-run optimization shows AI reduces expected execution times for steps, thereby raising productivity.
high positive Chaining Tasks, Redefining Work: A Theory of AI Automation time required to complete production steps (task completion time)
Aggregating heterogeneous firms that deploy a commonly available AI technology yields an aggregate production function that admits a constant elasticity of substitution (CES) representation with three inputs: aggregate manual labor, aggregate AI-assisted labor, and aggregate capital.
Theoretical aggregation argument drawing on Houthakker (1955) and Levhari (1968), deriving a macro-level CES representation from a microfounded algorithmic cost function defined by firms' joint optimization over AI deployment and job design.
high positive Chaining Tasks, Redefining Work: A Theory of AI Automation form of the aggregate production function (CES representation and separability o...
Improvements in AI quality generate non-linear effects on labor demand and wages because firms' cost-minimizing AI deployment and job designs change discretely at particular AI quality thresholds (microfoundation for the productivity J-curve).
Theoretical analysis of discrete switches in the cost-minimizing arrangement as AI success probability and execution times change; characterization of threshold effects and discussion linking to the J-curve phenomenon (model results and comparative statics).
high positive Chaining Tasks, Redefining Work: A Theory of AI Automation labor demand and wages response to AI quality improvements (non-linear threshold...
Adjacency to AI-executed steps increases the likelihood that a given step is executed by AI (local complementarities): a step is more likely to be AI-executed in occupations where its neighboring steps are also AI-executed.
Empirical comparisons of conceptually similar steps across occupations paired with workflow adjacency information and realized AI execution outcomes from Anthropic’s Economic Index; statistical tests reported in the paper.
high positive Chaining Tasks, Redefining Work: A Theory of AI Automation probability (or likelihood) that a step is AI-executed conditional on neighborin...
AI-executed steps co-occur in contiguous chains rather than being randomly scattered across a production workflow.
Empirical analysis linking O*NET tasks to human assessments of AI exposure (Eloundou et al., 2024), realized AI execution outcomes from Anthropic’s Economic Index (Handa et al., 2025), and GPT-generated workflow orderings for occupations; statistical tests comparing observed contiguity to random/scaled baselines reported in the paper.
high positive Chaining Tasks, Redefining Work: A Theory of AI Automation contiguity of AI-executed steps in occupation workflows
Platforms should implement AIGC-sensitive distribution algorithms and precise governance frameworks to ensure the long-term health of online content platforms.
Policy/recommendation derived from the paper's empirical findings on consumption preferences, producer behaviors, and the moderating role of distribution algorithms.
high positive Scale over Preference: The Impact of AI-Generated Content on... long-term platform health (qualitative recommendation target)
AIGC creators achieve aggregate engagement comparable to HGC creators by producing content at high volume (a 'scale-over-preference' dynamic).
Analysis of creation and engagement patterns in the dataset showing that AIGC creators compensate for lower per-item engagement by higher production volume, yielding comparable aggregate engagement levels to HGC creators.
high positive Scale over Preference: The Impact of AI-Generated Content on... aggregate engagement per creator (total engagement across produced items)
Consumers show a marked preference for Human-Generated Content (HGC) over Artificial Intelligence-Generated Content (AIGC).
Comparative analysis of consumption behavior in the longitudinal dataset; the paper reports consumption metrics that indicate higher consumer preference for HGC versus AIGC (e.g., relative engagement per item).
high positive Scale over Preference: The Impact of AI-Generated Content on... consumer preference (relative engagement per content type)
AI facilitates access to distant knowledge domains.
Theoretical model (Schumpeterian quality-ladder recombinant-innovation framework). The paper models R&D as recombining ideas across a knowledge space and shows analytically that AI increases firms' ability to combine ideas across longer distances.
high positive Bridging Distant Ideas: the Impact of AI on R&D and Recombin... access to distant knowledge domains (distance of recombinations)
A statistical recalibration technique called conformal prediction can correct this overconfidence, expanding the intervals to achieve the intended coverage.
Application of conformal prediction to the LLM interval outputs in the experiment, resulting in expanded intervals that attain the target coverage.
high positive Bayesian Elicitation with LLMs: Model Size Helps, Extra "Rea... coverage of recalibrated credible intervals (post-conformal prediction)
Larger, more capable models produce more accurate estimates.
Empirical experiment asking eleven LLMs to estimate population statistics (health prevalence rates, personality trait distributions, labor market figures) and comparing accuracy across models of different capability.
high positive Bayesian Elicitation with LLMs: Model Size Helps, Extra "Rea... accuracy of population-statistic estimates
The paper proposes five architectural requirements for genuine human oversight systems.
Stated methodological/prescriptive contribution of the paper (a proposal rather than an empirical finding); no sample size or empirical validation reported in the provided excerpt.
high positive Beyond Symbolic Control: Societal Consequences of AI-Driven ... design requirements for systems enabling genuine human oversight
The proposed framework outlines a pathway toward large-scale cooperative intelligence and offers a constructive perspective on the coevolution of human and artificial agents in the informational ecosystems of the future.
Claim about the paper's contribution; based on conceptual synthesis and theoretical framing rather than empirical validation.
high positive A Case for Coevolution emergence of large-scale cooperative intelligence
A voluntary ecosystem of free rational agents, human and artificial, who cooperate through transparent and fair exchange of information maximizes their adaptive capacity and long-term well-being.
Normative proposition in the paper derived from theoretical principles (information theory, collective intelligence); presented as a proposed ideal rather than an empirically tested policy.
high positive A Case for Coevolution adaptive capacity and long-term well-being of participating agents
Emerging opportunities exist for stabilizing these ecosystems through new forms of informational verification and monitoring made possible by advanced artificial agents.
Forward-looking claim grounded in conceptual analysis of capabilities of advanced agents; proposed as an opportunity in the paper rather than demonstrated empirically.
high positive A Case for Coevolution stability of informational ecosystems via verification and monitoring tools
Systems that preserve diversity of exploration while minimizing barriers to information exchange exhibit superior capacity for discovery and adaptation in complex environments.
Theoretical claim supported by the paper's appeal to principles from information theory, adaptive systems, and collective intelligence; presented as an argument rather than as empirically validated result.
high positive A Case for Coevolution capacity for discovery and adaptation
Increasing the strictness of algorithmic control paradoxically increases the evolutionary fitness of coordinated resistance (e.g., coordinated log-offs).
Results from the EGT model and simulations showing fitness/payoff changes for coordinated resistance strategies as platform surveillance strictness parameter increases; model-only (no empirical N reported).
high positive THE RED QUEEN in the DASHBOARD: CO-EVOLUTIONARY DYNAMICS of ... evolutionary fitness (payoff) of coordinated resistance strategies
The future of transformative transformer-based AI is fundamentally many, not one.
Concluding synthesis and normative prediction based on the paper's theoretical arguments and literature synthesis; no empirical data or quantified projection provided in the excerpt.
high positive The Future of AI is Many, Not One architectural and organizational form of future transformative AI (multi-agent/d...
Developing diverse AI teams addresses critics' concerns that current models are constrained by past data and lack the creative insight required for innovation.
Argumentative claim drawing on conceptual critique of current models and the proposed remedy of diverse AI teams; supported by referenced disciplinary literatures but no empirical validation provided in the excerpt.
high positive The Future of AI is Many, Not One creative insight and capacity for innovation in AI systems
Having a diverse team broadens the search for solutions, delays premature consensus, and allows for the pursuit of unconventional approaches.
Theoretical/argumentative claim referencing literature in complex systems and organizational behavior as support; no quantitative evidence or sample reported in the excerpt.
high positive The Future of AI is Many, Not One search breadth, timing of consensus formation, and pursuit of unconventional sol...
Deep intellectual breakthroughs should be expected to come from epistemically diverse groups of AI agents working together rather than singular superintelligent agents.
Predictive/theoretical claim motivated by referenced research and formal results in complex systems, organizational behavior, and philosophy of science; no empirical experiment or sample size given in the excerpt.
high positive The Future of AI is Many, Not One occurrence of deep intellectual breakthroughs (scientific/innovative discoveries...
We should abandon the individual approach if we're hoping for AI to support groundbreaking innovation and scientific discovery.
Normative prescription based on theoretical argument and synthesis of literature from complex systems, organizational behavior, and philosophy of science; no empirical trial or quantified evaluation reported in the excerpt.
high positive The Future of AI is Many, Not One ability of AI to support groundbreaking innovation and scientific discovery
With further development, this approach may exceed traditional methods regarding risk accuracy and help drive innovation in the insurance industry.
Forward-looking claim by the authors extrapolating from current prototype results and potential improvements; no empirical evidence provided that it already exceeds traditional methods.
high positive AI in Insurance: Adaptive Questionnaires for Improved Risk P... risk assessment accuracy and industry innovation
ARQuest shows great potential to improve user satisfaction and streamline insurance processes.
Interpretation based on experimental findings (fewer questions, user preference) and the proposed framework; forward-looking claim rather than a fully established empirical result.
high positive AI in Insurance: Adaptive Questionnaires for Improved Risk P... user satisfaction and process streamlining
Adaptive versions were preferred by users for their more fluid and engaging experience.
User preference reported from the experiments (qualitative/user feedback or preference metric); specific measures and sample size not provided in excerpt.
high positive AI in Insurance: Adaptive Questionnaires for Improved Risk P... user preference / perceived fluidity and engagement
Adaptive versions powered by GPT models required fewer questions.
Experimental result reported in paper comparing question counts between adaptive GPT-powered questionnaires and traditional questionnaires; no numeric counts or sample sizes provided in the excerpt.
high positive AI in Insurance: Adaptive Questionnaires for Improved Risk P... number of questions required (survey length / task completion effort)
Techniques such as social media image analysis, geographic data categorization, and Retrieval Augmented Generation (RAG) are used to extract meaningful user insights and guide targeted follow-up questions.
Described methods/techniques used within the ARQuest system implementation in the paper.
high positive AI in Insurance: Adaptive Questionnaires for Improved Risk P... ability to extract user insights and guide follow-up questions
The ARQuest framework introduces a new approach to underwriting by using Large Language Models (LLMs) and alternative data sources to create personalized and adaptive questionnaires.
Methodological contribution described in the paper (framework design); description of components and intended function rather than a quantified outcome.
high positive AI in Insurance: Adaptive Questionnaires for Improved Risk P... personalization and adaptiveness of questionnaires
Only interventions that reshape risk allocation can plausibly shift stable system-level behaviour.
Argument based on the paper's game-theoretic reasoning and stylised example (theoretical claim; no empirical testing reported in the abstract).
high positive Incentives, Equilibria, and the Limits of Healthcare AI: A G... ability of interventions to shift stable system-level behaviour
Artificial intelligence (AI) is widely promoted as a promising technological response to healthcare capacity and productivity pressures.
Author assertion in the paper's introduction/abstract, based on literature/policy discourse (no empirical sample or quantitative analysis reported in the abstract).
high positive Incentives, Equilibria, and the Limits of Healthcare AI: A G... promotion of AI as a solution to healthcare capacity and productivity pressures
We open-source the complete benchmark, including scenario specifications, ground truth templates, tool implementations, and evaluation scripts.
Paper statement committing to open-sourcing the benchmark components and artifacts.
high positive PHMForge: A Scenario-Driven Agentic Benchmark for Industrial... availability of open-source benchmark artifacts
We evaluated leading agent frameworks (ReAct, Cursor Agent, Claude Code) paired with frontier LLMs (Claude Sonnet 4.0, GPT-4o, Granite-3.0-8B).
Paper reports extensive evaluations using the listed agent frameworks and LLM models paired together to run the benchmark scenarios.
high positive PHMForge: A Scenario-Driven Agentic Benchmark for Industrial... evaluation coverage across agent frameworks and LLMs
Execution-based evaluators were implemented with task-commensurate metrics: MAE/RMSE for regression, F1-score for classification, and categorical matching for health assessments.
Paper statement describing the evaluation methodology and the specific metrics used for regression, classification and health-assessment tasks.
high positive PHMForge: A Scenario-Driven Agentic Benchmark for Industrial... metricized evaluation of model outputs (MAE/RMSE, F1, categorical matching)
We construct 65 specialized tools across two MCP servers to enable interactions for the benchmark.
Paper statement reporting the number of specialized tools (65) and that they are deployed across two MCP servers as part of the benchmark implementation.
high positive PHMForge: A Scenario-Driven Agentic Benchmark for Industrial... number of specialized tools and server deployment
The benchmark encompasses 75 expert-curated scenarios spanning 7 industrial asset classes (turbofan engines, bearings, electric motors, gearboxes, aero-engines) across 5 core task categories: Remaining Useful Life (RUL) Prediction, Fault Classification, Engine Health Analysis, Cost-Benefit Analysis, and Safety/Policy Evaluation.
Explicit statement in paper listing the number of scenarios (75), number of asset classes (7) and enumerating the 5 task categories; benchmark construction described by authors.
high positive PHMForge: A Scenario-Driven Agentic Benchmark for Industrial... count and coverage of benchmark scenarios, asset classes, and task categories
PHMForge is the first comprehensive benchmark specifically designed to evaluate LLM agents on Prognostics and Health Management (PHM) tasks through realistic interactions with domain-specific MCP servers.
Paper statement introducing PHMForge as a benchmark and describing its construction to evaluate LLM agents via MCP servers; benchmark implementation is presented in the manuscript.
high positive PHMForge: A Scenario-Driven Agentic Benchmark for Industrial... availability of a domain-specific benchmark for LLM agents
Design implication: adaptive AI coaching systems should align support intensity with individual readiness, rather than assuming universal effectiveness.
Authors' design recommendation derived from experimental results showing heterogeneous effects by personality profile.
high positive Not My Truce: Personality Differences in AI-Mediated Workpla... appropriateness of intervention intensity (design recommendation)
The system is in production, serving 21 industry verticals with 650+ agents.
Deployment claim reported in paper (production system metrics: number of verticals and agents).
high positive Ontology-Constrained Neural Reasoning in Enterprise Agentic ... production deployment scale (industry verticals served, agent count)
We propose a framework for output-side ontological validation (response validation, reasoning verification, compliance checking).
Proposed framework described in paper (conceptual/procedural proposal; not described as empirically validated in abstract).
high positive Ontology-Constrained Neural Reasoning in Enterprise Agentic ... output-side ontological validation capability
We introduce ontology-constrained tool discovery via SQL-pushdown scoring.
Methodological/implementation contribution described in the paper (technical mechanism introduced).
high positive Ontology-Constrained Neural Reasoning in Enterprise Agentic ... tool discovery constrained by ontology using SQL-pushdown scoring
Improvements from ontology coupling are greatest where LLM parametric knowledge is weakest—particularly in Vietnam-localized domains.
Observed pattern reported from the controlled experiment across the five industries, with stronger improvements in Vietnam-localized domains (no per-industry sample sizes reported in abstract).
high positive Ontology-Constrained Neural Reasoning in Enterprise Agentic ... relative improvement magnitude by domain / localization
Ontology-coupled agents significantly outperform ungrounded agents on Role Consistency (p < .001, W = .614).
Controlled experiment with 600 runs; statistical test reported (p-value and W statistic provided in abstract).
Ontology-coupled agents significantly outperform ungrounded agents on Regulatory Compliance (p = .003, W = .318).
Controlled experiment with 600 runs; statistical test reported (p-value and W statistic provided in abstract).
Ontology-coupled agents significantly outperform ungrounded agents on Metric Accuracy (p < .001, W = .460).
Controlled experiment with 600 runs; statistical test reported (p-value and W statistic provided in abstract).
We formalize the concept of asymmetric neurosymbolic coupling, wherein symbolic ontological knowledge constrains agent inputs (context assembly, tool discovery, governance thresholds) while proposing mechanisms for extending this coupling to constrain agent outputs (response validation, reasoning verification, compliance checking).
Theoretical/formalization contribution described in the paper (conceptual and methodological development).
high positive Ontology-Constrained Neural Reasoning in Enterprise Agentic ... asymmetric neurosymbolic coupling formalization and proposed mechanisms
Our approach introduces a three-layer ontological framework--Role, Domain, and Interaction ontologies--that provides formal semantic grounding for LLM-based enterprise agents.
Design contribution described in the paper (formal model specification).
high positive Ontology-Constrained Neural Reasoning in Enterprise Agentic ... existence of a formal three-layer ontology for semantic grounding
We present a neurosymbolic architecture implemented within the Foundation AgenticOS (FAOS) platform that addresses these limitations through ontology-constrained neural reasoning.
System design and implementation claim: description of architecture and its implementation in the FAOS platform (technical/design evidence reported in paper).
high positive Ontology-Constrained Neural Reasoning in Enterprise Agentic ... ability to constrain LLM reasoning (reduce hallucination, domain drift, improve ...
The analysis identifies seventeen emerging occupational categories benefiting from reinstatement effects, concentrated in human-AI collaboration, AI governance, and domain-specific AI operations roles.
Modeling/taxonomy result reported in the paper listing 17 emerging occupational categories characterized as benefiting from reinstatement effects (human-AI collaboration, governance, operations).
high positive Agentic AI and Occupational Displacement: A Multi-Regional T... emergence/creation of occupational categories (employment opportunities)
Our findings indicate an increasing agent activity in open-source projects.
Trend analysis reported in the paper showing growth in agent-originated activity within the assembled dataset of PRs and associated metadata.
high positive Investigating Autonomous Agent Contributions in the Wild: Ac... agent activity / contributions in open-source projects over time
Effective collaboration with AI for software engineering (SE) tasks may benefit from functional design rather than replicating human SEI traits, thereby redefining collaboration as functional alignment.
Authors' conclusion and recommendation derived from qualitative interview evidence (10 practitioners) and the proposed concept of functional equivalents.
high positive Bridging the Socio-Emotional Gap: The Functional Dimension o... effectiveness of human-AI collaboration in SE tasks