The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (13827 claims)

Adoption
8454 claims
Productivity
7544 claims
Governance
6789 claims
Human-AI Collaboration
6327 claims
Org Design
4126 claims
Innovation
4058 claims
Labor Markets
3520 claims
Skills & Training
2924 claims
Inequality
2057 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 749 195 97 889 1979
Governance & Regulation 815 391 188 121 1539
Organizational Efficiency 771 189 124 83 1177
Technology Adoption Rate 624 233 123 96 1084
Research Productivity 410 121 56 331 929
Output Quality 466 177 59 47 749
Decision Quality 320 174 75 42 618
Firm Productivity 435 55 88 20 604
AI Safety & Ethics 214 276 65 33 593
Market Structure 178 166 122 24 495
Task Allocation 206 64 70 31 376
Skill Acquisition 165 57 60 17 299
Innovation Output 201 27 41 18 288
Employment Level 105 51 107 13 278
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 116 63 42 11 232
Firm Revenue 149 46 26 3 224
Inequality Measures 44 122 49 6 221
Task Completion Time 169 29 8 12 219
Worker Satisfaction 89 61 20 12 182
Error Rate 69 91 10 2 172
Regulatory Compliance 76 68 14 5 163
Training Effectiveness 92 19 13 19 145
Wages & Compensation 77 36 25 6 144
Automation Exposure 51 54 22 12 142
Team Performance 86 17 27 9 140
Developer Productivity 94 17 14 6 132
Job Displacement 12 80 20 1 113
Hiring & Recruitment 51 7 8 3 69
Skill Obsolescence 5 45 6 1 57
Creative Output 31 16 7 2 57
Social Protection 27 16 8 2 53
Labor Share of Income 17 17 17 51
Worker Turnover 11 12 3 26
Industry 1 1
We evaluate collaborative performance from consensus-based routing among self-interested heterogeneous agents in AgentSociety on real-world datasets.
Empirical evaluation / experiments using real-world datasets to measure collaborative performance under consensus-based routing among heterogeneous agents.
high positive AgentSociety: Incentivizing Agentic Social Intelligence collaborative performance from consensus-based routing
We characterize the Nash equilibrium showing that agent payoffs are reflective of their marginal contributions.
Analytical game-theoretic characterization/proof of Nash equilibrium in the paper.
high positive AgentSociety: Incentivizing Agentic Social Intelligence agent payoffs relative to marginal contributions
The mechanism incentivizes agents to selectively disclose information to their neighbor agents when doing so aligns with their self-interest, in order to garner influence.
Theoretical analysis and mechanism design arguments (and possibly supporting simulations) within the paper.
high positive AgentSociety: Incentivizing Agentic Social Intelligence information disclosure behavior and influence acquisition among agents
Delegation to more competent neighbor agents is incentive compatible and naturally generates multi-agent routing path by consensus.
Formal theoretical proof/analysis presented in the paper (analytical/theoretical result).
high positive AgentSociety: Incentivizing Agentic Social Intelligence delegation behavior and emergence of routing paths (multi-agent routing by conse...
We propose AgentSociety, a mechanism that enables decentralized agentic collaboration grounded in liquid democracy and information diffusion from social choice theory.
Description and design of the AgentSociety mechanism in the paper (mechanism proposal / system design).
high positive AgentSociety: Incentivizing Agentic Social Intelligence ability of agents to operate autonomously, strategically communicate, behave col...
AI assistance can stabilize an overloaded workflow only when (i) the fraction of tasks handled by AI exceeds a critical threshold, and (ii) the human attention required for review and expected rework is lower than the attention required for manual completion.
Formal analytical conditions derived from the paper's queueing model (model-based theoretical result; no empirical sample reported).
high positive Queue & AI: When Faster Tasks Slow Down the Workflow organizational_efficiency
LLM-assisted systems make candidate generation, code comprehension, harness construction, proof-of-impact drafting, and report preparation cheaper at codebase scale.
Argument supported by analysis using public data from Anthropic's Mythos Preview and Mozilla Firefox collaborations (qualitative and illustrative examples; no sample size reported in the provided text).
high positive Demystifying the Mythos or Disrupting Bugonomics? From Zero-... cost/effort to produce candidate vulnerabilities (generation, comprehension, har...
The paper calls for action by stakeholders to consider human and environmental moderators when adopting AI.
Policy/recommendation statement in the paper's conclusion/abstract; normative recommendation rather than empirical finding.
high positive Position: Adopting AI in Practice Does Not Guarantee the Pro... stakeholder policies and actions regarding AI adoption and moderation
We revise the existing framework to redefine effective organizational determinants and shed light on practical implications including industry and education.
Authors' proposed theoretical revision of an existing framework and discussion of implications; presented as a conceptual contribution within the paper.
high positive Position: Adopting AI in Practice Does Not Guarantee the Pro... organizational determinants and practical implications for industry and educatio...
Most practitioners assume that AI brings productivity boosts owing to enhanced technical capabilities.
Statement of common practitioner belief reported by the authors in the paper's framing; no supporting survey or sample reported in the abstract.
high positive Position: Adopting AI in Practice Does Not Guarantee the Pro... perceived productivity benefits from AI
Adoption of Claude Code increases cumulative lifetime languages used by +0.51.
Panel analysis of 5,838 developers over 28 months using the Callaway & Sant'Anna estimator; treatment = first Claude-co-authored commit.
high positive Coding Beyond Your Training: Claude Code and the Technologic... cumulative lifetime programming languages (count)
Adoption of Claude Code increases the count of newly-used languages by +0.31.
Same dataset and staggered-rollout estimator (Callaway & Sant'Anna), treatment = first Claude-co-authored commit; not-yet-treated controls.
high positive Coding Beyond Your Training: Claude Code and the Technologic... newly-used programming languages (monthly)
Adoption of Claude Code increases Shannon language entropy by +0.14.
Estimated with the doubly robust Callaway & Sant'Anna approach on the 5,838-developer panel over 28 months, using first Claude-co-authored commit as treatment.
high positive Coding Beyond Your Training: Claude Code and the Technologic... Shannon language entropy (diversity of languages used)
Adoption of Claude Code increases the number of distinct programming languages used by a developer by +0.83.
Same panel and staggered-rollout estimation as above (Callaway & Sant'Anna), treatment = first Claude-co-authored commit.
high positive Coding Beyond Your Training: Claude Code and the Technologic... distinct programming languages used (monthly)
Adoption of Claude Code increases the number of repositories a developer contributes to by +1.5 (monthly).
Same panel (5,838 developers, 28 months) and estimator (Callaway & Sant'Anna). Treatment = first Claude-co-authored commit; not-yet-treated controls.
high positive Coding Beyond Your Training: Claude Code and the Technologic... repositories contributed to (monthly)
Adoption of Claude Code is associated with an increase of +41 monthly commits per developer.
Analysis of a panel of 5,838 GitHub developers observed monthly over 28 months, exploiting staggered rollout of Claude Code (May 2025–Jan 2026). Treatment defined by developer's first Claude-co-authored commit; not-yet-treated developers used as controls. Estimates from the doubly robust Callaway and Sant'Anna (2021) staggered-difference-in-differences estimator.
Case studies demonstrate exact power-water consistency between virtual attributions and physical generation-side withdrawals.
Simulation results on IEEE 30-bus and 118-bus test systems reported in the paper claiming exact consistency (two test systems used).
high positive From Accounting to Coordination: A Virtual Water-Aware Elect... power-water consistency (alignment between attributed virtual water and physical...
Case studies on the IEEE 30-bus and 118-bus test systems demonstrate reliable convergence of the method.
Simulation experiments reported in the paper using two standard test systems (IEEE 30-bus and IEEE 118-bus). Sample size: 2 test systems.
high positive From Accounting to Coordination: A Virtual Water-Aware Elect... convergence of the algorithm/method in simulations
Combined with fixed-point coordination, the framework enforces consistency between virtual water attribution and physical generation-side withdrawals.
Methodological claim about algorithmic properties (fixed-point coordination used to align attributions with physical withdrawals); supported by theoretical description and later case-study demonstrations.
high positive From Accounting to Coordination: A Virtual Water-Aware Elect... consistency between virtual water attribution and physical generation withdrawal...
The framework represents dispatch optimization as a differentiable optimization layer embedded within a deep learning architecture, enabling efficient end-to-end learning of coordination policies while preserving operational feasibility.
Methodological description claiming an implementation approach (differentiable optimization layer within deep learning); evidence likely from algorithmic implementation and simulation experiments described later in the paper.
high positive From Accounting to Coordination: A Virtual Water-Aware Elect... efficiency of end-to-end learning of coordination policies and preservation of o...
This paper develops an operational electricity-computation-water (ECW) nexus framework that internalizes virtual water impacts directly into power system dispatch.
Primary methodological contribution described in the paper (development and formulation of an ECW framework; implementation details implied but not quantified in the excerpt).
high positive From Accounting to Coordination: A Virtual Water-Aware Elect... integration of virtual water impacts into dispatch optimization
The expansion of data centers (DCs) drives a sustained increase in electricity demand and associated water withdrawals at generation sites.
Background assertion in paper introduction; general empirical observation motivating the work (no specific dataset or sample size reported in the excerpt).
high positive From Accounting to Coordination: A Virtual Water-Aware Elect... electricity demand and associated water withdrawals at power generation sites
The contribution is a benchmark-ready evaluation framework for runtime actuarial control of autonomous-agent side effects.
Paper presents the AAI, Authority Frontier, metrics (C_full, Capital@k), taxonomy, implementations and experimental traces; authors present it as benchmark-ready.
high positive Insuring Every Action: An Authority Frontier Framework for R... availability of a benchmark-ready evaluation framework
We report a live Postgres panel in which three Azure-hosted models propose actions through the same contract.
Live-panel experiment described in the paper using three Azure-hosted models interacting with a Postgres panel under the AAI contract.
high positive Insuring Every Action: An Authority Frontier Framework for R... models proposing actions under the contract in a live Postgres setup
We instantiate AAI across four agentic environments (database mutation, customer-service refund, and the public tau-bench retail and airline tool-use traces).
Empirical instantiation described in the paper across four named environments/traces.
high positive Insuring Every Action: An Authority Frontier Framework for R... successful instantiation of AAI across multiple agentic environments
The framework provides (i) a deterministic quote-bind-commit protocol with toll-bounded capability tokens; (ii) a universal seven-class action taxonomy mapping heterogeneous tool calls to comparable authority units; (iii) replay determinism and pathwise reserve coverage under alpha-spending; (iv) cross-domain normalization via full reserve demand C_full and capital metrics Capital@k.
System design and theoretical specification in the paper; described as implemented across experiments.
high positive Insuring Every Action: An Authority Frontier Framework for R... availability of protocol, taxonomy, determinism properties, and normalization me...
We develop the Authority Frontier, an evaluation primitive measuring how much autonomous authority the runtime releases at each level of reserve capital.
Methodological contribution (definition and formulation of the Authority Frontier) described in the paper; subsequently instantiated empirically in experiments.
high positive Insuring Every Action: An Authority Frontier Framework for R... amount of autonomous authority released as a function of reserve capital
We propose the Actuarial Action Interface (AAI), a deterministic runtime contract that prices each such action against a contractually fixed safe default under a time-consistent risk mapping, and gates execution against a per-boundary reserve capital budget.
Methodological design and proposal described in the paper (no empirical test reported for the claim itself).
high positive Insuring Every Action: An Authority Frontier Framework for R... ability to price actions and gate execution via a deterministic runtime contract
A profile-driven approach places humans and AI systems on shared scales, supporting comparisons that are predictive of novel-task performance, explanatory of why agents succeed or fail, and auditable.
Claim about anticipated benefits of the proposed profile-driven approach presented in the paper (theoretical argument; no empirical results reported).
high positive Reverse Turing Tests for Human-Machine Task Suitability Asse... predictive validity for novel-task performance; explanatory power; auditability ...
Suitability evaluations for task-assignment should be profile-driven — based on assessments that infer latent constructs such as capabilities and propensities from observed performance.
Core proposal of the position paper (conceptual/methodological recommendation; no empirical pilot or validation reported).
high positive Reverse Turing Tests for Human-Machine Task Suitability Asse... method for conducting suitability evaluations (profile-driven assessment of late...
As AI is integrated into the workplace, organisations increasingly face allocation decisions between human and machine workers, and these decisions are increasingly made or assisted by algorithms.
Position paper / conceptual argument in the paper's introduction (no empirical sample or quantitative data reported).
high positive Reverse Turing Tests for Human-Machine Task Suitability Asse... use of algorithms to make or assist allocation decisions between human and machi...
The paper proposes a policy architecture for 'shared gains' centered on learning equity, transition protections, accountable algorithmic management, and distribution-sensitive metrics beyond GDP.
Paper's normative policy proposal presented in abstract, based on the integrative framework and synthesis of secondary sources; no empirical sample size reported.
high positive ARTIFICIAL INTELLIGENCE, INEQUALITIES OF KNOWLEDGE AND RESOU... policy architecture elements for inclusive AI transitions
India's macro growth remains robust.
Statement in abstract referencing official Indian statistics (MoSPI–NSO GDP estimates, 2025); no numerical sample size provided in abstract.
Evidence indicates accelerating AI adoption among firms in advanced economies.
Abstract cites validated secondary sources including OECD (2026) and other global reports; no primary sample size reported in paper abstract.
high positive ARTIFICIAL INTELLIGENCE, INEQUALITIES OF KNOWLEDGE AND RESOU... rate of AI adoption among firms in advanced economies
AI is increasingly embedded in production, services, and workforce management.
Statement in paper's abstract supported by integrative socio-technical political economy framework and validated secondary sources (OECD, ILO, UNDP, WTO, WEF). No primary sample size reported.
high positive ARTIFICIAL INTELLIGENCE, INEQUALITIES OF KNOWLEDGE AND RESOU... degree of AI embedding in production, services, and workforce management
Future A2A collaboration networks cannot rely on unverified self-reporting alone; scalable collaboration requires mechanisms that balance open participation with verifiable execution and trustworthy evaluation.
Paper's concluding recommendation based on the empirical problems documented (low reuse, ranking manipulation, vacuous validations).
high positive Behind EvoMap: Characterizing a Self-Evolving Agent-to-Agent... policy / mechanism design for verification and evaluation
EvoMap's credit economy rewards agents for publishing valuable assets, encouraging participation at scale.
Description and analysis of the platform's reward mechanism and observed high participation (agent counts); empirical linkage between reward rules and publishing behavior discussed in the paper.
high positive Behind EvoMap: Characterizing a Self-Evolving Agent-to-Agent... participation / publishing activity
Structured AI-based interventions provide causal evidence that they can transform access to scientific feedback from a largely private advantage into a more widely distributed resource.
Causal inference based on randomized field experiment showing increased revision likelihood and broader uptake of LLM tools across diverse regions and author groups.
high positive Human-AI Collaboration in Science at Scale: A Global Large-s... access and distribution of scientific feedback (measured via treated authors' be...
Effects were strongest among teams with lower h-indexes and earlier career stages.
Heterogeneous treatment effects by team-level metrics (h-index) and career stage reported in the randomized experiment.
high positive Human-AI Collaboration in Science at Scale: A Global Large-s... treatment effect (e.g., revision likelihood) by team h-index and author career s...
Effects were strongest for manuscripts less embedded in the scholarly literature.
Heterogeneous treatment effects reported by manuscript-level embedding in literature (e.g., referencing/citation context) within the randomized experiment.
high positive Human-AI Collaboration in Science at Scale: A Global Large-s... treatment effect (e.g., revision likelihood) by degree of manuscript embeddednes...
Effects of AI feedback were strongest among authors from non-English-dominant research regions.
Heterogeneous treatment effects reported in the randomized experiment stratified by authors' geographic / language-dominance region; sample includes authors from 133 geographic regions.
high positive Human-AI Collaboration in Science at Scale: A Global Large-s... treatment effect on revision likelihood (or other measured outcomes) by region
Exposure to AI feedback increased authors' subsequent use of LLM tools in their future papers, suggesting longer-run shifts in scientific practice.
Follow-up measurements in the randomized field experiment tracking authors' later behavior (use of LLM tools in subsequent papers); comparison between treatment and control authors.
high positive Human-AI Collaboration in Science at Scale: A Global Large-s... subsequent use of LLM tools in future papers
Authors who received LLM-generated feedback had a significantly higher likelihood of revising their manuscripts, corresponding to a 12.55% relative increase over the baseline revision rate.
Randomized field experiment comparing treatment (LLM feedback) vs control; sample described as >31,000 arXiv preprints and >45,000 researchers; reported comparative revision rate and statistical significance.
high positive Human-AI Collaboration in Science at Scale: A Global Large-s... likelihood (probability) of revising manuscripts
A difference-in-differences design centered on ChatGPT's release supports a causal interpretation of GenAI's local labor-market effects.
Quasi-experimental difference-in-differences analysis using ChatGPT's release as an event/shock, comparing outcomes across neighborhoods with different pre-existing GenAI exposure measures derived from 5 million job postings.
high positive Generative AI impacts on intra-urban inequality and skill pr... causal effect of GenAI exposure on neighborhood-level labor-market outcomes (e.g...
A human-centered approach is needed that integrates technological advancement with reskilling initiatives, labor protections, and inclusive policies.
Authors' prescriptive/recommendation based on their thematic synthesis of the reviewed literature (2010–2024).
high positive Artificial Intelligence in Manufacturing policy and programmatic responses (reskilling, protections, inclusion)
The integration of AI into manufacturing offers substantial gains in efficiency, productivity, and operational performance.
Authors' systematic literature review of interdisciplinary studies (2010–2024) using thematic synthesis; synthesis of prior empirical and conceptual studies reporting efficiency/productivity effects of AI in manufacturing.
high positive Artificial Intelligence in Manufacturing efficiency, productivity, and operational performance
A-insensitivity increases with financial literacy, suggesting financially literate decision-makers perceive greater ambiguity in prediction accuracy.
Association reported in the incentivized laboratory experiment between participants' measured financial literacy and their measured a-insensitivity (correlational evidence; sample size not reported in abstract).
high positive Trusting human versus machine predictions as a decision unde... a-insensitivity (ambiguity-generated insensitivity)
Decision-makers hold more optimistic beliefs about the accuracy of ML analysts than about human analysts, and this greater optimism predicts higher trust in ML analysts relative to human analysts.
Incentivized laboratory experiment measuring participants' optimism about forecast accuracy for human vs. ML analysts and examining the relationship between those beliefs and expressed trust (correlational/regression evidence; sample size not reported in abstract).
high positive Trusting human versus machine predictions as a decision unde... optimism about forecast accuracy and trust in analyst
A human-centred approach underpinned by ongoing reskilling and ethical governance is vital for sustainable workforce evolution in the Indian IT sector.
Authors' policy/recommendation derived from their literature synthesis and thematic analysis (qualitative conclusion).
high positive Human–AI Collaboration in the Indian IT Industry: A Qualitat... sustainability of workforce evolution (effect of human-centred reskilling and go...
The paper introduces a conceptual framework for hybrid intelligence within the Indian IT sector.
Authors present a new conceptual framework as part of this qualitative research article (conceptual contribution).
high positive Human–AI Collaboration in the Indian IT Industry: A Qualitat... conceptual framework introduction