The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (4114 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Innovation Remove filter
AutoScientists is a decentralized team of AI agents that interpret a shared experimental state, self-organize into teams around promising hypotheses, critique proposals before using experimental compute, and share successes and failures to reduce redundant exploration.
System design and implementation described in the paper (architecture and agent protocols); qualitative description of agent behaviors and coordination mechanisms; demonstrated in experiments.
high positive AutoScientists: Self-Organizing Agent Teams for Long-Running... agent coordination and information sharing (qualitative description)
GENESIS is built on three composable primitives (agents, skills, hooks) and a knowledge layer (SYNAPSE) that doubles as the source of ground truth and the recipient of every artifact the framework produces, making capabilities compound across runs.
Architectural description in the paper; claim about knowledge base acting as ground truth and enabling capability compounding (design-level claim). No quantitative evaluation given in the abstract.
high positive GENESIS: Harnessing AI Agents for Autonomous 6G RAN Synthesi... accumulation/compounding of capabilities across runs (longitudinal improvement o...
GENESIS is an agentic AI framework that converts intents (e.g., a specification clause, a telemetry anomaly, or a research hypothesis) into solutions validated with over-the-air experiments, fed back into a persistent knowledge base.
System design / implementation claim presented in the paper (description of proposed framework). The abstract does not report empirical evaluation metrics or sample size.
high positive GENESIS: Harnessing AI Agents for Autonomous 6G RAN Synthesi... ability to produce solutions validated by over-the-air experiments (end-to-end R...
Large Language Models (LLMs) have compressed comparable R&D work in general software engineering from days to minutes.
Paper's stated comparison/claim (likely based on prior reports or authors' experience); no experimental details or sample size provided in the abstract.
high positive GENESIS: Harnessing AI Agents for Autonomous 6G RAN Synthesi... time to complete R&D/software engineering tasks
Operational reasoning paradigms such as ReasonOps may become foundational infrastructure for next-generation trustworthy AI ecosystems.
Author's forward-looking argument / conjecture about the potential future impact and adoption of operational reasoning paradigms; presented as an argument rather than demonstrated empirically in the excerpt.
high positive ReasonOps: A Unified Operational Paradigm for Trustworthy Ve... future adoption / foundational role of operational reasoning paradigms
The paper presents the ReasonOps architecture, demonstrates its workflow using an autonomous braking system analysis example, and discusses its potential role in future safety-critical autonomous AI systems.
Author statement about the paper's content and demonstration (explicitly claims an architecture and an example walkthrough); evidence is the paper's own descriptive content.
high positive ReasonOps: A Unified Operational Paradigm for Trustworthy Ve... presence of architecture and example demonstration in the paper
The proposed paradigm integrates semantic interpretation, autoformalization, symbolic reasoning, theorem proving, runtime assurance, probabilistic reliability estimation, and adaptive correction into a unified reasoning lifecycle.
Author claim about the architecture and components of ReasonOps; presented as a proposed integrated lifecycle in the paper (no empirical evaluation reported in excerpt).
high positive ReasonOps: A Unified Operational Paradigm for Trustworthy Ve... integration of multiple reasoning and assurance components
ReasonOps treats reasoning as a continuously monitored, verifiable, reliability-aware operational process rather than an isolated inference task.
Author description of the ReasonOps paradigm and its operational stance (conceptual framework described in paper).
high positive ReasonOps: A Unified Operational Paradigm for Trustworthy Ve... operationalization of reasoning processes (monitoring, verification, reliability...
This paper introduces ReasonOps, a unified operational paradigm for trustworthy verified reasoning systems.
Declarative claim about the paper's contribution (introduction of a named paradigm); supported by the paper itself (architectural description and example claimed).
high positive ReasonOps: A Unified Operational Paradigm for Trustworthy Ve... existence/introduction of an operational paradigm (ReasonOps)
Recent advances in theorem proving, autoformalization, symbolic reasoning, and tool-augmented language models demonstrate substantial progress toward machine-assisted formal reasoning.
Author statement citing multiple research directions (theorem proving, autoformalization, symbolic reasoning, tool-augmented LMs); no specific empirical results or quantitative studies provided in excerpt.
high positive ReasonOps: A Unified Operational Paradigm for Trustworthy Ve... progress toward machine-assisted formal reasoning
Large Language Models (LLMs) have transformed artificial intelligence from primarily generative systems into increasingly capable reasoning agents.
Author assertion in paper's introduction; conceptual argument referencing recent developments in LLMs (no empirical study or sample size reported in text excerpt).
high positive ReasonOps: A Unified Operational Paradigm for Trustworthy Ve... capability of LLMs to perform reasoning
There exists a data supply chain that runs from individual translators through language service providers (LSPs) and platforms to model developers.
Mapping and descriptive analysis of industry supply chains and intermediary roles provided in the paper; conceptual and empirical examples of flows of translation data from translators to model developers. No numerical sample reported.
high positive Translators as Invisible Teachers of AI: Copyright, Translat... structure and flow of translation data across actors
Article 30-4 of the Japanese Copyright Act legitimates a mode of use the paper terms 'appropriation without consumption'—i.e., mining works for statistical features rather than reading or experiencing them.
Textual/legal analysis of Article 30-4 of the Japanese Copyright Act and its interpretation; comparative legal reading presented in the paper. No numerical sample reported.
high positive Translators as Invisible Teachers of AI: Copyright, Translat... legal legitimation of non-experiential mining of copyrighted works
The development of statistical machine translation (SMT), neural machine translation (NMT), the Transformer architecture, and multilingual large language models (LLMs) cannot be disentangled from the accumulation of translation data (TM/parallel corpora).
Historical and technical literature review linking MT/NLP methodological advances to the availability and use of parallel corpora and TM; comparative analysis of model development histories described in the paper. No numerical sample reported.
high positive Translators as Invisible Teachers of AI: Copyright, Translat... dependence of major MT/LLM advances on accumulated translation data
Translation memories (TM) and parallel corpora preserve a one-to-one correspondence between source and target text and therefore constitute extraordinarily valuable supervised training data for machine translation.
Conceptual argument and literature review of machine translation practice (discussion of TM/parallel corpora as supervised training data); examples and descriptive evidence from MT research and industry practice presented in the paper. No numerical sample reported.
high positive Translators as Invisible Teachers of AI: Copyright, Translat... value of translation data as supervised training inputs for MT
To balance promotion of innovation with preservation of human creativity, it is essential to revise existing laws and introduce novel approaches such as defining a specific intellectual property right for AI-generated works or designating ownership among associated human agents.
Normative recommendation derived from the paper's comparative legal analysis and discussion of enforcement challenges (no empirical sample size).
high positive Examining the Challenges of Intellectual Property in AI-Gene... policy/legislative reforms to IP law for AI-generated works
Artificial intelligence systems are capable of autonomously generating artistic, literary, musical works, and even inventions without direct human intervention.
Stated as part of the paper's premise and supported by the paper's literature/theoretical review of advances in AI creative and inventive capabilities (no empirical sample size reported).
high positive Examining the Challenges of Intellectual Property in AI-Gene... existence/capability of AI to autonomously generate creative works and invention...
The proposed policy framework contributes to establishing a foundation for Vietnam to proactively embrace the Agent Economy safely and effectively.
Claim in abstract about the intended contribution/impact of the proposed framework; no empirical evaluation or measured outcomes presented.
high positive Regulatory Policy for the Agent Economy in the Digital Age: ... capacity of Vietnam to embrace Agent Economy safely/effectively (foundation-buil...
The Agent Economy promises substantial gains in productivity and innovation.
Asserted in paper abstract as an anticipated outcome; no empirical measurement, sample size, or quantified effect provided.
high positive Regulatory Policy for the Agent Economy in the Digital Age: ... productivity and innovation gains
Adoption of Claude Code increases cumulative lifetime languages used by +0.51.
Panel analysis of 5,838 developers over 28 months using the Callaway & Sant'Anna estimator; treatment = first Claude-co-authored commit.
high positive Coding Beyond Your Training: Claude Code and the Technologic... cumulative lifetime programming languages (count)
Adoption of Claude Code increases the count of newly-used languages by +0.31.
Same dataset and staggered-rollout estimator (Callaway & Sant'Anna), treatment = first Claude-co-authored commit; not-yet-treated controls.
high positive Coding Beyond Your Training: Claude Code and the Technologic... newly-used programming languages (monthly)
Adoption of Claude Code increases Shannon language entropy by +0.14.
Estimated with the doubly robust Callaway & Sant'Anna approach on the 5,838-developer panel over 28 months, using first Claude-co-authored commit as treatment.
high positive Coding Beyond Your Training: Claude Code and the Technologic... Shannon language entropy (diversity of languages used)
Adoption of Claude Code increases the number of distinct programming languages used by a developer by +0.83.
Same panel and staggered-rollout estimation as above (Callaway & Sant'Anna), treatment = first Claude-co-authored commit.
high positive Coding Beyond Your Training: Claude Code and the Technologic... distinct programming languages used (monthly)
Adoption of Claude Code increases the number of repositories a developer contributes to by +1.5 (monthly).
Same panel (5,838 developers, 28 months) and estimator (Callaway & Sant'Anna). Treatment = first Claude-co-authored commit; not-yet-treated controls.
high positive Coding Beyond Your Training: Claude Code and the Technologic... repositories contributed to (monthly)
Adoption of Claude Code is associated with an increase of +41 monthly commits per developer.
Analysis of a panel of 5,838 GitHub developers observed monthly over 28 months, exploiting staggered rollout of Claude Code (May 2025–Jan 2026). Treatment defined by developer's first Claude-co-authored commit; not-yet-treated developers used as controls. Estimates from the doubly robust Callaway and Sant'Anna (2021) staggered-difference-in-differences estimator.
Case studies demonstrate exact power-water consistency between virtual attributions and physical generation-side withdrawals.
Simulation results on IEEE 30-bus and 118-bus test systems reported in the paper claiming exact consistency (two test systems used).
high positive From Accounting to Coordination: A Virtual Water-Aware Elect... power-water consistency (alignment between attributed virtual water and physical...
Case studies on the IEEE 30-bus and 118-bus test systems demonstrate reliable convergence of the method.
Simulation experiments reported in the paper using two standard test systems (IEEE 30-bus and IEEE 118-bus). Sample size: 2 test systems.
high positive From Accounting to Coordination: A Virtual Water-Aware Elect... convergence of the algorithm/method in simulations
Combined with fixed-point coordination, the framework enforces consistency between virtual water attribution and physical generation-side withdrawals.
Methodological claim about algorithmic properties (fixed-point coordination used to align attributions with physical withdrawals); supported by theoretical description and later case-study demonstrations.
high positive From Accounting to Coordination: A Virtual Water-Aware Elect... consistency between virtual water attribution and physical generation withdrawal...
The framework represents dispatch optimization as a differentiable optimization layer embedded within a deep learning architecture, enabling efficient end-to-end learning of coordination policies while preserving operational feasibility.
Methodological description claiming an implementation approach (differentiable optimization layer within deep learning); evidence likely from algorithmic implementation and simulation experiments described later in the paper.
high positive From Accounting to Coordination: A Virtual Water-Aware Elect... efficiency of end-to-end learning of coordination policies and preservation of o...
This paper develops an operational electricity-computation-water (ECW) nexus framework that internalizes virtual water impacts directly into power system dispatch.
Primary methodological contribution described in the paper (development and formulation of an ECW framework; implementation details implied but not quantified in the excerpt).
high positive From Accounting to Coordination: A Virtual Water-Aware Elect... integration of virtual water impacts into dispatch optimization
The expansion of data centers (DCs) drives a sustained increase in electricity demand and associated water withdrawals at generation sites.
Background assertion in paper introduction; general empirical observation motivating the work (no specific dataset or sample size reported in the excerpt).
high positive From Accounting to Coordination: A Virtual Water-Aware Elect... electricity demand and associated water withdrawals at power generation sites
The aim is to keep autonomous agency composable while keeping accountability non-negotiable, so that coordination itself can become shared infrastructure for a human-AI society that is open, pluralistic, and governable.
Stated design/ethical objective in the paper; normative claim about intended social and governance outcomes rather than an empirically validated result.
high positive Foundation Protocol: A Coordination Layer for Agentic Societ... feasibility of composable autonomous agency combined with enforceable accountabi...
FP is designed to wrap and bridge existing protocols rather than replace them, enabling incremental adoption while reducing integration and governance overhead.
Design rationale/claim in the paper about interoperability and incremental adoption strategy; no empirical deployment, integration case studies, or measured overhead reductions presented.
high positive Foundation Protocol: A Coordination Layer for Agentic Societ... ability to interoperate with existing protocols and reduce integration/governanc...
FP treats policy, provenance, and audit as first-class concerns.
Design/architectural claim in the paper stating that policy, provenance, and audit are prioritized within FP; no empirical compliance or audit trials presented.
high positive Foundation Protocol: A Coordination Layer for Agentic Societ... integration of policy, provenance, and audit mechanisms into the protocol
FP provides economic primitives for metering, receipts, and settlement.
Design claim in the paper listing economic primitives as part of FP; no deployment or economic experiments reported.
high positive Foundation Protocol: A Coordination Layer for Agentic Societ... availability of built-in primitives for metering usage, issuing receipts, and pe...
FP supports native multi-party organization and event-based collaboration.
Feature/architecture claim in the paper describing native support for multi-party organization and event-driven collaboration; no empirical evaluation or user studies provided.
high positive Foundation Protocol: A Coordination Layer for Agentic Societ... support for multi-party organizational constructs and event-based collaboration ...
FP unifies heterogeneous entities, including agents, tools, resources, humans, institutions, and organizations.
Design specification/feature claim in the paper describing FP's data and entity model; no empirical interoperability study reported.
high positive Foundation Protocol: A Coordination Layer for Agentic Societ... ability to represent and integrate diverse entity types within the protocol
This paper introduces the Foundation Protocol (FP), a graph-first coordination layer for an emerging human-AI society.
Claim of authorship/introduction in the paper; architectural/design proposal rather than an evaluated system.
high positive Foundation Protocol: A Coordination Layer for Agentic Societ... existence of a proposed coordination layer (Foundation Protocol)
Agents need to form reliable relationships, organize multi-agent work, exchange value, support an AI economy, and stay safe and accountable under real-world oversight.
Normative/requirements statement in the paper describing necessary capabilities for scaled multi-agent systems; no empirical validation or experimental data provided.
high positive Foundation Protocol: A Coordination Layer for Agentic Societ... requirements for multi-agent operation (reliability of relationships, work organ...
Autonomous agents are moving from tools into a layer of social infrastructure: they browse, purchase, deploy software, manage systems, and increasingly interact with one another.
Statement in the paper's introductory/abstract text presenting an observed trend; conceptual/qualitative claim without empirical data or measured sample.
high positive Foundation Protocol: A Coordination Layer for Agentic Societ... degree of autonomous agent activity across social and economic functions (browsi...
The paper proposes five evaluation dimensions for AutoResearch systems: novelty, validity, impact, reliability, and provenance.
Paper explicitly proposes these five dimensions as an evaluation rubric; conceptual proposal.
high positive AutoResearch AI: Towards AI-Powered Research Automation for ... n/a (evaluation framework)
The field can be organized around five workflow conditions: literature and research grounding; hypothesis formation and planning; experimentation and tool use; feedback, validation, and review; and reporting and knowledge communication.
Authors propose this five-condition organizational framework as part of their survey and synthesis; conceptual contribution.
high positive AutoResearch AI: Towards AI-Powered Research Automation for ... n/a (framework/organizational taxonomy)
Vibe Research denotes the human-steered region of prompt-based assistance and human-verified execution within AutoResearch.
Paper-introduced terminology and conceptual delineation of a sub-region of the AutoResearch spectrum; definitional statement.
high positive AutoResearch AI: Towards AI-Powered Research Automation for ... n/a (terminology/definition)
AutoResearch is defined as the developmental spectrum of AI-powered scientific workflow automation.
Paper provides an explicit definitional framing (terminology introduced by authors); conceptual contribution rather than empirical finding.
high positive AutoResearch AI: Towards AI-Powered Research Automation for ... n/a (terminology/definition)
This shift marks a transition from task-level AI for science to workflow-level research automation.
Conceptual argument backed by literature survey and examples of systems that coordinate multiple research tasks; no single quantitative study reported.
high positive AutoResearch AI: Towards AI-Powered Research Automation for ... degree of automation along research workflows (task-level vs workflow-level)
Scientific research is being reshaped by AI systems that move beyond isolated assistance toward longer-horizon workflows spanning literature grounding, hypothesis generation, experimentation, validation, reporting, and revision.
Survey / conceptual synthesis of recent AI research systems and literature; paper presents this as an observed trend rather than reporting original empirical measurements.
high positive AutoResearch AI: Towards AI-Powered Research Automation for ... extent of AI integration across research workflows (literature grounding, hypoth...
XWind shows consistent gains across workload types, load levels, and GPU generations.
Reported experimental results spanning multiple workload types, different load levels, and various GPU generations (details in main paper); abstract states consistency of gains.
high positive XWind: A Cross-site Router for Large Language Model Inferenc... consistency of latency/performance gains across workloads, loads, and GPU genera...
XWind reduces P99 end-to-end latency by up to 98% over baselines such as power-capping and GPU idling.
Experimental results on the 64-GPU A100 testbed with emulated wind sites and Azure traces; comparison against baseline strategies including power-capping and GPU idling.
XWind reduces P99 end-to-end latency by up to 52% over the strongest contender (also our idea).
Experimental results on the 64-GPU A100 testbed with emulated wind sites and Azure traces; comparison against a 'strongest contender' baseline (described as another idea from the authors).
We build XWind, a lightweight, reactive, and workload-agnostic AI inference router that uses only real-time signals (inference latency, KV-cache utilization, and queue depth) to dynamically configure sites and distribute requests under variable wind power.
System implementation described in paper; design specification lists the three real-time signals used.
high positive XWind: A Cross-site Router for Large Language Model Inferenc... ability to configure sites and distribute inference requests using only specifie...