The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (8570 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Adoption Remove filter
Providers hide the model, the tokenizer, and the execution to protect their IP, mitigate jailbreaks, and preserve user privacy, which means an auditor can only inspect proofs the provider supplies.
Conceptual/architectural claim about current commercial provider practices and their implications for auditability (argumentation in paper).
high negative Token Inflation: How Dishonest Providers Can Overcharge for ... auditability (availability of independent evidence)
Limited data, resource constraints and skill gaps significantly influence the pace and form of AI adoption in SMEs.
Synthesis of barriers identified across multiple studies in the 2016-2024 literature (review-level claim without a single quantitative estimate).
high negative The Role of Artificial Intelligence in Strengthening Financi... pace and form of AI adoption
Ethical concerns—especially algorithmic bias—and the need for human oversight remain essential for ensuring positive financial outcomes.
Argument and synthesis from the reviewed literature highlighting ethical risks and recommended governance (conceptual and empirical discussions across studies).
high negative The Role of Artificial Intelligence in Strengthening Financi... ethical risks (algorithmic bias) and governance needs (human oversight)
SMEs face barriers to AI adoption such as limited data, skill shortages, and high implementation costs.
Review synthesis of barriers reported in multiple studies from 2016-2024 (no pooled quantitative prevalence reported).
high negative The Role of Artificial Intelligence in Strengthening Financi... barriers to AI adoption (data availability, skills, costs)
Existing generative AI models do not directly optimize marketplace performance.
Stated as an observed limitation / motivation for the proposed method in the paper (conceptual claim; not an empirical test reported in the excerpt).
high negative Utility-Aware Multimodal Contrastive Learning for Product Im... marketplace performance (sales / demand)
Existing approaches either require a trusted central coordinator (cloud marketplaces), demand heavy blockchain infrastructure (Golem, BrokerChain), or lack an incentive layer entirely (BOINC, Petals).
Comparative characterization based on named existing platforms; presented as conceptual/qualitative analysis without empirical evaluation or quantified benchmarks.
Vast quantities of compute (GPU cycles on personal workstations, idle inference servers, and edge devices between jobs) go unused because no incentive-aligned protocol exists for their owners to share them safely and profitably.
Asserted in the paper's problem statement; no empirical data, sample, or measurement reported — presented as observed motivation.
Both humans and AI contribute wrong answers.
Reported error contributions from both human participants and AI agents in the experimental task.
high negative AI, Take the Wheel: What Drives Delegation and Trust in Huma... contribution of incorrect answers by humans and by AI
Humans over-rely on AI when AI misleads them, occurring in 1.7% of opportunities.
Aggregate analysis of adoption decisions in the experiment (reported percentage of over-reliance on misleading AI suggestions).
high negative AI, Take the Wheel: What Drives Delegation and Trust in Huma... rate of over-reliance on incorrect AI suggestions
Humans under-rely on correct AI suggestions, missing 3.9% of opportunities.
Aggregate analysis of adoption decisions in the experiment (reported percentage of missed opportunities to rely on correct AI suggestions).
high negative AI, Take the Wheel: What Drives Delegation and Trust in Huma... rate of missed correct AI suggestions (under-reliance)
Organizations increasingly deploy separate purpose-built AI tools across professional domains, often hiring domain specialists for each, recreating the staffing models AI was expected to transform.
Stated as an observational/introductory claim in the paper (no empirical data or sample size reported to support the general trend).
high negative Augment Engineering: A Methodology for Multi-Tool AI Orchest... deployment of separate purpose-built AI tools and hiring of domain specialists (...
LLMs heavily rely on simulations for designing algorithms, which is notorious for breaking when transferred to real hardware.
Paper's claim grounded in known transferability issues between simulation and hardware; no experimental quantification provided in the abstract.
high negative GENESIS: Harnessing AI Agents for Autonomous 6G RAN Synthesi... algorithm performance when moving from simulation to real hardware (failure/brea...
LLM pitfalls worsen on Radio Access Network (RAN) use cases: they hallucinate Application Programming Interfaces (APIs) and mis-read specifications, which kills interoperability of RAN components at the first mistake.
Author assertion / observed behavior reported in the paper (qualitative examples implied); no formal experiment or sample size provided in the abstract.
high negative GENESIS: Harnessing AI Agents for Autonomous 6G RAN Synthesi... interoperability / correctness of produced interfaces and implementations
Cellular research and development (R&D) is throttled by six structural processes that each consume months of manual engineering work per iteration: (i) synthesizing new features from standards or research papers into production code; (ii) conformance and interoperability testing; (iii) hardening against field anomalies and diverse deployment environments; (iv) data-driven optimization of network functionalities; (v) discovering and prototyping novel waveforms, functionalities, and capabilities for future standards; and (vi) securing the stack against vulnerabilities.
Author assertion in the paper (qualitative analysis / domain expertise). No empirical sample size or quantitative study reported in the abstract.
high negative GENESIS: Harnessing AI Agents for Autonomous 6G RAN Synthesi... time per R&D iteration (manual engineering work duration)
Existing approaches remain fragmented across formal verification, runtime assurance, neuro-symbolic reasoning and trustworthy Artificial Intelligence (AI) research communities.
Author claim about the state of the research landscape; asserted fragmentation without bibliometric or survey data provided in excerpt.
high negative ReasonOps: A Unified Operational Paradigm for Trustworthy Ve... degree of integration/coordination across research communities
Current reasoning systems still suffer from hidden logical inconsistencies, hallucinated symbolic transitions, unsupported theorem applications, and limited reliability guarantees.
Author assertion identifying failure modes of current reasoning systems; presented qualitatively without quantitative error rates or experimental sample sizes in the excerpt.
high negative ReasonOps: A Unified Operational Paradigm for Trustworthy Ve... reliability / correctness of reasoning systems
Stochastic Tax can remain positive even when Agentic Technical Debt is minimized.
Theoretical claim in the paper's model and discussion: even with minimized debt (stock), the model predicts a nonzero recurring operating burden from stochastic agents; illustrated via examples and an accounts-payable simulation.
high negative Modeling Agentic Technical Debt and Stochastic Tax: A Standa... persistence of Stochastic Tax (recurring operating burden) under minimized Agent...
Stochastic Tax is a recurring flow of operating burden that arises when stochastic agents are used in business workflows.
Definition provided in the paper as part of the conceptual framework describing Stochastic Tax as a flow (recurring operating burden) associated with stochastic agents in workflows.
high negative Modeling Agentic Technical Debt and Stochastic Tax: A Standa... operating burden (recurring flow) arising from use of stochastic agents in busin...
There is an urgent question of how humans can effectively supervise and control an economy operated by AI agents when this system may expand beyond the capacity of traditional governance.
Framed as a central research/policy concern in the paper's abstract; conceptual argument rather than empirical finding.
high negative Regulatory Policy for the Agent Economy in the Digital Age: ... capacity of traditional governance to supervise/control AI-operated economy
The Agent Economy raises new regulatory challenges concerning data privacy, security, ethics, and the risk of job displacement.
Stated in paper abstract as identified risks; based on literature synthesis and comparative policy analysis approach (method described), but no empirical incidence metrics reported.
high negative Regulatory Policy for the Agent Economy in the Digital Age: ... regulatory challenges related to privacy, security, ethics, and job displacement...
Organizations implementing AI without responsible transition mechanisms may worsen workforce anxiety, skill obsolescence, inequality, and trust erosion.
Paper's theoretical/conceptual assertion about risks of poorly-managed AI adoption; no empirical validation reported in the excerpt.
high negative From Automation Panic to Workforce Resilience: A Governance ... workforce anxiety, skill obsolescence, inequality, trust
The International Monetary Fund estimates that nearly 40% of global employment is susceptible to AI, with exposure rising to 60% in advanced economies owing to cognitive task-oriented jobs.
Cited IMF estimate reported in the paper (reference to an IMF analysis; no sample size given in the excerpt).
high negative From Automation Panic to Workforce Resilience: A Governance ... share of employment susceptible/exposed to AI
Tenure negatively relates to AI use (OR = 0.846 per category).
Reported odds ratio from logistic regression for tenure categories predicting AI use; OR = 0.846 per tenure category.
high negative Determinants of Artificial Intelligence Adoption in Public S... active AI adoption (binary)
The requirement that review + expected rework attention be lower than manual completion attention is substantially more stringent than the requirement that AI merely generate faster drafts.
Comparative analytical argument based on the model's derived stability conditions (theoretical/model-based reasoning; no empirical sample reported).
high negative Queue & AI: When Faster Tasks Slow Down the Workflow developer_productivity
Under congestion, reviewers rationally raise the risk threshold for checking AI outputs, reducing scrutiny precisely when it would matter the most.
Analytical implication derived from the queueing model presented in the paper (theoretical/model-based inference; no empirical validation reported).
Mean-based metrics (e.g., tasks completed per worker-hour or mean handle time) can misrepresent AI's effects in workflows where tasks accumulate and compete for scarce human attention.
Argument and analysis presented in the paper; theoretical reasoning and illustrative queueing model (no empirical sample reported).
high negative Queue & AI: When Faster Tasks Slow Down the Workflow task_completion_time
Regardless of apparent performance advances in AI technology, human and environmental factors of the organization may substantially attenuate — or even negate — the effective productivity benefits.
Conceptual argument in the paper; theoretical reasoning and literature synthesis (no primary empirical data reported in the abstract).
high negative Position: Adopting AI in Practice Does Not Guarantee the Pro... realized productivity benefits from AI deployment
Adopting AI in organizational practice does not guarantee productivity gains, because human and environmental factors critically moderate the relationship between AI deployment and realized productivity improvements.
Position paper's conceptual argument presented in the abstract; no empirical sample or quantitative study reported.
high negative Position: Adopting AI in Practice Does Not Guarantee the Pro... productivity gains (realized productivity improvements)
AI evaluation methods (benchmarks, red teaming, leaderboards) cannot be easily applied to human workers or yield comparable metrics.
Conceptual critique in the paper contrasting standard AI evaluation methods with human evaluation (no empirical comparisons provided).
high negative Reverse Turing Tests for Human-Machine Task Suitability Asse... applicability and comparability of AI evaluation methods when applied to humans
Common criteria used to assess people (e.g., education, experience, references) cannot feasibly scale to AI systems.
Argumentative claim in the paper contrasting human hiring/evaluation practices with AI system assessment (conceptual; no empirical validation provided).
high negative Reverse Turing Tests for Human-Machine Task Suitability Asse... scalability of human assessment criteria to AI systems
Human and machine workers may 'compete' for a given task, reproducing aspects of adversarial games.
Theoretical/assertional claim in the paper (conceptual discussion; no empirical data provided).
high negative Reverse Turing Tests for Human-Machine Task Suitability Asse... competitive interaction between human and AI workers for tasks
The increased use of algorithms in allocation decisions creates a Reverse Turing Test dynamic wherein the machine is now the judge.
Conceptual framing and argument presented in the paper (theoretical description; no empirical test reported).
high negative Reverse Turing Tests for Human-Machine Task Suitability Asse... judgment role of algorithms in human-machine task assignment
AI-driven efficiency pressures in IT services may compress billable work and alter hiring and wage structures, raising transition risks even for technical workers.
Abstract cites high-reliability sector evidence (Reuters 2026a; Nasscom) to support this sector-specific claim; no sample size provided in abstract.
high negative ARTIFICIAL INTELLIGENCE, INEQUALITIES OF KNOWLEDGE AND RESOU... compression of billable work, changes to hiring and wage structures, transition ...
Labor-market segmentation and digital capability gaps in India create distributional vulnerabilities.
Abstract cites Indian official statistics and household/labor surveys (PLFS, HCES, MoSPI–NSO) and integrates sector evidence; no specific sample size reported in abstract.
high negative ARTIFICIAL INTELLIGENCE, INEQUALITIES OF KNOWLEDGE AND RESOU... distributional vulnerabilities arising from labor-market segmentation and digita...
Refined exposure measures imply widespread task transformation rather than uniform job destruction, with accelerated skill change as a central risk for vulnerable workers.
Abstract cites labor-market analyses and ILO (2025) as the basis for refined exposure measures and conclusions; no sample size stated in abstract.
high negative ARTIFICIAL INTELLIGENCE, INEQUALITIES OF KNOWLEDGE AND RESOU... task transformation versus job destruction and skill change risk for vulnerable ...
Global frameworks warn that uneven readiness may produce a 'Next Great Divergence' between countries.
Cited global reports in abstract (UNDP 2025, WTO 2025, OECD 2026) which are summarized as issuing this warning; no primary data sample size reported in paper abstract.
high negative ARTIFICIAL INTELLIGENCE, INEQUALITIES OF KNOWLEDGE AND RESOU... uneven readiness leading to increased divergence between countries
Persistent adoption gaps among groups suggest unequal access to AI-enabled productivity.
Abstract references global reports (OECD, WEF, UNDP, WTO) and sector evidence indicating adoption gaps; no numerical sample size given.
high negative ARTIFICIAL INTELLIGENCE, INEQUALITIES OF KNOWLEDGE AND RESOU... adoption gaps and unequal access to AI-enabled productivity
AI may widen capability inequality—inequalities in access to knowledge, digital infrastructure, computational resources, and organizational adoption—thereby shaping income opportunities and socio-economic security for low-income groups.
Argument presented using the paper's socio-technical political economy framework and validated secondary sources (OECD, ILO, UNDP, WTO, WEF) and official Indian statistics; no direct empirical sample from this paper reported.
high negative ARTIFICIAL INTELLIGENCE, INEQUALITIES OF KNOWLEDGE AND RESOU... capability inequality and downstream income/socio-economic security for low-inco...
Design choices that prioritize scalable growth introduce trade-offs in reusability, evolution, and auditability in A2A collaboration networks.
Synthesis of empirical findings (low reuse, manipulable rankings, unverified validations) connecting design incentives to negative side-effects.
high negative Behind EvoMap: Characterizing a Self-Evolving Agent-to-Agent... trade-offs among scalability, reusability, evolution, auditability
EvoMap relies on agents to provide local execution logs as evidence that uploaded assets function correctly; because these validations are not independently verified, over 84% of approved assets bypass quality checks using vacuous tests (e.g., console.log).
Empirical audit of validation logs and acceptance tests reported in the paper showing >84% of approved assets used trivial/vacuous checks.
high negative Behind EvoMap: Characterizing a Self-Evolving Agent-to-Agent... verification/validation quality of assets
Agents can trivially manipulate their asset's scores by falsifying self-reported metadata.
Demonstrations/analyses in the paper that changing metadata values leads to predictable changes in GDI scores; examples like claimed lines-of-code manipulation are provided.
high negative Behind EvoMap: Characterizing a Self-Evolving Agent-to-Agent... vulnerability to manipulation of ranking
An asset's GDI rank is heavily dictated by unverified, self-reported metadata (e.g., claimed lines of code modified).
Correlation/causal analysis in the paper showing strong dependence of GDI scores on self-reported metadata fields rather than objective performance measures.
high negative Behind EvoMap: Characterizing a Self-Evolving Agent-to-Agent... drivers of ranking (metadata vs objective performance)
EvoMap employs an algorithm (GDI) to score and rank shared assets, and this scoring system is flawed.
Paper description of the GDI ranking algorithm and empirical analyses illustrating problems with how it operates.
high negative Behind EvoMap: Characterizing a Self-Evolving Agent-to-Agent... quality of ranking/scoring
Rewards become highly concentrated among a small fraction of agents.
Distributional analysis of credits/rewards across agents (inequality/concentration observed in reward allocation).
high negative Behind EvoMap: Characterizing a Self-Evolving Agent-to-Agent... reward concentration / inequality
98% of assets are never reused.
Empirical reuse metric computed across the asset corpus reported in the paper.
Because rewards favor publication over adoption, agents mass-produce assets to accumulate credits.
Observed publishing behavior (large numbers of assets per agent) and the platform's incentive structure; paper links publication-focused rewards to high per-agent asset counts.
high negative Behind EvoMap: Characterizing a Self-Evolving Agent-to-Agent... publishing behavior / task allocation
Rewards are tied primarily to publication rather than adoption.
Analysis of reward allocation rules and empirical patterns showing reward issuance linked to publication events more than measured reuse/adoption.
high negative Behind EvoMap: Characterizing a Self-Evolving Agent-to-Agent... reward allocation (publication vs. adoption)
These findings challenge the prevailing theory of skill-biased technological change.
Empirical observation that high-skill, high-exposure neighborhoods experienced wage stagnation post-2023 despite continued inflows of high-skilled workers, interpreted in contrast to predictions of skill-biased technological change.
high negative Generative AI impacts on intra-urban inequality and skill pr... validity of skill-biased technological change predictions (skill premium dynamic...
Since 2023, high-exposure neighborhoods have experienced wage stagnation even as they continue to attract high-skilled workers (a 'high-skill trap').
Temporal analysis of job-posting wage signals in Beijing neighborhoods (2018--2024) using the GenAI Exposure Index to compare wage trajectories before and after 2023 between high- and low-exposure neighborhoods.
high negative Generative AI impacts on intra-urban inequality and skill pr... wage levels / wage growth (stagnation)
GenAI exposure is highly concentrated in the city's core districts, deepening the intra-urban AI divide.
Spatial analysis of a neighborhood-level GenAI Exposure Index constructed from 5 million Beijing job postings (2018--2024), where task-level assessments were aggregated across five leading large language models to measure exposure by neighborhood.
high negative Generative AI impacts on intra-urban inequality and skill pr... GenAI exposure concentration across neighborhoods / intra-urban AI divide