The Commonplace
Home Dashboard Papers Evidence Digests 🎲

Evidence (2340 claims)

Adoption
5267 claims
Productivity
4560 claims
Governance
4137 claims
Human-AI Collaboration
3103 claims
Labor Markets
2506 claims
Innovation
2354 claims
Org Design
2340 claims
Skills & Training
1945 claims
Inequality
1322 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 378 106 59 455 1007
Governance & Regulation 379 176 116 58 739
Research Productivity 240 96 34 294 668
Organizational Efficiency 370 82 63 35 553
Technology Adoption Rate 296 118 66 29 513
Firm Productivity 277 34 68 10 394
AI Safety & Ethics 117 177 44 24 364
Output Quality 244 61 23 26 354
Market Structure 107 123 85 14 334
Decision Quality 168 74 37 19 301
Fiscal & Macroeconomic 75 52 32 21 187
Employment Level 70 32 74 8 186
Skill Acquisition 89 32 39 9 169
Firm Revenue 96 34 22 152
Innovation Output 106 12 21 11 151
Consumer Welfare 70 30 37 7 144
Regulatory Compliance 52 61 13 3 129
Inequality Measures 24 68 31 4 127
Task Allocation 75 11 29 6 121
Training Effectiveness 55 12 12 16 96
Error Rate 42 48 6 96
Worker Satisfaction 45 32 11 6 94
Task Completion Time 78 5 4 2 89
Wages & Compensation 46 13 19 5 83
Team Performance 44 9 15 7 76
Hiring & Recruitment 39 4 6 3 52
Automation Exposure 18 17 9 5 50
Job Displacement 5 31 12 48
Social Protection 21 10 6 2 39
Developer Productivity 29 3 3 1 36
Worker Turnover 10 12 3 25
Skill Obsolescence 3 19 2 24
Creative Output 15 5 3 1 24
Labor Share of Income 10 4 9 23
Clear
Org Design Remove filter
Returns to AI are heterogeneous across firms; estimating treatment effects requires attention to selection, complementarities, and dynamic adoption pipelines.
Methodological argument referencing treatment-effect literature and observed firm heterogeneity; supported by conceptual examples rather than a single empirical treatment-effect estimate.
high neutral Modern Management in the Age of Artificial Intelligence: Str... heterogeneity in returns to AI adoption (firm-level productivity or performance ...
A subset of four datasets included settings in which the AI provided explanations of its decision.
Paper states that four of the datasets involved AI explanations (explicitly stated in abstract).
high null result Beyond AI advice -- independent aggregation boosts human-AI ... presence_of_AI_explanation
The study compared HCT to the AI-as-advisor approach using 10 datasets from various domains, including medical diagnostics and misinformation discernment.
Paper reports an empirical comparison across 10 datasets spanning multiple domains (explicitly stated in abstract).
The hybrid confirmation tree (HCT) elicits a human judgment and an AI judgment independently; if they agree that decision is accepted, and if they disagree a second human breaks the tie.
Description of the HCT method in the paper (procedural/design specification).
We implement a rigorously controlled execution-based testbed featuring Git worktree isolation and explicit global memory to evaluate agent coordination frameworks.
Methodological description in the paper indicating the testbed design choices (Git worktree isolation, explicit global memory) used to ensure controlled, reproducible execution of agent-generated code.
high null result An Empirical Study of Multi-Agent Collaboration for Automate... experimental reproducibility and isolation (testbed design)
We benchmark a single-agent baseline against two multi-agent paradigms: a subagent architecture (parallel exploration with post-hoc consolidation) and an agent team architecture (experts with pre-execution handoffs) using a rigorously controlled, execution-based testbed.
Description of experimental setup in the paper: an execution-based testbed with Git worktree isolation and explicit global memory; experiments explicitly compare single-agent, subagent, and agent-team architectures under fixed computational time budgets.
high null result An Empirical Study of Multi-Agent Collaboration for Automate... comparative performance of agent architectures (benchmarking setup)
Methods combine targeted literature synthesis, comparative conceptual analysis, and framework building (with recent scholarly and institutional sources reviewed).
Explicit methodological statement in the paper describing the review and analytic approach; no primary-data methods used.
high null result Behavioral Factors as Determinants of Successful Scaling of ... methodological approach (literature synthesis and conceptual framework developme...
AI coding assistants are a high-visibility class of corporate AI and are given special attention as an illustrative case in the paper.
Paper specifically calls out AI coding assistants as a focal example in the conceptual analysis and discussion; based on literature review rather than original measurement.
high null result Behavioral Factors as Determinants of Successful Scaling of ... role of coding assistants as illustrative case for scaling and behavioral dynami...
The Article translates these insights into risk-sensitive guideposts for modernizing governance of AI-enabled tools and emerging modalities, from agentic systems to blockchain-deployed smart contracts.
Prescriptive/conceptual policy guidance presented in the Article (normative recommendations; governance framework).
high null result Rewired: Reconceptualizing Legal Services for the AI Age provision of governance guideposts for AI-enabled legal technologies
The Innovation Frontier traces LegalTech’s evolution from 2000s-vintage e-discovery to generative AI.
Historical/chronological analysis in the Article (literature review/history of LegalTech provided by authors).
high null result Rewired: Reconceptualizing Legal Services for the AI Age narrative/historical scope of LegalTech evolution covered in the Article
The Legal Services Value Chain disaggregates the lifecycle of a legal matter into five distinct nodes of activity.
Model description in the Article (conceptual architecture; decomposition of legal work).
high null result Rewired: Reconceptualizing Legal Services for the AI Age number and structure of nodes in the proposed value-chain model
The Article develops two core organizing models: the Legal Services Value Chain and the Innovation Frontier.
Explicit claim in the Article describing conceptual/model contributions (theoretical/model-building).
high null result Rewired: Reconceptualizing Legal Services for the AI Age presence of two organizing conceptual models in the Article
This Article provides a practical framework for navigating the shifting terrain of legal innovation and AI.
Statement of purpose in the Article (conceptual contribution; framework development). No empirical validation reported in the excerpt.
high null result Rewired: Reconceptualizing Legal Services for the AI Age existence of a practical framework for legal-AI governance and strategy
Three interlocking threads characterize AI for science: (1) AI as research instrument, (2) AI for research infrastructure, and (3) the reshaping of scholarly profiles and incentives by machine-readable metrics.
Conceptual framework presented in the paper; organization of topics rather than empirical measurement. The paper indicates these threads are followed through historical and contemporary examples.
high null result A Brief History of AI for Scientific Discovery: Open Researc... conceptual decomposition of AI-for-science developments
The history of artificial intelligence for scientific discovery is not a two year story about chatbots learning to write papers; it is a sixty year story beginning with DENDRAL (1965).
Historical narrative / literature review citing early systems such as DENDRAL (1965) and subsequent developments in scholarly infrastructure (arXiv, Google Scholar, ORCID). No empirical sample or statistical test reported.
high null result A Brief History of AI for Scientific Discovery: Open Researc... historical scope and timeline of AI for scientific discovery
Both the positive (approach) and negative (avoidance) AI job crafting pathways failed to significantly affect life satisfaction, indicating domain specificity of AI-related psychological mechanisms.
Analysis of the same multi-source, multi-wave dataset of 287 employee–leader dyads; tests of effects on life satisfaction showed non-significant results for both pathways.
For readers less familiar with the Bayesian and decision-theoretic language, key terms are defined in a glossary at the end of the article.
Statement about the article's structure and supporting material (presence of glossary noted in the article).
high null result Retraining as Approximate Bayesian Inference availability of glossary/terminology definitions
The gap between a continuously updated belief state and your frozen deployed model is 'learning debt.'
Terminology/definition introduced by the author in the article (glossary and definitional exposition).
high null result Retraining as Approximate Bayesian Inference definition/labeling of model staleness
Model retraining is usually treated as an ongoing maintenance task.
Author's descriptive claim in the article; presented as an observation about prevailing practice (no empirical sample or data reported).
high null result Retraining as Approximate Bayesian Inference how retraining is operationalized (treated as maintenance)
The study was conducted by the Mohammed bin Rashid School of Government’s Future of Government Center, in collaboration with global AI pioneers.
Authorship and collaboration statement in the report.
high null result Charting AI Governance Future in the Arab Region: A Policy R... institutional authorship and collaboration on the study
The report highlights the key findings of a field study covering ten Arab countries to explore the realities and challenges of AI governance.
Report statement describing the geographic scope of the field study (explicitly: ten Arab countries).
high null result Charting AI Governance Future in the Arab Region: A Policy R... geographic coverage of the field study (number of countries)
The recommendations are based on regional research that included hundreds of leaders active in the AI domains, from the public and private sectors.
Report statement claiming participant base of the underlying research (described as 'hundreds of leaders').
high null result Charting AI Governance Future in the Arab Region: A Policy R... scope and participant coverage of the underlying research
Data sources include field research conducted in 2024 and public reports from the Ministry of Industry and Information Technology and the National Bureau of Statistics.
Paper statement describing data provenance: field surveys in 2024 (n=326) plus public reports from MIIT and National Bureau of Statistics.
high null result Research on the Adoption of Artificial Intelligence and Proc... data provenance / sources
The visualization avoided redistributing value.
Reported result from the within-subjects experiment (N=32) stating that the visualization did not redistribute value between parties (i.e., it improved outcomes/efficiency without changing value split).
high null result From Overload to Convergence: Supporting Multi-Issue Human-A... distribution of value between negotiating parties (value split / surplus allocat...
Human-like presentations did not raise conformity pressure.
Reported experimental result: manipulaton of presentation style (human-like vs not) and measurement of conformity pressure; the abstract states that human-like presentation increased perceived usefulness/agency without increasing conformity pressure. No quantitative details provided in abstract.
Larger panels yielded no gains in accuracy relative to a single AI.
Reported experimental comparison manipulating panel size in the study (three tasks). The abstract states that larger panels did not produce accuracy gains versus a single AI. (No sample size or numerical effect reported in abstract.)
The paper proposes an original 'Revenue-Sharing as Infrastructure' (RSI) model in which the platform offers its AI infrastructure for free and takes a percentage of the revenues generated by developers' applications, reversing the traditional upstream payment logic.
Theoretical model proposal and conceptual description in the paper; presented as original contribution (no empirical implementation reported).
high null result Revenue-Sharing as Infrastructure: A Distributed Business Mo... business model design (revenue-sharing vs pay-upfront)
Recent literature distinguishes three generations of business models: a first generation modeled on cloud computing (pay-per-use), a second characterized by diversification (freemium, subscriptions), and a third, emerging generation exploring multi-layer market architectures with revenue-sharing mechanisms.
Literature review and conceptual synthesis presented in the paper; no empirical study or sample reported.
high null result Revenue-Sharing as Infrastructure: A Distributed Business Mo... classification of business model generations
We evaluate our approach on spapi, a production in-vehicle API system at Volvo Group involving 192 endpoints, 420 properties, and 776 CAN signals across six functional domains.
Case study / evaluation dataset description (explicit counts provided in paper).
high null result LLM-Powered Workflow Optimization for Multidisciplinary Soft... evaluation dataset scale and scope (endpoints, properties, CAN signals, domains)
We document a systematic pattern we call the 'Intent-Source Divide' (experiential vs transactional intent is associated with different source mixes).
Labeling of the observed consistent association between query intent (experiential vs transactional) and citation-source mix in the audited dataset of Google Gemini responses.
high null result The End of Rented Discovery: How AI Search Redistributes Pow... association between query intent and source mix
We audit 1,357 grounding citations from Google Gemini across 156 hotel queries in Tokyo.
Manual audit of Google Gemini grounding citations for 156 hotel queries in Tokyo; counted 1,357 grounding citations.
high null result The End of Rented Discovery: How AI Search Redistributes Pow... number of grounding citations audited
This study uses a mixed-method research design combining quantitative ROI modelling and cost–benefit analysis, qualitative synthesis of secondary enterprise case studies, and architectural analysis of Azure-native GenAI services.
Explicit methodological description in the abstract of the paper.
high null result Measuring Business ROI of Generative AI Adoption on Azure Cl... research design / methods
This Article presents the results of an experiment in which a transcript of a hypothetical client interview involving potential disability discrimination, retaliation, and wrongful termination claims was submitted to each AI system, with prompts requesting identification and assessment of viable legal theories.
Methodological description of the experiment: one hypothetical client interview transcript fed to each of four AI engines with prompts to identify and assess legal theories.
high null result Robot Wingman: Using AI to Assess an Employment Termination experimental procedure (input and prompts)
The paper derives formal conditions under which the inversion (smaller, orchestrated models outperforming frontier models) holds.
Mathematical derivations and stated sufficient/necessary conditions presented in the paper.
high null result Punctuated Equilibria in Artificial Intelligence: The Instit... parameter conditions for comparative performance inversion
We develop the Institutional Fitness Manifold, a mathematical framework that evaluates AI systems along four dimensions: capability, institutional trust, affordability, and sovereign compliance.
Theoretical/model development presented in the paper (formal definition of the manifold and its four dimensions).
high null result Punctuated Equilibria in Artificial Intelligence: The Instit... institutional fitness evaluated across four dimensions
There have been five eras of AI development since 1943, and within the current Generative AI Era there are four distinct epochs, each initiated by a discontinuous event.
Descriptive/historical classification within the paper (counts of eras and epochs; named initiating events such as the transformer and the 'DeepSeek Moment').
high null result Punctuated Equilibria in Artificial Intelligence: The Instit... count and classification of historical AI eras/epochs
Despite fears of mass unemployment, aggregate labor-market data through 2025 show limited labor-market disruption from generative AI.
Review of aggregate employment and labor-market studies and macro-level data through 2025 cited in the brief; methods include analyses of employment statistics and macro labor indicators (no single sample size reported).
high null result AI, Productivity, and Labor Markets: A Review of the Empiric... aggregate employment / labor-market disruption
We scored rule-breaking and abuse outcomes with an independent rubric-based judge across 28,112 transcript segments from multi-agent governance simulations.
Reported methodology: multi-agent governance simulations with agents in formal governmental roles, outcomes evaluated by an independent rubric-based judge; explicit sample count of 28,112 transcript segments.
high null result I Can't Believe It's Corrupt: Evaluating Corruption in Multi... rule-breaking and abuse outcomes (as assessed by rubric-based judge)
Controlled experiments were run with N = 250 across five content types to validate the mechanisms.
Experimental methods reported in the paper: controlled experiments with specified sample size and content-type breakdown.
high null result Governed Memory: A Production Architecture for Multi-Agent W... experimental sample size and content-type breadth (N=250, 5 content types)
Research agenda: empirical microdata on managerial time use, task-level automation, performance outcomes, and wage impacts are needed to quantify substitution versus complementarity and to evaluate human-in-the-loop designs' effects on firm performance and distributional outcomes.
Explicit methodological recommendation within the paper; identifies gaps due to the paper's conceptual (non-empirical) approach.
high null result Comparative analysis of strategic vs. computational thinking... availability and use of microdata on managerial tasks, automation, firm performa...
There is a need for longitudinal and cross‑country empirical research to measure how hybrid work and AI tools affect promotion rates, network centrality, productivity, privacy harms, trust, and long‑term career trajectories.
Statement of research gaps derived from the paper's methodological approach (conceptual synthesis and secondary case studies) and absence of longitudinal/cross‑cultural primary data.
high null result The Sociology of Remote Work and Organisational Culture: How... research gap existence (need for longitudinal and cross‑country empirical studie...
Practical recommendations for firms and policymakers include investing in training for AI curation/evaluation/coordination, experimenting with decentralised decision rights and governance safeguards, and monitoring competitive dynamics related to model/platform providers.
Policy and practitioner takeaways explicitly presented in the discussion/implications sections, deriving from the conceptual framework and mapped literature.
high null result Generative AI and the algorithmic workplace: a bibliometric ... recommended organisational and policy actions
The paper recommends a research agenda for AI economists: causal microeconometric studies (DiD, IVs, RCTs), structural models with hybrid human–AI agents, measurement work on GenAI use, distributional analysis and policy evaluation.
Explicit recommendations listed in the implications and research agenda sections; logical follow‑on from bibliometric findings about gaps in causal and measurement evidence.
high null result Generative AI and the algorithmic workplace: a bibliometric ... recommended methodological directions for future empirical and theoretical resea...
Bibliometric mapping profiles the intellectual structure and evolution of the field but does not establish causal effects of GenAI on organisational outcomes.
Methodological limitation explicitly stated in the paper; bibliometric approach (co‑word, citation, thematic mapping) is descriptive and historical in scope.
high null result Generative AI and the algorithmic workplace: a bibliometric ... methodological limitation (inability to infer causality from bibliometric mappin...
Co‑word and thematic analyses reveal six coherent conceptual clusters that bridge technical AI topics (e.g., LLMs, GANs) with managerial themes (e.g., autonomy, coordination, decision‑making).
Thematic mapping and co‑word network analysis performed on the 212‑paper corpus; identification of six clusters reported in results.
high null result Generative AI and the algorithmic workplace: a bibliometric ... number and thematic composition of conceptual clusters (six clusters linking tec...
Bibliometric and conceptual tools (VOSviewer, Bibliometrix) were used to identify performance trends, co‑word structures, thematic maps, and conceptual evolution in the GenAI–organisation literature.
Methods section: use of VOSviewer for network visualization and Bibliometrix for bibliometric statistics, co‑word analysis, thematic mapping and Sankey thematic evolution.
high null result Generative AI and the algorithmic workplace: a bibliometric ... types of bibliometric analyses applied (performance trends, co‑word structures, ...
The study analysed a corpus of 212 Scopus‑indexed publications covering 2018–2025 to map emergent literature on Generative AI and organisational change.
Bibliometric dataset constructed from Scopus; sample size = 212 peer‑reviewed articles; time window 2018–2025; analyses performed with Bibliometrix and VOSviewer.
high null result Generative AI and the algorithmic workplace: a bibliometric ... size and timeframe of bibliometric corpus (number of publications, 2018–2025)
Because the study is cross-sectional and self-report, causal claims are limited and generalizability is restricted to Generation Z (limitation noted in the paper).
Authors' limitations: cross-sectional/self-report design and sample restricted to Generation Z; these constraints are reported in the paper.
high null result Trust in AI-Driven Marketing and its Impact on Brand Loyalty... Inference validity / generalizability
Study design: cross-sectional self-report survey of 450 Generation Z consumers analyzed with Structural Equation Modeling (SPSS AMOS).
Methods section reporting sample size (n = 450), target population (Generation Z), cross-sectional survey design, and analysis technique (SEM using SPSS AMOS).
The measurement and structural model show good to excellent fit and reliable constructs (CFI = 0.980, TLI = 0.974, RMSEA = 0.062, SRMR = 0.031).
Reported psychometric/model-fit indices from SEM analysis (SPSS AMOS) on sample of 450 respondents.
high null result Trust in AI-Driven Marketing and its Impact on Brand Loyalty... Model fit / construct validity