The Commonplace
Home Dashboard Papers Evidence Digests 🎲

Evidence (5267 claims)

Adoption
5267 claims
Productivity
4560 claims
Governance
4137 claims
Human-AI Collaboration
3103 claims
Labor Markets
2506 claims
Innovation
2354 claims
Org Design
2340 claims
Skills & Training
1945 claims
Inequality
1322 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 378 106 59 455 1007
Governance & Regulation 379 176 116 58 739
Research Productivity 240 96 34 294 668
Organizational Efficiency 370 82 63 35 553
Technology Adoption Rate 296 118 66 29 513
Firm Productivity 277 34 68 10 394
AI Safety & Ethics 117 177 44 24 364
Output Quality 244 61 23 26 354
Market Structure 107 123 85 14 334
Decision Quality 168 74 37 19 301
Fiscal & Macroeconomic 75 52 32 21 187
Employment Level 70 32 74 8 186
Skill Acquisition 89 32 39 9 169
Firm Revenue 96 34 22 152
Innovation Output 106 12 21 11 151
Consumer Welfare 70 30 37 7 144
Regulatory Compliance 52 61 13 3 129
Inequality Measures 24 68 31 4 127
Task Allocation 75 11 29 6 121
Training Effectiveness 55 12 12 16 96
Error Rate 42 48 6 96
Worker Satisfaction 45 32 11 6 94
Task Completion Time 78 5 4 2 89
Wages & Compensation 46 13 19 5 83
Team Performance 44 9 15 7 76
Hiring & Recruitment 39 4 6 3 52
Automation Exposure 18 17 9 5 50
Job Displacement 5 31 12 48
Social Protection 21 10 6 2 39
Developer Productivity 29 3 3 1 36
Worker Turnover 10 12 3 25
Skill Obsolescence 3 19 2 24
Creative Output 15 5 3 1 24
Labor Share of Income 10 4 9 23
Clear
Adoption Remove filter
This yields a common scale (bits of usable information) for comparing a wide range of interventions, contexts, and models.
Theoretical implication of the authors' formalization combining Bayesian persuasion and V-usable information (paper argues for a common information scale measured in bits).
high null result Mecha-nudges for Machines bits of usable information as a comparability metric
To formalize mecha-nudges, we combine the Bayesian persuasion framework with V-usable information, a generalization of Shannon information that is observer-relative.
Methodological/theoretical development described in the paper (formal combination of two theoretical frameworks).
high null result Mecha-nudges for Machines formal representation of information available to observers/agents
We introduce mecha-nudges: changes to how choices are presented that systematically influence AI agents without degrading the decision environment for humans.
Conceptual/definitional contribution made in the paper (novel concept introduced by authors).
high null result Mecha-nudges for Machines influence on AI agents while preserving human decision environment
Nudges are subtle changes to the way choices are presented to human decision-makers (e.g., opt-in vs. opt-out by default) that shift behavior without restricting options or changing incentives.
Background/definition stated in the paper (conceptual; references to standard behavioral-economics definition of nudges).
high null result Mecha-nudges for Machines behavioral response to choice presentation
Data sources include field research conducted in 2024 and public reports from the Ministry of Industry and Information Technology and the National Bureau of Statistics.
Paper statement describing data provenance: field surveys in 2024 (n=326) plus public reports from MIIT and National Bureau of Statistics.
high null result Research on the Adoption of Artificial Intelligence and Proc... data provenance / sources
We conduct an in-depth case study of SWE-bench GitHub issue resolution using two representative models, GPT-5 mini and DeepSeek v3.2.
Descriptive: authors report running an in-depth case study on the SWE-bench GitHub issue resolution dataset using two named models (GPT-5 mini and DeepSeek v3.2).
high null result Computational Arbitrage in AI Model Markets execution of a case study on SWE-bench GitHub issue resolution with two named mo...
The paper proposes an original 'Revenue-Sharing as Infrastructure' (RSI) model in which the platform offers its AI infrastructure for free and takes a percentage of the revenues generated by developers' applications, reversing the traditional upstream payment logic.
Theoretical model proposal and conceptual description in the paper; presented as original contribution (no empirical implementation reported).
high null result Revenue-Sharing as Infrastructure: A Distributed Business Mo... business model design (revenue-sharing vs pay-upfront)
Recent literature distinguishes three generations of business models: a first generation modeled on cloud computing (pay-per-use), a second characterized by diversification (freemium, subscriptions), and a third, emerging generation exploring multi-layer market architectures with revenue-sharing mechanisms.
Literature review and conceptual synthesis presented in the paper; no empirical study or sample reported.
high null result Revenue-Sharing as Infrastructure: A Distributed Business Mo... classification of business model generations
We evaluate our approach on spapi, a production in-vehicle API system at Volvo Group involving 192 endpoints, 420 properties, and 776 CAN signals across six functional domains.
Case study / evaluation dataset description (explicit counts provided in paper).
high null result LLM-Powered Workflow Optimization for Multidisciplinary Soft... evaluation dataset scale and scope (endpoints, properties, CAN signals, domains)
The analysis relies on partial least squares path modeling (PLS-PM) to test eight predictions linking technological perceptions, organizational factors, and adoption outcomes.
Author-stated analytical method: PLS-PM; eight predictions tested; uses the survey data described above.
high null result Artificial Intelligence Adoption in Talent Acquisition: Effe... analytical approach / hypothesis testing
The study uses cross-sectional survey data from 523 human resource professionals and hiring managers representing 184 organizations across multiple industries in the United States.
Author-stated sample description in the paper: cross-sectional survey; 523 HR professionals/hiring managers; 184 organizations; multiple industries; U.S.
high null result Artificial Intelligence Adoption in Talent Acquisition: Effe... sample composition / data source
The research methodology is based on the envelope model ("input" orientation) to assess the level of transformation of labor resources and labor markets due to the spread of artificial intelligence.
Methodological statement in the paper specifying the use of an input-oriented envelope model applied to a sample of European Union countries.
high null result Artificial intelligence as a driver of economic growth: Chal... method of measurement / assessment approach
We document a systematic pattern we call the 'Intent-Source Divide' (experiential vs transactional intent is associated with different source mixes).
Labeling of the observed consistent association between query intent (experiential vs transactional) and citation-source mix in the audited dataset of Google Gemini responses.
high null result The End of Rented Discovery: How AI Search Redistributes Pow... association between query intent and source mix
We audit 1,357 grounding citations from Google Gemini across 156 hotel queries in Tokyo.
Manual audit of Google Gemini grounding citations for 156 hotel queries in Tokyo; counted 1,357 grounding citations.
high null result The End of Rented Discovery: How AI Search Redistributes Pow... number of grounding citations audited
The model yields two limits on the speed of learning and adoption: a structural limit determined by prerequisite reachability and an epistemic limit determined by uncertainty about the target.
Theoretical result stated in the paper (model-derived identification of two distinct limiting factors on learning speed).
high null result A Mathematical Theory of Understanding speed of learning / adoption
Teaching is modeled as sequential communication with a latent target.
Modeling assumption explicitly stated in the paper (formalization of teaching in the theoretical framework).
high null result A Mathematical Theory of Understanding model specification (teaching process)
The paper models the learner as a mind: an abstract learning system characterized by a prerequisite structure over concepts.
Modeling assumption explicitly stated in the paper (definition of the 'mind' in the theoretical model).
high null result A Mathematical Theory of Understanding model specification (representation of learner)
The findings provide evidence against concerns that AI mediation undermines people's ability to distinguish truth from lies.
Synthesis of experimental results showing unchanged lie-detection accuracy despite declines in perceived trust/confidence.
high null result Through the Looking-Glass: AI-Mediated Video Communication R... ability to distinguish truth from lies (lie-detection accuracy)
Participants were no more inclined to suspect those using AI tools of lying.
Experimental comparisons assessing participants' propensity to suspect AI-mediated speakers of deception showed no increase in suspicion for users of AI tools.
high null result Through the Looking-Glass: AI-Mediated Video Communication R... inclination to suspect AI-mediated speakers of lying
Participants' actual judgment accuracy (ability to detect lies) remained unchanged across AI-mediated and non-AI-mediated videos.
Primary experimental result comparing lie-detection accuracy (truthful vs deceptive statements) across the three AI mediation conditions in the preregistered experiments (N = 2,000).
high null result Through the Looking-Glass: AI-Mediated Video Communication R... judgment accuracy (lie-detection accuracy)
We conducted two preregistered online experiments (N = 2,000).
Methods statement in the paper: two preregistered online experiments with a combined sample size of 2,000 participants.
high null result Through the Looking-Glass: AI-Mediated Video Communication R... study design / sample size (methodological claim)
The study collected data from 293 questionnaire respondents and 12 interview participants.
Mixed-methods data collection reported in the paper: n=293 survey respondents and n=12 interviewees.
high null result The Impact of Artificial Intelligence on Financial Inclusion... study sample / data collection
The study synthesises findings from 36 peer-reviewed articles published between 2015 and 2025.
Systematic literature synthesis / review of peer-reviewed articles; sample = 36 articles (2015–2025) as stated in the paper.
high null result The Influence of Automation on Tax Compliance Behaviour scope of evidence base (number of articles reviewed)
This research deepens theoretical understanding by integrating CE principles, Industry 4.0 architectures, green innovation theory, and lifecycle assessment into a unified conceptual framework.
Authors' description of theoretical contribution in the abstract, based on their synthesis of the bibliometric and systematic review findings.
high null result Artificial intelligence as a catalyst for the circular econo... conceptual/theoretical integration (framework development)
This study offers the first comprehensive mixed-methods assessment of how AI transforms industrial production ecosystems in the post-ChatGPT era.
Authors' methodological/novelty claim in the abstract; supported by description of methods (bibliometric analysis of 196 articles and systematic review of 104 studies).
high null result Artificial intelligence as a catalyst for the circular econo... novelty / comprehensiveness of the study
We construct a multidimensional energy justice index to analyze AI’s net effects, pathways, and institutional dependencies.
Methodological statement: authors create an energy justice index (multidimensional) used as dependent variable in empirical analysis.
high null result Artificial intelligence adoption for advancing energy justic... multidimensional energy justice index
This study uses a panel dataset for 30 Chinese provinces from 2008 to 2022.
Statement of dataset coverage in the paper: 30 provinces, years 2008–2022 (panel data).
high null result Artificial intelligence adoption for advancing energy justic... dataset coverage (30 provinces, 2008–2022)
This study uses a mixed-method research design combining quantitative ROI modelling and cost–benefit analysis, qualitative synthesis of secondary enterprise case studies, and architectural analysis of Azure-native GenAI services.
Explicit methodological description in the abstract of the paper.
high null result Measuring Business ROI of Generative AI Adoption on Azure Cl... research design / methods
Ninety-five high-quality studies were analyzed using principal component analysis and k-means clustering.
Paper states screening produced 95 high-quality studies which were subjected to PCA and k-means clustering for analysis.
high null result AI Governance Risk Tiering for Sustainable Digital Infrastru... number of studies analyzed and analytical methods applied
A systematic literature review of 450 records from major databases was conducted using PRISMA 2020 guidelines.
Statement in the paper describing methods: systematic literature review using PRISMA 2020; initial search returned 450 records from major databases.
high null result AI Governance Risk Tiering for Sustainable Digital Infrastru... number of records screened in systematic review
This Article presents the results of an experiment in which a transcript of a hypothetical client interview involving potential disability discrimination, retaliation, and wrongful termination claims was submitted to each AI system, with prompts requesting identification and assessment of viable legal theories.
Methodological description of the experiment: one hypothetical client interview transcript fed to each of four AI engines with prompts to identify and assess legal theories.
high null result Robot Wingman: Using AI to Assess an Employment Termination experimental procedure (input and prompts)
Industrial robot penetration is used as a proxy measure for AI adoption in Chinese provinces.
Paper explicitly states industrial robot penetration was used as the proxy for AI adoption in the empirical analysis.
high null result Nonlinear effects of ageing population and AI on China’s GDP... AI adoption (proxied by industrial robot penetration)
The study uses panel data on 31 Chinese provinces for the period 2000–2022 and employs panel threshold regression models with ageing and AI adoption as threshold variables.
Paper description: panel data from 31 provinces (2000–2022); use of panel threshold regression models; threshold variables specified as ageing and AI adoption (industrial robot penetration).
high null result Nonlinear effects of ageing population and AI on China’s GDP... methodological approach (panel threshold regression)
Specification and implementation are available at https://github.com/chelof100/acp-framework-en
Repository URL provided in the specification text; points to the stated implementation and documentation artifacts.
high null result Agent Control Protocol: Admission Control for Agent Actions availability of specification and implementation at the given URL
The specification defines more than 62 verifiable requirements and 12 prohibited behaviors.
Quantitative claims stated in the specification about requirement and prohibited-behavior counts.
high null result Agent Control Protocol: Admission Control for Agent Actions number of verifiable requirements and prohibited behaviors
The v1.13 release includes an OpenAPI 3.1.0 specification for all HTTP endpoints.
Specification/repository statement indicating an OpenAPI 3.1.0 specification is provided for HTTP endpoints.
high null result Agent Control Protocol: Admission Control for Agent Actions presence of OpenAPI 3.1.0 specification covering HTTP endpoints
The v1.13 release includes 51 signed conformance test vectors (Ed25519 + SHA-256).
Repository/specification statement listing 51 signed conformance test vectors and the signature/hash algorithms used.
high null result Agent Control Protocol: Admission Control for Agent Actions count and cryptographic scheme of conformance test vectors
The v1.13 release includes a Go reference implementation of 22 packages covering all L1-L4 capabilities.
Repository statement describing a Go reference implementation comprising 22 packages and coverage claim for L1-L4.
high null result Agent Control Protocol: Admission Control for Agent Actions number of Go packages in the reference implementation and claimed coverage of co...
The v1.13 specification comprises 36 technical documents organized into five conformance levels (L1-L5).
Explicit quantitative statement in the specification/repository describing document count and organization.
high null result Agent Control Protocol: Admission Control for Agent Actions number of technical documents and conformance-level organization
Existing financial question answering benchmarks primarily focus on company balance sheet data and rarely evaluate reasoning over how company stocks trade in the market or their interactions with fundamentals.
Literature/background claim made in the paper motivating the new benchmark; authors contrast prior benchmarks' focus on balance sheet data with the lack of market/trading-signal evaluation.
high null result FinTradeBench: A Financial Reasoning Benchmark for LLMs scope of existing financial QA benchmarks (focus on balance sheet data vs. tradi...
Retrieval provides limited benefit for trading-signal reasoning.
Experimental comparison reported in the paper showing that retrieval-augmentation had little impact on performance for trading-signal-focused questions.
high null result FinTradeBench: A Financial Reasoning Benchmark for LLMs change in performance on trading-signal-focused questions with retrieval
To ensure reliability at scale, we adopt a calibration-then-scaling framework that combines expert seed questions, multi-model response generation, intra-model self-filtering, numerical auditing, and human-LLM judge alignment.
Methodological claim in the paper describing the QA and annotation pipeline; the paper reports using these components as part of their reliability framework.
high null result FinTradeBench: A Financial Reasoning Benchmark for LLMs benchmark annotation and validation procedure
The benchmark is organized into three reasoning categories: fundamentals-focused, trading-signal-focused, and hybrid questions requiring cross-signal reasoning.
Direct description of the benchmark's taxonomy in the paper; the authors specify these three categories as the organizational structure for the 1,400 questions.
high null result FinTradeBench: A Financial Reasoning Benchmark for LLMs benchmark organization / task taxonomy
FinTradeBench contains 1,400 questions grounded in NASDAQ-100 companies over a ten-year historical window.
Statement in the paper describing the benchmark construction and scope; the paper reports the benchmark size (1,400 questions) and the dataset grounding (NASDAQ-100 over ten years).
high null result FinTradeBench: A Financial Reasoning Benchmark for LLMs benchmark size and scope (number of questions; data grounding)
The paper derives formal conditions under which the inversion (smaller, orchestrated models outperforming frontier models) holds.
Mathematical derivations and stated sufficient/necessary conditions presented in the paper.
high null result Punctuated Equilibria in Artificial Intelligence: The Instit... parameter conditions for comparative performance inversion
We develop the Institutional Fitness Manifold, a mathematical framework that evaluates AI systems along four dimensions: capability, institutional trust, affordability, and sovereign compliance.
Theoretical/model development presented in the paper (formal definition of the manifold and its four dimensions).
high null result Punctuated Equilibria in Artificial Intelligence: The Instit... institutional fitness evaluated across four dimensions
There have been five eras of AI development since 1943, and within the current Generative AI Era there are four distinct epochs, each initiated by a discontinuous event.
Descriptive/historical classification within the paper (counts of eras and epochs; named initiating events such as the transformer and the 'DeepSeek Moment').
high null result Punctuated Equilibria in Artificial Intelligence: The Instit... count and classification of historical AI eras/epochs
The study uses panel data for 30 Chinese provinces from 2013–2022 to measure urban circular economy efficiency (UCEE) with a Super-SBM model including undesirable outputs, track dynamics via the Global Malmquist–Luenberger index, and estimate spatial effects with a spatial Durbin model.
Methodological description in the abstract: explicit statement of data (30 provinces, 2013–2022) and the three methods used (Super-SBM with undesirable outputs, GML index, spatial Durbin model).
high null result How artificial intelligence and environmental regulation inf... use of Super-SBM measurement, GML dynamics, and spatial Durbin estimation (metho...
Despite fears of mass unemployment, aggregate labor-market data through 2025 show limited labor-market disruption from generative AI.
Review of aggregate employment and labor-market studies and macro-level data through 2025 cited in the brief; methods include analyses of employment statistics and macro labor indicators (no single sample size reported).
high null result AI, Productivity, and Labor Markets: A Review of the Empiric... aggregate employment / labor-market disruption
Open research challenges that define the research agenda include scaling beyond benchmarks, achieving compositionality over changes, metrics for validating specifications, handling rich logics, and designing human-AI specification interactions.
Authors' explicit enumeration of open problems and a proposed multi-disciplinary research agenda; presented as expert opinion rather than empirical finding.
high null result Intent Formalization: A Grand Challenge for Reliable Coding ... progress on research questions (research agenda advancement)