The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (8625 claims)

Adoption
8625 claims
Productivity
7686 claims
Governance
6917 claims
Human-AI Collaboration
6574 claims
Org Design
4189 claims
Innovation
4131 claims
Labor Markets
3588 claims
Skills & Training
2985 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 761 200 101 904 2020
Governance & Regulation 829 400 191 122 1566
Organizational Efficiency 784 193 125 84 1197
Technology Adoption Rate 637 236 124 97 1103
Research Productivity 431 131 58 340 972
Output Quality 481 183 59 47 770
Decision Quality 332 177 82 49 647
Firm Productivity 439 57 88 20 610
AI Safety & Ethics 218 279 66 33 602
Market Structure 181 170 123 24 503
Task Allocation 214 64 72 33 388
Skill Acquisition 174 62 62 17 315
Innovation Output 204 27 45 18 295
Employment Level 105 54 108 13 282
Fiscal & Macroeconomic 132 69 43 26 277
Consumer Welfare 117 63 42 11 233
Firm Revenue 154 48 26 3 231
Task Completion Time 173 31 8 12 225
Inequality Measures 44 123 50 6 223
Worker Satisfaction 89 65 22 12 188
Error Rate 71 92 10 2 175
Regulatory Compliance 77 69 14 5 165
Automation Exposure 58 56 26 13 156
Training Effectiveness 96 21 14 19 152
Wages & Compensation 77 37 25 6 145
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 81 21 1 115
Hiring & Recruitment 52 7 8 3 70
Creative Output 32 20 8 3 64
Skill Obsolescence 5 47 6 1 59
Social Protection 28 16 8 2 54
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Adoption Remove filter
External environmental pressures did not show a significant role in the adoption process.
PLS-SEM results from the survey (n=110) reportedly found no significant effect of environmental/external pressures on AI adoption.
high null result Drivers of AI Adoption: The Role of Innovation Attributes, O... AI adoption (in relation to external/environmental pressure)
Data analysis involved Smart PLS-SEM, which facilitated reliability and validity assessment along with hypothesis evaluation.
Paper reports using SmartPLS for Partial Least Squares Structural Equation Modeling to assess reliability, validity, and test hypotheses.
high null result Drivers of AI Adoption: The Role of Innovation Attributes, O... analytical method used
The investigation was guided by the Technology-Organization-Environment (TOE) framework combined with innovation characteristics from Diffusion of Innovation theory.
Paper states theoretical frameworks used to design variables and hypotheses: TOE plus DOI innovation characteristics.
high null result Drivers of AI Adoption: The Role of Innovation Attributes, O... theoretical framing / constructs selection
A total of 110 valid responses were collected through an organized online survey using purposive sampling.
Reported sample description in the paper: online survey, purposive sampling, resulting in 110 valid responses.
high null result Drivers of AI Adoption: The Role of Innovation Attributes, O... sample_size / data_collection
The explanatory interface has no significant impact on situational trust.
Trust measured in different forms (situational, learned, cognitive, emotional) in the RCT; authors report no significant effect of explanatory interface on situational trust (N=120).
Under the sequential AI-assisted decision-making paradigm, the explanatory interface has no significant effect on immediate task performance.
Same randomized controlled experiment; authors report no significant effect of explanatory interface on immediate task performance in the sequential paradigm (N=120 total).
high null result How AI-Assisted Decision-Making Paradigms and Explainability... immediate task performance (task execution stage)
The study was a randomized controlled experiment with 120 pre-service teachers.
Randomized controlled experiment reported in the paper; sample described as 120 pre-service teachers.
The study uses a panel of 283 prefecture-level and above cities from 2012 to 2023 and a difference-in-differences (DID) identification strategy exploiting the establishment of national big data comprehensive pilot zones as a natural experiment.
Methodological description in the paper: sample composition (283 prefecture-level+ cities), time span (2012–2023), and the DID/natural experiment design to estimate policy effects.
high null result Study on the Impact of Establishing Big Data Comprehensive P... study design / methodological setup
Exploratory innovation does not show a significant direct association with long-term competitive performance.
PLS-SEM results from the survey of 104 Portuguese B2B managers reporting a non-significant direct path from exploratory innovation to performance.
high null result Generative AI Adoption in B2B Firms: Ethical Governance, Inn... long-term competitive performance
Data were analyzed using partial least squares structural equation modeling (PLS-SEM) implemented in SmartPLS 4.
Methods section statement in paper indicating use of PLS-SEM and SmartPLS 4 for data analysis.
The empirical analysis is based on a questionnaire survey administered to 324 respondents from Romanian organizations operating in IT, services, industry, and public administration.
Questionnaire survey described in paper; sample size explicitly stated as 324 respondents from Romanian organizations across IT, services, industry, and public administration.
high null result Decision-Making in Complex Systems Using AI-Based Decision S... sample description / data source
ARS's implementation can be found at https://github.com/t54-labs/AgenticRiskStandard.
Link to code repository provided in the abstract (factual statement pointing to implementation).
high null result Quantifying Trust: Financial Risk Management for Trustworthy... availability of ARS implementation in a public GitHub repository
As AI systems evolve into autonomous agents deployed in open environments and increasingly connected to payments or assets, the operational meaning of trust shifts to end-to-end outcomes: whether an agent completes tasks, follows user intent, and avoids failures that cause material or psychological harm.
Conceptual/argumentative claim presented in the paper (no empirical sample reported in the abstract).
high null result Quantifying Trust: Financial Risk Management for Trustworthy... agent task completion, alignment with user intent, avoidance of material or psyc...
Prior work on trustworthy AI emphasizes model-internal properties such as bias mitigation, adversarial robustness, and interpretability.
Summary statement about existing literature (no empirical data or sample reported in the abstract; asserted by authors as background).
high null result Quantifying Trust: Financial Risk Management for Trustworthy... research emphasis on model-internal properties (bias mitigation, adversarial rob...
On document intelligence (DocILE), our Code Factory variant matches Direct LLM on key field extraction (KILE: 80.0%).
Empirical evaluation reported on DocILE dataset of 5,680 invoices; KILE metric reported at 80.0%.
high null result Compiled AI: Deterministic Code Generation for LLM-Based Wor... key field extraction accuracy (KILE)
We evaluate on two task types: function-calling (BFCL, n=400) and document intelligence (DocILE, n=5,680 invoices).
Statement in paper specifying dataset/task types and sample sizes used in evaluation.
high null result Compiled AI: Deterministic Code Generation for LLM-Based Wor... evaluation datasets and sample sizes
All four models converge to similar skill profiles (3.6-point spread), suggesting that text-based automation feasibility may be more skill-dependent than model-dependent.
Comparison across 4 LLMs (LLaMA 3.3 70B, Mistral Large, Qwen 2.5 72B, Gemini 2.5 Flash) with reported 3.6-point spread in skill-profile SAFI scores.
high null result The AI Skills Shift: Mapping Skill Obsolescence, Emergence, ... variation (spread) in SAFI skill profiles across models
Explicit 'Sponsored' labels do not significantly reduce persuasion.
Experimental comparison including conditions with explicit 'Sponsored' labels; authors report no significant reduction in persuasion when labels were present (from the preregistered experiments).
high null result Commercial Persuasion in AI-Mediated Conversations effect of 'Sponsored' labels on sponsored product selection
A fifth of all products were randomly designated as sponsored and promoted in different ways.
Paper description of experimental manipulation: 20% of products (a fifth) were randomly designated as sponsored in the catalog.
high null result Commercial Persuasion in AI-Mediated Conversations sponsorship assignment (experimental manipulation)
We conducted two preregistered experiments with N = 2,012 participants.
Statement of experimental design in the paper (two preregistered experiments) with total sample size reported as N = 2,012.
high null result Commercial Persuasion in AI-Mediated Conversations study_design / sample_size
A pre-registered experiment evaluates this thesis in a commons production economy -- where agents share a finite resource pool and collaboratively produce value -- at 50-1,000 agent scale.
Paper states that a pre-registered experiment is planned/described; the experiment context (commons production economy) and planned scale (50-1,000 agents) are specified in the excerpt. No experimental outcomes or effect estimates are reported here.
high null result AgentCity: Constitutional Governance for Autonomous Agent Ec... alignment-through-accountability in a commons production economy (collective pro...
We instantiate SoP in AgentCity on an EVM-compatible layer-2 blockchain (L2) with a three-tier contract hierarchy (foundational, meta, and operational).
Reported implementation/instantiation described in the paper (system implementation claim). The paper states the platform (AgentCity) and technical details (EVM-compatible L2, three-tier contracts).
high null result AgentCity: Constitutional Governance for Autonomous Agent Ec... existence/implementation of SoP via AgentCity on L2 with three-tier hierarchy
In this architecture, smart contracts are the law itself -- the actual legislative output that agents produce and that governs their behavior.
Architectural/design claim in the paper describing conceptual role of smart contracts within SoP; presented as an intended property of the system.
high null result AgentCity: Constitutional Governance for Autonomous Agent Ec... role of smart contracts as legislative instrument for agent behavior
Agents discover, transact with, and delegate to agents owned by other parties without centralized oversight.
Asserted behavior pattern of autonomous agents in the paper's motivation; presented as descriptive claim rather than supported by a reported experiment or dataset in the excerpt.
high null result AgentCity: Constitutional Governance for Autonomous Agent Ec... ability of agents to discover, transact, and delegate across ownership boundarie...
Autonomous AI agents are beginning to operate across organizational boundaries on the open internet.
Stated as an empirical observation in the paper's introduction/introduction-level motivation; no specific dataset or sample described in the text excerpt.
high null result AgentCity: Constitutional Governance for Autonomous Agent Ec... cross-organization operation of autonomous agents
The empirical basis of the study is industry data from the Bureau of National Statistics of the Republic of Kazakhstan for 2020–2024.
Statement in the paper specifying the data source and years used for calibration of the model.
high null result The Impact of Artificial Intelligence on the Structural Tran... data source and period used
The study's methodological framework integrates the Bass model of innovation diffusion, an expanded production function with endogenous technological progress and the task-oriented Acemoglu–Restrepo approach, plus a multi-criteria system of industry prioritisation.
Description of the paper's modelling approach in the methods section; model components identified explicitly in the paper.
high null result The Impact of Artificial Intelligence on the Structural Tran... methodological framework (model components)
We conducted a systematic review and bibliometric analysis of 627 articles.
Statement in abstract reporting a systematic review and bibliometric analysis; sample size explicitly given as 627 articles.
high null result Advancing Decision-Making through AI-Human Collaboration: A ... number of articles reviewed / literature corpus
This study uses data from 743 listed enterprises in China’s strategic emerging industries from 2014 to 2023 and employs mediation and moderation (interaction) tests to examine mechanisms (digital-green synergy, information asymmetry, financing constraints) and the moderating role of AI applications.
Statement of data and methods in the paper: panel of 743 listed firms (2014–2023); empirical strategy includes mediation analyses and moderation (interaction) tests.
high null result The Impact of Patient Capital on the High-Quality Developmen... research design / dataset description
This paper has been accepted at PEARC 2026.
Statement in the paper indicating conference acceptance.
The University's GIS Center Ecological Archive (849 curated datasets) serves as a single-agent baseline deployment of EnviSmart.
Reported deployment dataset count provided in the paper: 849 curated datasets used as a single-agent baseline.
high null result Exploring Robust Multi-Agent Workflows for Environmental Dat... number of curated datasets in baseline deployment
The study employed a mixed-methods approach: a quantitative survey of 150 leading Nigerian firms across finance, tech, and manufacturing, complemented by qualitative analysis of government policy and workforce interviews.
Methodological statement in the paper explicitly describing sample and methods (quantitative survey n=150; qualitative policy and interviews).
high null result Human Capital and the AI-Powered Future of Work: (Training, ... methodology (survey and qualitative analysis)
The experiment used stratified randomization across 32 strata with 255 treatment firms and 260 control firms; baseline characteristics are well balanced across groups.
Experimental design description: stratification by geography, traction score, and baseline AI use; reporting of allocation counts and balance tests in Table 2.
high null result Mapping AI into Production: A Field Experiment on Firm Perfo... randomization balance (baseline covariates)
Attrition from the accelerator was low (1.6%, eight ventures) and balanced across treatment and control.
Program enrollment and retention records for the 515 firms in the randomized accelerator; 8 firms attrited.
high null result Mapping AI into Production: A Field Experiment on Firm Perfo... attrition / program dropout rate
The gains from treatment are broad-based: there are no significant differential effects by baseline firm performance or founder technical background.
Heterogeneity/subgroup analyses in the randomized sample (515 firms) comparing treatment effects across strata defined by baseline traction and founder technical background.
high null result Mapping AI into Production: A Field Experiment on Firm Perfo... heterogeneity of treatment effects by baseline performance and founder technical...
Treated firms' demand for labor remains unchanged.
RCT with 515 firms; firms reported labor demand/changes, comparison between treatment and control groups showed no significant change.
high null result Mapping AI into Production: A Field Experiment on Firm Perfo... demand for labor / employment
AIGC and HGC exhibit distinct creation behaviors and consumption behaviors.
Descriptive comparisons in the longitudinal dataset showing differences in production rates, content volumes, and consumption patterns between AIGC and HGC.
high null result Scale over Preference: The Impact of AI-Generated Content on... creation behavior (upload frequency/volume) and consumption behavior (views/enga...
The paper uses a comprehensive longitudinal dataset comprising tens of millions of users from a leading Chinese video-sharing platform.
Statement in paper summarizing data source: a longitudinal dataset covering 'tens of millions of users' from a major Chinese video-sharing platform; used for descriptive and comparative analyses of creation and consumption behavior.
high null result Scale over Preference: The Impact of AI-Generated Content on... dataset coverage (number of users observed)
Increasing reasoning effort (low, medium, high) provides no consistent benefit to estimation performance.
Controlled variation of each model's reasoning effort (low/medium/high) while asking them to produce 95% credible intervals for population statistics.
high null result Bayesian Elicitation with LLMs: Model Size Helps, Extra "Rea... impact of prompting/reasoning effort on estimate accuracy and calibration
These chats were committed to public repositories as part of routine development, capturing in-the-wild behavior.
Data collection method: analysis of chat transcripts that were committed to public repositories (authors state collected from repos and describe them as routine commits).
high null result Programming by Chat: A Large-Scale Behavioral Analysis of 11... degree to which collected chats represent in-the-wild developer behavior (public...
We analyze 74,998 developer messages from 11,579 chat sessions across 1,300 repositories and 899 developers using Cursor and GitHub Copilot.
Reported dataset counts in the paper (message, session, repository, developer counts) drawn from public commit histories of chats.
high null result Programming by Chat: A Large-Scale Behavioral Analysis of 11... number of developer messages / chat sessions / repositories / developers analyze...
We evaluate APEX across three baselines and six scenarios using sample sizes 2–4x larger than initial experiments (N=20–40 per scenario).
Experimental design statement in the paper (three baselines, six scenarios, reported N range of 20–40 per scenario).
high null result APEX: Agent Payment Execution with Policy for Autonomous Age... experimental evaluation breadth (number of baselines/scenarios) and sample sizes...
The HTTP 402 protocol treats payment as a first-class protocol event, but most implementations rely on cryptocurrency rails.
Descriptive claim in the paper about the state of HTTP 402 and common implementations (literature/implementation survey-style claim in paper).
high null result APEX: Agent Payment Execution with Policy for Autonomous Age... implementation choice for HTTP 402 (use of cryptocurrency rails)
Green innovation does not yet significantly reduce carbon inequality.
Empirical results from the provincial panel analysis (2003–2021) showing that measures of green innovation are not associated with a statistically significant reduction in carbon inequality.
This paper employs a staggered difference-in-differences (DID) model using data from Chinese A-share listed manufacturing companies from 2012 to 2023 and uses the National Artificial Intelligence Innovative Application Pioneer Zone (AIIAPZ) policy as a quasi-natural experiment.
Staggered DID empirical design; sample described as Chinese A-share listed manufacturing firms, 2012–2023; AIIAPZ policy used as treatment assignment (quasi-natural experiment).
high null result Does Artificial Intelligence Improve the Operational Resilie... methodological design / identification strategy (use of staggered DID and policy...
Big data analytics and blockchain technologies show no significant correlations with exports to specific destinations (multivariate probit result).
Multivariate probit model of destination-specific export decisions showing non-significant coefficients for big data analytics and blockchain across destinations (sample size not reported in prompt).
high null result How Digitalization Shapes Export Potential: Firm-Level Insig... exporting to specific destination regions (binary/region-specific firm export de...
Adopting blockchain technologies does not have a statistically significant effect on a firm's likelihood of exporting (probit model result).
Probit regression analysis showing non-significant coefficient for blockchain adoption (sample size not reported in prompt).
high null result How Digitalization Shapes Export Potential: Firm-Level Insig... likelihood/probability of exporting (firm-level)
Adopting big data analytics does not have a statistically significant effect on a firm's likelihood of exporting (probit model result).
Probit regression analysis showing non-significant coefficient for big data analytics adoption (sample size not reported in prompt).
high null result How Digitalization Shapes Export Potential: Firm-Level Insig... likelihood/probability of exporting (firm-level)
We introduce the Agentic Task Exposure (ATE) score, a composite measure computed algorithmically from O*NET task data using calibrated adoption parameters (not a regression estimate), incorporating AI capability scores, workflow coverage factors, and logistic adoption velocity.
Methodological description in the paper; algorithmic construction from O*NET task data with specified calibrated adoption parameters and components (AI capability scores, workflow coverage, logistic adoption).
high null result Agentic AI and Occupational Displacement: A Multi-Regional T... NA (methodological construct for measuring exposure/adoption)
Code authoring and review are only a small part of the larger software engineering process; the resulting code must also be maintained and updated over time.
Conceptual/argumentative claim presented in the paper to motivate longitudinal analysis (not presented as an empirical estimate from the dataset).
high null result Investigating Autonomous Agent Contributions in the Wild: Ac... relative share of authoring/review versus maintenance in software development (c...