The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (14055 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
We analyzed over 1.5M assets and 128K agents in EvoMap.
Descriptive dataset statement in the paper reporting the scope of the empirical analysis (assets and agents counts).
We conducted a global large-scale randomized field experiment, delivering customized LLM-generated feedback for over 31,000 arXiv preprints across 150 fields and more than 45,000 researchers from 133 geographic regions.
Statement in paper describing experimental design and scale: randomized field experiment; sample described as >31,000 preprints, >45,000 researchers, 150 fields, 133 regions.
high null result Human-AI Collaboration in Science at Scale: A Global Large-s... n/a (description of experimental sample and coverage)
The study uses 5 million job postings from Beijing covering 2018--2024 as its primary data source.
Stated dataset scope and size in the paper's description of data.
high null result Generative AI impacts on intra-urban inequality and skill pr... dataset size and temporal coverage
We construct a neighborhood-level GenAI Exposure Index by aggregating task-level assessments from five leading large language models.
Methodological construction described in the paper: task-level GenAI suitability assessments from five LLMs applied to tasks in 5 million Beijing job postings (2018--2024), aggregated to the neighborhood level.
high null result Generative AI impacts on intra-urban inequality and skill pr... GenAI Exposure Index (measurement / adoption proxy)
Decision-makers (DMs) are similarly ambiguity-seeking and ambiguity-generated insensitive (a-insensitive) regardless of whether the analyst is human or a machine learning (ML) model.
Incentivized laboratory experiment in which participants' ambiguity attitudes were measured for forecasts attributed to human and ML analysts; comparison of ambiguity-seeking and a-insensitivity across analyst type reported in the paper (sample size not reported in abstract).
high null result Trusting human versus machine predictions as a decision unde... ambiguity attitude (ambiguity-seeking and a-insensitivity)
There is a significant deficiency in India-centric qualitative investigations on human-AI collaboration in the IT sector.
Authors' review of peer-reviewed literature and secondary data concluding a gap in India-focused qualitative studies (literature gap analysis). No numeric count provided.
high null result Human–AI Collaboration in the Indian IT Industry: A Qualitat... quantity/coverage of India-centric qualitative research
The same bias was not observed when imagining help from another human participant.
Empirical comparison reported in the abstract: predictions about receiving help from another human did not show the same faster-than-reality bias as predictions about AI assistance (from the same preregistered study, N = 1237).
high null result Cognitive offloading and the speedup illusion in human-AI in... predicted completion time when imagining help from another human
Actual completion times between independent completion and AI-assisted completion did not differ.
Empirical result reported in the abstract comparing measured completion times for independent vs. AI-assisted task completion in the preregistered study (N = 1237).
high null result Cognitive offloading and the speedup illusion in human-AI in... actual completion time
We conducted a preregistered large-scale behavioral study (N = 1237) to characterize mismatches between expectations and reality, with a focus on simple cognitive tasks.
Authors report study design and sample size in the abstract: preregistered behavioral experiment with N = 1237 participants.
high null result Cognitive offloading and the speedup illusion in human-AI in... study design / sample size (methodological claim)
Identification strategy exploits import lumpiness in product categories linked to automation technologies (including robots) to disentangle adoption effects from selection into adoption.
Methodological claim: use of import 'lumpiness' in automation-related product categories as a plausibly exogenous source of adoption variation within a difference-in-differences framework.
high null result Firm size and the automation wage premium identification strategy (exogeneity of adoption variation)
We integrate datasets on trade activities, firm, and worker characteristics for the population of Italian importing firms from 2011 to 2019.
Data integration described in abstract; population-level administrative datasets on trade, firm, and worker characteristics for Italian importing firms covering years 2011–2019.
high null result Firm size and the automation wage premium coverage of datasets (population of Italian importing firms 2011–2019)
The study examines the impact of AI technologies on Uzbekistan's labor market transformation in the context of implementing the national strategy 'Digital Uzbekistan - 2030' and the Strategy for the Development of AI Technologies until 2030.
Framing and scope statement in the paper; analysis based on national strategy documents, statistical data, industry reviews, and regulatory legal documents.
high null result The Impact of Artificial Intelligence During the Transformat... impact of AI in the context of national digital/AI strategies
The degree of persuasiveness for LLM-based narrative explanations did not meaningfully impact decision accuracy over a simple AI prediction alone.
Large-scale human behavioral experiment comparing decision accuracy with AI prediction alone versus AI prediction plus narrative explanations of varying persuasiveness (method described in paper).
The system was evaluated on a real 64-GPU A100 testbed emulating three wind-powered sites with Azure production traces.
Experimental evaluation described in abstract: 64-GPU A100 testbed, emulation of three sites, use of Azure production traces.
high null result XWind: A Cross-site Router for Large Language Model Inferenc... experimental evaluation setup
The paper includes comparisons against accelerated baselines (reported experimental comparisons).
Statement in experimental section that comparisons to accelerated baselines were performed; specific baselines and results are in the paper.
high null result CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolv... comparative performance vs. baselines
The paper examines the legal implications of overusing export controls.
Statement of the paper's analytic scope and structure (description of content).
high null result Strategic Stalemates: The Paradox of Export Controls in the ... legal implications of export control overuse
AI infrastructure decisions involve trade-offs across physical resource systems including energy, land, water, and labor.
Descriptive claim in the abstract and framing sections; supported by cited prior work on the economic, physical, and moral limits of AI development and by illustrative regional cases.
high null result The AI Infrastructure Triad in Regional Governance: How Regi... resource demands/trade-offs (energy, land, water, labor) associated with AI infr...
The evidence is used illustratively rather than as a full causal test.
Explicit methodological statement in the abstract describing the role of the evidence (coded comments and cases) as illustrative.
high null result The AI Infrastructure Triad in Regional Governance: How Regi... strength/type of empirical inference (illustrative vs causal)
The article interprets stakeholder and regional positions as different ways of prioritizing the triad's frontiers.
Analysis of the coded public comments and illustrative regional cases used to map stakeholder/regional positions onto the Progress/Sustainability/Equity triad.
high null result The AI Infrastructure Triad in Regional Governance: How Regi... mapping of stakeholder/regional positions onto triad priorities
The article draws on a previously coded dataset of 10,068 public comments submitted to the 2025 U.S. AI Action Plan.
Empirical resource used in the paper; dataset size explicitly reported as 10,068 coded public comments.
high null result The AI Infrastructure Triad in Regional Governance: How Regi... stakeholder/public comment content regarding the U.S. AI Action Plan
We sample 50 benchmark games from a 2,000-game generated pool and evaluate nine frontier and open-weight LLMs in a head-to-head tournament with over 36,000 matches.
Empirical setup reported in the paper's abstract: 50 sampled games, 2,000-game pool, nine LLMs, >36,000 head-to-head matches.
high null result GENSTRAT: Toward a Science of Strategic Reasoning in Large L... evaluation sample size / tournament scale (matches run)
We interviewed 24 product-focused individuals at a large technology firm about how AI has impacted their own work, their work within their product team, and their professional interactions.
Qualitative semi-structured interviews with 24 product-focused employees at a single large technology firm; sample size = 24.
high null result Beyond the Org Chart: AI and the Transformation of Invisible... description of sample and data collection
This study is a systematic literature review conducted following PRISMA 2020 guidelines synthesizing peer-reviewed studies published between 2019 and 2025 identified via searches in Scopus, Web of Science and Google Scholar.
Author-stated methodology in the paper: PRISMA 2020 systematic literature review covering 2019–2025 with database searches in Scopus, Web of Science, and Google Scholar.
high null result Yapay Zeka Sistemleri ve İnsan İşbirliğinin Psikolojik, Sosy... scope and coverage of literature search / methodological transparency
This scoping review adhered to the PRISMA-ScR guidelines and encompassed 29 peer-reviewed empirical studies published from 2020 to 2025.
Methods statement in the paper (explicit methodological description).
high null result The influence of AI-Driven Employee Performance Management (... scope and methodological adherence of the review (PRISMA-ScR; n=29 studies)
AI capability is conceptualized/measured as having sub-dimensions including technical infrastructure and management.
Measurement/model description in paper: AI capability broken into sub-dimensions (technical infrastructure, management); supported by survey instrument and measurement model using PLS-SEM on 251 firms.
high null result AI for decision-making: exploring the linkage from AI capabi... construct dimensionality of AI capability
The mixed-method approach, combining partial least squares–structural equation modeling (PLS-SEM) and fuzzy-set qualitative comparative analysis (fsQCA), was used for analyzing the survey data of 251 firms.
Methods statement in paper: authors report using a mixed-method approach (PLS-SEM and fsQCA) on survey data; sample size explicitly stated as 251 firms.
high null result AI for decision-making: exploring the linkage from AI capabi... research methodology / analytic approach
The paper identifies five major research gaps and proposes future research directions in intelligent international marketing.
Author-reported outcome of the paper's systematic review and content analysis (2010–2025); descriptive claim about the paper's contributions.
high null result Research on International Marketing in the Context of Intell... identification of research gaps and proposed directions
Prior productivity does not predict AI use.
Analysis linking prior productivity measures to reported AI adoption in the Census Bureau survey data; finding of no predictive relationship reported.
high null result The Adoption of Industrial AI in America predictive relationship between prior productivity and AI adoption
The analysis uses a mandatory, purpose-designed Census Bureau survey of approximately 28,500 establishments.
Census Bureau mandatory survey specifically designed for this study; sample size stated as approximately 28,500 establishments.
high null result The Adoption of Industrial AI in America survey_sample_size / data source
Large language models are routinely used as automated evaluators (to review code, moderate content, or score outputs), often with many items passing through one conversation.
Background/introductory claim in the paper describing common practice; not an experimental result but contextual motivation.
high null result AMEL: Accumulated Message Effects on LLM Judgments prevalence of LLM use as automated evaluators
Position of biased turns does not matter: five biased turns placed anywhere in a 50-turn history produce the same shift.
Follow-up experiment manipulating the positions of biased turns within 50-turn histories and observing equivalent bias magnitudes.
high null result AMEL: Accumulated Message Effects on LLM Judgments dependence of AMEL on the position of biased messages in conversation history
Bias does not grow with context length: 5 prior turns and 50 produce the same shift (Spearman |r| < 0.01; OLS slope p = 0.80).
Correlation and OLS analysis of bias magnitude versus context-length (number of prior turns) reported in the experiments.
high null result AMEL: Accumulated Message Effects on LLM Judgments relationship between context length and magnitude of AMEL
We conducted 75,898 API calls to 11 models from 4 providers (OpenAI, Anthropic, Google, and four open-source models).
Descriptive statement of the experimental scope reported in the paper: total number of API calls and models/providers tested.
high null result AMEL: Accumulated Message Effects on LLM Judgments experimental sample size / scope (number of API calls and models)
When execution is standardized on a cheaper Gemini Flash scaffold (separating planning from execution), a pooled 32-game planner bakeoff is consistent with near-equality (p approx 0.821).
Empirical experiment: 32-game planner-only comparison where execution was standardized; reported p-value ≈ 0.821 indicating no significant difference among planners.
high null result Evaluating Large Language Models as Live Strategic Agents: P... planner performance equality (pooled test)
We study this setting in a timed multi-phase Risk environment with explicit victory targets and repeated planning and execution cycles.
Methodological description of the experimental environment used in the paper (timed multi-phase Risk environment with explicit victory targets and repeated cycles).
high null result Evaluating Large Language Models as Live Strategic Agents: P... experimental_environment_description
Identification of effects uses within-firm variation with firm and city-by-year fixed effects.
Identification strategy reported in abstract: within-firm variation under firm and city-by-year fixed effects.
high null result Toward Sustainable Workforce Development: How AI Reshapes Sk... identification approach / econometric controls
The study measures four skill-category demand shares and their within-category importance from job-description text.
Methodological statement in abstract: measurement of four skill-category demand shares and within-category importance via job-description text.
high null result Toward Sustainable Workforce Development: How AI Reshapes Sk... skill-category demand shares and within-category importance
AI exposure is decomposed into displacement and augmentation components based on task routineness.
Methodological claim in abstract: decomposition of exposure into displacement and augmentation using a routineness criterion for tasks.
high null result Toward Sustainable Workforce Development: How AI Reshapes Sk... decomposed AI exposure measures (displacement vs augmentation)
The authors construct firm-by-year potential AI exposure via semantic matching between AI patent texts and detailed occupation task descriptions.
Method description in abstract: semantic matching of AI patent texts to occupation task descriptions to build firm-by-year exposure.
high null result Toward Sustainable Workforce Development: How AI Reshapes Sk... firm-by-year potential AI exposure (constructed measure)
The study uses approximately 67 million online job postings from two major Chinese recruitment platforms (2019–2024).
Statement in paper abstract describing dataset size and source (job postings from two major Chinese recruitment platforms over 2019–2024).
high null result Toward Sustainable Workforce Development: How AI Reshapes Sk... dataset size and coverage (number of job postings, platforms, years)
The study extends the Technology Acceptance Model (TAM), Dynamic Capabilities Theory, and the Technology-Organisation-Environment (TOE) framework into the qualitative, emerging-economy entrepreneurial context.
Authors' stated theoretical contribution based on mapping thematic results to TAM, Dynamic Capabilities, and TOE frameworks within analysis and discussion sections.
high null result Navigating the Intelligence Frontier: AI Adoption as a Succe... theoretical contribution / framework extension
This study employed an interpretivist, qualitative research design using sixteen in-depth semi-structured interviews with entrepreneurs across fintech, edtech, health-tech, logistics, retail, and SaaS in Delhi/NCR, India, and used Braun & Clarke's (2006) six-phase thematic analysis framework.
Explicit methodological description in the paper: interpretivist qualitative design; n=16 in-depth semi-structured interviews across specified sectors in Delhi/NCR; thematic analysis following Braun & Clarke (2006).
high null result Navigating the Intelligence Frontier: AI Adoption as a Succe... research design / data collection (qualitative interviews)
Using a qualitative approach with 17 expert interviews from employees at startups.
Methods statement in paper specifying qualitative study design and sample size of 17 interviews.
high null result From Prompt To Process: Qualitative Insights On How Genai Us... study methodology and sample
Process-related insights into how GenAI transforms startups are limited.
Authors' literature positioning / gap statement in paper (no empirical metric provided).
high null result From Prompt To Process: Qualitative Insights On How Genai Us... availability of process-related insights in literature
The paper's findings are based on three pre-registered user studies with a combined sample size of N = 2691.
Statement in the paper's abstract reporting three pre-registered user studies and combined N = 2691.
high null result The efficiency-gain illusion: People underestimate the rate ... study_sample_description
Light AI users perform similarly to matched users who do not use AI.
Same controlled logical reasoning experiment with on-demand AI assistance comparing light AI users to matched non-users (sample size not stated in abstract).
high null result The Impact of AI Usage and Informativeness on Skill Developm... post-AI performance / skill development
We map that space through six interconnected elements: sociotechnical context, decision-making frameworks, human decision participants, AI capabilities, interaction, and holistic evaluation.
The paper's proposed analytical/framework contribution listing six elements (descriptive of the authors' mapping work).
high null result Addressing the Synergy Gap: The Six Elements of the Design S... n/a (framework description)
Most current work treats human-AI combination as an engineering problem and concentrates on interpretability, trust calibration, or interface design.
Authors' characterization of the existing literature and dominant research foci (qualitative literature assessment; no quantitative breakdown provided).
high null result Addressing the Synergy Gap: The Six Elements of the Design S... research focus/themes in human-AI combination literature
We call this persistent shortfall the 'synergy gap.'
Terminology/definition introduced by the authors in the paper (conceptual claim, not an empirical finding).
high null result Addressing the Synergy Gap: The Six Elements of the Design S... n/a (terminology defining a phenomenon)
Agentic payments are distinct from traditional automated systems because they emphasise autonomy, contextual reasoning and adaptability.
Conceptual distinction asserted in the abstract (comparative analysis between agentic payments and traditional automated systems).
high null result AI Agents in Payments: Applications, Risks and Regulations system characteristics (autonomy, contextual reasoning, adaptability)