The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (13827 claims)

Adoption
8454 claims
Productivity
7544 claims
Governance
6789 claims
Human-AI Collaboration
6327 claims
Org Design
4126 claims
Innovation
4058 claims
Labor Markets
3520 claims
Skills & Training
2924 claims
Inequality
2057 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 749 195 97 889 1979
Governance & Regulation 815 391 188 121 1539
Organizational Efficiency 771 189 124 83 1177
Technology Adoption Rate 624 233 123 96 1084
Research Productivity 410 121 56 331 929
Output Quality 466 177 59 47 749
Decision Quality 320 174 75 42 618
Firm Productivity 435 55 88 20 604
AI Safety & Ethics 214 276 65 33 593
Market Structure 178 166 122 24 495
Task Allocation 206 64 70 31 376
Skill Acquisition 165 57 60 17 299
Innovation Output 201 27 41 18 288
Employment Level 105 51 107 13 278
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 116 63 42 11 232
Firm Revenue 149 46 26 3 224
Inequality Measures 44 122 49 6 221
Task Completion Time 169 29 8 12 219
Worker Satisfaction 89 61 20 12 182
Error Rate 69 91 10 2 172
Regulatory Compliance 76 68 14 5 163
Training Effectiveness 92 19 13 19 145
Wages & Compensation 77 36 25 6 144
Automation Exposure 51 54 22 12 142
Team Performance 86 17 27 9 140
Developer Productivity 94 17 14 6 132
Job Displacement 12 80 20 1 113
Hiring & Recruitment 51 7 8 3 69
Skill Obsolescence 5 45 6 1 57
Creative Output 31 16 7 2 57
Social Protection 27 16 8 2 53
Labor Share of Income 17 17 17 51
Worker Turnover 11 12 3 26
Industry 1 1
PRISM successfully identifies and repairs production regressions caused by LLM behavioral drift within a 24-hour detection window.
Reported result in abstract claiming detection and repair of production regressions within 24 hours during deployment.
high positive PRISM: Prompt Reliability via Iterative Simulation and Monit... time-to-detection/repair of production regressions
PRISM achieves 99% production reliability across all evaluated agents.
Reported quantitative outcome in abstract from the three-week evaluation over 35 agents.
PRISM reduces median prompt authoring time from 2 days to under 30 minutes.
Reported quantitative outcome in abstract from the three-week evaluation across the 35 agents.
high positive PRISM: Prompt Reliability via Iterative Simulation and Monit... median prompt authoring time
We demonstrate its extraterritorial scope for gaining access to elements such as employment contracts and NDAs that have never been provided to the workers concerned.
Reported legal/empirical demonstration in paper: GDPR requests resulting in access to employment contracts and nondisclosure agreements (NDAs) that workers had not previously received. (Exact number of successful requests not stated in the excerpt.)
high positive Auditing African Content Moderators' Working Conditions by U... access to employment contracts and NDAs via GDPR (extraterritorial application)
We audit the working conditions of content moderators in Kenya and Nigeria employed by business process outsourcing (BPO) companies by using the European General Data Protection Regulation (GDPR).
Method reported in paper: use of GDPR data-subject access / information requests to BPOs and platforms to obtain employment-related documents for content moderators in Kenya and Nigeria. (Sample size / number of requests not stated in the excerpt.)
high positive Auditing African Content Moderators' Working Conditions by U... use of GDPR to access employment-related documents for content moderators
Design principles that promote disagreement and decentralization—contextual grounding, community customization, continual adaptation, and polycentric governance—should be used so oversight is distributed across many legitimate centers rather than centralized in one institutional or moral chokepoint.
Normative design recommendations and governance proposals provided in the paper (argumentative; no empirical governance evaluation reported).
high positive Positive Alignment: Artificial Intelligence for Human Flouri... promotion of disagreement and decentralization in AI oversight/governance
A range of technical directions (e.g., data filtering and upsampling, pre- and post-training, evaluations, collaborative value collection) are relevant for supporting positive alignment across different phases of the LLM and agents lifecycle.
Prescriptive technical recommendations and research directions described by the authors (conceptual proposals, not reported empirical tests).
high positive Positive Alignment: Artificial Intelligence for Human Flouri... applicability of listed technical interventions to LLM/agent lifecycle for posit...
Several existing failures of alignment (e.g., engagement hacking, loss of human autonomy, failures in truth-seeking, low epistemic humility, error correction, lack of diverse viewpoints, and being primarily reactive rather than proactive) may be better addressed through positive alignment, including cultivating virtues and maximizing human flourishing.
Theoretical argument and illustrative examples presented in the paper (no experimental or observational results reported).
high positive Positive Alignment: Artificial Intelligence for Human Flouri... mitigation of specific alignment failures (engagement hacking, autonomy loss, tr...
Positive Alignment is a distinct and necessary agenda within AI alignment research.
Normative argumentation in the paper advocating for a separate research agenda (no empirical validation presented).
high positive Positive Alignment: Artificial Intelligence for Human Flouri... need for a distinct research agenda in alignment
Positive Alignment is the development of AI systems that (i) actively support human and ecological flourishing in a pluralistic, polycentric, context-sensitive, and user-authored way while (ii) remaining safe and cooperative.
Paper's definitional proposal / conceptual framing (normative definition rather than empirical evidence).
high positive Positive Alignment: Artificial Intelligence for Human Flouri... definition and intended properties of 'Positive Alignment' systems
Policy frameworks are necessary to govern verifiable machine intelligence in modern socio-technical infrastructures.
Normative recommendation and policy discussion in the paper; no empirical policy evaluation or legislative case studies are presented in the supplied text.
high positive Optimizing Process Based Reward Models through Reinforcement... existence/need for governance and regulation
Process-based supervision has broader implications for algorithmic fairness and can reduce black-box opacity.
High-level discussion in the paper linking process-verifiability to fairness and reduced opacity; no empirical fairness audits or quantitative fairness metrics reported in the provided text.
high positive Optimizing Process Based Reward Models through Reinforcement... algorithmic fairness / model opacity
Integrating reinforcement learning with process-oriented feedback can foster a more transparent AI ecosystem where the path to a conclusion is as scrutinized as the conclusion itself.
Conceptual claim and proposed benefit in the paper; presented as an argument rather than supported by empirical transparency or interpretability studies in the supplied text.
high positive Optimizing Process Based Reward Models through Reinforcement... transparency / interpretability of model reasoning
Process-based supervision significantly improves the reliability of models in high-stakes domains such as law, medicine, and engineering.
Asserted by the authors as an advantage of PRMs for high-stakes applications; presented as argumentation rather than backed by reported empirical trials or case-study sample sizes in the provided text.
high positive Optimizing Process Based Reward Models through Reinforcement... model reliability in high-stakes domains
Optimizing PRMs through reinforcement learning enhances the verifiability and robustness of multi-step reasoning in large-scale model architectures.
Central argumentative claim of the paper (theoretical proposal and conceptual analysis); no experimental results or quantitative evaluation provided in the text supplied.
high positive Optimizing Process Based Reward Models through Reinforcement... verifiability and robustness of multi-step reasoning
Process-Based Reward Models (PRMs) assign value to each distinct stage of a reasoning chain, providing a more granular signal for training than outcome-only approaches.
Methodological description and conceptual argument in the paper; described as a design/approach rather than empirically validated with data.
high positive Optimizing Process Based Reward Models through Reinforcement... training signal granularity / training effectiveness
Overall, the study provides a cross-sectoral empirical foundation for understanding how budget flexibility, governance, and technology interact to support resilient financial systems in uncertain economic environments.
Synthesis statement based on the paper's cross-sectoral comparative analysis combining firm 10-K data (four firms), Open Budget Survey, OECD database, GAO reports, and the Flexibility Index.
high positive Budgeting for Agility: A Cross-Sectoral Analysis of Fiscal F... resilience of financial systems to uncertainty
In the public sector, systems characterized by strong transparency frameworks and Medium-Term Expenditure Frameworks demonstrate higher alignment between planned and actual expenditures.
Cross-sectional analysis using Open Budget Survey 2023, OECD Budget Practices Database, and U.S. GAO oversight reports linking transparency and MTEFs to alignment between planned and actual expenditures.
high positive Budgeting for Agility: A Cross-Sectoral Analysis of Fiscal F... alignment between planned and actual expenditures (forecast/policy alignment)
Firms with decentralized budgeting structures and embedded predictive analytics exhibit lower forecast deviations and faster resource reallocation.
Comparative empirical analysis of four large firms using Form 10-K data (2019–2023) and the Flexibility Index to relate decentralization and AI integration to forecast deviations and reallocation speed.
high positive Budgeting for Agility: A Cross-Sectoral Analysis of Fiscal F... forecast deviation (predictive alignment) and speed of resource reallocation
Methodologically, the study demonstrates how expert reasoning can be operationalized as a benchmark for evaluating AI systems in urban infrastructure contexts, addressing gaps in empirical assessment and governance tools.
Study design: creation of Delphi-derived rubric from 20 experts and its use as an evaluation benchmark for six LLMs; reported as a methodological contribution.
high positive Governance risks of AI reasoning in urban infrastructure thr... feasibility of operationalizing expert reasoning as evaluation benchmark
The Delphi process elicited and refined expert reasoning criteria, producing a rubric that emphasized public safety, regulatory compliance, contextual judgment, financial stewardship, and system reliability.
Method: Delphi process with 20 infrastructure professionals that generated and refined reasoning criteria; resulting rubric content reported in paper.
high positive Governance risks of AI reasoning in urban infrastructure thr... content/themes of the expert-derived rubric
Policymakers should combine support for technological development with strategic investments in finance, trade integration, and public infrastructure to maximize AI's economic benefits and transform its potential into sustainable and inclusive growth.
Policy recommendation derived from the empirical findings (positive AI effects and positive interactions with financial innovation, trade openness, and government consumption) reported for 19 G20 countries (2005–2023) using GMM.
high positive Artificial intelligence and economic growth in G20 economies... economic growth (implied)
The interaction between AI and government final consumption expenditure helps strengthen economic growth by improving public infrastructure, institutional quality, and capacity to leverage new technologies.
GMM interaction specifications using panel data for 19 G20 countries (2005–2023); reported AI × government final consumption expenditure interaction coefficient is positive and statistically significant, with interpretation linking it to public infrastructure and institutional capacity.
The interaction between AI and trade openness is positive and significant, underscoring the role of international trade in technological diffusion and competitiveness to boost growth.
GMM interaction models on panel data (19 G20 countries, 2005–2023); reported AI × trade openness interaction coefficient is positive and statistically significant.
The interaction between AI and financial innovation has a positive and significant impact on economic growth, indicating that innovative finance mediates AI's technological potential into tangible economic gains.
GMM models with interaction terms using panel data of 19 G20 countries (2005–2023); reported AI × financial innovation interaction coefficient is positive and statistically significant.
AI-related innovation has a positive and significant effect on economic growth (linear model, GMM).
Panel analysis of 19 G20 countries (2005–2023) using the Generalized Method of Moments (GMM) linear model; reported positive and statistically significant coefficient for AI-related innovation.
In an empirical study of the Community Health Centers rollout, estimated spillovers account for a substantial share of the effect on older-adult mortality.
Empirical application reported in the paper applying the proposed methods to the Community Health Centers rollout; estimated spillover component contributes substantially to the measured effect on older-adult mortality (results from observational data analysis).
Monte Carlo simulations show the proposed estimators have small bias for these effects and the associated confidence intervals have coverage close to the nominal level.
Monte Carlo simulation evidence reported in the paper indicating small bias of the proposed estimators and coverage of confidence intervals close to nominal in the simulated settings.
high positive Identification and Estimation of Staggered Difference-in-Dif... estimator bias and confidence interval coverage
Synthetic scenarios in the paper illustrate that the revised metric distinguishes between frequent low-leverage use, semantically repetitive prompting, and more autonomous, higher-consequence AI-assisted work.
Paper includes synthetic scenario simulations/illustrations demonstrating metric behavior across different usage patterns (synthetic examples; no real-world sample reported).
high positive Intelligence Impact Quotient (IIQ): A Framework for Measurin... ability of the metric to discriminate types of AI use
The authors derive sub-daily update rules and a bounded interpretation layer for estimated efficiency and financial impact from the IIQ metric.
Analytic derivation in the methods: paper presents update rules (sub-daily) and an interpretation layer mapping IIQ to estimated efficiency and financial impact (theoretical derivation / worked examples). No empirical validation sample reported.
high positive Intelligence Impact Quotient (IIQ): A Framework for Measurin... estimated efficiency and financial impact
The formulation produces a raw Intelligence Adoption Index (IAI) and a normalized 0-1000 IIQ index for comparison between heterogeneous users and units.
Methodological description: authors define a raw IAI and describe normalization to a 0–1000 IIQ scale for comparability (model/specification). No empirical sample reported.
high positive Intelligence Impact Quotient (IIQ): A Framework for Measurin... normalized adoption/index score across users/units
IIQ combines a novelty-weighted, time-decayed token stock with usage frequency, a grace-period recency gate, organizational leverage, task complexity, and autonomy to form its measurement.
Methodological formulation in the paper: component-level specification of the IIQ metric (theoretical specification / algorithmic description). No empirical validation sample reported.
high positive Intelligence Impact Quotient (IIQ): A Framework for Measurin... operationalization of AI usage (components driving the metric)
The Intelligence Impact Quotient (IIQ) is a composite metric intended to quantify the depth to which AI systems are integrated into organizational work and their impact.
Paper framing and definition: the authors introduce IIQ as a composite metric and describe its purpose as quantifying AI integration depth and impact (conceptual/methodological description). No empirical sample reported.
high positive Intelligence Impact Quotient (IIQ): A Framework for Measurin... depth of AI integration into work / AI impact
Experiments show consistent advantages in viewer engagement.
Reported experimental comparison vs named baselines; claim of consistent advantage in viewer engagement without numeric effect size provided in the excerpt.
Experiments show consistent advantages in tactfulness.
Reported experimental comparison vs named baselines; claim of consistent advantage in tactfulness without numeric effect size provided in the excerpt.
Experiments against GPT-5.4, Claude Sonnet 4.6, Gemini 3.1 Pro, and other baselines demonstrate gains of 18% on factual correctness.
Reported experimental comparison vs named baselines; specific numeric improvement stated (18% gain on factual correctness). Evaluation dataset or sample size not provided in the excerpt.
Experiments against GPT-5.4, Claude Sonnet 4.6, Gemini 3.1 Pro, and other baselines demonstrate gains of 23% on informativeness.
Reported experimental comparison vs named baselines; specific numeric improvement stated (23% gain on informativeness). Evaluation dataset or sample size not provided in the excerpt.
We fine-tune a large language model on this data to deliver empathetic, commercially oriented responses, adapting to viewer intent through empathetic amplification, evidence-backed rebuttal, and humor-mediated deflection.
Methodological contribution: fine-tuning an LLM on the collected annotated data, described in the paper.
high positive VerbalValue: A Socially Intelligent Virtual Host for Sales-D... ability to produce empathetic, commercially oriented responses
We collect and annotate 1,475 live-commerce interactions spanning diverse viewer intents.
Dataset creation reported in the methods: explicitly states 1,475 annotated live-commerce interactions.
high positive VerbalValue: A Socially Intelligent Virtual Host for Sales-D... size of annotated dataset
We construct a domain knowledge base of product specifications and a curated sales terminology lexicon that anchor product-related responses in verified expertise.
Methodological contribution described in the paper: construction of a domain knowledge base and curated sales terminology lexicon.
high positive VerbalValue: A Socially Intelligent Virtual Host for Sales-D... availability of domain knowledge and sales lexicon (artifact creation)
A skilled live-commerce host is not merely a narrator, but a sales agent who converts viewer curiosity into purchase intent through expert product knowledge, emotionally intelligent response tactics, and entertainment that serves as a vehicle for product exposure.
Conceptual description in the paper's introduction; no empirical data or experimental method cited in the excerpt.
high positive VerbalValue: A Socially Intelligent Virtual Host for Sales-D... purchase intent / sales conversion
Das Dokument leistet einen Beitrag zu den laufenden Bemühungen der G7 und der OECD, die Verbreitung innovativer, vertrauenswürdiger und produktivitätssteigernder KI im Einklang mit den KI-Grundsätzen der OECD zu fördern.
Descriptive claim about the paper's intended contribution to G7/OECD efforts and alignment with OECD AI Principles (self-declared by the paper).
high positive Einführung von KI in kleinen und mittleren Unternehmen Policy-aligned Förderung vertrauenswürdiger, produktivitätssteigernder KI
Die Erkenntnisse unterstreichen, dass die Regierungen Strategien unterstützen sollten, die die Einführung von KI in KMU beschleunigen und eine digitale Transformation fördern, die allen zugutekommt.
Policy recommendation based on the paper's synthesis of data analysis and case studies; presented as the paper's conclusion (no causal estimate provided in excerpt).
high positive Einführung von KI in kleinen und mittleren Unternehmen Wirksamkeit staatlicher Strategien zur Beschleunigung der KI-Einführung in KMU /...
Das Dokument führt eine Taxonomie der KI-nutzenden KMU auf Basis des digitalen Reifegrads, der Komplexität der Nutzung und des Umfangs der Anwendung ein, die darauf abzielt, die Politikgestaltung zu unterstützen.
Descriptive statement that the paper develops a taxonomy (method: taxonomy construction based on those three dimensions); presented as part of the paper's contributions (no empirical validation details in excerpt).
high positive Einführung von KI in kleinen und mittleren Unternehmen Kategorisierung/Taxonomie von KI-nutzenden KMU
Dieses Diskussionspapier wurde auf Ersuchen der G7-Präsidentschaft vom OECD-Sekretariat erstellt, um Hintergrundmaterial für die Diskussionen der G7 über einen Blueprint zur Einführung von KI in KMU bereitzustellen.
Descriptive statement of the paper's provenance and purpose (administrative/factual about document preparation).
high positive Einführung von KI in kleinen und mittleren Unternehmen Erstellung und Zweck des Diskussionspapiers
Im Rahmen der G7-Präsidentschaft Kanadas 2025 wurde die beschleunigte Einführung von KI in KMU zu einer Hauptpriorität erklärt.
Factual statement about G7 policy priorities as reported by the paper (administrative/policy fact reported by OECD secretariat).
high positive Einführung von KI in kleinen und mittleren Unternehmen political/policy prioritization of AI adoption in KMU
Künstliche Intelligenz (KI) ist ein vielversprechender Ansatz, um Produktivität und Innovation in Unternehmen, insbesondere kleinen und mittleren Unternehmen (KMU), zu steigern.
Authoritative statement in the paper's summary; based on literature review and general argumentation rather than a specific empirical test reported in this excerpt.
high positive Einführung von KI in kleinen und mittleren Unternehmen Produktivität und Innovation in Unternehmen (insbesondere KMU)
The framework contributes to improving understanding of enterprise coordination and governance under constrained legal conditions and offers a basis for future analytical and empirical research.
Author-stated contribution of the paper based on the developed theoretical framework; positioned as foundation for future work.
high positive RegTech-enabled governance of sanctions-safe enterprise ecos... conceptual contribution to understanding enterprise coordination and governance
The analysis identifies theoretical conditions under which such governance may support verifiable integrity, adaptive compliance, and access to formal markets.
Theoretical conditions derived from the review and theory synthesis (no empirical testing reported in this paper).
high positive RegTech-enabled governance of sanctions-safe enterprise ecos... verifiable integrity, adaptive compliance, access to formal markets
The study develops a theory-based framework explaining how RegTech-supported governance may, under specified conditions, enable sanctions-safe enterprise ecosystems during post-conflict reconstruction.
Primary contribution of the paper: theory synthesis built from integrative review of five literature streams (RegTech, sanctions compliance, institutional voids, supply-chain governance, algorithmic accountability).
high positive RegTech-enabled governance of sanctions-safe enterprise ecos... potential for RegTech-supported governance to enable sanctions-safe enterprise e...