The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (13827 claims)

Adoption
8454 claims
Productivity
7544 claims
Governance
6789 claims
Human-AI Collaboration
6327 claims
Org Design
4126 claims
Innovation
4058 claims
Labor Markets
3520 claims
Skills & Training
2924 claims
Inequality
2057 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 749 195 97 889 1979
Governance & Regulation 815 391 188 121 1539
Organizational Efficiency 771 189 124 83 1177
Technology Adoption Rate 624 233 123 96 1084
Research Productivity 410 121 56 331 929
Output Quality 466 177 59 47 749
Decision Quality 320 174 75 42 618
Firm Productivity 435 55 88 20 604
AI Safety & Ethics 214 276 65 33 593
Market Structure 178 166 122 24 495
Task Allocation 206 64 70 31 376
Skill Acquisition 165 57 60 17 299
Innovation Output 201 27 41 18 288
Employment Level 105 51 107 13 278
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 116 63 42 11 232
Firm Revenue 149 46 26 3 224
Inequality Measures 44 122 49 6 221
Task Completion Time 169 29 8 12 219
Worker Satisfaction 89 61 20 12 182
Error Rate 69 91 10 2 172
Regulatory Compliance 76 68 14 5 163
Training Effectiveness 92 19 13 19 145
Wages & Compensation 77 36 25 6 144
Automation Exposure 51 54 22 12 142
Team Performance 86 17 27 9 140
Developer Productivity 94 17 14 6 132
Job Displacement 12 80 20 1 113
Hiring & Recruitment 51 7 8 3 69
Skill Obsolescence 5 45 6 1 57
Creative Output 31 16 7 2 57
Social Protection 27 16 8 2 53
Labor Share of Income 17 17 17 51
Worker Turnover 11 12 3 26
Industry 1 1
European institutions (in particular the European AI Office) should issue guidance on how systems designed for sustained social or emotional interaction should be assessed in the implementation of the AI Act.
Policy recommendation contained in the text; prescriptive argument rather than an empirical finding; no supporting data or empirical evaluation provided.
high positive Governing Relational AI: China’s Regulation of Anthropomorph... issuance of regulatory guidance by European institutions
Existing regulatory frameworks will need to consider risks that arise not only from system outputs but also from longer-term patterns of human–AI interaction.
Normative recommendation based on the document's argument that conversational AI generates risks through sustained interaction; no empirical method or data reported.
high positive Governing Relational AI: China’s Regulation of Anthropomorph... scope of regulatory risk assessment (outputs vs. long-term interaction patterns)
The paper proposes five evaluation dimensions for AutoResearch systems: novelty, validity, impact, reliability, and provenance.
Paper explicitly proposes these five dimensions as an evaluation rubric; conceptual proposal.
high positive AutoResearch AI: Towards AI-Powered Research Automation for ... n/a (evaluation framework)
The field can be organized around five workflow conditions: literature and research grounding; hypothesis formation and planning; experimentation and tool use; feedback, validation, and review; and reporting and knowledge communication.
Authors propose this five-condition organizational framework as part of their survey and synthesis; conceptual contribution.
high positive AutoResearch AI: Towards AI-Powered Research Automation for ... n/a (framework/organizational taxonomy)
Vibe Research denotes the human-steered region of prompt-based assistance and human-verified execution within AutoResearch.
Paper-introduced terminology and conceptual delineation of a sub-region of the AutoResearch spectrum; definitional statement.
high positive AutoResearch AI: Towards AI-Powered Research Automation for ... n/a (terminology/definition)
AutoResearch is defined as the developmental spectrum of AI-powered scientific workflow automation.
Paper provides an explicit definitional framing (terminology introduced by authors); conceptual contribution rather than empirical finding.
high positive AutoResearch AI: Towards AI-Powered Research Automation for ... n/a (terminology/definition)
This shift marks a transition from task-level AI for science to workflow-level research automation.
Conceptual argument backed by literature survey and examples of systems that coordinate multiple research tasks; no single quantitative study reported.
high positive AutoResearch AI: Towards AI-Powered Research Automation for ... degree of automation along research workflows (task-level vs workflow-level)
Scientific research is being reshaped by AI systems that move beyond isolated assistance toward longer-horizon workflows spanning literature grounding, hypothesis generation, experimentation, validation, reporting, and revision.
Survey / conceptual synthesis of recent AI research systems and literature; paper presents this as an observed trend rather than reporting original empirical measurements.
high positive AutoResearch AI: Towards AI-Powered Research Automation for ... extent of AI integration across research workflows (literature grounding, hypoth...
XWind shows consistent gains across workload types, load levels, and GPU generations.
Reported experimental results spanning multiple workload types, different load levels, and various GPU generations (details in main paper); abstract states consistency of gains.
high positive XWind: A Cross-site Router for Large Language Model Inferenc... consistency of latency/performance gains across workloads, loads, and GPU genera...
XWind reduces P99 end-to-end latency by up to 98% over baselines such as power-capping and GPU idling.
Experimental results on the 64-GPU A100 testbed with emulated wind sites and Azure traces; comparison against baseline strategies including power-capping and GPU idling.
XWind reduces P99 end-to-end latency by up to 52% over the strongest contender (also our idea).
Experimental results on the 64-GPU A100 testbed with emulated wind sites and Azure traces; comparison against a 'strongest contender' baseline (described as another idea from the authors).
We build XWind, a lightweight, reactive, and workload-agnostic AI inference router that uses only real-time signals (inference latency, KV-cache utilization, and queue depth) to dynamically configure sites and distribute requests under variable wind power.
System implementation described in paper; design specification lists the three real-time signals used.
high positive XWind: A Cross-site Router for Large Language Model Inferenc... ability to configure sites and distribute inference requests using only specifie...
Site-wise right-sizing combined with spatial complementarity of wind energy keeps aggregate fleet utilization on par with traditional deployments.
Feasibility/analytical evaluation in the paper (presumably simulations/analysis of site sizing and spatial complementarity); specific methods/details not in abstract.
high positive XWind: A Cross-site Router for Large Language Model Inferenc... aggregate fleet utilization
Our feasibility analysis shows that 890+ GW of wind capacity lies within 50 ms network round trip time of Azure data centers.
Feasibility analysis mapping wind capacity to Azure data center network latency; result reported as aggregate capacity (890+ GW).
high positive XWind: A Cross-site Router for Large Language Model Inferenc... wind capacity within 50 ms RTT of Azure data centers
AI Greenferencing brings modular AI compute to renewable energy sources (focusing on wind), allowing AI footprint expansion, generating local behind-the-meter demand for renewable sites, and helping ease the growing strain on power utilities.
Conceptual/proposed deployment model described in the paper; feasibility analysis described elsewhere in the paper supports feasibility but exact empirical backing for all claimed benefits not specified in abstract.
high positive XWind: A Cross-site Router for Large Language Model Inferenc... local demand generation at renewable sites and reduction in grid strain
CHRONOS achieves a total privacy loss of epsilon = 4.25 at delta = 10^-6 under zCDP composition in the reported experiments.
Reported privacy accounting result in experimental section (zCDP composition).
high positive CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolv... privacy budget (epsilon, delta)
Measured latency for CHRONOS is 161 ms.
Reported experimental latency metric in paper.
Across the benchmarks CHRONOS attains 2.74 queries per second throughput.
Reported experimental throughput metric in paper.
high positive CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolv... throughput (queries per second)
The paper reports empirical results across four benchmarks showing CHRONOS achieves 0.937 recall at ten (recall@10).
Experimental evaluation across four benchmarks reported in paper (four benchmarks stated).
The paper includes a scalability analysis for 500 sellers (multi-epoch settlement).
Scalability analysis reported in paper explicitly referencing 500 sellers.
high positive CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolv... scalability with respect to number of sellers
CHRONOS releases a privatized affinity matrix per epoch using the Gaussian mechanism; all retrieval and ranking are post-processing and thus incur no extra privacy cost.
System design and privacy mechanism description in paper (Gaussian mechanism; post-processing argument).
high positive CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolv... privacy accounting / composition (privacy cost per epoch and downstream operatio...
Layer three uses EXP3-IX to achieve Big-O(sqrt(T log T)) regret while enforcing (epsilon, delta)-differential privacy via moments accounting.
Theoretical regret bound and privacy-preserving algorithmic design described in paper (EXP3-IX with moments accounting).
high positive CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolv... regret of the online allocation algorithm
Layer two conditions Shapley valuation on detected changepoints and provides finite-sample error guarantees under noise.
Methodological description plus finite-sample theoretical guarantees under noise presented in paper.
high positive CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolv... accuracy/error of Shapley-based valuations
The monotone-envelope guarantee in layer one reduces bound looseness to 1.8 to 3.2 times observed loss.
Empirical/theoretical comparison of bound looseness vs. observed loss reported in paper (range reported as 1.8–3.2×).
high positive CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolv... tightness of recall-loss bound (bound looseness ratio)
Layer one of CHRONOS applies neural-ODE temporal decay to shortcut edges and provides a per-query expected recall-loss bound of Big-O(Pq lambda delta t).
Theoretical bound and method description (neural-ODE temporal decay) presented in paper; no empirical sample size stated for the bound itself.
high positive CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolv... recall (expected recall-loss per query)
The AI-driven econometric approach outperforms traditional approaches by delivering more accurate forecasting and more timely policy recommendations.
Explicit claim in the paper that the approach outperforms traditional methods by producing more accurate forecasts and timelier recommendations; the excerpt contains no quantitative comparison, performance metrics, statistical tests, or sample sizes.
high positive AI-Augmented Econometrics: Transforming Labor Market Analysi... forecasting accuracy and timeliness of policy recommendations
The framework relies on distributed data processing and MLOps pipelines to enable system scalability and continuous model improvement.
System architecture description in the paper stating use of distributed processing and MLOps pipelines; no performance benchmarks, scalability tests, or deployment metrics are reported in the excerpt.
high positive AI-Augmented Econometrics: Transforming Labor Market Analysi... system scalability and continuous model improvement
The proposed approach uses ensemble models and deep learning combined with econometric methods to ensure both model interpretability and robust findings.
Methodological claim in the paper describing use of ensemble and deep learning models integrated with econometric techniques; no reported evaluation metrics, interpretability measures, or robustness tests in the provided text.
high positive AI-Augmented Econometrics: Transforming Labor Market Analysi... model interpretability and robustness of findings
Combining structured economic indicators with unstructured data from job postings and skill descriptions provides a real-time picture of employment patterns, wage changes, and skill requirements.
Paper describes integrating structured and unstructured data sources (economic indicators, job postings, skill descriptions) to produce a real-time view; no empirical metrics, evaluation sample, or quantitative validation given in the excerpt.
high positive AI-Augmented Econometrics: Transforming Labor Market Analysi... employment patterns, wage changes, skill requirements (real-time measurement)
An AI-based econometric system that incorporates machine learning algorithms and extensible data processing can enhance labor market predictions and research compared with traditional econometric models.
Methodological description in the paper stating development of an AI-based econometric system that incorporates ML and extensible data processing; no sample size or empirical evaluation statistics provided in the text excerpt.
high positive AI-Augmented Econometrics: Transforming Labor Market Analysi... labor market predictions / research quality
The study advances multilevel propositions and outlines a research agenda for examining legitimacy in hybrid human–AI decision systems.
Paper presents multilevel theoretical propositions and a suggested agenda for future empirical research (conceptual contribution; no empirical validation reported).
high positive Decision Legitimacy in AI-Enabled Organizations: A Multileve... presence of multilevel propositions and proposed research directions
Human judgment remains essential for contextual interpretation and accountability in hybrid human–AI decision systems.
Conceptual claim advanced through theoretical argumentation and literature references in the paper (no empirical sample reported).
high positive Decision Legitimacy in AI-Enabled Organizations: A Multileve... role of human judgment in contextual interpretation and accountability
Legitimacy of AI-enabled decisions depends on transparency, explainability, and perceived fairness.
Conceptual argument and literature synthesis in the paper emphasizing transparency, explainability, and fairness as determinants (no empirical sample reported).
high positive Decision Legitimacy in AI-Enabled Organizations: A Multileve... decision legitimacy as a function of transparency, explainability, perceived fai...
AI enhances efficiency and consistency in organizational decision-making.
Theoretical claim supported by referenced literature and conceptual argumentation within the paper (no empirical test or sample reported).
high positive Decision Legitimacy in AI-Enabled Organizations: A Multileve... efficiency and consistency of decisions
Procedural, distributive, and cognitive legitimacy are key dimensions of decision legitimacy in AI-enabled organizations.
Conceptual development in the paper drawing on institutional theory, socio-technical systems, and behavioral decision-making; literature synthesis and theoretical argumentation (no empirical sample reported).
high positive Decision Legitimacy in AI-Enabled Organizations: A Multileve... procedural legitimacy; distributive legitimacy; cognitive legitimacy
Fitted on 3.9B Pythia models with 30180B tokens, the Shannon Scaling Law predicts an unseen 12B model up to 307B tokens at pooled R^2=0.847, while monotonic baselines collapse.
Specific extrapolation experiment reported: model fit trained on models <=6.9B and <=180B tokens (Pythia), then used to predict behavior of an unseen 12B model up to 307B tokens; pooled R^2 reported as 0.847 and monotonic baselines reported to fail.
high positive LLMs as Noisy Channels: A Shannon Perspective on Model Capac... extrapolative predictive performance measured by pooled R^2 when predicting loss...
The Shannon Scaling Law consistently outperforms classical scaling laws and recent perturbation-aware laws, achieving strong R^2 scores and accurately capturing loss basins missed by prior approaches.
Empirical model comparison reported in the paper: goodness-of-fit comparisons (R^2) between the proposed Shannon Scaling Law and prior scaling laws / perturbation-aware variants, with qualitative claims about capturing loss basins.
high positive LLMs as Noisy Channels: A Shannon Perspective on Model Capac... goodness-of-fit (R^2) to observed loss/ performance curves and ability to captur...
We validate our theory through experiments on Pythia and OLMo2 under perturbations, including Gaussian noise, quantization and supervised fine-tuning on math, QA and code tasks.
Empirical experiments reported in the paper using Pythia and OLMo2 model families, testing various perturbations and tasks (math, QA, code).
high positive LLMs as Noisy Channels: A Shannon Perspective on Model Capac... empirical behavior of models under perturbations (robustness and fit to the prop...
Export controls often unintentionally boost China's self-reliance and R&D.
Argument in the paper that restrictions spur domestic substitution and investment in R&D in the targeted country (qualitative/historical reasoning; no quantified estimate provided).
high positive Strategic Stalemates: The Paradox of Export Controls in the ... China's domestic R&D capacity and technological self-reliance
Export controls are strategic tools in U.S.-China AI competition.
Analytical argument in the paper connecting export controls to broader strategic aims in great-power competition over AI; qualitative policy analysis rather than empirical measurement.
high positive Strategic Stalemates: The Paradox of Export Controls in the ... use of export controls as strategic instruments
Since October 2022, the U.S. Bureau of Industry and Security (BIS) has progressively tightened restrictions on advanced computing components to China.
Factual timeline asserted in the paper referencing BIS policy actions beginning October 2022 (policy documents and announcements invoked).
high positive Strategic Stalemates: The Paradox of Export Controls in the ... degree of U.S. export restrictions on advanced computing components to China
Controls cover advanced chips, capital, personnel, and critical minerals for semiconductors.
Enumerative claim in the paper listing categories of items and flows targeted by export controls (policy documents and examples cited).
high positive Strategic Stalemates: The Paradox of Export Controls in the ... categories of goods/flows subject to export controls
Export controls have become central to U.S.-China tech rivalry, especially in AI.
Policy analysis in the paper citing recent U.S. measures (e.g., BIS actions) and Chinese responses; contextual argumentation rather than a quantitative study.
high positive Strategic Stalemates: The Paradox of Export Controls in the ... centrality of export controls in bilateral tech competition
Export control is a policy and legal tool to protect national interests by regulating exports of sensitive goods and technology to foreign nations.
Descriptive/legal characterization presented in the paper (normative definition and overview of export control regimes).
high positive Strategic Stalemates: The Paradox of Export Controls in the ... scope and use of export control as a policy instrument
These findings challenge narratives that automation and digitalization induce net job loss in manufacturing.
Interpretation based on the paper's empirical results showing positive effects of digital transformation on labor demand and demand for skilled workers (Chinese A-share manufacturing firms, 2011–2024). (Sample size not stated in provided text.)
high positive How Does Digital Transformation Reshape Manufacturing Firms'... implication for automation-induced job loss narrative
Digital transformation enhances employees' digital literacy.
Mechanism analysis reported in the paper using firm-level measures of employee digital skills/digital literacy as an intermediate outcome (Chinese A-share manufacturing firms, 2011–2024). (Sample size not stated in provided text.)
high positive How Does Digital Transformation Reshape Manufacturing Firms'... employees' digital literacy
Increased total factor productivity (driven by digital transformation) promotes both the amount of labor demanded and the intensity of factor input.
Mechanism/mediation analysis linking digital transformation → TFP → labor demand and factor-input intensity in the firm-level regressions (Chinese A-share manufacturing firms, 2011–2024). (Sample size not stated in provided text.)
high positive How Does Digital Transformation Reshape Manufacturing Firms'... labor demand and intensity of factor input
Digital transformation enhances firms' total factor productivity (TFP).
Mechanism analysis / mediation analysis reported in the paper using firm-level data (Chinese A-share manufacturing firms, 2011–2024). (Sample size not stated in provided text.)
high positive How Does Digital Transformation Reshape Manufacturing Firms'... total factor productivity
Digital transformation increases firms' need (demand) for highly educated, high-skilled workers.
Regression analysis on Chinese A-share listed manufacturing firms (2011–2024); analysis of worker composition/skill-demand reported by the authors. (Sample size not stated in provided text.)
high positive How Does Digital Transformation Reshape Manufacturing Firms'... demand for highly educated high-skilled workers
Digital transformation significantly increases the quantity of firm labor demand.
Regression analysis using data from Chinese A-share listed manufacturing firms between 2011 and 2024; mechanism and heterogeneity analyses reported in the paper. (Sample size not stated in provided text.)
high positive How Does Digital Transformation Reshape Manufacturing Firms'... quantity of firm labor demand