The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (13827 claims)

Adoption
8454 claims
Productivity
7544 claims
Governance
6789 claims
Human-AI Collaboration
6327 claims
Org Design
4126 claims
Innovation
4058 claims
Labor Markets
3520 claims
Skills & Training
2924 claims
Inequality
2057 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 749 195 97 889 1979
Governance & Regulation 815 391 188 121 1539
Organizational Efficiency 771 189 124 83 1177
Technology Adoption Rate 624 233 123 96 1084
Research Productivity 410 121 56 331 929
Output Quality 466 177 59 47 749
Decision Quality 320 174 75 42 618
Firm Productivity 435 55 88 20 604
AI Safety & Ethics 214 276 65 33 593
Market Structure 178 166 122 24 495
Task Allocation 206 64 70 31 376
Skill Acquisition 165 57 60 17 299
Innovation Output 201 27 41 18 288
Employment Level 105 51 107 13 278
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 116 63 42 11 232
Firm Revenue 149 46 26 3 224
Inequality Measures 44 122 49 6 221
Task Completion Time 169 29 8 12 219
Worker Satisfaction 89 61 20 12 182
Error Rate 69 91 10 2 172
Regulatory Compliance 76 68 14 5 163
Training Effectiveness 92 19 13 19 145
Wages & Compensation 77 36 25 6 144
Automation Exposure 51 54 22 12 142
Team Performance 86 17 27 9 140
Developer Productivity 94 17 14 6 132
Job Displacement 12 80 20 1 113
Hiring & Recruitment 51 7 8 3 69
Skill Obsolescence 5 45 6 1 57
Creative Output 31 16 7 2 57
Social Protection 27 16 8 2 53
Labor Share of Income 17 17 17 51
Worker Turnover 11 12 3 26
Industry 1 1
Simulating the calibrated endogenous-automation model under an 'A.I. as a continuation of historical patterns' calibration yields growth rates reaching only 2.5% by 2075.
Forward simulations of an endogenous-growth model calibrated to historical private business sector patterns (model + calibration + simulation).
high positive Past Automation and Future A.I.: How Weak Links Tame the Gro... projected economy-wide growth rate by 2075
The main benefit of automation is that it allows production of a task to shift from slowly-improving human labor to rapidly-improving machines.
Theoretical argument within the task-based model and supporting historical accounting showing faster capital-augmenting productivity growth relative to labor.
high positive Past Automation and Future A.I.: How Weak Links Tame the Gro... contribution of automation to productivity/TFP growth
At the task level, capital productivity has grown at least 3 percentage points per year faster than labor productivity.
Historical task-level growth accounting across sectors using BEA/BLS data and the paper's task-based decomposition; statement appears in abstract and introduction summarizing empirical findings across sectors.
high positive Past Automation and Future A.I.: How Weak Links Tame the Gro... gap in growth rates between capital productivity and labor productivity at the t...
Historically, TFP growth is driven primarily by improvements in capital productivity.
Growth accounting using a task-based model applied to aggregate U.S. data (BEA and BLS) and industry-level data; theoretical decomposition separating capital-augmenting, labor-augmenting, and "other" productivity components.
Economists strongly favor targeted policy interventions such as AI-focused worker retraining (71.8% support) over broad structural interventions like job guarantees (13.7% support) or universal basic income (37.4% support).
Survey items asking respondents to indicate normative support for six policy proposals; reported support percentages for the economist group for specific policies (retraining, job guarantee, UBI).
high positive Forecasting the Economic Effects of AI policy support percentages among economists
Economists (as a group) forecast GDP growth of 3.5% under the rapid AI scenario.
Conditional forecasts reported in Key Findings (economist subgroup forecasts under the rapid progress scenario).
high positive Forecasting the Economic Effects of AI annual GDP growth under rapid AI scenario (economists)
The median respondent in each group expects annual U.S. GDP growth of about 2.5% (unconditional forecast).
Unconditional (all-things-considered) survey forecasts of annual GDP growth elicited from respondents across five groups; compared in text to government and private-sector baseline forecasts (typical medium-run 2.0% and long-run 1.7%).
high positive Forecasting the Economic Effects of AI annual GDP growth (unconditional forecast)
The average economist assigns a 61.4% probability to moderate or rapid AI progress by 2030.
Survey responses from the economist respondent group reporting the mean/average subjective probability for the combined 'moderate' and 'rapid' scenario categories.
high positive Forecasting the Economic Effects of AI probability assigned to moderate or rapid AI progress by 2030
The median respondent in each group expects substantial advances in AI capabilities by 2030.
Survey of five respondent groups (academic economists, AI-company employees, AI policy researchers, highly accurate forecasters, and the general public) eliciting unconditional and conditional forecasts about AI capabilities and economic outcomes (details and sample sizes referenced in Section 2.1, not provided in excerpt).
high positive Forecasting the Economic Effects of AI AI capability progress by 2030
Organizations and policymakers that treat work-time policy as foundational economic planning will better position their economies to harness AI's benefits while mitigating systemic instability.
Policy-prescriptive conclusion based on cross-disciplinary analysis; no empirical trial or quantification offered in the summary.
high positive A Shorter Workweek as Economic Infrastructure: Managing AI-D... economic resilience / ability to harness AI benefits and mitigate instability
Work-time reduction can distribute productivity gains more equitably.
Argument supported by examination of historical work-time transitions and pilot programs referenced in the article; no empirical effect sizes or sample details in the summary.
high positive A Shorter Workweek as Economic Infrastructure: Managing AI-D... distribution of productivity gains / equity in gains
Coordinated reduction in working hours helps maintain aggregate demand.
The paper's synthesis of historical transitions and pilot programs and argument about distribution of productivity gains; no quantitative evidence or sample sizes provided in the summary.
high positive A Shorter Workweek as Economic Infrastructure: Managing AI-D... aggregate demand / consumption
Gradual, policy-led reduction in standard working hours can preserve employment.
Claim based on examination of historical work-time transitions, contemporary pilot programs, and cross-sector implementation strategies referenced in the paper; no specific studies or sample sizes cited in the summary.
high positive A Shorter Workweek as Economic Infrastructure: Managing AI-D... employment levels / preservation of jobs
Platforms should implement AIGC-sensitive distribution algorithms and precise governance frameworks to ensure the long-term health of online content platforms.
Policy/recommendation derived from the paper's empirical findings on consumption preferences, producer behaviors, and the moderating role of distribution algorithms.
high positive Scale over Preference: The Impact of AI-Generated Content on... long-term platform health (qualitative recommendation target)
AIGC creators achieve aggregate engagement comparable to HGC creators by producing content at high volume (a 'scale-over-preference' dynamic).
Analysis of creation and engagement patterns in the dataset showing that AIGC creators compensate for lower per-item engagement by higher production volume, yielding comparable aggregate engagement levels to HGC creators.
high positive Scale over Preference: The Impact of AI-Generated Content on... aggregate engagement per creator (total engagement across produced items)
Consumers show a marked preference for Human-Generated Content (HGC) over Artificial Intelligence-Generated Content (AIGC).
Comparative analysis of consumption behavior in the longitudinal dataset; the paper reports consumption metrics that indicate higher consumer preference for HGC versus AIGC (e.g., relative engagement per item).
high positive Scale over Preference: The Impact of AI-Generated Content on... consumer preference (relative engagement per content type)
AI facilitates access to distant knowledge domains.
Theoretical model (Schumpeterian quality-ladder recombinant-innovation framework). The paper models R&D as recombining ideas across a knowledge space and shows analytically that AI increases firms' ability to combine ideas across longer distances.
high positive Bridging Distant Ideas: the Impact of AI on R&D and Recombin... access to distant knowledge domains (distance of recombinations)
Systematic quality auditing should be standard practice for complex agentic tasks.
Normative recommendation based on the authors' methodological and empirical findings that auditing revealed substantial benchmark issues affecting evaluation of agent capabilities.
high positive ELT-Bench-Verified: Benchmark Quality Issues Underestimate A... recommendation for adoption of systematic quality auditing (policy/practice prop...
Re-evaluating on ELT-Bench-Verified yields significant improvement attributable entirely to benchmark correction.
Re-evaluation of agent performance on the revised benchmark which the authors claim shows significant improvement and that this improvement is due to the benchmark corrections; no quantitative effect sizes or sample sizes provided in the excerpt.
high positive ELT-Bench-Verified: Benchmark Quality Issues Underestimate A... change in agent performance after benchmark correction
Based on these findings, we construct ELT-Bench-Verified, a revised benchmark with refined evaluation logic and corrected ground truth.
Development and release of a revised benchmark (ELT-Bench-Verified) incorporating refined evaluation logic and corrected ground truth as described in the paper.
high positive ELT-Bench-Verified: Benchmark Quality Issues Underestimate A... existence of a revised benchmark with corrected evaluation and ground truth
We develop an Auditor-Corrector methodology that combines scalable LLM-driven root-cause analysis with rigorous human validation (inter-annotator agreement Fleiss' kappa = 0.85) to audit benchmark quality.
Description of a methodology combining LLM root-cause analysis and human validation; human validation reported with inter-annotator agreement Fleiss' kappa = 0.85.
high positive ELT-Bench-Verified: Benchmark Quality Issues Underestimate A... benchmark audit reliability as measured by inter-annotator agreement (Fleiss' ka...
Re-evaluating ELT-Bench with upgraded large language models reveals that the extraction and loading stage is largely solved, while transformation performance improves significantly.
Re-evaluation performed using upgraded LLMs comparing performance across ELT pipeline stages; specific performance metrics or sample sizes not reported in the excerpt.
high positive ELT-Bench-Verified: Benchmark Quality Issues Underestimate A... performance on extraction/loading and transformation stages of ELT pipeline cons...
Constructing Extract-Load-Transform (ELT) pipelines is a labor-intensive data engineering task and a high-impact target for AI automation.
Statement in the paper framing ELT pipeline construction as labor-intensive and high-impact; no empirical data or sample size reported in the provided excerpt.
high positive ELT-Bench-Verified: Benchmark Quality Issues Underestimate A... labor intensity and suitability for AI automation (qualitative claim)
Competition law assessments of a dominant undertaking’s conduct must consider not only the product market but also the labor market, particularly in cases of significant market structure changes.
Conclusion stated in abstract summarizing the paper’s findings; supported by the paper's legal analysis and referenced case law (no empirical sample provided in abstract).
high positive Employee Poaching as An Abuse of Dominance Under Article 102... scope of competition law assessment (inclusion of labor market considerations)
Poaching employees is an inherent aspect of competition for highly qualified talent and is particularly pronounced among tech giants.
Statement in abstract; general observation supported by literature/case-law references implied in paper (no specific empirical sample or quantitative method reported in abstract).
high positive Employee Poaching as An Abuse of Dominance Under Article 102... frequency/prevalence of employee poaching among firms (not quantitatively measur...
A statistical recalibration technique called conformal prediction can correct this overconfidence, expanding the intervals to achieve the intended coverage.
Application of conformal prediction to the LLM interval outputs in the experiment, resulting in expanded intervals that attain the target coverage.
high positive Bayesian Elicitation with LLMs: Model Size Helps, Extra "Rea... coverage of recalibrated credible intervals (post-conformal prediction)
Larger, more capable models produce more accurate estimates.
Empirical experiment asking eleven LLMs to estimate population statistics (health prevalence rates, personality trait distributions, labor market figures) and comparing accuracy across models of different capability.
high positive Bayesian Elicitation with LLMs: Model Size Helps, Extra "Rea... accuracy of population-statistic estimates
Applying the Method of Moments Quantile Regression (MMQR) allows the study to capture heterogeneous impacts of robotics across performance levels.
Authors describe use of MMQR in methodology and justify it as appropriate for detecting heterogeneity across quantiles of the dependent variable (value added).
high positive Automation and growth in the European Union: sectoral insigh... heterogeneity of estimated impacts across quantiles
The study uses panel data from Eurostat, the International Federation of Robotics (2024), and World Robotics covering three key sectors in selected EU countries.
Data sources explicitly listed in the paper (Eurostat, IFR 2024, World Robotics); the scope is described as three key sectors in selected EU countries.
high positive Automation and growth in the European Union: sectoral insigh... data coverage / sample scope
Policymakers should support automation through fiscal incentives, invest in reskilling programs, and develop innovation strategies tailored to specific sectors to foster inclusive and sustainable growth.
Policy recommendations derived from empirical findings showing heterogeneous effects of robot density, R&D and human capital across sectors; authors explicitly recommend fiscal incentives, reskilling, and sector-targeted innovation strategies.
high positive Automation and growth in the European Union: sectoral insigh... policy intervention recommendations aiming at inclusive and sustainable growth
The paper’s novelty lies in its differentiated, cross-sectoral approach integrating technological adoption (robotics) with sectoral gross value added using advanced econometric techniques (MMQR).
Authors state the study's contribution is differentiated cross-sectoral analysis and use of MMQR to capture heterogeneous impacts; methodological description provided in paper.
high positive Automation and growth in the European Union: sectoral insigh... methodological contribution / sectoral analysis of value creation
The positive effect of robot density on value added is particularly strong in higher-performing sectors (i.e., at higher quantiles of the value-added distribution).
Results from MMQR showing heterogeneous impacts across performance levels/quantiles; authors state larger positive coefficients of robot density at upper quantiles.
high positive Automation and growth in the European Union: sectoral insigh... gross value added across quantiles (sector performance levels)
Increased robot density significantly enhances value added.
Empirical analysis using panel data (Eurostat, International Federation of Robotics 2024, World Robotics) estimated with Method of Moments Quantile Regression (MMQR); gross value added used as dependent variable and robot density as a core explanatory variable; authors report statistically significant positive coefficients.
high positive Automation and growth in the European Union: sectoral insigh... gross value added (value added)
The paper proposes five architectural requirements for genuine human oversight systems.
Stated methodological/prescriptive contribution of the paper (a proposal rather than an empirical finding); no sample size or empirical validation reported in the provided excerpt.
high positive Beyond Symbolic Control: Societal Consequences of AI-Driven ... design requirements for systems enabling genuine human oversight
The proposed framework outlines a pathway toward large-scale cooperative intelligence and offers a constructive perspective on the coevolution of human and artificial agents in the informational ecosystems of the future.
Claim about the paper's contribution; based on conceptual synthesis and theoretical framing rather than empirical validation.
high positive A Case for Coevolution emergence of large-scale cooperative intelligence
A voluntary ecosystem of free rational agents, human and artificial, who cooperate through transparent and fair exchange of information maximizes their adaptive capacity and long-term well-being.
Normative proposition in the paper derived from theoretical principles (information theory, collective intelligence); presented as a proposed ideal rather than an empirically tested policy.
high positive A Case for Coevolution adaptive capacity and long-term well-being of participating agents
Emerging opportunities exist for stabilizing these ecosystems through new forms of informational verification and monitoring made possible by advanced artificial agents.
Forward-looking claim grounded in conceptual analysis of capabilities of advanced agents; proposed as an opportunity in the paper rather than demonstrated empirically.
high positive A Case for Coevolution stability of informational ecosystems via verification and monitoring tools
Systems that preserve diversity of exploration while minimizing barriers to information exchange exhibit superior capacity for discovery and adaptation in complex environments.
Theoretical claim supported by the paper's appeal to principles from information theory, adaptive systems, and collective intelligence; presented as an argument rather than as empirically validated result.
high positive A Case for Coevolution capacity for discovery and adaptation
Increasing the strictness of algorithmic control paradoxically increases the evolutionary fitness of coordinated resistance (e.g., coordinated log-offs).
Results from the EGT model and simulations showing fitness/payoff changes for coordinated resistance strategies as platform surveillance strictness parameter increases; model-only (no empirical N reported).
high positive THE RED QUEEN in the DASHBOARD: CO-EVOLUTIONARY DYNAMICS of ... evolutionary fitness (payoff) of coordinated resistance strategies
The primary contribution is a controlled agent-payment infrastructure and reference architecture that demonstrates how agentic access monetization can be adapted to fiat systems without discarding security and policy guarantees.
Summary of the paper's claimed contribution (architectural demonstration and reference implementation).
high positive APEX: Agent Payment Execution with Policy for Autonomous Age... existence of a controlled agent-payment infrastructure adapting monetization to ...
Multiple trial runs show low variance across scenarios, demonstrating high reproducibility with 95% confidence intervals.
Reported statistical characterization from repeated trials in the paper (statement of low variance and 95% confidence intervals across scenarios).
high positive APEX: Agent Payment Execution with Policy for Autonomous Age... variance / reproducibility across scenarios (95% CIs reported)
Security mechanisms impose low latency overhead (19.6ms average).
Performance measurement reported in the paper's experiments (average latency overhead reported as 19.6ms).
high positive APEX: Agent Payment Execution with Policy for Autonomous Age... latency overhead introduced by security mechanisms
Security mechanisms achieve 100% block rate for both replay attacks and invalid tokens.
Experimental security evaluation reported in the paper (block rate reported at 100% for replay attacks and invalid tokens).
high positive APEX: Agent Payment Execution with Policy for Autonomous Age... block rate for replay attacks and invalid tokens
The system uses FastAPI, SQLite, and Python standard libraries, making it transparent, inspectable, and reproducible.
Implementation stack specified in the paper and availability of reference implementation; asserted reproducibility.
high positive APEX: Agent Payment Execution with Policy for Autonomous Age... technology stack and reproducibility/inspectability of the implementation
APEX implements a challenge–settle–consume lifecycle with HMAC-signed short-lived tokens, idempotent settlement handling, and policy-aware payment approval.
Implementation details described in the methods/architecture section and supported by the provided reference implementation.
high positive APEX: Agent Payment Execution with Policy for Autonomous Age... presence of challenge–settle–consume lifecycle and specific security/payment mec...
We present APEX, an implementation-complete research system that adapts HTTP 402-style payment gating to UPI-like fiat workflows while preserving policy-governed spend control, tokenized access verification, and replay resistance.
System design and implementation presented in the paper (codebase built using FastAPI, SQLite, Python; demonstration/implementation claimed).
high positive APEX: Agent Payment Execution with Policy for Autonomous Age... ability to adapt HTTP 402-style gating to UPI-like fiat while preserving spend c...
API providers need request-level monetization with programmatic spend governance.
Normative recommendation in the paper (argumentation rather than empirical evidence).
high positive APEX: Agent Payment Execution with Policy for Autonomous Age... need for request-level monetization and spend governance
Autonomous agents are moving beyond simple retrieval tasks to become economic actors that invoke APIs, sequence workflows, and make real-time decisions.
Framing statement / literature-motivated claim in the paper's introduction (qualitative argumentation, no experimental sample reported).
high positive APEX: Agent Payment Execution with Policy for Autonomous Age... agents invoking APIs, sequencing workflows, and making real-time decisions (agen...
The future of transformative transformer-based AI is fundamentally many, not one.
Concluding synthesis and normative prediction based on the paper's theoretical arguments and literature synthesis; no empirical data or quantified projection provided in the excerpt.
high positive The Future of AI is Many, Not One architectural and organizational form of future transformative AI (multi-agent/d...
Developing diverse AI teams addresses critics' concerns that current models are constrained by past data and lack the creative insight required for innovation.
Argumentative claim drawing on conceptual critique of current models and the proposed remedy of diverse AI teams; supported by referenced disciplinary literatures but no empirical validation provided in the excerpt.
high positive The Future of AI is Many, Not One creative insight and capacity for innovation in AI systems