The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (3566 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Labor Markets Remove filter
Reliance on preference signals risks learning spurious proxies and produces unstable behavior under distribution shift.
Theoretical argument supported by examples of spurious proxies in ML and by observations in RLHF-trained models; the paper cites literature showing proxy behavior but does not present a unified empirical quantification specific to RLHF across many tasks.
medium negative Via Negativa for AI Alignment: Why Negative Constraints Are ... frequency of spurious-proxy-driven failures and degradation in behavior under di...
Positive preference signals are continuous, context-dependent, and entangled with surface correlates (e.g., agreement with the user), which causes models trained on them to pick up spurious proxies and exhibit sycophancy and brittleness.
Conceptual/theoretical argument in the paper describing structural properties of preference spaces, supported by cited observations of sycophantic behavior in models trained with preference-based objectives. No single definitive empirical quantification is provided within the paper; supporting examples are drawn from recent literature.
medium negative Via Negativa for AI Alignment: Why Negative Constraints Are ... incidence of sycophantic behavior and brittleness (e.g., tendency to agree with ...
There is a risk of manipulation and misinformation if argument mining/synthesis is unregulated or misaligned with social incentives, creating externalities that may justify public intervention.
Conceptual risk assessment combining known misinformation dynamics and AI capabilities; no empirical incident data provided.
medium negative Argumentative Human-AI Decision-Making: Toward AI Agents Tha... incidence of manipulation/misinformation attributable to argument-mining/synthes...
Increased error risk and weaker explainability from GLAI will raise malpractice and liability exposure for firms and lawyers, driving up insurance and compliance costs.
Legal-risk analysis and economic reasoning connecting explainability/liability to insurance costs; no empirical cost studies presented.
medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... malpractice/liability exposure levels and associated insurance/compliance costs
The combination of hallucination and professional overreliance strains existing regulatory goals (e.g., explainability, human oversight) within European AI governance frameworks.
Legal and regulatory analysis mapping technical and behavioral risks onto European AI governance goals; references to statutory/regulatory texts and policy debates. Qualitative argumentation rather than empirical test.
medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... compatibility between GLAI deployment dynamics and regulatory obligations (e.g.,...
Fabricated or opaque intermediate data and reasoning in GLAI weaken explainability, making it difficult to provide meaningful explanations about how outputs were produced.
Conceptual analysis of token-prediction architectures, literature on explainability limits of LLMs, and legal/regulatory analysis referencing explainability requirements. No empirical measurement.
medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... quality/meaningfulness of explanations about model outputs (explainability)
Hallucinated content produced by GLAI is often linguistically fluent and persuasive, increasing the risk that legal professionals will accept it without verification.
Literature synthesis on model fluency and behavioral literature on trust in coherent authoritative outputs, plus illustrative vignettes. No original experimental data or sample size.
medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... rate of professional acceptance or uncritical reliance on fluent but incorrect o...
This architectural mismatch (token-prediction vs. formal legal reasoning) contributes to confident but factually incorrect outputs (hallucinations) in GLAI.
Technical/conceptual analysis plus synthesis of existing literature on hallucinations in generative models; illustrative examples and vignettes provided. No primary empirical measurement in the paper.
medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... incidence and nature of hallucinated (factually incorrect) outputs produced by G...
Top-performing community submissions (including baselines and competition entries) still leave a performance gap relative to elite human play on battling tasks.
Paper reports comparative evaluation results showing win-rate and other metrics for heuristic, RL, LLM baselines and community submissions versus human (elite) benchmarks; analysis highlights a remaining gap.
medium negative The PokeAgent Challenge: Competitive and Long-Context Learni... performance gap measured primarily by win-rate (Battling) and strategic robustne...
Misalignment or poor meta-control could produce persistent unsafe behaviors in autonomous learners; governance and oversight mechanisms will be crucial.
Risk analysis based on conceptual failure modes for meta-control; no empirical incidents reported in the paper.
medium negative Why AI systems don't learn and what to do about it: Lessons ... frequency and severity of unsafe behaviors; successful governance interventions
Current models transfer poorly across domains, are brittle in nonstationary environments, and are inefficient in physical/embodied tasks.
Synthesis of known challenges from prior literature and practical experience; paper cites these as motivating observations rather than reporting new data.
medium negative Why AI systems don't learn and what to do about it: Lessons ... cross-domain generalization; robustness under nonstationarity; sample efficiency...
Current models have limited meta-control and do not autonomously decide when to explore, imitate, consult prior knowledge, or consolidate.
Conceptual critique based on typical ML training pipelines and limited on-line decision-making modules; no empirical tests in paper.
medium negative Why AI systems don't learn and what to do about it: Lessons ... autonomy in meta-decisions (e.g., fraction of exploration/imitative acts chosen ...
There is weak integration between passive observation (supervised/representation learning) and active experimentation (reinforcement/exploratory learning) in current systems.
Observation of methodological separation in current literature and systems; conceptual discussion in the paper.
medium negative Why AI systems don't learn and what to do about it: Lessons ... performance on mixed observation-action tasks; ability to combine passive and ac...
Current AI models lack the architectures and control mechanisms required for sustained, autonomous learning in dynamic real-world settings.
Conceptual/theoretical analysis presented in the paper; synthesis of limitations observed in existing literature and practices (no new empirical data provided).
medium negative Why AI systems don't learn and what to do about it: Lessons ... ability to sustain autonomous learning in dynamic real-world environments
Public‑interest concerns (bias, misuse, systemic risk) may be harder to mitigate via simple transparency rules; policies should emphasize outcome‑based regulations, mandatory behavioral testing, and marketplace disclosure obligations for stressed scenarios.
Policy implication derived from the non‑rule‑encodability thesis; no empirical policy evaluation included.
medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... effectiveness of transparency-based vs outcome-based regulatory approaches
Standard contracts and regulatory audits that rely on inspection of rule sets or source code will be insufficient to assess model behavior or risk; regulators and buyers must rely more on behavior‑based testing, standards, and outcome measures.
Policy and regulatory argument derived from the main theorem about non‑rule‑encodability; no empirical regulatory studies presented.
medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... effectiveness of rule‑based audits/regulatory inspections for assessing model ri...
Full interpretability via rule extraction may be impossible for the most valuable parts of LLM competence, limiting the utility of some transparency approaches for safety and auditing.
Argumentative consequence of the main theoretical claim and structural mismatch; supported by historical limitations of rule‑based systems; no empirical tests reported.
medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... feasibility of fully extracting human‑readable rules from LLMs (interpretability...
There is a structural mismatch between explicit human cognitive tools (rules, checklists) and the pattern‑rich, high‑dimensional competence encoded in LLMs.
Theoretical/structural argument about distributed statistical representations in LLMs versus discrete rules; no experimental quantification provided.
medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... alignment/mismatch between human‑readable rules and LLM representations/competen...
Historical expert systems failed to generalize or scale to complex, ambiguous tasks, contrasting with LLMs' broader empirical successes.
Historical case analysis and literature review-style discussion of expert systems versus contemporary LLM performance; no new quantitative historical dataset provided.
medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... generalization and scalability of rule‑based expert systems
LEAFE's benefits depend on informative, actionable feedback; environments with noisy or adversarial feedback may limit improvements.
Limitations stated in the paper noting sensitivity to feedback quality; conceptual reasoning that the method relies on extracting actionable signals from environment feedback.
medium negative Internalizing Agency from Reflective Experience Change in Pass@k or recovery performance under degraded/noisy feedback (qualitat...
Outcome-driven post-training (optimizing final rewards) underutilizes rich environment feedback and causes 'distribution sharpening' — policies overfit a narrow set of successful behaviors and fail to broaden problem-solving/recovery capacity in long-horizon settings.
Problem diagnosis in the paper supported by comparison of outcome-driven RL (GRPO) performance versus LEAFE and by conceptual argument about how optimizing final success signals can narrow behavioral support; supported by empirical observations of poorer recovery/generalization in baselines.
medium negative Internalizing Agency from Reflective Experience Breadth of problem-solving/recovery capacity (inferred from failure modes and Pa...
If left unchecked, managerial short-termism combined with AI adoption can create a feedback loop where firms cut labor to boost short-term profits, undermining aggregate demand and eroding the market that sustains those profits.
Conceptual macroeconomic and organizational synthesis drawing on theory and historical patterns; no new empirical time-series demonstrating this loop in current AI-driven layoffs.
medium negative A Shorter Workweek as a Policy Response to AI-Driven Labor D... sequence of firm-level layoffs, short-term profits, aggregate demand decline, su...
Work-time reduction policies carry distributional and implementation risks (heterogeneous effects by occupation, firm size, capital intensity; risk of hidden wage cuts) that require careful compensation rules and monitoring.
Theoretical reasoning and references to heterogeneous outcomes in prior work-hour studies; no new empirical quantification of heterogeneity in AI-era implementations.
medium negative A Shorter Workweek as a Policy Response to AI-Driven Labor D... heterogeneous employment/wage effects across occupations/firms; incidence of wag...
Lower household demand resulting from payroll cuts can precipitate further cost-cutting and automation, creating a self-reinforcing feedback loop that risks persistent demand shortfalls and higher structural unemployment.
Theoretical models of demand-driven adjustment and cited historical patterns; conceptual argument rather than empirical causal identification in contemporary AI contexts.
medium negative A Shorter Workweek as a Policy Response to AI-Driven Labor D... aggregate demand, subsequent rounds of layoffs/automation adoption, structural u...
AI-justified layoffs are driven more by managerial short-termism and misaligned executive incentives than by immediate technological necessity.
Interdisciplinary conceptual synthesis drawing on labor-economics theory, organizational behavior literature linking executive compensation/short-termism to layoffs, and selected prior empirical studies; no new firm-level causal identification or large-scale dataset provided.
medium negative A Shorter Workweek as a Policy Response to AI-Driven Labor D... frequency/extent of layoffs attributed to AI (vs. attributable to managerial inc...
Distributional impacts of AI are uneven: younger workers and individuals with lower formal education face greater disruption.
Descriptive breakdowns of occupational vulnerability and employment changes by demographic groups (age and education) derived from labor statistics and vulnerability mapping; supported by qualitative case observations. Exact subgroup sample sizes not given.
medium negative The AI Transition: Assessing Vulnerability and Structural Re... employment change / displacement risk by age cohort and education level
Routine service and administrative occupations show the highest vulnerability to automation and displacement from AI.
Occupational vulnerability mapping using task/routine exposure methods and descriptive employment trend analysis across occupations; supported by employer survey responses and case-study observations. Sample sizes for surveys/mapping not provided in summary.
medium negative The AI Transition: Assessing Vulnerability and Structural Re... occupational vulnerability / risk of displacement (automation exposure index or ...
Passive monitoring and predictive models are insufficient for governing the complex dynamics of a tech-driven economy.
Conceptual critique based on economic cybernetics literature and the author's expert assessment; no empirical test comparing governance regimes is provided.
medium negative DIGITAL TRANSFORMATION OF THE RUSSIAN FEDERATION’S SOCIOECON... governance adequacy/effectiveness (ability to steer socio-economic outcomes)
Digitalization is deepening digital inequality (unequal access to digital tools, skills, and benefits) across social groups and regions.
Qualitative analysis and expert assessment; the paper calls for new metrics but does not present systematic empirical measures of inequality.
medium negative DIGITAL TRANSFORMATION OF THE RUSSIAN FEDERATION’S SOCIOECON... digital inequality (access to internet/digital services, digital literacy rates)
Digital transformation can generate technological unemployment if not managed with appropriate retraining and social protection measures.
Expert assessment and literature-informed argumentation in the paper; no empirical longitudinal analysis isolating technology-driven job losses presented.
medium negative DIGITAL TRANSFORMATION OF THE RUSSIAN FEDERATION’S SOCIOECON... technological unemployment (job losses attributable to automation/AI adoption)
Forced or poorly regulated digitalization risks exacerbating social stratification.
Conceptual argument supported by qualitative analysis of policy documents and expert assessment; no empirical causal estimates provided.
medium negative DIGITAL TRANSFORMATION OF THE RUSSIAN FEDERATION’S SOCIOECON... social stratification (income/wealth inequality measures, social mobility proxie...
Manufacturing and Retail experienced net employment contractions attributable mainly to task automation and substitution.
Simulated employment-level series and net change calculations by sector (Manufacturing, Retail) across 2020–2024 in the paper's dataset, together with literature-derived mechanisms emphasizing automation/substitution in these sectors (systematic review of selected publishers 2020–2024).
medium negative AI-Driven Transformation of Labor Markets: Skill Shifts, Hyb... Employment levels and net change by sector (Manufacturing, Retail)
Explainability, trust, and demonstrated real-world effectiveness are key demand-side frictions; small-scale laboratory gains rarely translate into broad clinical uptake without workflow fit.
Adoption studies, qualitative interviews with clinicians and purchasers, and observations that many high-performing lab models see limited clinical use due to workflow and trust issues.
medium negative Human-AI interaction and collaboration in radiology: from co... adoption rates, clinician trust/acceptance measures, implementation success rate...
Hidden costs can arise from increased liability exposure, workflow redesign burden, and potential productivity loss during transition periods.
Qualitative deployment studies and procurement narratives reporting unanticipated legal, operational, and productivity impacts during early rollouts.
medium negative Human-AI interaction and collaboration in radiology: from co... measures of productivity during rollout, documented workflow redesign time/costs...
Human-AI collaboration can also generate harms, including automation bias, deskilling, and workflow disruption.
Behavioral laboratory experiments, simulation/reader studies demonstrating automation bias, qualitative reports and observational deployment accounts documenting workflow frictions and concerns about reduced trainee exposure.
medium negative Human-AI interaction and collaboration in radiology: from co... rates of over-reliance on AI, diagnostic error rates attributable to automation ...
Trust, verification costs, and legal/governance requirements remain consequential even with AI mediation and may limit or shape adoption.
Theoretical discussion of governance and verification costs; no empirical measurement of these costs in adopter firms provided.
medium negative AI as a universal collaboration layer: Eliminating language ... verification/trust costs; legal/governance compliance costs; adoption barriers
AI-mediated interpretation and action carry risks related to quality, bias, and misalignment, which can produce miscommunication or incorrect automated actions.
Paper's discussion section raising caveats; conceptual risk analysis without empirical incident data; references to general concerns in AI safety literature (no new empirical evidence provided).
medium negative AI as a universal collaboration layer: Eliminating language ... incidence of miscommunication/errors attributable to AI mediation; bias metrics;...
Despite positive outcomes, challenges such as workforce displacement, ethical concerns, and limited access to AI technologies were identified as barriers to full adoption.
Study respondents reported barriers in the survey; descriptive statistics summarized the prevalence of workforce displacement concerns, ethical issues, and limited access to AI technologies as impediments to broader adoption.
medium negative Entrepreneurship in the Era of Artificial Intelligence: Rede... barriers to AI adoption (perceived workforce displacement, ethical concerns, lim...
There is a growing tension between relatively rigid education and training systems and the rapidly changing skill requirements of digitally driven labor markets.
Argument motivated and supported by comparative assessment of international practices and systemic analysis; descriptive/comparative evidence rather than quantified empirical testing.
medium negative EDUCATIONAL AND PROFESSIONAL STRATEGIES FOR PREPARING HUMAN ... alignment between education/training systems and labor market skill requirements
Analyses of online job postings indicate significant declines in demand for highly automatable and entry-level roles.
Empirical studies using online job-posting data described in the paper (methods: job-posting frequency/trend analysis; sample size/timeframe not specified in the excerpt).
medium negative The Impact of Generative AI on the Future of Employment: Opp... job demand (posting volume) for highly automatable positions and entry-level rol...
Since the public release of ChatGPT in November 2022, concerns regarding job displacement, wage reduction, and labor market restructuring have intensified.
Temporal observation in the paper referencing heightened public and policy concerns after ChatGPT's release; based on cited literature and discourse (no sample size given).
medium negative The Impact of Generative AI on the Future of Employment: Opp... perceived risk: job displacement, wage reduction, labor market restructuring
Low‑skill installation and maintenance jobs have increased, but wage levels and upward mobility for these jobs remain lower than those in high‑skill industries.
Finding reported from the literature review and cited reports/studies indicating growth in low‑skill installation/maintenance employment alongside comparative analyses of wages and career mobility; no specific datasets or sample sizes provided in the summary.
medium negative Job Polarization in Solar Power Plants: A Systematic Literat... number of low‑skill installation/maintenance jobs; wage levels; measures of upwa...
Job polarization is occurring in solar power plants as a result of automation or digital transformation and changes in required skill sets.
Synthesis from the systematic literature review and referenced reports/studies indicating links between automation/digitalization and occupational shifts in solar plants; specific studies and sample sizes not provided in the summary.
medium negative Job Polarization in Solar Power Plants: A Systematic Literat... degree of job polarization (shift in job distribution across skill levels) withi...
The paper highlights that urgent policy intervention is required to reestablish a balance between the benefits of AI and the ethical ramifications that arise from these technologies, with a particular emphasis on job displacement.
Author conclusion drawn from the stated literature-based analysis; the excerpt does not list the specific studies, empirical findings, or criteria used to reach this policy recommendation.
medium negative A Study on Work-Life Balance of Women Employees in the IT Se... need for policy intervention to address ethical implications and job displacemen...
There has been an increase in the level of concern regarding the ethical implications arising from the automation of tasks and the subsequent job displacement due to AI.
Author statement based on a review of (unspecified) novel studies and existing literature; no empirical sample size, instrumentation, or quantitative measure of 'concern' reported in the provided text.
medium negative A Study on Work-Life Balance of Women Employees in the IT Se... level of concern about ethical implications of AI-driven automation and job disp...
The limitations of systems that prioritize academic pathways constrain workforce adaptability and inclusive labor market development.
Argument based on synthesis of empirical studies and secondary data connecting education pathway composition to workforce adaptability and inclusiveness (presented as a policy-relevant conclusion rather than a quantified causal estimate).
medium negative Balancing Higher Education, Vocational Training, and Lifelon... workforce adaptability and inclusiveness of labor market outcomes
Skills mismatch in the labor market is structural and linked to education systems that prioritize academic pathways without adequate support for vocational and continuing training.
Integrated interpretation of comparative evidence and secondary data showing imbalances between academic and vocational provision and associated labor-market frictions (paper frames this as a structural conclusion; specific causal tests not described in the summary).
medium negative Balancing Higher Education, Vocational Training, and Lifelon... skills mismatch magnitude and its structural drivers (education system compositi...
Expansion of intermediate vocational skills has been limited relative to the expansion of higher education.
Comparative evidence and secondary data showing smaller increases in intermediate vocational qualifications compared with higher education attainment (specific metrics/country coverage not provided in the summary).
medium negative Balancing Higher Education, Vocational Training, and Lifelon... supply/attainment of intermediate vocational qualifications
The risk to the tax system is heightened by the federal government’s dependence on individual labor income even as economic value shifts toward mobile capital and AI ownership by large firms.
Analytical claim in the paper linking tax base dependence to shifts in economic value; no empirical measurement of 'mobile capital' or quantified shift included in the excerpt.
medium negative Taxing AI vulnerability of tax base (share of revenue from labor income) given shifts towa...
AI threatens to disrupt the tax system’s ability to fulfill its fundamental goals of raising revenue, redistributing income, and regulating taxpayer behavior.
Normative/policy argument made in the paper (no empirical testing or quantified projections provided in the excerpt).
medium negative Taxing AI tax system performance on revenue raising, income redistribution, and behavioral...