The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (8570 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Adoption Remove filter
All models are severely overconfident: their 95% intervals contain the true value only 9--44% of the time, far below the expected 95%.
Analysis of model-produced 95% credible intervals across elicited population statistics, measuring empirical coverage rates reported between 9% and 44%.
high negative Bayesian Elicitation with LLMs: Model Size Helps, Extra "Rea... empirical coverage rate of 95% credible intervals
Policy enforcement reduces total spending by 27.3%.
Quantitative result reported from the paper's experiments across baselines and scenarios (paper reports a 27.3% reduction attributed to policy enforcement).
In many deployment contexts, especially countries with strong real-time fiat systems like UPI, relying on crypto rails is misaligned with regulatory and infrastructure realities.
Contextual/argumentative claim in the paper contrasting crypto reliance with fiat systems such as UPI (no empirical country-level sample reported).
high negative APEX: Agent Payment Execution with Policy for Autonomous Age... alignment between payment-rail assumptions and regulatory/infrastructure realiti...
Traditional questionnaires yielded slightly higher accuracy in risk assessment.
Result reported from the two experiments comparing traditional questionnaires to adaptive ARQuest versions; no numeric accuracy or sample size provided in the excerpt.
Insurers must blindly trust users' responses, increasing the chances of fraud.
Stated as a motivating problem in the paper; presented as logical/empirical concern rather than supported by a reported study within the paper.
high negative AI in Insurance: Adaptive Questionnaires for Improved Risk P... fraud risk from self-reported responses
Insurance application processes often rely on lengthy and standardized questionnaires that struggle to capture individual differences.
Descriptive claim in paper introduction arguing limitations of standard questionnaires; no experiment or sample size reported for this assertion.
high negative AI in Insurance: Adaptive Questionnaires for Improved Risk P... ability of standardized questionnaires to capture individual differences
AI's disproportionate benefits for lagging regions help narrow interprovincial emission gaps.
Heterogeneity analysis reported in the provincial panel (2003–2021) showing stronger AI-related reductions in emissions inequality for lagging regions compared to advanced regions.
high negative Artificial intelligence, green innovation, and regional carb... interprovincial emission gaps (carbon inequality)
Green innovation is concentrated in coastal provinces and has not effectively diffused to inland areas, limiting its ability to reduce regional carbon inequality.
Spatial distribution analysis within the provincial panel showing geographic concentration of green innovation activity in coastal provinces and limited diffusion inland.
high negative Artificial intelligence, green innovation, and regional carb... geographic concentration of green innovation (diffusion to inland areas)
AI reduces carbon inequality primarily through improved energy efficiency, enhanced environmental monitoring, and more efficient resource allocation, disproportionately benefiting lagging regions and narrowing interprovincial emission gaps.
Mechanism analysis reported in the paper based on the provincial panel (2003–2021) linking AI development to proximate channels (energy efficiency, monitoring, resource allocation) and heterogeneous impacts across regions.
high negative Artificial intelligence, green innovation, and regional carb... carbon inequality (interprovincial emission gaps)
AI development significantly reduces carbon inequality, particularly when measured by the Gini index.
Empirical analysis using a provincial panel dataset covering 2003–2021; carbon inequality measured with the Gini index; reported statistically significant negative association between AI development and Gini-measured carbon inequality.
high negative Artificial intelligence, green innovation, and regional carb... carbon inequality (Gini index)
Cross-equipment generalization is poor, with 42.7% performance on held-out datasets.
Paper reports held-out dataset evaluation showing 42.7% (presumably accuracy or task completion) for cross-equipment generalization.
high negative PHMForge: A Scenario-Driven Agentic Benchmark for Industrial... held-out dataset performance (cross-equipment generalization)
Multi-asset reasoning causes a 14.9 percentage point degradation in performance.
Paper reports a 14.9 percentage point performance degradation attributed to multi-asset reasoning in comparative analyses.
high negative PHMForge: A Scenario-Driven Agentic Benchmark for Industrial... performance degradation (percentage points) when reasoning across multiple asset...
There are systematic failures in tool orchestration, with 23% incorrect sequencing.
Paper reports a measured incorrect sequencing rate of 23% during evaluation of agent tool orchestration across scenarios.
high negative PHMForge: A Scenario-Driven Agentic Benchmark for Industrial... rate of incorrect tool sequencing
Even top-performing configurations achieve only 68% task completion.
Reported aggregated performance result from the benchmark evaluation across the tested frameworks and LLMs (paper statement). The benchmark contains 75 scenarios (used as evaluation instances).
Improvements in operational resilience (OR) effectively reduce corporate operational risk.
Further analysis reported in the paper linking higher OR to lower operational risk measures for firms in the sample.
high negative Does Artificial Intelligence Improve the Operational Resilie... corporate operational risk (reduction)
AI promotes operational resilience by reducing management agency conflicts.
Mechanism (mediation) tests reported in the paper showing AI associated with reductions in measures of agency/management conflict, which in turn relate to OR improvements.
high negative Does Artificial Intelligence Improve the Operational Resilie... management agency conflicts (reduction)
Specific occupations such as credit analysts, judges, and sustainability specialists reach ATE scores of 0.43-0.47 by 2030.
Reported model outputs / ATE score estimates for individual occupations within the paper's 2025-2030 regional application.
high negative Agentic AI and Occupational Displacement: A Multi-Regional T... ATE score (automation exposure) for named occupations
Applying the ATE framework across five major US technology regions (Seattle-Tacoma, San Francisco Bay Area, Austin, New York, and Boston) over a 2025-2030 horizon, 93.2% of the 236 analyzed occupations across six information-intensive SOC groups cross the moderate-risk threshold (ATE >= 0.35) in Tier 1 regions by 2030.
Modeling/application of the ATE score to O*NET-derived tasks for 236 occupations in six SOC groups across five named US regions with forecasts for 2025-2030; explicit numeric result reported (93.2%).
high negative Agentic AI and Occupational Displacement: A Multi-Regional T... proportion of occupations crossing ATE moderate-risk threshold (automation expos...
Agentic AI systems execute end-to-end workflows (multi-step reasoning, tool invocation, autonomous decision-making) and substantially expand occupational displacement risk beyond what existing task-level analyses capture.
Theoretical extension of the Acemoglu-Restrepo task exposure framework described in the paper; conceptual argument contrasting prior automation (subtask substitution) with agentic AI (end-to-end workflow automation). No empirical sample size reported for this conceptual claim.
high negative Agentic AI and Occupational Displacement: A Multi-Regional T... occupational displacement risk (automation exposure)
Agent contributions are associated with more churn over time compared to human-authored code.
Longitudinal comparison between agent-generated and human-authored contributions reported in the paper (churn/survival estimates described; association between agent contributions and higher churn asserted).
high negative Investigating Autonomous Agent Contributions in the Wild: Ac... code churn rate over time (agent-generated vs human-authored)
Informal workers cannot capture augmentation rents: the estimated coefficient for H^A in informal sector is negative (beta_2 = -0.044).
Subsample or interaction estimate from the augmented Mincer regression using the same merged dataset (N = 105,517); reported coefficient beta_2 = -0.044 for informal workers.
high negative Augmented Human Capital: A Unified Theory and LLM-Based Meas... wages (return to H^A for informal workers)
Unbalanced or poorly governed adoption of Big Data and AI contributes to increased systemic risk, cybersecurity vulnerability, regulatory fragmentation and third-party dependence on BigTech platforms.
Argument based on qualitative literature review and synthesis of international empirical studies and comparative sector analysis; no single-sample empirical study in this paper.
high negative Implications of Big Data Technologies for the Resilience of ... systemic risk; cybersecurity vulnerability; regulatory fragmentation; third-part...
Extreme automation (high AI intensity) causes employment decline.
Part of the U-shaped relationship reported by the paper's empirical results; described qualitatively in the abstract/summary.
high negative Impact Of Artificial Intelligence (AI) On Employment employment decline
Task orchestration is the most under-researched dimension among the five workplace-design components.
Finding from the PRISMA-guided systematic review of 120 papers, which mapped coverage across the five dimensions and identified task orchestration as having the least research attention.
high negative From Automation to Augmentation: A Framework for Designing H... volume/coverage of research on task orchestration
Decision authority allocation emerges as the binding constraint for Society 5.0 transitions.
Result synthesized from the systematic review and theoretical analysis mapping the five workplace-design dimensions; stated as the binding constraint in the paper's findings.
high negative From Automation to Augmentation: A Framework for Designing H... constraint on transitions to human-centric (Society 5.0) technology integration
The environmental impact of AI is weaker in energy-efficient countries.
Heterogeneity analysis in the paper dividing sample by energy-efficiency status (energy-efficient vs. energy-inefficient countries) shows a smaller AI→CO2 association in energy-efficient countries (104-country panel, 2000–2023).
high negative Artificial Intelligence: A Blessing or a Curse for Climate A... CO2 emissions (heterogeneous AI effect by energy efficiency)
Advanced digital infrastructure (DII) significantly mitigates the positive effect of AI on CO2 emissions.
Moderation analysis in the panel regressions (104 countries, 2000–2023) including interaction terms between AI adoption and digital infrastructure; results reported that stronger DII reduces the environmental impact of AI.
high negative Artificial Intelligence: A Blessing or a Curse for Climate A... CO2 emissions (AI effect moderated by digital infrastructure)
High institutional quality (GQI) significantly mitigates the positive effect of AI on CO2 emissions.
Moderation analysis in the panel regressions (same 104-country sample, 2000–2023) including interaction terms between AI adoption and governance quality; reported results indicate the AI→CO2 effect is weaker when GQI is stronger.
high negative Artificial Intelligence: A Blessing or a Curse for Climate A... CO2 emissions (AI effect moderated by governance quality)
The literature shows persistent gaps in empirical validation, standardized evaluation methods, and sector-specific comparative analyses of agentic AI in financial services.
Review-level assessment noting limited empirical studies, heterogeneous evaluation metrics, and few direct cross-sector comparisons up to mid-2024.
high negative A Comparative & Systematic Review of Literature on the I... availability/quality of empirical validation and evaluation standards
Significant implementation barriers persist, notably workforce transformation challenges, legacy system integration difficulties, and trust deficits.
Thematic synthesis across empirical and conceptual papers in the review reporting implementation barriers and change management issues.
high negative A Comparative & Systematic Review of Literature on the I... implementation barriers (workforce, legacy systems, trust)
Ethical concerns—including bias, lack of transparency, and regulatory compliance risks—remain critical for agentic AI in financial services and necessitate layered governance and human-AI collaboration.
Collation of ethical, legal, and governance issues reported across the reviewed multidisciplinary studies and normative discussions.
high negative A Comparative & Systematic Review of Literature on the I... prevalence/severity of ethical and regulatory risks and governance needs
Insurance is comparatively underrepresented in the literature and in reported agentic AI deployments compared with banking and investment.
Review finding (counts/themes across included studies indicating fewer studies/applications in insurance relative to banking and investment).
high negative A Comparative & Systematic Review of Literature on the I... relative representation/adoption across financial subsectors
When predictions from the two sources conflict, the AI agent aligns more frequently with the prompt, despite its lower accuracy.
Analysis of cases where prompt-based and revealed-data-based AI predictions differed; reported frequency with which the AI's action matched the prompt versus the revealed-preference prediction.
high negative Should I State or Should I Show? Aligning AI with Human Pref... frequency of AI alignment with prompt versus revealed-preference prediction in c...
Task complexity shapes substitution: low-complexity tasks see high substitution, while high-complexity tasks favor limited partial automation.
Calibration of the model to O*NET tasks + expert survey + GPT-4o decompositions; implementation results reported for computer vision showing substitution varies with task complexity.
high negative Economics of Human and AI Collaboration: When is Partial Aut... degree of labor substitution as a function of task complexity
AI systems exhibit predictable but diminishing returns to data, compute, and model size (scaling-law experiments), implying the cost of higher accuracy is convex: good performance may be inexpensive, but near-perfect accuracy is disproportionately costly.
Scaling-law experiments estimating performance as a function of data, compute, and model size; described experimental estimation of production function.
high negative Economics of Human and AI Collaboration: When is Partial Aut... marginal returns to inputs (data, compute, model size) and marginal cost of accu...
Kerangka hukum ketenagakerjaan Indonesia saat ini bersifat reaktif, dengan fokus pada kompensasi pasca-PHK yang belum mampu menjawab dampak jangka panjang disrupsi AI.
Analisis normatif terhadap peraturan perundang-undangan dan temuan dari literatur yang ditinjau; kesimpulan yang dilaporkan oleh penulis penelitian.
high negative Reformasi Hukum Ketenagakerjaan di Era Artificial Intelligen... orientasi kebijakan hukum (reaktif vs proaktif) dan kecukupan penanganan dampak ...
Belum terdapat pengaturan eksplisit mengenai kewajiban pelatihan ulang (retraining) maupun mekanisme distribusi manfaat teknologi secara adil dalam kerangka hukum ketenagakerjaan Indonesia saat ini.
Temuan dari analisis peraturan perundang-undangan nasional (UU Cipta Kerja dan peraturan turunannya) dan literatur yang dikaji dalam penelitian normatif.
high negative Reformasi Hukum Ketenagakerjaan di Era Artificial Intelligen... kekosongan regulasi terkait kewajiban pelatihan ulang dan mekanisme distribusi m...
Fenomena adopsi AI menimbulkan tantangan hukum terkait perlindungan hak pekerja, keadilan sosial, dan keberlanjutan sistem ketenagakerjaan.
Analisis normatif terhadap konsekuensi sosial-ekonomi AI yang disintesis dari literatur nasional (SINTA) dan internasional; pendekatan konseptual dan komparatif dijelaskan dalam metode.
high negative Reformasi Hukum Ketenagakerjaan di Era Artificial Intelligen... kebutuhan perlindungan hukum untuk hak pekerja dan keadilan sosial
Perkembangan pesat Artificial Intelligence (AI) telah membawa perubahan mendasar dalam struktur pasar tenaga kerja di Indonesia dengan meningkatnya risiko penggantian pekerjaan manusia oleh teknologi otomatisasi.
Pernyataan latar belakang yang didukung oleh tinjauan literatur pada jurnal nasional terindeks SINTA dan jurnal internasional bereputasi (metode: penelitian hukum normatif dengan pendekatan perundang-undangan, konseptual, dan komparatif).
high negative Reformasi Hukum Ketenagakerjaan di Era Artificial Intelligen... risiko penggantian pekerjaan oleh automasi (job displacement risk)
The common claim that generative AI simply amplifies the Dunning–Kruger effect is too coarse to capture the available evidence.
Paper's synthesis of heterogenous empirical findings from human–AI interaction, learning research, and model evaluation used to critique the uniform-amplification interpretation; no single empirical countertest reported.
high negative Beyond the Steeper Curve: AI-Mediated Metacognitive Decoupli... validity of the 'amplified Dunning–Kruger' interpretation
LLM use degrades metacognitive accuracy and flattens the classic competence–confidence gradient across skill groups (i.e., reduces calibration and narrows differences in self-assessed confidence by skill level).
Synthesis of studies from human–AI interaction and learning research reported in the paper that document worsened calibration and a reduction in the competence–confidence gradient when users rely on LLM outputs; the paper does not report a single combined sample size.
high negative Beyond the Steeper Curve: AI-Mediated Metacognitive Decoupli... metacognitive accuracy / calibration and competence–confidence gradient
New technologies are initially skill intensive (demand more college-educated workers) but become less so as they age (they get standardized and accessible to less-skilled workers).
Empirical descriptive evidence from novel text-based data combining patent text and job postings (building on Kalyani et al., 2025) tracking technologies and their changing demand for skills as they age.
high negative THE SKILL PREMIUM IN TIMES OF RAPID TECHNOLOGICAL CHANGE demand for college-educated workers by technology age
Observed declines in browsing time due to ChatGPT adoption are concentrated in website categories such as search and news, which are highly exposed to substitution by generative AI.
Category-level browsing time changes across website classification; concentration of declines in categories identified as highly overlap-exposed to chatbot capabilities using web-scraping and LLM site-level overlap classification.
high negative https://arxiv.org/pdf/2603.03144 browsing time on search and news website categories
High-income and younger households adopt generative AI substantially faster than low-income and older counterparts, and this gap is widening over time ('generative AI divide').
Descriptive heterogeneity analysis using Comscore household demographics (income and age bins) and observed adoption trajectories across 2021–2024; authors report widening gap rather than convergence.
high negative https://arxiv.org/pdf/2603.03144 heterogeneity in adoption rates by income and age (inequality in adoption)
Most of today's agents remain isolated tools or closed-ecosystem orchestrators rather than socially integrated participants in open networks.
Author claim/assessment presented as current-state analysis; no empirical breakdown or study sample provided in the text.
high negative Synergy: A Next-Generation General-Purpose Agent for Open Ag... degree of social integration / openness of agent deployments
Prominent studies predict substantial job displacement due to automation.
Paper asserts this as background, referencing the existence of prominent studies in the literature (no specific citations or sample sizes provided in the abstract).
high negative AI Civilization and the Transformation of Work job losses / displacement
The literature singles out endemic data quality issues, algorithmic bias, governance frameworks, and regulatory compliance as concerns that require trusted AI and sustainable digital finance ecosystems.
Synthesis from the reviewed literature noting recurring concerns and limitations reported across studies; the paper lists these as major challenges identified in the field.
high negative Artificial intelligence in sustainable finance and Environme... prevalence of data quality issues, algorithmic bias, governance and regulatory c...
AI can worsen financial and market performance if it crowds out normal R&D.
Paper's empirical analysis and interpretation linking AI dependence to poorer financial/market performance through displacement of standard R&D activities; presented as a study finding.
high negative The 'Intelligent Trap' in Corporate Finance—A Study Based on... financial and market performance
High AI dependency disclosed in financial reports does not improve firms' financial health and may even endanger it.
Empirical results drawn from the study's analysis of listed new energy vehicle and automobile manufacturers (2013–2023); statement appears in the paper's findings/conclusions.
high negative The 'Intelligent Trap' in Corporate Finance—A Study Based on... financial health / corporate financial condition
AI dependency reduces financial safety for listed new energy vehicle and automobile manufacturers.
Empirical analysis of a sample of listed new energy vehicle and automobile manufacturers covering 2013–2023; the paper reports data analysis showing AI dependency reduces financial safety.
high negative The 'Intelligent Trap' in Corporate Finance—A Study Based on... financial safety / corporate financial risk