The Commonplace
Home Dashboard Papers Evidence Digests 🎲

Evidence (4049 claims)

Adoption
5126 claims
Productivity
4409 claims
Governance
4049 claims
Human-AI Collaboration
2954 claims
Labor Markets
2432 claims
Org Design
2273 claims
Innovation
2215 claims
Skills & Training
1902 claims
Inequality
1286 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 369 105 58 432 972
Governance & Regulation 365 171 113 54 713
Research Productivity 229 95 33 294 655
Organizational Efficiency 354 82 58 34 531
Technology Adoption Rate 277 115 63 27 486
Firm Productivity 273 33 68 10 389
AI Safety & Ethics 112 177 43 24 358
Output Quality 228 61 23 25 337
Market Structure 105 118 81 14 323
Decision Quality 154 68 33 17 275
Employment Level 68 32 74 8 184
Fiscal & Macroeconomic 74 52 32 21 183
Skill Acquisition 85 31 38 9 163
Firm Revenue 96 30 22 148
Innovation Output 100 11 20 11 143
Consumer Welfare 66 29 35 7 137
Regulatory Compliance 51 61 13 3 128
Inequality Measures 24 66 31 4 125
Task Allocation 64 6 28 6 104
Error Rate 42 47 6 95
Training Effectiveness 55 12 10 16 93
Worker Satisfaction 42 32 11 6 91
Task Completion Time 71 5 3 1 80
Wages & Compensation 38 13 19 4 74
Team Performance 41 8 15 7 72
Hiring & Recruitment 39 4 6 3 52
Automation Exposure 17 15 9 5 46
Job Displacement 5 28 12 45
Social Protection 18 8 6 1 33
Developer Productivity 25 1 2 1 29
Worker Turnover 10 12 3 25
Creative Output 15 5 3 1 24
Skill Obsolescence 3 18 2 23
Labor Share of Income 7 4 9 20
Clear
Governance Remove filter
If cognitive interlocks are widely adopted, many negative externalities can be internalized and AI-driven productivity gains can be realized more sustainably; absent such controls, equilibrium may drift toward higher error rates and systemic incidents.
Long-run equilibrium argument based on theoretical reasoning and conditional claims; no longitudinal or cross-firm empirical evidence presented.
speculative mixed Overton Framework v1.0: Cognitive Interlocks for Integrity i... long-run system outcomes (error rates, incident frequency, net productivity) con...
Labor demand effects are ambiguous: junior/entry-level demand may be reduced for some tasks while demand for verification and higher-skill roles may rise.
Economic reasoning, early observational signals, and theoretical task-reallocation frameworks; empirical longitudinal evidence is limited or absent.
speculative mixed ChatGPT as a Tool for Programming Assistance and Code Develo... labor demand by skill level and occupation (employment levels, hiring rates)
The effectiveness of generative AI depends critically on human-AI workflows: prompt design, iterative refinement, and human vetting materially affect outcomes.
Qualitative analyses of interaction patterns and experiments manipulating prompting/iteration showing variation in outcomes; many studies report improved outputs after iterative prompting and human-in-the-loop refinement.
medium-high mixed ChatGPT as an Innovative Tool for Idea Generation and Proble... variation in output quality based on prompt design; changes in output after iter...
Market demand is likely to bifurcate: high-value clinical markets will require rigorous explainability and neuroscientific grounding (higher willingness-to-pay), while research and consumer segments may tolerate black-box models (lower margins).
Market segmentation argument built from differing end-user requirements and tolerance for opaque models; presented as a projected implication rather than an empirically tested market study.
speculative mixed Explainable Artificial Intelligence (XAI) for EEG Analysis: ... market segmentation / willingness-to-pay across segments
Teams often produce evaluation outputs (tests, metrics, user feedback) but lack mechanisms, processes, or technical levers to convert those outputs into actionable engineering or product changes—a novel “results-actionability gap.”
Recurring theme from the 19 practitioner interviews and coding; authors explicitly articulate and label this gap based on participants' reports.
medium-high negative Results-Actionability Gap: Understanding How Practitioners E... ability to translate evaluation outputs into concrete product/engineering change...
The study confirms several previously documented evaluation challenges with LLMs: model unpredictability, metric mismatch, high human-evaluation costs, and difficulty reproducing failures.
Interview data from 19 practitioners; thematic analysis flagged these recurring problems as reported by participants and aligned with prior literature.
medium-high negative Results-Actionability Gap: Understanding How Practitioners E... presence and prevalence of known evaluation challenges
Emergent quality hierarchies among agents imply winner-take-most dynamics in informational value and potential market concentration in agent quality.
Observed formation of quality hierarchies in agent interactions and documented economic interpretation; this is a hypothesis/implication drawn from qualitative patterns rather than measured market outcomes.
speculative negative When Openclaw Agents Learn from Each Other: Insights from Em... distribution of informational value / concentration of agent quality
Security of LLM-based MASs functions as an economic externality: failures can impose social costs (misinformation, poor collective decisions), and absent liability or market incentives providers may underinvest in robustness.
Economic reasoning and implication section in the paper—conceptual argument linking the technical vulnerability to economic externality and incentive misalignment. No empirical economic data provided in the summary.
speculative negative Don't Trust Stubborn Neighbors: A Security Framework for Age... investment in defenses (underprovision) and social costs from MAS security failu...
Analytical conditions on stubbornness and influence weights identify when a single adversary can dominate network dynamics (i.e., influence propagation criteria derived from FJ fixed-point analysis).
Mathematical/theoretical analysis of FJ model fixed points and influence propagation in the paper; derivation of conditions relating agent stubbornness and interpersonal trust weights to steady-state influence.
medium-high negative Don't Trust Stubborn Neighbors: A Security Framework for Age... theoretical criteria predicting when an agent's influence weight leads to domina...
If models frequently leak or misuse preferences in third‑party contexts, users and organizations will discount the value of personalization or demand stronger controls, increasing costs for deploying memory features and reducing consumer surplus.
Economic reasoning and implication drawn from the observed misapplication behavior; no empirical user adoption or market data provided in the study to directly support this claim.
speculative negative BenchPreS: A Benchmark for Context-Aware Personalized Prefer... Projected changes in trust, adoption costs, and consumer surplus (not empiricall...
The failure mode (misapplication of preferences to third parties) creates negative externalities (privacy violations, normative harms, misinformation, contractual breaches) that markets and platforms may not internalize without regulation or design changes.
Economic interpretation and argumentation building on the empirical failure mode; these harms are hypothesized implications rather than measured outcomes in the paper.
speculative negative BenchPreS: A Benchmark for Context-Aware Personalized Prefer... Projected negative externalities on third parties (not directly measured in stud...
Unclear liability frameworks increase perceived and real costs and can slow adoption by hospitals and insurers.
Policy analyses and procurement narratives noting liability uncertainty cited as a barrier to procurement and deployment.
medium_high negative Human-AI interaction and collaboration in radiology: from co... time-to-adoption, procurement decisions citing liability concerns, insurance/cov...
Up-front implementation costs commonly include procurement, integration with PACS/EMR, UI/UX development, regulatory compliance, and staff training; recurring costs include monitoring, data labeling, software updates, and cybersecurity.
Implementation reports, vendor and hospital accounts, and qualitative studies documenting cost categories (specific dollar amounts vary across settings and are rarely published in detail).
medium_high negative Human-AI interaction and collaboration in radiology: from co... implementation capital expenditures, annual operating expenditures
Without continuous support for upskilling/reskilling and inclusive policies, AI risks becoming a source of exclusion rather than an enabler of human advancement.
Normative conclusion derived from reviewed literature and thematic interpretation in the qualitative study (literature-based; evidence is secondary and not quantified).
speculative negative THE IMPACT OF ARTIFICIAL INTELLIGENCE IN THE WORKPLACE: OPPO... social inclusion versus exclusion related to AI adoption
Research literature synthesis demonstrates 70-75% automation potential.
Quantitative estimate offered by the authors (70-75%) as part of function-by-function analysis; no described empirical evaluation or sample supporting the figure.
speculative negative Are Universities Becoming Obsolete in the Age of Artificial ... percent automation potential for research literature synthesis
Knowledge transmission (teaching/lecturing) shows 75-80% AI substitutability.
Authors' quantitative estimate presented in the analysis (75-80%); the paper does not detail empirical methods or validation samples for this percentage.
speculative negative Are Universities Becoming Obsolete in the Age of Artificial ... percent substitutability/automation potential of knowledge transmission
Administrative tasks face 75-80% disruption risk from AI.
Paper provides a quantitative estimate (75-80%) as part of its functional disruption assessment; no empirical methodology, dataset, or sample size is described to support the numeric range.
speculative negative Are Universities Becoming Obsolete in the Age of Artificial ... percent disruption/substitutability of administrative tasks
Aggregation and linkage across data sources can reveal intimate, predictive traits that were not foreseeable to the data subject at the time of sale.
Conceptual argument with references to documented cases and literature on data linkage and inference; relies on illustrative examples rather than original empirical experiments.
medium-high negative Data and privacy: Putting markets in (their) place Extent to which data aggregation yields unforeseen sensitive inferences about in...
The United States shows a more market-driven (firm-dominated) patenting profile and comparatively weaker integration between AI and robotics patent trajectories.
Country-level and actor-type decomposition for U.S. patent filings (1980–2019), showing higher firm share of patents and weaker long-run association/cointegration between core AI and AI-enhanced robotics series compared with China (as reported in the paper).
medium-high negative The "Gold Rush" in AI and Robotics Patenting Activity. Do in... share of patents by firms in U.S.; strength of long-run integration between U.S....
There is a risk of a two‑tier market where high‑quality temporal‑preserving enhancements are costly, increasing inequality in experiential welfare and cognitive capital.
Speculative socioeconomic implication based on cost/access arguments and distributional concerns; no inequality modeling or empirical pricing data provided.
speculative negative XChronos and Conscious Transhumanism: A Philosophical Framew... distributional inequality in access to temporal‑quality enhancements and resulti...
Technical expansion without an accompanying theory of lived temporality risks increasing capabilities while degrading the qualitative depth of human experience (presence, attentional flow, felt meaning).
Argumentative claim supported by philosophical analysis and literature synthesis (neurophenomenology, attention economics); no empirical test reported (N/A).
speculative negative XChronos and Conscious Transhumanism: A Philosophical Framew... qualitative depth of human experience (presence, attentional flow, felt meaning)
High-quality, equitable climate information displays public-good characteristics (nonrival, nonexcludable at scale), so private incentives alone will underprovide geographically representative data and shared infrastructure.
Economic reasoning supported by observed concentration of compute and model development (mapping) and standard public-goods theory; no formal empirical market model estimated in the paper.
medium-high negative The Rise of AI in Weather and Climate Information and its Im... Level of provision of geographically representative data/shared infrastructure u...
Full replacement of physicians would require breakthroughs in robust generalization, embodied capabilities, and legal/regulatory change—currently lacking.
Conceptual inference based on documented limitations (OOD generalization, lack of embodied/sensorimotor capability, unsettled legal/regulatory environment) summarized in the review.
speculative negative Will AI Replace Physicians in the Near Future? AI Adoption B... feasibility/timeline for physician replacement
Shrinking acquisition workforce capacity functions as a critical scarce input in defense AI economics; reduced human capital lowers the Department's ability to extract value from AI investments and to internalize externalities, decreasing effective returns to AI procurement.
Institutional trend evidence of workforce reductions combined with economic analysis treating institutional capacity as an input factor. No empirical quantification of returns or elasticity provided—this is analytical inference.
speculative negative FEATURE COMMENT: Governance as a "Blocker": How the Pentagon... effective returns to AI procurement given acquisition workforce capacity (theore...
Ambiguous standards increase uncertainty for contracting officers, raising the risk that they will either over-rely on vendor claims or inconsistently enforce requirements, both of which harm procurement integrity.
Policy-text analysis identifying vague criteria combined with qualitative analysis of procurement decision workflows; argument based on measurement and enforcement friction literature. No empirical study of contracting officer behavior provided.
speculative negative FEATURE COMMENT: Governance as a "Blocker": How the Pentagon... consistency and reliability of contracting officer enforcement and reliance on v...
Lower governance barriers and ambiguous procurement criteria (e.g., undefined 'model objectivity') can skew market competition toward suppliers that prioritize rapid iteration and opaque practices over rigorous assurance, harming traceability and quality.
Market-effects reasoning grounded in policy changes (document analysis) and qualitative institutional analysis of measurement/enforcement frictions. No market-share or supplier-behavior data provided.
speculative negative FEATURE COMMENT: Governance as a "Blocker": How the Pentagon... market composition and supplier incentives (favoring speed/opacity vs. assurance...
Mandating permissive contract terms and enabling waivers reduces private incentives for contractors to invest in safety and compliance, creating classical moral-hazard problems in defense AI procurement.
Economic reasoning and principal–agent analysis applied to the documented contractual changes (primary-source policy text). No empirical measurement of contractor investment behavior provided; claim is theoretical/inferential.
speculative negative FEATURE COMMENT: Governance as a "Blocker": How the Pentagon... contractor incentives to invest in safety and compliance (theoretical inference)
A mismatch between expanded waiver authority (Barrier Removal Board) and declining acquisition oversight capacity creates procurement-integrity and systemic risks: faster acquisition concurrent with weakened institutional checks increases likelihood of improper procurement decisions and unchecked deployment of unsafe or unvetted AI models.
Synthesis of primary-source policy analysis, institutional staffing trend evidence, and qualitative risk/scenario assessment using principal–agent and moral-hazard frameworks. This is a conceptual risk projection rather than an empirically derived probability estimate.
speculative negative FEATURE COMMENT: Governance as a "Blocker": How the Pentagon... probability and nature of procurement-integrity failures and deployments of unsa...
Emerging agentic/AGI capabilities introduce new failure modes and governance challenges that standard ML oversight may not cover.
Emerging literature, theoretical analyses, and expert opinion summarized in the synthesis; authors note limited empirical long-term data and characterize this as an emergent risk.
speculative negative Framework for Government Policy on Agentic and Generative AI... governance risk / novel failure modes
Centralized provision of high-quality coding models by a few vendors could produce vendor lock-in and increase platform power in software development inputs.
Market-structure analysis and industry observations synthesized in the paper; the claim is forward-looking and not established by longitudinal market data within the review.
speculative negative ChatGPT as a Tool for Programming Assistance and Code Develo... market concentration measures (e.g., HHI), indicators of vendor lock-in (switchi...
This reversal of the burden of proof creates moral-hazard-like behavior: incentives for speed reduce verification effort.
Theoretical argument built on the micro-coercion mechanism and economic reasoning; no empirical validation provided.
speculative negative Overton Framework v1.0: Cognitive Interlocks for Integrity i... verification effort per artifact (e.g., reviewer time), proportion of unchecked ...
Under time pressure, developers adopt an implicit default of accepting plausible machine outputs unless they can disprove them (the 'micro-coercion of speed'), effectively reversing the burden of proof.
Behavioral mechanism posited from descriptive reasoning and thought experiments; no behavioral experiments, surveys, or observational data reported.
speculative negative Overton Framework v1.0: Cognitive Interlocks for Integrity i... developer acceptance rate of machine-generated outputs under time pressure; rate...
DAR dynamics (authority states, hysteresis, safe-exit times) introduce path-dependence and switching costs that should be treated as state variables in production and decision models of human–AI joint work.
Theoretical implications section arguing these elements add path-dependence and switching costs to economic/production models; analytic reasoning, not empirical measurement.
medium-high negative Human–AI Handovers: A Dynamic Authority Reversal Framework f... switching_costs; path_dependence_indicators; effect_on_throughput
Concentration risks exist because high fixed costs for safe integration and model adaptation may favor larger incumbents or platform providers.
Conceptual economic reasoning and practitioner commentary synthesized in the review; no empirical market-structure analysis or sample-based evidence included here.
speculative negative The Effectiveness of ChatGPT in Customer Service and Communi... market concentration indicators and barriers to entry related to AI integration ...
Imported AI systems may impose foreign values and norms, risking erosion of indigenous knowledge and social cohesion.
Normative and conceptual argument supported by cited case studies and policy analyses; no original anthropological or sociological fieldwork in the paper.
low-medium negative Towards Responsible Artificial Intelligence Adoption: Emergi... indicators of indigenous knowledge retention, measures of cultural alignment of ...
Deployed AI systems can produce algorithmic bias that harms marginalized groups when models are trained on skewed or non‑representative data.
Synthesis of prior empirical findings and case studies on algorithmic bias and fairness in ML systems; paper does not present new empirical tests.
medium-high negative Towards Responsible Artificial Intelligence Adoption: Emergi... fairness metrics, disparate error rates, incidence of discriminatory outcomes fo...
Human reviewers may over-trust machine-generated language and explanations (automation bias), reducing the likelihood of detecting fraudulent outputs.
Reference to automation-bias literature and conceptual examples; threat modeling and illustrative vignettes in the article.
medium-high negative Prompt Engineering or Prompt Fraud? Governance Challenges fo... detection rate of fraudulent outputs by human reviewers when outputs are machine...
Existing internal audit and compliance frameworks focus on access, transaction, and system controls, not on content-generation integrity.
Literature and standards review combined with threat-control mapping demonstrating gaps in content/provenance coverage.
medium-high negative Prompt Engineering or Prompt Fraud? Governance Challenges fo... coverage of content-generation integrity within existing audit/compliance framew...
AI systems and economic models are biased toward European languages because of lack of vernacular corpora; investing in high-quality corpora for African vernaculars (e.g., Cameroon Pidgin) is necessary to avoid misallocation of resources.
Policy implication extrapolated from the study's finding that vernacular mediation materially affects outcomes, combined with general knowledge about data-driven AI bias; no empirical AI-modeling tests in the paper.
speculative negative (current state) / positive (recommended investment) From Linguistic Hybridity to Development Sovereignty: Pidgin... AI model performance and allocation bias (inferred, not measured)
There are research opportunities to measure returns to 'teaching' (causal impact of configuring agents on human skill accumulation and earnings) and to model agent-platform ecosystems with network effects, spillovers, and endogenous quality hierarchies.
Author-stated research agenda and proposed empirical questions derived from the observed phenomena; not empirical results but recommended directions.
speculative null result When Openclaw Agents Learn from Each Other: Insights from Em... need for future causal estimates of returns to teaching and formal models of eco...
Recommended research priorities include hierarchical/temporal-decomposition methods, continual learning, robust adaptation to non-stationarity, and causal/structured reasoning to handle multi-factor interactions.
Paper discussion linking observed failure modes to methodological gaps and proposing research directions to address limitations; these are recommendations rather than experimentally validated claims.
speculative null result RetailBench: Evaluating Long-Horizon Autonomous Decision-Mak... suggested research directions to improve robustness (proposed, not empirically v...
Recommended future research includes scalable interoperability solutions, longitudinal lifecycle value validation, human‑centred adoption strategies, and sustainability assessment methods.
Authors' explicit recommendations at the end of the review based on identified gaps in the literature.
speculative null result Digital Twins Across the Asset Lifecycle: Technical, Organis... priority research areas to address current evidence gaps
Researchers should combine qualitative studies with administrative/matched employer–employee data and experimental/quasi-experimental designs (pilot rollouts, staggered adoption) to identify causal effects of AI on tasks, productivity, and wages.
Methodological recommendation by authors based on limitations of their qualitative study (15 UX designers) and the need to quantify observed phenomena; not an empirical claim tested in the paper.
speculative null result The Values of Value in AI Adoption: Rethinking Efficiency in... recommended measurement approaches for causal identification (task allocation, p...
AI economics should prioritize causal identification of who benefits and who loses when AI is introduced into credit and other financial services, and model endogenous platform behavior including competition and regulatory responses.
Research agenda proposed by the authors based on identified gaps in the literature; prescriptive guidance rather than empirically tested claims.
speculative null result Financial Inclusion in the Age of FinTech Platforms: Opportu... research priorities (causal identification, endogenous platform behavior) rather...
Regulatory tools to consider include algorithmic impact assessments, data portability/interoperability mandates, fairness enforcement, sandboxing with post-deployment audits, and macroprudential tools for platform risk.
Policy recommendation derived from literature review and gap analysis; framed as suggested instruments rather than tested interventions.
speculative null result Financial Inclusion in the Age of FinTech Platforms: Opportu... effectiveness of regulatory tools on consumer protection, competition, and syste...
To measure and monitor these effects, researchers should track firm-level adoption of AI features, fulfillment automation intensity, platform-mediated market entry, and task-level labor shifts.
Author recommendations based on gaps identified in the case-based and multi-modal empirical work and the sensitivity of results to adoption measures; not an empirical finding but a methodological claim.
speculative null result Artificial Intelligence–Enabled E-Commerce Systems and Autom... measurement coverage metrics (availability/quality of adoption and task-shift da...
The threshold for taxing AI may be crossed once AI becomes sufficiently capable in substituting humans across cognitive tasks.
Model-based comparative-static/threshold analysis showing that higher AI substitutability for cognitive tasks increases the likelihood that cognitive workers will consider switching to manual jobs, thereby meeting the model's tax-initiation condition.
speculative positive Workers' Incentives and the Optimal Taxation of AI whether/when the model's tax-initiation threshold is crossed as a function of AI...
Developing domain-specific vernacular NLP and speech models (health, agriculture, education) would help replicate pragmatic features (proverbs, registers) that enable epistemic appropriation.
Policy/research recommendation based on qualitative findings that proverbs and registers confer legitimacy and facilitate knowledge transfer; no experimental NLP work reported in study.
speculative positive From Linguistic Hybridity to Development Sovereignty: Pidgin... potential improvement in vernacular AI-assisted advisory effectiveness (proposed...
Local-language (vernacular) inclusion improves economic returns to development interventions by increasing comprehension and adoption, thereby improving program cost-effectiveness.
Logical extrapolation from observed higher comprehension and adoption rates in the field sample (N = 45); no direct economic cost–benefit analysis reported in the study—claim framed as implication for AI economics.
speculative positive From Linguistic Hybridity to Development Sovereignty: Pidgin... implied economic return / cost-effectiveness (inferred from uptake/comprehension...
Building and maintaining an open-access disclosure repository would enable comparability, aggregation, and public appraisal of environmental pressures.
Policy recommendation derived from conceptual analysis; no implemented repository or empirical evaluation reported.
speculative positive A golden opportunity: Corporate sustainability reporting as ... data accessibility, comparability, and ability to aggregate environmental disclo...