The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (4333 claims)

Adoption
5539 claims
Productivity
4793 claims
Governance
4333 claims
Human-AI Collaboration
3326 claims
Labor Markets
2657 claims
Innovation
2510 claims
Org Design
2469 claims
Skills & Training
2017 claims
Inequality
1378 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 402 112 67 480 1076
Governance & Regulation 402 192 122 62 790
Research Productivity 249 98 34 311 697
Organizational Efficiency 395 95 70 40 603
Technology Adoption Rate 321 126 73 39 564
Firm Productivity 306 39 70 12 432
Output Quality 256 66 25 28 375
AI Safety & Ethics 116 177 44 24 363
Market Structure 107 128 85 14 339
Decision Quality 177 76 38 20 315
Fiscal & Macroeconomic 89 58 33 22 209
Employment Level 77 34 80 9 202
Skill Acquisition 92 33 40 9 174
Innovation Output 120 12 23 12 168
Firm Revenue 98 34 22 154
Consumer Welfare 73 31 37 7 148
Task Allocation 84 16 33 7 140
Inequality Measures 25 77 32 5 139
Regulatory Compliance 54 63 13 3 133
Error Rate 44 51 6 101
Task Completion Time 88 5 4 3 100
Training Effectiveness 58 12 12 16 99
Worker Satisfaction 47 32 11 7 97
Wages & Compensation 53 15 20 5 93
Team Performance 47 12 15 7 82
Automation Exposure 24 22 9 6 62
Job Displacement 6 38 13 57
Hiring & Recruitment 41 4 6 3 54
Developer Productivity 34 4 3 1 42
Social Protection 22 10 6 2 40
Creative Output 16 7 5 1 29
Labor Share of Income 12 5 9 26
Skill Obsolescence 3 20 2 25
Worker Turnover 10 12 3 25
Clear
Governance Remove filter
Data privacy, confidentiality, and cross-border data transfer concerns are important barriers to deployment.
Challenges enumerated from case studies and literature; specific organizational concerns cited in cases (Xiaomi, Deloitte) and in regulatory discussion.
medium negative Explore the Impact of Generative AI on Finance and Taxation deployment constraints related to data privacy (e.g., blocked data flows, need f...
Automation and human–robot assemblages can reproduce subjugation and vulnerability affecting care workers and marginalized users, requiring attention to distributional justice and labor-market impacts.
Illustrative vignettes from healthcare robotics and literature synthesis on care ethics and labor impacts; no quantitative labor-market analysis presented.
medium negative Examining ethical challenges in human–robot interaction usin... distributional impacts on wages, bargaining power, welfare, and vulnerability of...
Legal liability regimes and insurance products may systematically under- or mis-assign costs of harm in socio-technical assemblages when primordial ethical demands are considered.
Conceptual argument and suggested modeling directions; no empirical simulation or insurance-market data presented.
medium negative Examining ethical challenges in human–robot interaction usin... accuracy of cost assignment in liability/insurance regimes for socio-technical h...
Treating responsibility as a Levinasian, asymmetrical moral obligation implies it operates as a non-contractible externality that markets and contracts may fail to internalize, creating persistent externalities in AI deployment that standard economic models may miss.
Theoretical implication derived from philosophical argument applied to economic concepts; suggested consequences but no formal models or empirical validation in the paper.
medium negative Examining ethical challenges in human–robot interaction usin... degree to which markets/contracts internalize asymmetrical moral obligations (th...
Simple pluralist or multi-principle balancing approaches risk reproducing structural subordination by failing to foreground the asymmetrical ethical demand toward vulnerable Others.
Normative critique supported by cross-disciplinary literature (care ethics, mediation, STS) and illustrative examples; no empirical test of pluralist approaches’ effects.
medium negative Examining ethical challenges in human–robot interaction usin... tendency of pluralist balancing approaches to reproduce structural subordination...
The Levinasian framework helps reveal how human–robot interactions can both expose and reproduce systemic vulnerabilities, subjugation, and unaddressed harms (termed 'Problem C' — attribution of responsibility and distributed agency).
Theoretical diagnosis supported by interdisciplinary literature synthesis and illustrative vignettes from healthcare robotics, autonomous vehicles, and algorithmic governance. No quantitative prevalence data.
medium negative Examining ethical challenges in human–robot interaction usin... presence/manifestation of systemic vulnerabilities, subjugation, and unaddressed...
Absent interoperability, divergence in data and AI rules will raise transaction costs, reduce trade gains, and create opportunities for regulatory arbitrage.
Economic reasoning and scenario-based projections; asserted as an outcome of mechanism analysis rather than demonstrated with quantitative estimates.
medium negative Path Analysis of Digital Economy and Reconstruction of Inter... transaction costs, aggregate trade gains, incidence of regulatory arbitrage
Explainability, auditability, or data-localization requirements could favor larger vendors with compliance capacity, increasing market concentration and affecting competition among AI suppliers.
Market-structure argument grounded in regulatory-compliance burden analysis and comparative examples; not supported by empirical market data in the study.
medium negative ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... market concentration and competition among AI vendors (supplier market structure...
Legal uncertainty and strict procedural requirements increase compliance costs and regulatory risk, which can slow AI adoption by firms and public agencies.
Theoretical economic implications drawn from legal analysis and comparative observations; no empirical measurement of costs or adoption rates in the study.
medium negative ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... AI adoption rate and investment risk (speed and likelihood of procurement/invest...
AI can restrict or reshape human administrative discretion in legally sensitive ways.
Doctrinal analysis of statutory specificity and formal procedural requirements in civil-law contexts, illustrated with Vietnam as the exemplar case; comparative observations.
medium negative ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... scope of administrative discretion (degree of human decision-making latitude)
Five qualitatively distinct D3 reflexive failure modes were identified in model responses, including categorical self-misidentification and false-positive self-attribution.
Qualitative coding and taxonomy reported in Results: five D3 categories cataloged with examples; identification based on analysis of model responses to narrative dilemmas (sample drawn from the study runs).
medium negative Literary Narrative as Moral Probe : A Cross-System Framework... enumeration and qualitative descriptions of reflexive failure modes observed in ...
A probe composed of deliberately unresolvable moral dilemmas embedded in literary (science-fiction) narrative resists surface performance and exposes a measurable gap between performed and authentic moral reasoning.
Experimental application of the probe to 13 distinct LLM systems across 24 experimental conditions (13 blind, 4 declared re-tests, 7 ceiling-probe runs), with scoring and qualitative coding showing discriminating failure modes and a measurable gap in responses.
medium negative Literary Narrative as Moral Probe : A Cross-System Framework... discriminative power of the probe (ability to expose failures/gaps) operationali...
Existing AI moral-evaluation benchmarks largely measure surface-level, correct-sounding answers rather than genuine moral-reasoning capacity.
Comparative argument based on study results showing a measurable gap when applying the authors' narrative-based probe (unresolvable SF dilemmas) versus standard benchmarks; empirical support comes from experiments across 24 conditions and 13 systems showing systems produce plausible-sounding but reflexive/invalid reasoning on the narrative probe.
medium negative Literary Narrative as Moral Probe : A Cross-System Framework... gap between polished/surface moral answers and deeper/authentic moral-reasoning ...
Capabilities and data advantages for certain vendors could lead to market concentration and platform dominance in AI-driven educational feedback.
Expert concern synthesized from the workshop of 50 scholars about market dynamics; theoretical warning without empirical market-structure analysis in the report.
medium negative The Future of Feedback: How Can AI Help Transform Feedback t... market concentration measures (market share, Herfindahl index); entry barriers; ...
Differential access to high-quality AI feedback systems and bias in training data can exacerbate educational inequalities and harm marginalized groups.
Expert consensus and thematic analysis from the 50-scholar workshop, raising equity and bias risks; no empirical subgroup effectiveness estimates included.
medium negative The Future of Feedback: How Can AI Help Transform Feedback t... access disparities; differential effectiveness by subgroup; measures of algorith...
Learners may over-rely on AI feedback or game systems to obtain desirable responses, reducing effortful learning.
Workshop participant concerns synthesized qualitatively; cited as risk and an open empirical question—no experimental data provided.
medium negative The Future of Feedback: How Can AI Help Transform Feedback t... learner reliance on AI (usage patterns); changes in effortful learning behaviors...
Reliance on single-agent outputs or non-diverse agent ensembles can understate substantive uncertainty and bias conclusions in automated policy evaluation or AI-assisted empirical research.
Observed substantial agent-to-agent variability (NSEs) in the experiment (150 agents) demonstrating that single-agent results do not capture between-agent methodological uncertainty; imbalance between model families further implies potential bias if only one family is used.
medium negative Nonstandard Errors in AI Agents degree to which single-agent point estimates fail to capture between-agent dispe...
The post-exemplar convergence largely reflected imitation of exemplar choices rather than demonstrated understanding or principled correction by agents.
Qualitative and behavioral analysis of agents' post-exposure outputs showing direct adoption of exemplar measures/procedures and lack of substantive justification or mechanistic reasoning indicating comprehension; inference based on content of agent code and writeups after exposure.
medium negative Nonstandard Errors in AI Agents qualitative indicators of reasoning/comprehension in agents' outputs (textual ju...
Chat-like interfaces commonly activate misleading beliefs including overtrust in correctness/robustness, attribution of goals or moral agency, and underestimation of hallucination/bias/privacy risks.
Aggregated observations from literature in HCI and ethics; suggested examples rather than empirical prevalence estimates; no sample size given.
medium negative Why We Need to Destroy the Illusion of Speaking to A Human: ... incidence of overtrust, attribution of agency, and underestimation of model fail...
Natural conversational style creates the impression the system is human-like, intentional, or reliably knowledgeable.
Conceptual claim supported by synthesis of prior work on anthropomorphism and conversational interfaces; no new quantitative data provided.
medium negative Why We Need to Destroy the Illusion of Speaking to A Human: ... user beliefs about system humanness, intentionality, and perceived reliability
Reliance on preference signals risks learning spurious proxies and produces unstable behavior under distribution shift.
Theoretical argument supported by examples of spurious proxies in ML and by observations in RLHF-trained models; the paper cites literature showing proxy behavior but does not present a unified empirical quantification specific to RLHF across many tasks.
medium negative Via Negativa for AI Alignment: Why Negative Constraints Are ... frequency of spurious-proxy-driven failures and degradation in behavior under di...
Positive preference signals are continuous, context-dependent, and entangled with surface correlates (e.g., agreement with the user), which causes models trained on them to pick up spurious proxies and exhibit sycophancy and brittleness.
Conceptual/theoretical argument in the paper describing structural properties of preference spaces, supported by cited observations of sycophantic behavior in models trained with preference-based objectives. No single definitive empirical quantification is provided within the paper; supporting examples are drawn from recent literature.
medium negative Via Negativa for AI Alignment: Why Negative Constraints Are ... incidence of sycophantic behavior and brittleness (e.g., tendency to agree with ...
There is a risk of manipulation and misinformation if argument mining/synthesis is unregulated or misaligned with social incentives, creating externalities that may justify public intervention.
Conceptual risk assessment combining known misinformation dynamics and AI capabilities; no empirical incident data provided.
medium negative Argumentative Human-AI Decision-Making: Toward AI Agents Tha... incidence of manipulation/misinformation attributable to argument-mining/synthes...
Increased error risk and weaker explainability from GLAI will raise malpractice and liability exposure for firms and lawyers, driving up insurance and compliance costs.
Legal-risk analysis and economic reasoning connecting explainability/liability to insurance costs; no empirical cost studies presented.
medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... malpractice/liability exposure levels and associated insurance/compliance costs
The combination of hallucination and professional overreliance strains existing regulatory goals (e.g., explainability, human oversight) within European AI governance frameworks.
Legal and regulatory analysis mapping technical and behavioral risks onto European AI governance goals; references to statutory/regulatory texts and policy debates. Qualitative argumentation rather than empirical test.
medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... compatibility between GLAI deployment dynamics and regulatory obligations (e.g.,...
Fabricated or opaque intermediate data and reasoning in GLAI weaken explainability, making it difficult to provide meaningful explanations about how outputs were produced.
Conceptual analysis of token-prediction architectures, literature on explainability limits of LLMs, and legal/regulatory analysis referencing explainability requirements. No empirical measurement.
medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... quality/meaningfulness of explanations about model outputs (explainability)
Hallucinated content produced by GLAI is often linguistically fluent and persuasive, increasing the risk that legal professionals will accept it without verification.
Literature synthesis on model fluency and behavioral literature on trust in coherent authoritative outputs, plus illustrative vignettes. No original experimental data or sample size.
medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... rate of professional acceptance or uncritical reliance on fluent but incorrect o...
This architectural mismatch (token-prediction vs. formal legal reasoning) contributes to confident but factually incorrect outputs (hallucinations) in GLAI.
Technical/conceptual analysis plus synthesis of existing literature on hallucinations in generative models; illustrative examples and vignettes provided. No primary empirical measurement in the paper.
medium negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... incidence and nature of hallucinated (factually incorrect) outputs produced by G...
Observed failure modes during the workflow included hypothesis creep, definition-alignment bugs (mismatch between informal and formal definitions), and agent avoidance behaviors (agents delegating or failing to complete tasks).
Qualitative analysis and post-mortem reported in the paper based on the single project workflow and logs; specific failure modes enumerated by authors from their process observations.
medium negative Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau E... presence and types of failure modes observed in the workflow (hypothesis creep, ...
Absence of governance and observability could increase social costs of accidents and induce conservative regulation that stifles beneficial adoption.
Policy reasoning and historical regulatory responses to systemic risks; conceptual projection without quantitative modeling of regulatory impact.
medium negative The Internet of Physical AI Agents: Interoperability, Longev... social cost of accidents, regulatory restrictiveness, adoption rates
Strong proprietary stacks and incompatible protocols could create winner‑take‑all or oligopolistic market outcomes due to network effects and switching costs.
Market‑structure theory and historical platform examples (e.g., dominant tech platforms); argument is conceptual and not backed by new empirical market analysis in the paper.
medium negative The Internet of Physical AI Agents: Interoperability, Longev... market concentration (e.g., market share distribution), barriers to entry
Without these architectural commitments, the economic costs — stranded assets, safety incidents, reduced innovation, and high coordination costs — will be substantial.
Predictive economic argument built from historical IoT/Internet lessons and systems reasoning; no quantitative cost estimates or econometric analysis in the paper.
medium negative The Internet of Physical AI Agents: Interoperability, Longev... economic costs: stranded assets, safety incident frequency, innovation rates, co...
Poor governance and observability in agent networks would make accountability, certification, and regulation difficult.
Policy and governance reasoning with illustrative domain examples; conceptual argument without empirical governance case studies or metrics.
medium negative The Internet of Physical AI Agents: Interoperability, Longev... ease of accountability/certification/regulation; observability coverage
Weak or brittle security and trust mechanisms across distributed agent ecosystems will pose serious risks.
Lessons drawn from IoT security failures and conceptual threat analysis; no new penetration testing or security metrics presented.
medium negative The Internet of Physical AI Agents: Interoperability, Longev... security/trust robustness of agent ecosystems (vulnerabilities, compromise rates...
Lifecycle mismatch — rapidly evolving AI software embedded in long‑lived physical assets — risks premature ossification or expensive retrofits.
Systems engineering reasoning and historical analogies to embedded systems/IoT lifecycles; no quantitative lifecycle modeling or case study data in the paper.
medium negative The Internet of Physical AI Agents: Interoperability, Longev... frequency/cost of ossification and expensive retrofits; expected upgrade cost
Aligning deployments with frameworks like the EU AI Act will influence cross-border competitiveness and create compliance costs that small operators may struggle to bear, possibly concentrating deployment among larger firms or those using third-party governance services.
Policy-economic analysis drawing on regulatory compliance cost logic and barriers to entry; supported by conceptual examples rather than empirical cross-sectional firm data.
medium negative Resilience Meets Autonomy: Governing Embodied AI in Critical... market concentration and competitiveness effects (number/size distribution of de...
Requiring bounded autonomy and hybrid governance raises upfront costs (designing constraints, verification, auditing) and ongoing operational costs (human oversight, training, compliance), which will affect deployment timing and scale across sectors.
Economic reasoning and descriptive analysis of compliance/operational cost categories; no empirical cost-sample or econometric estimation provided.
medium negative Resilience Meets Autonomy: Governing Embodied AI in Critical... change in deployment costs and timing (capital and operational expenditures, tim...
Purely capability-driven autonomy can exacerbate crises when AI actions interact with novel dynamics or other automated systems.
Analytical reasoning supported by crisis-management literature and illustrative interaction scenarios between automated agents; thought experiments rather than empirical validation.
medium negative Resilience Meets Autonomy: Governing Embodied AI in Critical... change in crisis propagation/severity attributable to autonomous AI decisions (i...
Embodied AI in critical infrastructure is vulnerable to cascading failures and crisis dynamics outside training distributions.
Conceptual synthesis of crisis-dynamics and cascading-failure literature; analytical characterization of limitations in current embodied-AI training paradigms; illustrative thought experiments (no new empirical field data).
medium negative Resilience Meets Autonomy: Governing Embodied AI in Critical... vulnerability to cascading/systemic failures (probability or severity of cascade...
Public‑interest concerns (bias, misuse, systemic risk) may be harder to mitigate via simple transparency rules; policies should emphasize outcome‑based regulations, mandatory behavioral testing, and marketplace disclosure obligations for stressed scenarios.
Policy implication derived from the non‑rule‑encodability thesis; no empirical policy evaluation included.
medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... effectiveness of transparency-based vs outcome-based regulatory approaches
Standard contracts and regulatory audits that rely on inspection of rule sets or source code will be insufficient to assess model behavior or risk; regulators and buyers must rely more on behavior‑based testing, standards, and outcome measures.
Policy and regulatory argument derived from the main theorem about non‑rule‑encodability; no empirical regulatory studies presented.
medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... effectiveness of rule‑based audits/regulatory inspections for assessing model ri...
Full interpretability via rule extraction may be impossible for the most valuable parts of LLM competence, limiting the utility of some transparency approaches for safety and auditing.
Argumentative consequence of the main theoretical claim and structural mismatch; supported by historical limitations of rule‑based systems; no empirical tests reported.
medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... feasibility of fully extracting human‑readable rules from LLMs (interpretability...
There is a structural mismatch between explicit human cognitive tools (rules, checklists) and the pattern‑rich, high‑dimensional competence encoded in LLMs.
Theoretical/structural argument about distributed statistical representations in LLMs versus discrete rules; no experimental quantification provided.
medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... alignment/mismatch between human‑readable rules and LLM representations/competen...
Historical expert systems failed to generalize or scale to complex, ambiguous tasks, contrasting with LLMs' broader empirical successes.
Historical case analysis and literature review-style discussion of expert systems versus contemporary LLM performance; no new quantitative historical dataset provided.
medium negative Why the Valuable Capabilities of LLMs Are Precisely the Unex... generalization and scalability of rule‑based expert systems
High governance costs in regulated/high-risk domains can slow adoption of agentic systems, concentrating deployment in less regulated uses or among large firms that can afford governance infrastructure.
Economic reasoning about fixed and marginal governance costs and firm-level adoption decisions; no empirical adoption data presented.
medium negative Runtime Governance for AI Agents: Policies on Paths rate of adoption of agentic systems across firm sizes and regulated domains
Path-dependent behavior increases the complexity of principal–agent contracting and moral hazard between platforms, enterprise customers, and downstream users, requiring richer contract terms (acceptable paths, logging, audit rights).
Economic theory reasoning and applied contract/design implications discussed; no empirical contract-study data.
medium negative Runtime Governance for AI Agents: Policies on Paths complexity of contractual arrangements (number/complexity of contract clauses or...
Path-dependent policies complicate ex post auditing and simple rule-based regulation; regulators may prefer standards requiring runtime evaluation and logging to be enforceable in practice.
Conceptual argument about limits of auditing when important state is ephemeral and about how runtime logging enables ex post review; illustrative policy examples mapping to runtime requirements.
medium negative Runtime Governance for AI Agents: Policies on Paths enforceability of regulation (ease of ex post compliance verification)
Outdated or inconsistent facts—especially when visual inputs are involved—can reduce user trust, raise liability risks, and increase oversight costs in high-stakes domains.
Argumentative implications in the paper linking empirical findings (outdated/inconsistent outputs) to downstream product risk, trust, and oversight cost concerns; not directly measured empirically.
medium negative V-DyKnow: A Dynamic Benchmark for Time-Sensitive Knowledge i... projected impacts on trust, liability, and oversight costs (qualitative)
Static-training regimes create recurring economic costs: organizations must choose between expensive retraining/continuous fine-tuning and engineering around external retrieval/RAG systems to keep facts current.
Analytic discussion in paper on maintenance costs and trade-offs; economic argumentation rather than primary empirical measurement.
medium negative V-DyKnow: A Dynamic Benchmark for Time-Sensitive Knowledge i... economic maintenance cost trade-offs (qualitative analysis)
Multimodal retrieval-augmented generation (RAG) designs conditionally using time-stamped external evidence do not guarantee cross-modal propagation of updated facts.
Experiments implementing multimodal RAG pipelines where models are conditioned on retrieved, time-stamped evidence; evaluation shows that retrieved evidence does not always override outdated internal knowledge across both text and image prompts.
medium negative V-DyKnow: A Dynamic Benchmark for Time-Sensitive Knowledge i... effectiveness of RAG in updating model outputs across modalities