The Commonplace
Home Dashboard Papers Evidence Digests 🎲

Evidence (2432 claims)

Adoption
5126 claims
Productivity
4409 claims
Governance
4049 claims
Human-AI Collaboration
2954 claims
Labor Markets
2432 claims
Org Design
2273 claims
Innovation
2215 claims
Skills & Training
1902 claims
Inequality
1286 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 369 105 58 432 972
Governance & Regulation 365 171 113 54 713
Research Productivity 229 95 33 294 655
Organizational Efficiency 354 82 58 34 531
Technology Adoption Rate 277 115 63 27 486
Firm Productivity 273 33 68 10 389
AI Safety & Ethics 112 177 43 24 358
Output Quality 228 61 23 25 337
Market Structure 105 118 81 14 323
Decision Quality 154 68 33 17 275
Employment Level 68 32 74 8 184
Fiscal & Macroeconomic 74 52 32 21 183
Skill Acquisition 85 31 38 9 163
Firm Revenue 96 30 22 148
Innovation Output 100 11 20 11 143
Consumer Welfare 66 29 35 7 137
Regulatory Compliance 51 61 13 3 128
Inequality Measures 24 66 31 4 125
Task Allocation 64 6 28 6 104
Error Rate 42 47 6 95
Training Effectiveness 55 12 10 16 93
Worker Satisfaction 42 32 11 6 91
Task Completion Time 71 5 3 1 80
Wages & Compensation 38 13 19 4 74
Team Performance 41 8 15 7 72
Hiring & Recruitment 39 4 6 3 52
Automation Exposure 17 15 9 5 46
Job Displacement 5 28 12 45
Social Protection 18 8 6 1 33
Developer Productivity 25 1 2 1 29
Worker Turnover 10 12 3 25
Creative Output 15 5 3 1 24
Skill Obsolescence 3 18 2 23
Labor Share of Income 7 4 9 20
Clear
Labor Markets Remove filter
Models exhibit inherent deviations from real rulings.
Empirical comparison of LLM outputs to CJOL judgments showing systematic differences (based on the paper's reported comparisons across the dataset).
high negative LLM Safety in Judicial AI: A Stress Test of Social Media Inf... magnitude and frequency of deviations between LLM outputs and actual court judgm...
Rather than broad job losses, evidence points to a reallocation at the entry level: AI automates tasks typically assigned to junior staff, shifting the nature of entry-level roles.
Synthesis of firm- and task-level empirical studies reported in the brief documenting automation of routine/junior tasks and changes in job-task composition; specific sample sizes vary by cited study and are not provided in the brief.
high negative AI, Productivity, and Labor Markets: A Review of the Empiric... automation of entry-level/junior tasks and changes to entry-level job content
Large-scale AI models have significant energy and resource costs, creating a notable environmental footprint that must be addressed.
Narrative integration of prior empirical studies measuring compute, energy consumption, and embodied emissions of large models (cited literature); the review does not present new quantitative measurements itself.
high negative The Evolution and Societal Impact of Artificial Intelligence... energy consumption, carbon emissions, and resource use associated with large-sca...
As AI is deployed in safety-critical domains, reliability, regulation, and human-oriented system design become essential to avoid harms.
Review of literature on safety-critical systems, human–machine interaction studies, and regulatory policy discussions; the paper reports this as a consensus implication rather than presenting new empirical tests.
high negative The Evolution and Societal Impact of Artificial Intelligence... system reliability/safety and risk of harm in safety-critical deployments
The current literature is skewed toward descriptive and engineering work; there is a lack of causal, field‑experimental evidence on NLP interventions' effects on customer behavior and firm profits.
Review coding of study types in the sample (engineering/descriptive vs. experimental/causal) showing few field experiments or causal designs.
high negative Natural language processing in bank marketing: a systematic ... presence vs. absence of causal/experimental studies measuring effects on custome...
Important gaps include customer acquisition, personalization at scale, use of external text sources (social media, news, reviews), operational process improvement, and cross‑channel integration.
Gap detection via low‑density regions in the UMAP thematic map of sentence‑transformer embeddings and manual review showing low article counts for these topics within the 109‑article sample.
high negative Natural language processing in bank marketing: a systematic ... topical coverage by customer journey stage and source type (acquisition, persona...
Existing literature on NLP in marketing is concentrated around customer retention tasks (e.g., churn prediction, complaint handling, relationship management).
Thematic clustering from sentence‑transformer embeddings of article text combined with UMAP visualization, and manual review of article topics and keywords identifying frequent retention‑related themes.
high negative Natural language processing in bank marketing: a systematic ... topical frequency/coverage by customer journey stage (retention)
NLP applications in bank marketing are severely under‑studied.
Descriptive result from the PRISMA review showing only 8/109 articles focused on NLP in bank marketing (≈7%), plus thematic mapping showing sparse coverage in bank‑marketing/NLP intersection.
high negative Natural language processing in bank marketing: a systematic ... proportion and absolute count of studies at the intersection of NLP and bank mar...
Vietnam's civil-law features—statutory specificity, formal procedures, and constitutional principles like legal certainty and fairness—make straightforward AI deployment legally fraught.
Close textual analysis of Vietnam's statutes, constitutional provisions, and administrative procedures (doctrinal legal analysis); no quantitative sample.
high negative ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... legal compatibility of AI deployment (degree of legal obstacles to deployment)
Automated decisions complicate assigning responsibility and hinder judicial and administrative reviewability.
Doctrinal examination of accountability and review mechanisms in administrative law plus comparative institutional analysis of automated decision-making governance.
high negative ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... clarity of accountability (ability to assign responsibility) and effectiveness o...
Opaque AI models risk violating notice, reason-giving, and appeal rights protected under administrative due process.
Analysis of procedural due-process requirements (notice, reason-giving, appeal) in Vietnam's legal framework and assessment of opacity issues in algorithmic systems; qualitative reasoning, no empirical testing.
high negative ARTIFICIAL INTELLIGENCE AND ADMINISTRATIVE GOVERNANCE: A CRI... compliance with due-process requirements (notice, reasons, appealability)
Provider incentives may be misaligned (e.g., optimizing for engagement or test performance instead of durable learning), requiring contracts, regulation, or purchaser design to align incentives.
Consensus from interdisciplinary workshop (50 scholars) highlighting incentive risks and market-design considerations; descriptive, not empirical.
high negative The Future of Feedback: How Can AI Help Transform Feedback t... provider optimization metrics (engagement/test performance) vs. durable learning...
Extensive learner data needed to personalize AI feedback raises privacy and data-governance concerns (consent, storage, usage).
Qualitative consensus from workshop participants (50 scholars) noting data-collection requirements and governance risks; no empirical governance studies included.
high negative The Future of Feedback: How Can AI Help Transform Feedback t... volume/type of learner data collected; privacy risk indicators; compliance with ...
Automated feedback may not capture pedagogical nuances expert teachers use (motivation, socio-emotional cues, complex reasoning), limiting pedagogical fit.
Expert syntheses from the workshop of 50 scholars highlighting limits of automation relative to expert teacher judgment; no empirical comparisons presented.
high negative The Future of Feedback: How Can AI Help Transform Feedback t... coverage of socio-emotional and complex-reasoning cues in feedback; corresponden...
AI-generated feedback can be incorrect, misleading, or misaligned with learning objectives; assessing feedback quality is nontrivial.
Repeated concern raised across workshop participants (50 scholars) in qualitative synthesis; noted as a substantive risk and open challenge rather than empirically quantified here.
high negative The Future of Feedback: How Can AI Help Transform Feedback t... feedback factual correctness; alignment with stated learning objectives; rate of...
Generalization across domains and long-term robustness to adversarial adaptation require further validation.
Authors explicitly note the need for further validation; the paper's reported experiments do not (in the provided summary) disclose broad domain coverage, longitudinal tests, or adversarial evolution studies.
high negative CoMAI: A Collaborative Multi-Agent Framework for Robust and ... generalization across domains; long-term robustness to adaptive adversaries
A modular system may increase engineering complexity and compute overhead compared to a single LLM endpoint.
Authors' caveat in the paper noting higher engineering and compute costs as a trade-off for modularity; the summary does not provide quantitative cost or latency measurements.
high negative CoMAI: A Collaborative Multi-Agent Framework for Robust and ... engineering complexity and compute/resource overhead
Quality of CoMAI depends on rubric design and on how the finite-state machine and agent prompts are specified.
Authors' noted limitation/caveat in the paper that system performance hinges on rubric and prompt/FSM design choices; this is a qualitative dependency rather than an empirically quantified effect in the summary.
high negative CoMAI: A Collaborative Multi-Agent Framework for Robust and ... assessment quality as a function of rubric/FSM/agent prompt design
Using C.A.P. entails trade-offs: potential increases in latency and compute cost and a risk of over-correction (unnecessary clarification).
Paper explicitly notes these trade-offs as part of the design discussion and proposes measuring latency, compute cost, and unnecessary clarification rate in evaluations; this is an acknowledged design risk rather than an empirically quantified result.
high negative A Context Alignment Pre-processor for Enhancing the Coherenc... response latency, compute cost per session, rate of unnecessary clarifications
Integration costs—domain modeling, human-in-the-loop protocols, and regulatory/liability frameworks—are significant barriers to deployment.
Conceptual assessment of operational and regulatory requirements; no quantified cost studies provided.
high negative Argumentative Human-AI Decision-Making: Toward AI Agents Tha... implementation cost and organizational burden for deploying argumentative AI sys...
AFs and LLMs may be gamed or misled; adversaries may exploit systems leading to strategic argumentation or manipulation.
Conceptual security/adversarial concern based on known vulnerabilities in ML and strategic behavior; no adversarial tests reported.
high negative Argumentative Human-AI Decision-Making: Toward AI Agents Tha... system vulnerability metrics / susceptibility to adversarial manipulation
Faithful extraction—aligning LLM-extracted arguments with formal AF primitives and ensuring fidelity to source evidence—is a key technical challenge.
Paper's explicit identification of failure modes and alignment issues; grounded in documented limitations of IE/LLMs (no empirical quantification here).
high negative Argumentative Human-AI Decision-Making: Toward AI Agents Tha... fidelity/alignment error rate between extracted elements and source evidence
Computational argumentation approaches have required heavy feature engineering and domain-specific knowledge to be effective.
Conceptual claim grounded in prior work and practical experience reported in the literature; no quantitative cost estimates provided in the paper.
high negative Argumentative Human-AI Decision-Making: Toward AI Agents Tha... engineering cost / domain modeling effort required for AF-based systems
Automation bias (human tendency to defer to automated outputs) compounds the risk that GLAI errors become embedded in legal processes.
Behavioral literature review on automation bias and trust in AI systems; applied to legal-context vignettes. No primary empirical test within the paper.
high negative Why Avoid Generative Legal AI Systems? Hallucination, Overre... likelihood of human operators deferring to GLAI outputs (automation bias effect)
Current models heavily rely on large static datasets and batch training and exhibit poor lifelong/continual learning.
Synthesis of common practices in contemporary ML (supervised pretraining and offline training paradigms); no new experiments provided.
high negative Why AI systems don't learn and what to do about it: Lessons ... continual learning performance; dependence on dataset size and batch training
Proactive AI at national scale amplifies concerns around transparency, accountability, privacy, and potential misuse, necessitating robust regulatory and ethical frameworks.
Normative and ethical analysis in the paper, supported by general literature on large-scale AI governance; no empirical assessment of regulatory effectiveness in Russia included.
high negative DIGITAL TRANSFORMATION OF THE RUSSIAN FEDERATION’S SOCIOECON... risks to transparency, accountability, privacy and potential for misuse
Aggregating informal and recommendation data raises privacy and consent issues in low-regulation contexts, requiring governance safeguards.
Policy and ethical consideration based on the nature of the data used; no specific privacy-impact assessment reported in the summary.
high negative AI-Driven Skill Mapping and Gig Economy Matching Algorithm f... privacy risk / consent compliance
NLP/ML systems can inherit biases from inputs (underrepresentation, noisy self-reports, biased recommendations) and may therefore disadvantage some youth unless transparency and fairness constraints are implemented.
Reasoned risk assessment grounded in known properties of ML/NLP; the pilot summary does not report an audit or measured bias outcomes.
high negative AI-Driven Skill Mapping and Gig Economy Matching Algorithm f... bias in match outcomes / differential access by demographic group
There are limited randomized controlled trials or longitudinal evaluations; few studies measure patient-relevant outcomes or economic impacts.
Literature synthesis noting scarcity of RCTs and long-term observational studies, and absence of widespread patient-outcome and cost-effectiveness evaluations in existing publications.
high negative Human-AI interaction and collaboration in radiology: from co... number of RCTs/longitudinal studies, frequency of patient outcome and economic o...
Many published studies focus on standalone algorithm accuracy rather than clinician–AI joint performance in routine workflows.
Review of the literature categorizing study designs (preponderance of algorithm development/validation studies, fewer reader-in-the-loop, simulation, or deployment studies).
high negative Human-AI interaction and collaboration in radiology: from co... proportion of studies reporting standalone algorithm metrics versus those report...
Advanced technologies' complexity and lack of explainability create risks for audit reliability and professional judgement.
Findings from literature synthesis and professional/regulatory perspectives included in the review; presented as an identified risk/challenge rather than quantified effect.
high negative Audit 5.0 and the Digital Transformation of Auditing: The Ro... audit reliability and the exercise of professional judgement in presence of opaq...
Audit 5.0 introduces key challenges: data quality and integration issues, complexity and explainability of advanced technologies, regulatory and ethical uncertainty, and skills shortages combined with cultural resistance.
Systematic literature review and synthesis of professional standards and regulatory perspectives; assertions based on reviewed literature rather than a single empirical dataset.
high negative Audit 5.0 and the Digital Transformation of Auditing: The Ro... barriers to adoption/readiness factors (data quality, explainability, regulatory...
Gaps in infrastructure readiness, digital awareness, and inclusive policy frameworks hinder equitable AI adoption among micro‑enterprises.
Cross‑study synthesis of barriers identified across the 55 included articles; infrastructural, awareness, and policy barriers are explicitly reported as recurring themes.
high negative Role of AI in Enhancing Work Efficiency and Opportunities fo... barriers to AI adoption (infrastructure readiness, digital awareness, policy inc...
Only 24.4% of at-risk workers have viable transition pathways, where 'viable' is defined as sharing at least 3 skills and achieving at least 50% skill transfer.
Analysis of job-to-job transitions on the validated knowledge graph using an operational definition of viable pathways (>=3 shared skills and >=50% skill transfer); proportion of at-risk workers meeting that criterion reported as 24.4% (underlying at-risk worker count not given in the excerpt).
high negative Graph-Based Analysis of AI-Driven Labor Market Transitions: ... percentage of at-risk workers with viable transition pathways (per defined thres...
20.9% of jobs in the dataset face high automation risk.
Risk classification applied to the jobs represented in the knowledge graph (sample size: 9,978 job postings); proportion of jobs labeled as 'high automation risk' is reported as 20.9%.
high negative Graph-Based Analysis of AI-Driven Labor Market Transitions: ... proportion of jobs classified as high automation risk
Japan's population is shrinking, the share of working-age people is falling, and the number of elderly is growing fast.
Statement grounded in official national statistics referenced by the paper (demographic time series used to initialize and calibrate the system dynamics model).
high negative Fiscal Dynamics in Japan under Demographic Pressure total population size; share (%) of working-age population; number and share (%)...
A preregistered, nationally representative replication experiment in the United States (N = 1,200) replicates the causal finding that a labor-replacing (vs. labor-creating) AI frame reduces willingness to politically engage with future AI developments.
Preregistered randomized experiment (nationally representative US sample, N = 1,200) replicating the UK manipulation and measuring willingness to engage politically regarding AI.
high negative Perceiving AI as labor-replacing reduces democratic legitima... willingness to politically engage with future AI developments (self-reported)
A preregistered, nationally representative experiment in the United Kingdom (N = 1,202) shows that exposure to a labor-replacing (vs. labor-creating) AI frame causally reduces trust in democracy.
Preregistered randomized experiment (nationally representative UK sample, N = 1,202) manipulating AI framing (labor-replacing vs. labor-creating) and measuring trust/satisfaction with democratic institutions.
high negative Perceiving AI as labor-replacing reduces democratic legitima... trust in democracy / satisfaction with democratic institutions (post-manipulatio...
Large-scale survey data indicate that the public tends to view AI as labor-replacing rather than labor-creating.
Cross-sectional survey (N = 37,079 respondents across 38 European countries); descriptive analysis of responses about AI's labor market impact.
high negative Perceiving AI as labor-replacing reduces democratic legitima... public perception of AI's labor-market impact (labor-replacing vs. labor-creatin...
Only 12% of gig workers participate in retirement savings programs.
Survey and administrative measures of retirement-savings participation among gig workers in the 24-country sample.
high negative The Gig Economy and Labor Market Restructuring: Platform Wor... proportion of gig workers participating in retirement savings programs (%)
Only 23% of gig workers report access to employer-provided health insurance.
Self-reported benefits coverage from labor force surveys and linked administrative records for gig workers across the 24 OECD countries (2015–2025).
high negative The Gig Economy and Labor Market Restructuring: Platform Wor... proportion of gig workers reporting access to employer-provided health insurance...
Human judgment is constrained by bounded rationality, cognitive biases, and information-processing limitations.
Cited as established findings from prior research across decision sciences and related fields (extensive literature evidence referenced; no new empirical data in this paper's abstract).
high negative Reframing Organizational Decision-Making in the Age of Artif... human judgment accuracy/quality and cognitive processing capacity
Ireland exhibits the largest gender gap in advanced digital task use: approximately 44% of men versus 18% of women perform advanced digital tasks — a 26 percentage point gap, close to double the European average.
Country-level descriptive statistics from ESJS for Ireland reporting shares of men and women performing advanced digital tasks. (Exact Irish sample size not provided in the excerpt.)
high negative Squandered skills? Bridging the digital gender skills gap fo... Share (%) of men and women in Ireland performing advanced digital tasks; gender ...
Across Europe, women are around 15 percentage points less likely than men to perform advanced digital tasks in their jobs.
Empirical analysis of the European Skills and Jobs Survey (ESJS) (Cedefop, 2021) using regression-based estimates and descriptive statistics across European countries. (Exact sample size and country count not provided in the excerpt.)
high negative Squandered skills? Bridging the digital gender skills gap fo... Probability / share of workers performing advanced digital tasks (binary indicat...
AI substitutes many routine tasks, including both manual and cognitive/rule-based activities, disproportionately affecting middle-skill occupations.
Task-based substitution reasoning within SBTC framework and cross-sectoral task analysis. The paper provides conceptual synthesis rather than presenting new microdata or quantified task-level estimates.
high negative Artificial Intelligence, Automation, and Employment Dynamics... employment and wages in routine / middle-skill occupations; task displacement
Key implementation challenges include data quality and integration, model interpretability, cybersecurity and privacy, regulatory/compliance uncertainty, skills gaps among accounting professionals, and implementation costs.
Identified by the paper through literature review and practitioner reports; these are presented as recurring barriers rather than quantified with a specific sample.
high negative Role of Artificial Intelligence in the Accounting Sector incidence/severity of implementation barriers (data quality scores, integration ...
Two regimes emerge: an inequality-decreasing regime when AI behaves like a broadly available commodity technology or when labor-market institutions share rents widely (high ξ).
Model regime characterization and calibrated counterfactuals showing falling wage dispersion and ΔGini under commodity-like AI assumptions or higher rent-sharing elasticity.
high negative When AI Levels the Playing Field: Skill Homogenization, Asse... wage dispersion and aggregate inequality (ΔGini)
Generative AI compresses within-task skill differences (reduces dispersion of individual task performance).
Theoretical task-based model and calibrated quantitative simulations (Method of Simulated Moments matching six empirical moments) showing reductions in within-task performance dispersion after introducing AI technology.
high negative When AI Levels the Playing Field: Skill Homogenization, Asse... within-task performance dispersion (skill/ability variance within a task)
Because the design is cross-sectional and sampling purposive/geographically constrained, causal inference and generalizability are limited.
Authors' stated limitations in the summary: cross-sectional design and purposive, geographically constrained sample (Karnataka, India).
high negative AI-driven stress management and performance optimization: A ... generalizability / causal inference (methodological limitation)
Workplace stress is associated with lower employee retention.
PLS-SEM analysis on a cross-sectional survey of N = 350 pharmaceutical workers in Karnataka, India (purposive sampling). Reported direct path: Stress → Retention, β = 0.321, p < 0.001. (Note: the paper interprets this as stress reducing retention; sign/coding conventions of the variables are not detailed in the summary.)
high negative AI-driven stress management and performance optimization: A ... employee retention (retention intent/behavior)