The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (8653 claims)

Adoption
5884 claims
Productivity
5127 claims
Governance
4607 claims
Human-AI Collaboration
3677 claims
Labor Markets
2768 claims
Innovation
2737 claims
Org Design
2708 claims
Skills & Training
2132 claims
Inequality
1429 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 452 119 70 526 1183
Governance & Regulation 463 217 126 68 891
Research Productivity 277 103 36 304 726
Organizational Efficiency 451 107 78 43 683
Technology Adoption Rate 350 132 77 51 615
Firm Productivity 325 39 75 13 457
Output Quality 275 78 28 30 411
AI Safety & Ethics 125 191 47 27 392
Market Structure 119 134 89 14 361
Decision Quality 184 82 44 21 335
Fiscal & Macroeconomic 98 58 34 22 219
Employment Level 79 37 81 9 208
Skill Acquisition 105 37 42 9 193
Innovation Output 131 12 31 14 189
Firm Revenue 103 38 24 165
Task Allocation 97 18 37 9 163
Consumer Welfare 77 38 37 7 159
Inequality Measures 29 81 33 6 149
Regulatory Compliance 54 61 13 3 131
Task Completion Time 92 8 4 4 108
Worker Satisfaction 49 36 14 8 107
Error Rate 45 55 6 106
Training Effectiveness 60 13 12 16 102
Wages & Compensation 56 16 20 5 97
Team Performance 51 13 15 8 88
Automation Exposure 29 29 12 7 80
Job Displacement 7 46 13 66
Hiring & Recruitment 42 4 7 3 56
Developer Productivity 39 5 4 3 51
Social Protection 22 12 7 2 43
Creative Output 17 8 6 1 32
Skill Obsolescence 3 26 2 31
Labor Share of Income 12 8 10 30
Worker Turnover 10 12 3 25
Industry 1 1
Experiments run with multiple LLM backends (proprietary and open-source) show qualitatively consistent dynamics, indicating framework stability to model choice.
Cross-backend comparisons and robustness checks reported in the paper; several LLMs used though the exact models and counts are not specified in the summary.
medium positive An LLM-Driven Multi-Agent Simulation Framework for Coupled E... qualitative consistency of macro dynamics (e.g., similarity in infection/economi...
Behavioral changes in the simulation emerge endogenously from cognitive reasoning rather than from parameterized switches, producing context-sensitive, heterogeneous responses.
Description of agent heterogeneity (differences in perceptions, priorities, and local conditions) and use of CoT reasoning per agent; reported emergent, diverse responses in experiments. (Degree of heterogeneity and quantitative heterogeneity metrics not provided in summary.)
medium positive An LLM-Driven Multi-Agent Simulation Framework for Coupled E... heterogeneity in individual behaviors (context-sensitive changes in contacts, wo...
LLM-driven agents embedded in a Perception–Deliberation–Action (PDA) loop produce endogenous, human-like adaptive behaviors via Chain-of-Thought reasoning.
Multi-agent simulation where each agent is implemented as an LLM-driven cognitive unit running the PDA loop each timestep; agents use Chain-of-Thought (CoT) prompts/internal reasoning to make decisions. (Exact simulation sample size / population not specified in summary.)
medium positive An LLM-Driven Multi-Agent Simulation Framework for Coupled E... agent-level behavioral adaptation patterns / ‘‘human-likeness’’ of decisions (e....
Task‑based, dynamic exposure measures and real‑time data enable earlier detection of displacement risks and reallocation needs than static, occupation‑level extrapolations.
Conceptual argument and proposed architecture; no empirical timing comparison or lead-time statistics provided.
medium positive Enhancing BLS Methodologies for Projecting AI's Impact on Em... detection lead time for displacement risks; timeliness of signals indicating rea...
LLMs can be used to score task automation/augmentation plausibility and to detect emergent tasks.
Methodological proposal describing use of LLMs for semantic mapping/scoring of tasks; no empirical validation or accuracy metrics for LLM task scoring provided in the paper.
medium positive Enhancing BLS Methodologies for Projecting AI's Impact on Em... task-level automation/augmentation plausibility scores; detection of emergent ta...
Modeling nonlinearity (threshold adoption, network spillovers, complementarities) and path dependence in adoption dynamics is necessary rather than relying on linear extrapolation.
Theoretical argument and model suggestions (S‑curve diffusion, agent-based models) in the paper; no empirical comparison demonstrating superior performance provided.
medium positive Enhancing BLS Methodologies for Projecting AI's Impact on Em... accuracy of adoption dynamics forecasts; capture of threshold and spillover effe...
Applying causal inference methods (difference‑in‑differences, synthetic controls, instrumental variables, structural counterfactuals) can distinguish automation (task substitution) from augmentation (productivity/role change) and estimate net employment effects.
Methodological recommendation with examples of applicable identification strategies; no specific empirical applications or results reported in the paper.
medium positive Enhancing BLS Methodologies for Projecting AI's Impact on Em... causal estimates separating substitution vs augmentation effects; net employment...
Integrating multiple data streams (CPS, LEHD/LODES, UI wage records, administrative microdata, job ads, occupational manuals, enterprise adoption surveys) yields richer gross‑flows and skills measurement than using single data sources.
Proposed data-integration strategy and references to candidate datasets; no empirical demonstration or quantified improvement in measurement presented.
medium positive Enhancing BLS Methodologies for Projecting AI's Impact on Em... quality of gross‑flows estimates (transition rates, spell durations), comprehens...
A dynamic Occupational AI Exposure Score (OAIES) can quantify exposure at the task level using LLMs, job‑task matrices (e.g., O*NET), and real‑time job ad / workplace data to capture evolving capability of AI systems.
Methodological description of OAIES construction (mapping tasks to occupations, LLM scoring, weighting by time use/criticality); no empirical implementation or validation data presented in the paper.
medium positive Enhancing BLS Methodologies for Projecting AI's Impact on Em... OAIES scores (task- and occupation-level exposure measures) with uncertainty int...
Measurement and forecasting should move away from occupation-level forecasts toward task-level, continuously updated indicators linked to real-world adoption measures (firm purchases, API usage, procurement).
Recommendation in the paper motivated by rapid changes in AI capabilities and limitations of static indices; evidence basis is methodological argument and examples of richer adoption measures rather than a quantified evaluation of forecast improvements.
medium positive Recent Methodologies on AI and Labour - a Desk Review forecast accuracy and timeliness of AI exposure indicators
Policy should prioritise flexible reskilling and retraining programs targeted at high-risk tasks and low-skilled workers, informed by task-level exposure maps.
Policy implication recommended by the paper drawing on distributional findings (higher displacement risk for low-skilled tasks) and the availability of task-level exposure indices; evidence basis combines empirical pattern synthesis and normative recommendation rather than an RCT or program evaluation.
medium positive Recent Methodologies on AI and Labour - a Desk Review effectiveness of reskilling/training programs in mitigating displacement and imp...
Think tanks and international organisations are emphasising scenario planning with differing adoption initial conditions to inform reskilling and labour-market policy.
References to policy and scenario work by organisations named in the paper (TBI, IPPR, IMF, TBI 2024; IPPR 2024; Korinek 2023); evidence basis is published scenario reports and policy papers rather than experimental data.
medium positive Recent Methodologies on AI and Labour - a Desk Review policy scenario outputs (projected employment/wage/productivity under alternativ...
Practical measures (task selection, oversight, verification, governance) enable responsible deployment of GenAI that balances firm-level goals with individual consultants' skill development.
Recommendations synthesized from interviews with practitioners and the TGAIF framework; presented as practice guidance rather than experimentally tested interventions.
medium positive Where Automation Meets Augmentation: Balancing the Double-Ed... responsible deployment indicators (compliance with oversight procedures, balance...
The Task–GenAI Fit (TGAIF) framework maps task characteristics to GenAI capabilities to guide decisions about when and how to use GenAI effectively in consulting processes.
Framework inductively derived from interview data in the study; authors present mapping logic based on task features and reported GenAI capabilities. Evidence is conceptual and qualitative rather than empirically validated.
medium positive Where Automation Meets Augmentation: Balancing the Double-Ed... appropriateness of GenAI role for specific consulting tasks (decision guidance)
Generative AI offers efficiency and scaling opportunities in consulting.
Reported repeatedly in practitioner interviews summarized by the authors; qualitative impressions rather than measured productivity gains. No quantitative sample-size or effect-size reported.
medium positive Where Automation Meets Augmentation: Balancing the Double-Ed... operational efficiency (e.g., time-to-complete tasks, ability to scale deliverab...
A closed interaction loop—MLLM ingesting multimodal inputs (visual, machine feedback, user actions) and outputting structured commands and AR overlays—reduces user cognitive load during machine operation.
System architecture described in the paper plus empirical finding of reduced subjective workload in the CMM case study; supports the claim that the interaction loop contributes to cognitive-load reduction. (Causal attribution to loop structure is inferred rather than directly isolated experimentally.)
medium positive Augmented Reality-Based Training System Using Multimodal Lan... Cognitive load (subjective workload measures) and qualitative alignment of guida...
An iterative, scenario-refined prompt engineering structure enables the LLM (ChatGPT in this study) to generate task-specific, contextualized guidance that aligns with real-time user actions and machine state.
System design and methods: authors describe developing and refining a prompt structure across multiple machine-operation scenarios and using ChatGPT as the generative engine to produce stepwise instructions and contextual overlay content. Evidence is methodological and qualitative within the paper's development process.
medium positive Augmented Reality-Based Training System Using Multimodal Lan... Quality/alignment of LLM-generated guidance with scenario context and real-time ...
Participants reported lower perceived workload and improved usability when using the AR-MLLM system.
Subjective workload/usability questionnaires were administered in the CMM case study; authors report reduced reported workload under AR-MLLM guidance. (Questionnaire instrument, scales, and sample size not specified in the summary.)
medium positive Augmented Reality-Based Training System Using Multimodal Lan... Subjective workload/usability (self-reported measures)
Participants completed assigned CMM tasks faster when using the AR-MLLM system compared to baseline/traditional training.
Task execution time was recorded in the CMM case study; authors report statistically meaningful reductions in completion time with AR-MLLM guidance versus baseline. (Summary does not give numerical effect sizes or sample size.)
medium positive Augmented Reality-Based Training System Using Multimodal Lan... Task execution time (duration to complete assigned operations)
The AR-MLLM system achieved high measurement/feature-activity accuracy (participants performed correct measurements under AR-MLLM guidance).
Measurement/feature activity correctness was measured in the CMM case study; authors report high measurement accuracy under the AR-MLLM condition. (Exact rates and sample size not provided in the summary.)
medium positive Augmented Reality-Based Training System Using Multimodal Lan... Measurement/feature activity accuracy (correctness of performed measurements)
The AR-MLLM system achieved high task-recognition accuracy (the system correctly identified the current task/step).
Measured task recognition accuracy in the CMM case study; authors report 'high' recognition accuracy for the system. (Exact numeric accuracy and sample size not specified in the summary.)
medium positive Augmented Reality-Based Training System Using Multimodal Lan... Task recognition accuracy (system correctly identifying current task/step)
An AR + multimodal LLM (AR-MLLM) training system can substantially improve training and execution in complex machine operations (demonstrated on a Coordinate Measuring Machine).
Case-study experiment in the paper where human participants performed CMM measurement tasks both with and without the AR-MLLM system; metrics collected included task recognition accuracy, measurement activity correctness, task completion time, and subjective workload/usability. (Participant sample size not specified in the provided summary.)
medium positive Augmented Reality-Based Training System Using Multimodal Lan... Overall training and execution performance (aggregated: task accuracy, task comp...
AI methods such as transfer learning, active learning, and Bayesian approaches improve data efficiency and uncertainty quantification in drug discovery and preclinical modeling.
Methodological literature and exemplar studies summarized in the review describing these approaches; heterogeneous examples, no quantitative synthesis.
medium positive Artificial Intelligence in Drug Discovery and Development: R... data efficiency (number of experiments/samples needed), calibration of uncertain...
Clear regulatory alignment (e.g., preparation of credibility plans and qualified digital endpoints) reduces regulatory uncertainty, de-risks investment, and raises adoption rates of AI tools.
Policy and regulatory framework analysis in the review; references to regulatory guidance and qualification processes (narrative, forward-looking).
medium positive Artificial Intelligence in Drug Discovery and Development: R... regulatory uncertainty (qualitative), investment adoption rates in AI tools, pac...
Economic value from AI adoption concentrates with data-rich firms and platforms that own large, high-quality datasets and validation pipelines.
Economic analysis and theoretical arguments in the paper (narrative), supported by observed market patterns cited in the literature; no formal empirical valuation provided.
medium positive Artificial Intelligence in Drug Discovery and Development: R... firm returns/competitive advantage attributable to dataset ownership and validat...
Adopting equity-by-design (including diverse, non‑European datasets and subgroup evaluation) reduces model bias and improves global generalizability of AI models.
Recommendations and examples in the review; draws on literature documenting subgroup performance differences and bias remediation strategies (narrative evidence).
medium positive Artificial Intelligence in Drug Discovery and Development: R... subgroup performance disparities, generalizability across populations/geographie...
AI-enabled trial innovations—such as integration with new approach methodologies (NAMs), adaptive and covariate-adjusted designs, and digital biomarkers—can reduce trial inefficiency while preserving scientific and ethical standards.
Narrative review of trial design optimization methods, examples of adaptive and covariate-adjusted analyses, and digital endpoint qualification discussions; case examples and methodological papers referenced without meta-analysis.
medium positive Artificial Intelligence in Drug Discovery and Development: R... trial efficiency metrics (sample size, duration, cost) and maintenance of scient...
Synthesis-aware and physics-informed molecular design increases the downstream feasibility (synthetic accessibility and developability) of AI-designed compounds.
Methodological literature and case examples of synthesis-aware generative models and physics-informed approaches summarized in the narrative review (heterogeneous studies, no pooled estimate).
medium positive Artificial Intelligence in Drug Discovery and Development: R... synthetic success rate, developability indicators (e.g., ADMET proxies), time/co...
External validation, explicit applicability-domain reporting, and subgroup performance reporting improve model reliability and support regulatory alignment.
Technical best-practice recommendations and analysis of evolving regulatory frameworks discussed in the review; examples of regulatory guidance and credibility-plan concepts (narrative).
medium positive Artificial Intelligence in Drug Discovery and Development: R... model reliability/generalizability metrics and likelihood of regulatory acceptan...
Structural prediction tools and structural-biology advances speed target validation and can accelerate target identification/validation workflows.
Discussion of structural biology datasets (cryo-EM/X-ray and predicted structures) and use cases in the narrative review; examples include use of predicted structures to inform target characterization (heterogeneous examples).
medium positive Artificial Intelligence in Drug Discovery and Development: R... time to target validation and throughput of target characterization
AI-assisted molecular design can improve lead/compound quality (e.g., potency, selectivity, developability) when using synthesis-aware and physics-informed approaches.
Review of method papers and case examples of synthesis-aware generative models and physics-informed neural networks in de novo design; examples drawn from cheminformatics and molecular design studies (heterogeneous, narrative).
medium positive Artificial Intelligence in Drug Discovery and Development: R... compound/lead quality metrics (potency, selectivity, developability, synthetic f...
AI can raise early-phase (e.g., Phase I/II) success rates when effectively applied with the technical and governance controls described.
Case studies and literature examples summarized in the narrative review reporting improved early-phase outcomes under AI-supported discovery programs; heterogeneous sample sizes and contexts, no aggregated effect estimate.
medium positive Artificial Intelligence in Drug Discovery and Development: R... early-phase clinical success rate (probability of progression through Phase I/II...
Artificial intelligence (AI) can materially shorten drug development timelines when models are predictive, interpretable, and integrated with causal/mechanistic priors, synthesis- and physics-aware molecular design, rigorous external validation (with defined applicability domains), and governance aligned to regulatory requirements.
Narrative synthesis and case examples from recent literature reviewed in the paper; heterogeneous studies and case reports across discovery and early development domains (no pooled/meta-analytic effect size provided).
medium positive Artificial Intelligence in Drug Discovery and Development: R... drug development timeline (project duration from discovery to early development ...
Labor complementarities with agentic AI will shift resources toward oversight, interpretation, and coordination roles rather than routine task execution.
Economic and organizational reasoning; literature synthesis on skill complementarities; no empirical labor-market data analyzed in the paper.
medium positive Visioning Human-Agentic AI Teaming: Continuity, Tension, and... allocation of labor hours/roles toward oversight and coordination tasks
Principal–agent contracting frameworks must be extended to account for evolving agent objectives and open-ended action spaces; contracts should be dynamic and include continuous renegotiation and monitoring.
Theoretical extension and recommendations based on economic reasoning; proposed formal models for future work.
medium positive Visioning Human-Agentic AI Teaming: Continuity, Tension, and... adequacy of static contracting frameworks vs. proposed dynamic contracts
Projection congruence — alignment of forecasts/plans across heterogeneous agents — becomes a central metric for assessing alignment in agentic human–AI teams.
Conceptual modeling and proposal in the paper; introduced as a new measurable construct (projection congruence indices) for future empirical work.
medium positive Visioning Human-Agentic AI Teaming: Continuity, Tension, and... degree of congruence in projected trajectories between human and AI teammates
The DAR framework reframes human oversight as a dynamic, auditable process whose micro-level mechanics and macro-level legitimacy have direct economic consequences for productivity, contracting, regulation, and welfare.
Synthesis claim based on the conceptual framework, formal modeling, derived propositions, and policy/economics implications sections. The claim is theoretical and synthesizing rather than empirically validated.
medium positive Human–AI Handovers: A Dynamic Authority Reversal Framework f... productivity_metrics; contracting_outcomes; regulatory_costs; welfare_measures (...
The Reversal Register will create granular, time-stamped administrative data valuable for structural estimation of trust, error externalities, and productivity comparisons between automation and human judgment.
Design claim linking register contents to potential econometric uses; no empirical data shown—claim about potential data utility.
medium positive Human–AI Handovers: A Dynamic Authority Reversal Framework f... data_granularity (timestamped_entries per decision); suitability_for_structural_...
Reversal Register logs can enable descriptive and causal analyses of handovers and support experimental/quasi-experimental tests (e.g., randomized hysteresis thresholds, A/B override policies).
Implied empirical strategies and instrumentation described; paper outlines how register data would be used for experiments and causal inference. No empirical implementation or sample reported.
medium positive Human–AI Handovers: A Dynamic Authority Reversal Framework f... feasibility_of_experiments; causal_identification_quality; availability_of_time-...
Operationalizing reversible AI leadership via DAR can preserve human accountability while enabling AI-led decisions where appropriate.
Conceptual argument supported by the combined use of authority states, Reversal Register logging, and override mechanisms; no field validation provided.
medium positive Human–AI Handovers: A Dynamic Authority Reversal Framework f... human_accountability_metrics (e.g., attribution clarity); reversibility_rate; co...
DAR incorporates stabilizing mechanisms—hysteresis bands and safe-exit timers—to reduce rapid oscillation of authority and improve stability of handovers.
Formal model components and design proposals (hysteresis and timers) with conceptual argument that these damp oscillation; no empirical validation reported.
medium positive Human–AI Handovers: A Dynamic Authority Reversal Framework f... oscillation_frequency / authority_state_stability; handover_rate; dwell_time
Improved targeting and dynamic personalization increase marketing ROI by raising conversion rates and lowering customer acquisition costs (CAC).
Economic implication based on observed performance improvements in conversions and resource allocation in case studies; no comprehensive ROI/CAC empirical analysis or sample-size-backed estimates are given.
medium positive Personalized Content Selection in Marketing Using BERT and G... marketing ROI, conversion rate, customer acquisition cost (CAC)
Online A/B or multi-armed tests comparing the BERT–GPT pipeline with RAG+RL against baseline marketing automation produce measurable uplifts in CTR, engagement, conversion rate, retention, and revenue per user.
Paper reports that online experiments were conducted measuring these outcomes and observing uplifts; however, the paper does not provide numeric uplift magnitudes, confidence intervals, or sample sizes.
medium positive Personalized Content Selection in Marketing Using BERT and G... CTR, engagement, conversion rate, retention, revenue per user
Privacy-preserving techniques such as federated learning, differential privacy (DP), and homomorphic encryption can mitigate privacy leakage while enabling model updates and secure aggregation.
Methods section describes applying federated learning with DP mechanisms on gradient updates and homomorphic encryption for aggregation; feasibility is argued but no empirical privacy-utility trade-off results are provided.
medium positive Personalized Content Selection in Marketing Using BERT and G... privacy leakage bounds (DP epsilon), model utility (accuracy/CTR) under DP/feder...
Comparative evaluations and case studies show consistent improvements over traditional marketing automation across engagement and conversion metrics, driven by better intent recognition, contextually appropriate messaging, and adaptive delivery policies.
Reported comparative evaluations (offline metrics and online A/B tests) and case studies attributing gains to improved intent recognition and adaptive policies; empirical details (sample sizes, statistical significance) are not reported in the paper.
medium positive Personalized Content Selection in Marketing Using BERT and G... engagement metrics, conversion metrics (CTR, conversions), attribution to intent...
Continuous online adaptation of models and policies—updating from streaming user interactions—enables per-session and lifetime personalization that improves engagement and conversion outcomes.
Modeling pipeline includes streaming updates and online adaptation; evaluations include online experiments and retention/engagement measurements. (No numerical magnitudes or update frequencies provided.)
medium positive Personalized Content Selection in Marketing Using BERT and G... per-session CTR, engagement metrics, conversion rate, retention
An RL layer that formulates content selection as a contextual bandit / policy optimisation problem improves content selection and delivery using real-time reward signals (CTR, dwell time, conversions).
Paper describes RL-based policy optimisation using reward signals (CTR, session length, conversion events, LTV proxies) and reports online experiments/A/B tests where adaptive policies outperform static rules; exact algorithms and sample sizes not detailed.
medium positive Personalized Content Selection in Marketing Using BERT and G... CTR, session length (dwell time), conversion events, lifetime value proxies
RAG anchors generated content to up-to-date product/catalog/contextual knowledge and reduces hallucinations, increasing factuality of marketing messages.
Architectural description of RAG combining retrieved structured/unstructured knowledge with generative models; factuality/reduction in hallucinations evaluated in offline generation quality assessments using human raters and automatic factuality metrics.
medium positive Personalized Content Selection in Marketing Using BERT and G... factuality scores, rate of hallucinated assertions in generated content
GPT-family decoders generate tailored marketing content (ad copy, email text, chat responses) that matches user context and tone more effectively than template-based generation.
System uses GPT conditioned on user context and product info; generation quality evaluated via human raters and automatic relevance/factuality metrics in offline evaluations. (No quantitative effect sizes reported.)
medium positive Personalized Content Selection in Marketing Using BERT and G... generation relevance, tone match, human-rated content quality, automatic relevan...
An integrated BERT–GPT pipeline augmented with retrieval-augmented generation (RAG) and reinforcement learning (RL) substantially outperforms conventional rule-based or template-driven marketing automation.
Comparative evaluations and case studies reported in the paper, including online A/B or multi-armed tests comparing the full pipeline vs baseline automation and measuring CTR, engagement, conversion rate, retention, and revenue per user. (Sample sizes and statistical details are not specified in the paper.)
medium positive Personalized Content Selection in Marketing Using BERT and G... click-through rate (CTR), engagement metrics, conversion rate, retention, revenu...