The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (6491 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Human Ai Collab Remove filter
Across 21 scientific problems spanning six domains, SimpleTES discovers state-of-the-art solutions using gpt-oss models.
Empirical experiments reported across 21 problems in six domains using gpt-oss models (paper states 21 problems).
high positive Evaluation-driven Scaling for Scientific Discovery ability to discover state-of-the-art solutions (solution quality / discovery suc...
We introduce Simple Test-time Evaluation-driven Scaling (SimpleTES), a general framework that strategically combines parallel exploration, feedback-driven refinement, and local selection.
Methodological contribution described in the paper (framework design and algorithmic description).
high positive Evaluation-driven Scaling for Scientific Discovery framework design combining parallel exploration, feedback-driven refinement, and...
We propose seven interface primitives operationalizing verification-centered HCI.
Design contribution: specification of seven interface primitives within the paper (conceptual/design proposal); no user-study or empirical validation reported.
high positive The Instrumental Dissolution of Typing: Why AI Challenges th... existence and specification of interface primitives for verification-centered HC...
We map synthetic literacy -- oral input generating literate output -- as the defining feature of this transition.
Conceptual mapping and theoretical framing within the paper; supported by examples from technology trends but no empirical evaluation reported.
high positive The Instrumental Dissolution of Typing: Why AI Challenges th... emergence of synthetic literacy (oral-to-literate workflows)
Knowledge workers become adversarial auditors rather than keystroke-producers.
Projected role-shift based on the verification-bottleneck thesis and interdisciplinary supporting arguments; no empirical longitudinal workforce study reported.
high positive The Instrumental Dissolution of Typing: Why AI Challenges th... dominant work tasks/roles of knowledge workers (generation vs. auditing)
The central contribution identifies the verification bottleneck: as AI collapses production friction, the primary constraint shifts from generation to evaluation.
Theoretical argument supported by literature synthesis across multiple fields; no direct experimental quantification provided.
high positive The Instrumental Dissolution of Typing: Why AI Challenges th... relative constraint: generation vs. evaluation (verification) in knowledge work
We contribute design guidelines for specialized AI and articulate a vision for 'ecosystem-aware' Humble AI.
Paper's stated contributions (design guidelines and conceptual vision) described in the abstract.
high positive Learning from AVA: Early Lessons from a Curated and Trustwor... design guidance / conceptual framework
Qualitatively, participants used AVA as a specialized 'evidence engine'; reasoned abstention clarified scope boundaries, and trust was calibrated through institutional provenance and page-anchored citations.
Qualitative findings from surveys and 20 interviews reported in the paper (participant quotations and thematic analysis implied in abstract).
high positive Learning from AVA: Early Lessons from a Curated and Trustwor... user behavior and trust calibration (use as evidence engine; role of abstention ...
Difference-in-Differences estimates associate sustained engagement with 2.4-3.9 hours saved weekly.
Quantitative claim reported in the paper based on Difference-in-Differences analysis of usage/engagement data from the evaluation (implicit sample drawn from the >2,200 participants).
AVA operationalizes epistemic humility through two mechanisms: citation verifiability (tracing claims to sources) and reasoned abstention (declining unsupported queries with justification and redirection).
Design claim describing implemented mechanisms in the platform; described in the paper as operational features.
high positive Learning from AVA: Early Lessons from a Curated and Trustwor... epistemic humility operationalization (citation verifiability and reasoned abste...
AVA's multi-agent pipeline enables users to query and receive evidence-based syntheses.
System design and capability claim in the paper (description of multi-agent pipeline producing evidence-based syntheses).
high positive Learning from AVA: Early Lessons from a Curated and Trustwor... output: evidence-based syntheses
AVA is a GenAI platform built on a curated library of over 4,000 World Bank Reports with multilingual capabilities.
System description provided in the paper; statement of dataset size and functionality (library count and multilingual support).
high positive Learning from AVA: Early Lessons from a Curated and Trustwor... system corpus size / multilingual capability
Code-generating Artificial Intelligence has gained popularity within both professional and educational programming settings over the past several years.
Background statement in the paper's introduction (observational claim about recent trends in AI adoption).
high positive Fast and Forgettable: A Controlled Study of Novices' Perform... adoption/popularity of code-generating AI
The emotional effect of the human teammate was significantly more positive and arousing compared to working with Copilot.
Subjective emotion measures (valence/arousal) collected in the study; reported significant differences favoring human teammate on positivity and arousal (n=22).
high positive Fast and Forgettable: A Controlled Study of Novices' Perform... emotional valence and arousal during task
Several dimensions of participants' workload were significantly reduced when using GitHub Copilot.
Subjective workload measures collected during the experiment; multiple workload dimensions reported as significantly lower in the Copilot condition (n=22).
high positive Fast and Forgettable: A Controlled Study of Novices' Perform... subjective workload (multiple dimensions)
Participants performed significantly better with GitHub Copilot than with their human teammate.
Experimental comparison of task performance between Copilot-assisted individual condition and human pair condition; statistical significance reported in results (sample size n=22).
high positive Fast and Forgettable: A Controlled Study of Novices' Perform... programming performance on timed Python tasks
Evaluation demonstrates speed improvements of 6-7 minutes over traditional methods.
Reported empirical timing result in paper abstract: 6-7 minutes (presumably time to validate a change) compared to traditional methods (no further detail or sample size in abstract).
high positive Aether: Network Validation Using Agentic AI and Digital Twin validation time (speed)
Evaluation demonstrates diagnostic coverage of 92-96%.
Reported empirical range in paper abstract (92-96% diagnostic coverage over evaluated cases; specific n not provided in abstract).
Evaluation demonstrates promising results in error detection (100%).
Reported empirical result in paper abstract: 100% error detection over evaluated scenarios (no sample size given in abstract).
By orchestrating agent collaboration atop this digital twin, Aether enables automated, rapid network change validation while reducing manual effort, minimizing errors, and improving operational agility and cost-effectiveness.
High-level claim supported by system design and subsequent empirical evaluation reported in paper (evaluation details referenced in abstract).
high positive Aether: Network Validation Using Agentic AI and Digital Twin automation, manual effort, error rates, operational agility, cost-effectiveness
Aether agents use a unified Network Digital Twin integrating modeling, simulation, and emulation to maintain a consistent, up-to-date network view for verification and testing.
Design claim describing the digital twin's capabilities (modeling, simulation, emulation) as part of the system; presented in paper text.
high positive Aether: Network Validation Using Agentic AI and Digital Twin consistency and freshness of network view for verification/testing
Aether features an agentic architecture with five specialized Network Operations AI agents that collaboratively handle the change validation lifecycle from intent analysis to network verification and testing.
System architecture claim in paper describing five specialized agents (design specification; no empirical sample size).
high positive Aether: Network Validation Using Agentic AI and Digital Twin architectural decomposition into five agents
Aether integrates Generative Agentic AI with a multi-functional Network Digital Twin to automate and streamline network change validation workflows.
Paper describes Aether system design and architecture combining agentic AI and a digital twin (design-level claim; architectural description).
high positive Aether: Network Validation Using Agentic AI and Digital Twin automation/streamlining of change validation workflows
A common response to these worries stresses that the goods derived from work can be found elsewhere, often in better activities, suggesting that the proliferation of AI-powered automation does not threaten the meaningfulness of people’s lives.
Description of a commonly offered counterargument in the literature and popular debate (conceptual/literature-summary; no empirical data or sample reported).
high positive Is artificial intelligence a threat to meaningful work and l... argument that non-work activities can replace meaning from work (impact on meani...
The study uses a combination of cognitive systems theory, diplomatic negotiation models, and empirical Human-in-the-Loop experiments as its methodological basis.
Methods description in the paper listing theoretical foundations and empirical HITL experiments as components of the study design.
high positive Strategic Cognition and Artificial Diplomacy: Designing Huma... methodological approach (integration of theory and HITL experiments)
The paper outlines recommendations for international norm development, capacity building, and the creation of interoperable, transparent AI systems for diplomacy.
Policy recommendation section of the paper proposing international norms, capacity-building measures, and interoperable transparent system design.
high positive Strategic Cognition and Artificial Diplomacy: Designing Huma... policy recommendations proposed (norm development, capacity building, interopera...
Experimental HITL data indicate a 17% reduction in cognitive bias for hybrid human-AI teams.
Human-in-the-Loop (HITL) experiments reported in the paper; comparison of cognitive bias measures between hybrid teams and baseline (sample size not provided in summary).
high positive Strategic Cognition and Artificial Diplomacy: Designing Huma... cognitive bias (reduction)
Experimental HITL data indicate that hybrid human-AI teams achieved 23% faster consensus-building.
Human-in-the-Loop (HITL) experiments reported in the paper; experimental comparison between hybrid human-AI teams and baseline (details on sample size not reported in summary).
high positive Strategic Cognition and Artificial Diplomacy: Designing Huma... time to consensus (consensus-building speed)
The framework is validated through real-world and simulated case studies, including UN ceasefire mediation, EU sentiment-monitoring for conflict diplomacy, and African Union peacekeeping planning.
Validation reported via a set of real-world and simulated case studies described in the paper (case study methodology; specific cases named).
high positive Strategic Cognition and Artificial Diplomacy: Designing Huma... case-study-based validation of framework applicability
Each layer augments a core dimension of diplomatic reasoning, enabling interpretable AI contributions, foresight analysis, culturally sensitive framing, and legally compliant outputs.
Conceptual mapping of each proposed layer to functional capabilities described in the paper; claimed alignment with interpretability, foresight, cultural framing, and legal compliance.
high positive Strategic Cognition and Artificial Diplomacy: Designing Huma... interpretability, foresight analysis, culturally sensitive framing, legal compli...
The study proposes a five-layer Human-AI collaboration architecture tailored to multilateral diplomacy consisting of: (1) Context Modeling, (2) Scenario Generation, (3) Cognitive Interfacing, (4) Decision Support, and (5) Ethical-Normative Governance.
Architectural proposal in the paper based on synthesis of literature and design choices; claimed as the output of the conceptual framework.
high positive Strategic Cognition and Artificial Diplomacy: Designing Huma... definition of five-layer architecture (components enumerated)
This paper develops the concept of Artificial Diplomacy as a structured interface between human strategic cognition and machine-supported reasoning.
Theoretical development drawing on cognitive systems theory and diplomatic negotiation models; described design and conceptual argumentation in the paper.
high positive Strategic Cognition and Artificial Diplomacy: Designing Huma... conceptualization of 'Artificial Diplomacy' (design of an interface)
These divergences (between simulation and human data and across scenarios) provide crucial insights for the future design of human-centered AI agents.
Paper conclusion in abstract indicating practical implications and discussion of how divergences vary across contexts and what that implies for design.
With actual human subjects, AI attributes—particularly transparency—were much more impactful than personality traits.
Abstract reporting results from the human-subjects experiment (N=290) indicating AI attributes, especially chain-of-thought transparency, had greater impact.
high positive Imperfectly Cooperative Human-AI Interactions: Comparing the... relative_influence_on_outcomes (AI_attributes_vs_personality)
In simulation experiments, personality traits and AI attributes were comparatively influential on outcomes.
Abstract claim summarizing simulation experiment results (based on the 2,000 simulated runs) that personality and AI attributes were influential.
high positive Imperfectly Cooperative Human-AI Interactions: Comparing the... influence_on_interaction_outcomes
Policymakers can reinforce these conditions by shifting from technology-neutral principles to auditable process standards that couple AI investment with reskilling and data-quality obligations.
Policy recommendation based on the study's findings and synthesis; presented as a normative implication rather than empirically tested within the study. (Sample size not reported.)
high positive Overcoming Resistance to Change: Artificial Intelligence in ... policy effectiveness in reinforcing safe, equitable AI adoption
Leaders should fund training coverage and design (not just headline hours), equip non-specialists to interpret model outputs, pair performance artefacts with participatory routines, and treat explainability as a usability requirement to achieve durable, auditable value in safety-critical energy contexts.
Prescriptive recommendation based on a 'field-tested playbook' synthesised from the multi-case qualitative study (interviews, surveys, documents). The claim is drawn from authors' interpretation of cross-case patterns rather than causal inference. (Sample size not reported.)
high positive Overcoming Resistance to Change: Artificial Intelligence in ... durable, auditable value / legitimacy and sustained use
Structured upskilling and precise recourse mechanisms are associated with higher confidence, productivity, and clearer sustainability pathways.
Observed association in multi-case qualitative data: interviews, staff/manager surveys, and policy documents; triangulated through thematic coding and cross-case synthesis. (Sample size not reported.)
high positive Overcoming Resistance to Change: Artificial Intelligence in ... worker confidence and productivity; clarity of sustainability pathways
A tight workflow fit that minimises cognitive overhead at the decision point accelerates legitimate use and strengthens links to emissions monitoring and predictive-maintenance outcomes.
Synthesised from interviews, Likert-scale surveys of technical staff and managers, and internal workflow/policy documents across multiple cases in the energy sector. (Sample size not reported.)
high positive Overcoming Resistance to Change: Artificial Intelligence in ... rate of legitimate use (adoption) and effectiveness of emissions monitoring and ...
Communicative governance — e.g. model cards, bias tests, validation reports, and explicit appeal rights — earns trust, curbs shadow workarounds, and improves safety culture.
Reported from thematic coding of interviews, surveys of staff and managers, and documentary evidence across multiple cases; triangulation claimed. (Sample size not reported.)
high positive Overcoming Resistance to Change: Artificial Intelligence in ... trust, incidence of shadow workarounds, and safety culture
Broad-based capability building beyond specialist teams prevents benefits from concentrating in expert enclaves and reduces brittle scale.
Derived from cross-case thematic synthesis of interviews, Likert surveys of mid-level managers and technical staff, and internal policy/strategy document analysis (multi-case qualitative evidence). (Sample size not reported.)
high positive Overcoming Resistance to Change: Artificial Intelligence in ... distribution of benefits across organisation and scalability of AI use
Three reinforcing levers shape adoption outcomes: (1) broad-based capability building beyond specialist teams, (2) communicative governance that couples transparency with contestability, and (3) a tight workflow fit that minimises cognitive overhead at the decision point.
Qualitative, multi-case design triangulating a semi-structured interview with a senior manager, Likert-scale surveys of mid-level managers and technical staff, and analysis of internal policies and strategy documents; thematic coding with intercoder reliability and cross-case synthesis. (Sample size not reported.)
high positive Overcoming Resistance to Change: Artificial Intelligence in ... adoption outcomes / legitimate use
The framework demonstrates how digital intelligence can enhance supply chain resilience while supporting, rather than replacing, human decision-making (human-centric/planner-centered decision support).
Framework design emphasizes human-centric decision support; field deployment reported to be planner-centered (paper claims support rather than replacement of human decision-making).
high positive Enhancing Supply Chain Resilience in Textile SMEs: A Human-C... human-centric support vs. automation replacing planners
The results indicate that upstream textile SMEs can leverage publicly visible e-commerce signals to enhance production planning responsiveness, minimize inventory exposure and dye-lot disruptions, and strengthen resilience to demand uncertainty through planner-centered digital decision support.
Synthesis claim based on model results, validation of comment volume as sales proxy, Monte Carlo-based production guidance, decision dashboard design, and the 12-month field study outcomes.
high positive Enhancing Supply Chain Resilience in Textile SMEs: A Human-C... production planning responsiveness, inventory exposure, dye-lot disruptions, res...
This research extends the C2M paradigm from downstream retail contexts to upstream textile SMEs and proposes an integrated and operationally feasible intelligence framework for resource-constrained manufacturers.
Conceptual claim supported by the methodological development, large-scale e-commerce data modeling, and a field deployment at one SME reported in paper.
high positive Enhancing Supply Chain Resilience in Textile SMEs: A Human-C... extension and operational feasibility of C2M paradigm for upstream textile SMEs
In the same 12-month field study, implementation resulted in a 16% increase in capacity utilization.
Field deployment measurements reported in paper for one Taiwanese dyeing SME over 12 months.
In the same 12-month field study, implementation resulted in a 31% decrease in dye lot changeovers.
Field deployment measurements reported in paper for one Taiwanese dyeing SME over 12 months.
high positive Enhancing Supply Chain Resilience in Textile SMEs: A Human-C... number of dye lot changeovers
In a 12-month field study at a Taiwanese dyeing SME, implementation resulted in a 28% reduction in inventory value.
Field deployment and before-after (or intervention) measurement reported in paper over 12 months at one Taiwanese dyeing SME.
Forecasts were translated into production guidance using Monte Carlo simulation and a decision dashboard.
Description of operationalization methods in paper: Monte Carlo simulation and a planner-facing decision dashboard used to convert forecasts into production guidance.
high positive Enhancing Supply Chain Resilience in Textile SMEs: A Human-C... operational production guidance derived from forecasts (method implementation)
Consumer comment volume was validated as a proxy for sales activity, facilitating demand estimation.
Validation analysis reported in paper linking consumer comment volume to sales activity (methodological validation; specific statistical details not provided in abstract).
high positive Enhancing Supply Chain Resilience in Textile SMEs: A Human-C... validity of consumer comment volume as proxy for sales activity