The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (13827 claims)

Adoption
8454 claims
Productivity
7544 claims
Governance
6789 claims
Human-AI Collaboration
6327 claims
Org Design
4126 claims
Innovation
4058 claims
Labor Markets
3520 claims
Skills & Training
2924 claims
Inequality
2057 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 749 195 97 889 1979
Governance & Regulation 815 391 188 121 1539
Organizational Efficiency 771 189 124 83 1177
Technology Adoption Rate 624 233 123 96 1084
Research Productivity 410 121 56 331 929
Output Quality 466 177 59 47 749
Decision Quality 320 174 75 42 618
Firm Productivity 435 55 88 20 604
AI Safety & Ethics 214 276 65 33 593
Market Structure 178 166 122 24 495
Task Allocation 206 64 70 31 376
Skill Acquisition 165 57 60 17 299
Innovation Output 201 27 41 18 288
Employment Level 105 51 107 13 278
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 116 63 42 11 232
Firm Revenue 149 46 26 3 224
Inequality Measures 44 122 49 6 221
Task Completion Time 169 29 8 12 219
Worker Satisfaction 89 61 20 12 182
Error Rate 69 91 10 2 172
Regulatory Compliance 76 68 14 5 163
Training Effectiveness 92 19 13 19 145
Wages & Compensation 77 36 25 6 144
Automation Exposure 51 54 22 12 142
Team Performance 86 17 27 9 140
Developer Productivity 94 17 14 6 132
Job Displacement 12 80 20 1 113
Hiring & Recruitment 51 7 8 3 69
Skill Obsolescence 5 45 6 1 57
Creative Output 31 16 7 2 57
Social Protection 27 16 8 2 53
Labor Share of Income 17 17 17 51
Worker Turnover 11 12 3 26
Industry 1 1
Environment creation is framed as a multi-agent task: a coding agent writes setup scripts, downloads real-world data, and configures the software while producing evidence of correct setup; an independent audit agent verifies evidence against a quality checklist.
Method description of multi-agent pipeline (coding agent + audit agent) in the paper.
high positive Gym-Anything: Turn any Software into an Agent Environment reliability/validity of environment setup via multi-agent workflow
We introduce Gym-Anything, a framework for converting any software into an interactive computer-use environment.
Methodological contribution described in paper (framework implementation claimed).
high positive Gym-Anything: Turn any Software into an Agent Environment availability of a general framework for environment creation
The study introduces 'career reconfiguration' as a framework explaining intra-role task transformation, extending existing career mobility and job transition theories.
Theoretical/conceptual contribution presented in the paper (framework proposition; not an empirical effect).
high positive Artificial Intelligence Adoption and Career Reconfiguration ... theoretical framing of intra-role task transformation (career reconfiguration)
Mediation analysis confirms that training and organizational support significantly mediate the relationship between AI adoption and career shifts.
Mediation analysis reported in the study (method stated; no mediation coefficients or sample size provided in abstract).
high positive Artificial Intelligence Adoption and Career Reconfiguration ... career shifts (mediated effect of training and organizational support on relatio...
Together, these variables explain 61% of the variance in adaptive outcomes (R² = 0.61).
Multiple regression model summary reported in the paper (R-squared value provided; sample size not stated).
high positive Artificial Intelligence Adoption and Career Reconfiguration ... variance explained in adaptive outcomes (career adaptation)
Readiness to change is a significant predictor of career adaptation (beta = 0.298, p = 0.011).
Multiple regression analysis reported in the paper (predictors of career adaptation; sample size not stated).
high positive Artificial Intelligence Adoption and Career Reconfiguration ... career adaptation / adaptive outcomes
Openness to technology is a significant predictor of career adaptation (beta = 0.367, p = 0.003).
Multiple regression analysis reported in the paper (predictors of career adaptation; sample size not stated).
high positive Artificial Intelligence Adoption and Career Reconfiguration ... career adaptation / adaptive outcomes
Organizational support is a significant predictor of career adaptation (beta = 0.389, p = 0.005).
Multiple regression analysis reported in the paper (predictors of career adaptation; sample size not stated).
high positive Artificial Intelligence Adoption and Career Reconfiguration ... career adaptation / adaptive outcomes
Skills training is the strongest predictor of career adaptation (beta = 0.412, p = 0.002).
Multiple regression analysis reported in the paper (predictors of career adaptation; sample size not stated).
high positive Artificial Intelligence Adoption and Career Reconfiguration ... career adaptation / adaptive outcomes
The proposal outlines a phased implementation roadmap from a voluntary pilot to mandatory certification within five years.
Proposal states a phased implementation timeline moving from voluntary pilot projects to mandatory certification within a five-year period; presented as a planned roadmap rather than a demonstrated outcome.
high positive IASCA: The International AI Safety Certification Authority —... policy adoption timeline (voluntary pilot → mandatory certification within five ...
The governance structure for IASCA will be treaty-based and include anti-capture provisions.
Proposal explicitly proposes a treaty-based governance structure and states inclusion of anti-capture provisions; this is a design/policy prescription in the document rather than evidence-based finding.
high positive IASCA: The International AI Safety Certification Authority —... treaty-based governance with anti-capture provisions
IASCA employs a zero-knowledge testing architecture that evaluates model safety through behavioural probing without accessing proprietary weights, training data, or architecture.
Proposal describes a technical design: zero-knowledge testing via behavioural probes that does not require access to model weights, training data, or architecture; presented as a design feature without empirical validation or test results in the excerpt.
high positive IASCA: The International AI Safety Certification Authority —... safety evaluation via behavioural probing without inspecting weights/training da...
The International AI Safety Certification Authority (IASCA) is an independent, internationally governed body for mandatory pre-deployment safety certification of frontier AI models.
Explicit statement in the proposal describing IASCA as an independent, internationally governed authority and its role in mandatory pre-deployment certification; conceptual design, no empirical testing or implementation reported.
high positive IASCA: The International AI Safety Certification Authority —... pre-deployment safety certification of frontier AI models
SWE-bench alignment: Bench is aligned with SWE-bench-Verified and SWE-bench-Pro.
Paper statement that the constructed benchmark is aligned with SWE-bench-Verified and SWE-bench-Pro (methodological/design alignment described).
Bench contains 495 issues and 1,787 validated design constraints across six repositories.
Reported dataset statistics in paper/abstract: explicit counts of issues (495), validated constraints (1,787), and number of repositories (6).
We construct DESIGN-AWARE benchmark (Bench) by mining and validating design constraints from real-world pull requests, linking them to issue instances, and automatically checking patch compliance using an LLM-based verifier.
Method description in paper: dataset created by mining real-world pull requests, validating constraints, linking constraints to issues, and using an LLM-based verifier to check compliance.
Flowr is domain-independent, offering a generalizable blueprint for agentic AI-driven supply chain automation across large-scale enterprise settings.
Claim of generalizability made by the authors in the paper; presented as an assertion rather than demonstrated through multi-industry empirical tests in the excerpt.
high positive Flowr -- Scaling Up Retail Supply Chain Operations Through A... generalizability / applicability across domains
The framework was validated in collaboration with a large-scale supermarket chain.
Claim of field validation stated in the paper; indicates at least one real-world collaboration but provides no further details (e.g., number of stores, duration, metrics) in the excerpt.
high positive Flowr -- Scaling Up Retail Supply Chain Operations Through A... field validation / real-world deployment
Evaluation indicates Flowr enables proactive exception handling at a scale unachievable through manual processes.
Empirical/operational claim based on the paper's evaluation and deployment context; the excerpt asserts this capability but does not provide quantitative performance metrics or comparison details.
high positive Flowr -- Scaling Up Retail Supply Chain Operations Through A... proactive exception handling capability and scale
Evaluation shows Flowr improves demand–supply alignment.
Empirical claim in the paper's evaluation; reported improvement in demand-supply alignment from deployment or testing with a large supermarket chain, but no numerical metrics provided in the excerpt.
high positive Flowr -- Scaling Up Retail Supply Chain Operations Through A... demand–supply alignment
Evaluation demonstrates that Flowr significantly reduces manual coordination overhead.
Empirical claim reported in the paper's evaluation section; the excerpt notes an evaluation and collaboration with a large supermarket chain but provides no sample size figures or quantitative effect sizes.
high positive Flowr -- Scaling Up Retail Supply Chain Operations Through A... manual coordination overhead (effort/time/coordination burden)
Central to the framework is a human-in-the-loop orchestration model in which supply chain managers supervise and intervene across workflow stages via a Model Context Protocol (MCP)-enabled interface, preserving accountability and organizational control.
Design/organizational claim describing human-in-the-loop orchestration and MCP interface; asserted in the paper without empirical measures of accountability or control in the excerpt.
high positive Flowr -- Scaling Up Retail Supply Chain Operations Through A... preservation of accountability and organizational control during automation
To ensure task accuracy and adherence to responsible AI principles, the framework employs a consortium of fine-tuned, domain-specialized large language models coordinated by a central reasoning LLM.
Technical/design claim in the paper describing model architecture and approach; no evaluation metrics or tests of accuracy/responsibility provided in the excerpt.
high positive Flowr -- Scaling Up Retail Supply Chain Operations Through A... task accuracy and adherence to responsible AI principles
Flowr systematically decomposes manual supply chain operations into specialized AI agents, each responsible for a clearly defined cognitive role, enabling automation of processes previously dependent on continuous human coordination.
Architectural claim — asserted mechanism of the framework in the paper; presented as part of the framework design, no quantitative evaluation details in the excerpt.
high positive Flowr -- Scaling Up Retail Supply Chain Operations Through A... task decomposition and automation of previously human-coordinated processes
This paper introduces Flowr, a novel agentic AI framework for automating end-to-end retail supply chain workflows in large-scale supermarket operations.
Design and system-proposal claim in the paper; supported by framework description rather than empirical testing in the provided text.
high positive Flowr -- Scaling Up Retail Supply Chain Operations Through A... ability to automate end-to-end supply chain workflows (task allocation to AI)
The taxonomy, feasibility classification, and mechanism-to-scenario mapping provide a technical foundation for policymakers and identify the R&D investments required before hardware-level governance can support verifiable international agreements.
Authors' synthesis and policy-focused conclusions based on the taxonomy, feasibility ratings, mapping, and threat analyses presented in the paper (conceptual/prescriptive).
high positive Hardware-Level Governance of AI Compute: A Feasibility Taxon... usefulness of the paper's contributions for policy planning and R&D prioritizati...
We present an adversary-tiered threat analysis distinguishing commercial, non-state, and nation-state actors, arguing the appropriate security standard is tamper-evident assurance analogous to IAEA verification rather than absolute tamper-proofing.
Authors' adversary-model classification and normative argument recommending tamper-evident assurance (comparative reasoning with IAEA-style verification). Qualitative policy recommendation; no empirical experiment.
high positive Hardware-Level Governance of AI Compute: A Feasibility Taxon... recommended security standard for hardware-level governance
We map the taxonomy onto four governance scenarios: domestic regulation, bilateral agreements, multilateral treaty verification, and industry self-regulation.
Authors' scenario mapping exercise described in the paper (conceptual mapping of mechanisms to four named governance scenarios).
high positive Hardware-Level Governance of AI Compute: A Feasibility Taxon... mechanism-to-scenario applicability mapping
For each mechanism, we provide a technical description, a feasibility rating, and an identification of adversarial vulnerabilities.
Paper's stated content and structure: per-mechanism entries including technical descriptions, feasibility ratings, and adversarial vulnerability discussion (qualitative documentation).
high positive Hardware-Level Governance of AI Compute: A Feasibility Taxon... completeness of mechanism documentation
This paper proposes a taxonomy of 20 hardware-level governance mechanisms, organised by function (monitoring, verification, enforcement) and assessed for technical feasibility on a four-point scale from currently deployable to speculative.
Authors' methodological contribution: a constructed taxonomy enumerating 20 mechanisms and an assigned four-point feasibility rating (documentation in the paper). No external sample size; based on authors' engineering analysis.
high positive Hardware-Level Governance of AI Compute: A Feasibility Taxon... existence and classification of hardware governance mechanisms
Multimodal GeoAI studies fuse multiple geospatial data modalities to tackle urban mobility tasks including accessibility mapping, demand forecasting, and origin–destination flow prediction.
Categorization of tasks addressed by the included multimodal GeoAI studies (synthesis from the surveyed papers, n=18).
high positive GeoAI and Multimodal Geospatial Data Fusion for Inclusive Ur... types of urban mobility tasks addressed by multimodal GeoAI (accessibility mappi...
To address these challenges, the paper proposes a structured research roadmap including equity-aware loss functions, adaptive multimodal fusion pipelines, participatory and human-in-the-loop workflows, and urban data trusts.
Authors' proposed agenda and recommendations presented in the discussion/conclusion of the paper (proposal, not empirically evaluated).
high positive GeoAI and Multimodal Geospatial Data Fusion for Inclusive Ur... recommended methodological and governance directions to improve inclusiveness an...
The paper examines emerging techniques such as knowledge graphs, federated learning, and explainable AI that support equity-relevant insights across diverse urban contexts.
Discussion and synthesis of methodological developments in the surveyed literature (reported within the review).
high positive GeoAI and Multimodal Geospatial Data Fusion for Inclusive Ur... presence and applicability of emerging techniques (knowledge graphs, federated l...
The review highlights the growing use of deep learning architectures in multimodal GeoAI for urban mobility.
Observed trend reported by the authors based on the systematic review of included studies (n=18).
high positive GeoAI and Multimodal Geospatial Data Fusion for Inclusive Ur... use of deep learning architectures in multimodal GeoAI studies
The integration of artificial intelligence with geographic information science, combined with multimodal geospatial data fusion, provides powerful tools to diagnose and address mobility disparities by integrating heterogeneous data sources (satellite imagery, GPS trajectories, transit records, volunteered geographic information, social sensing).
Theoretical/methodological claim supported by examples and synthesis from the surveyed literature (the paper reviews multimodal GeoAI studies that fuse such data sources).
high positive GeoAI and Multimodal Geospatial Data Fusion for Inclusive Ur... diagnostic and remedial capacity for mobility disparities via multimodal GeoAI
The risk of evolution selecting for deception could be mitigated if reproduction is based on purely objective criteria, rather than human judgment.
Prescriptive implication derived from the model analysis: argument that replacing human-judged fitness with objective criteria would reduce selection for deception (theoretical reasoning, not empirical test).
high positive A mathematical theory of evolution for self-designing AIs reduction in selection for deception under objective reproduction criteria
Assuming bounded fitness and a fixed probability that any AI reproduces a 'locked' copy of itself, fitness concentrates on the maximum reachable value.
Formal theorem/proof within the mathematical model under the stated assumptions (bounded fitness and fixed probability of locked self-reproduction).
high positive A mathematical theory of evolution for self-designing AIs asymptotic distribution of fitness across lineages (concentration on maximum rea...
As artificial intelligence systems (AIs) become increasingly produced by recursive self-improvement, a form of evolution may emerge, in which the traits of AI systems are shaped by the success of earlier AIs in designing and propagating their descendants.
Conceptual argument and motivation in the paper; development of a mathematical model of self-designing AIs to formalize this idea (theoretical, no empirical data or sample).
high positive A mathematical theory of evolution for self-designing AIs emergence of evolutionary dynamics in self-improving AIs (traits shaped by desce...
Generative AI helps users solve problems more efficiently.
Motivating empirical observation stated in the paper (no sample or empirical analysis reported in the provided text); assumption used to motivate the theoretical model.
high positive When AI Improves Answers but Slows Knowledge Creation: Match... problem-solving efficiency (implicit)
By elucidating the mechanisms and trade-offs inherent in AI-human collaboration, this work lays a robust foundation for future research on adaptive decision systems.
Authors' forward-looking claim in the abstract that their synthesis clarifies mechanisms/trade-offs and thus supports subsequent research; based on their review and framework.
high positive Advancing Decision-Making through AI-Human Collaboration: A ... foundation for future research on adaptive decision systems
By synthesizing these paradigms, this research advances the theoretical understanding of hybrid decision-making systems and provides actionable insights for organizations navigating complex and AI-driven environments.
Authors' stated contribution based on the conceptual synthesis of the literature and the proposed framework (as reported in the abstract).
high positive Advancing Decision-Making through AI-Human Collaboration: A ... theoretical advancement and provision of actionable organizational insights
The framework introduces four distinct paradigms of AI-human collaborative decision-making: adaptive intuitive decision, programmed algorithmic decision, interpretive analytical decision and integrative hybrid decision.
Authors' conceptual taxonomy reported in the abstract, produced from synthesis of the reviewed literature (627 articles).
high positive Advancing Decision-Making through AI-Human Collaboration: A ... classification of AI-human collaborative decision-making into four paradigms
We developed a novel conceptual framework that identifies two critical dimensions, AI-human dynamics and decision typologies, that shape decision outcomes.
Authors' reported conceptual synthesis derived from the systematic review/bibliometric analysis of the 627 articles.
high positive Advancing Decision-Making through AI-Human Collaboration: A ... identification of critical dimensions affecting decision outcomes
Prompts can be treated as decision policies that allocate discretion between researcher and system, governing what is executed and when iteration stops.
Methodological framing advanced by the authors describing prompts as decision policies; conceptual claim based on the paper's analytic framework rather than empirical measurement.
high positive On the Carbon Footprint of Economic Research in the Age of G... conceptualization of prompts' role in workflow control and decision allocation
Operational constraints and decision rule prompts deliver large and stable footprint reductions while preserving decision equivalent topic outputs.
Experimental comparisons of prompt strategies in the benchmarked workflow showing reductions in runtime/CO2e and evaluated topic outputs' decision-equivalence (asserted in abstract; no numeric reductions or sample sizes provided).
high positive On the Carbon Footprint of Economic Research in the Age of G... carbon footprint / runtime reductions and preservation of topic output equivalen...
We benchmark a modern economic survey workflow, an LDA-based literature mapping implemented with GenAI assisted coding and executed in a fixed cloud notebook, measuring runtime and estimated CO2e with CodeCarbon.
Experimental benchmark described in the paper: single implemented workflow (LDA-based literature mapping) executed in a fixed cloud notebook with runtime and CO2e measured using CodeCarbon (methodological claim).
high positive On the Carbon Footprint of Economic Research in the Age of G... runtime and estimated CO2e (carbon footprint) of the benchmarked workflow
Training footprint is the largest cluster in the mapped Green AI literature.
Result from the paper's literature mapping / clustering (statement in abstract; no numeric cluster sizes given).
high positive On the Carbon Footprint of Economic Research in the Age of G... relative prevalence (cluster size) of 'training footprint' theme
We map the recent Green AI literature into seven themes: training footprint is the largest cluster, while inference efficiency and system level optimisation are growing rapidly, alongside measurement protocols, green algorithms, governance, and security and efficiency trade-offs.
Bibliometric / thematic mapping of recent Green AI literature described in the paper (method: literature mapping; exact number of papers or mapping procedure not specified in abstract).
high positive On the Carbon Footprint of Economic Research in the Age of G... distribution of themes within Green AI literature (theme prevalence and growth)
Compared to relationship-based debt, stable equity significantly promotes high-quality development in the high-end equipment manufacturing and new energy industries.
Comparative subgroup regression analysis on the same dataset (743 listed enterprises, 2014–2023) indicating that the coefficient for stable equity is significantly larger than that for relationship-based debt in the high-end equipment manufacturing and new energy industry subsamples.
high positive The Impact of Patient Capital on the High-Quality Developmen... high-quality development of enterprises (comparison of effects by financing type...
The effects of two distinct forms of patient capital—stable equity and relationship-based debt—are more pronounced in promoting high-quality development in the new energy vehicle industry, energy conservation and environmental protection industry, biotechnology industry, new materials industry, and next-generation information technology industry.
Industry heterogeneity / subgroup analyses on the 2014–2023 panel of 743 listed firms showing stronger estimated effects of both stable equity and relationship-based debt on firm high-quality development within these specified industries.
high positive The Impact of Patient Capital on the High-Quality Developmen... high-quality development of enterprises (industry-specific stronger effects of t...