The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (4114 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Innovation Remove filter
The results demonstrate a 'less is more' pattern: simpler combination (memory + reflection) yields better performance than adding architectural complexity.
Authors' interpretation of the ablation study results showing that adding multiple extra mechanisms degraded performance compared to the memory+reflection configuration.
high positive AEL: Agent Evolving Learning for Open-Ended Environments relative performance of simpler vs. more complex agent configurations
A nine-variant ablation reveals that memory and reflection together produce a 58% cumulative improvement over the stateless baseline.
Ablation study with nine variants on the sequential portfolio benchmark; authors report a 58% cumulative improvement when combining memory and reflection versus the stateless baseline.
high positive AEL: Agent Evolving Learning for Open-Ended Environments cumulative improvement in performance relative to stateless baseline
AEL outperforms five published self-improving methods and all non-LLM baselines while maintaining the lowest variance among all LLM-based approaches on the benchmark.
Comparative empirical evaluation on the same sequential portfolio benchmark, comparing AEL to five published self-improving methods and multiple non-LLM and LLM baselines (reported relative ranking and variance).
high positive AEL: Agent Evolving Learning for Open-Ended Environments relative performance (ranking) and variance across methods
On a sequential portfolio benchmark (10 sector-diverse tickers, 208 episodes, 5 random seeds), AEL achieves a Sharpe ratio of 2.13 ± 0.47.
Empirical experiment on the sequential portfolio benchmark with 10 tickers, 208 episodes, evaluated across 5 random seeds (reported Sharpe ratio and standard deviation).
high positive AEL: Agent Evolving Learning for Open-Ended Environments Sharpe ratio (portfolio performance metric)
We introduce Agent Evolving Learning (AEL), a two-timescale framework in which a Thompson Sampling bandit at the fast timescale learns which memory retrieval policy to apply each episode, while LLM-driven reflection at the slow timescale diagnoses failure patterns and injects causal insights into the agent's decision prompt.
Methodological description and proposed algorithmic design in the paper (no additional experimental sample size—design/algorithmic claim).
high positive AEL: Agent Evolving Learning for Open-Ended Environments framework architecture / learning framework
This survey provides scholars and practitioners with a structured understanding of how agentic AI is reshaping financial markets and identifies critical research directions to ensure these systems enhance both operational efficiency and market resilience.
Statement of contribution in the paper; based on the paper's literature review, taxonomy, and identified research agenda.
high positive Agentic Artificial Intelligence in Finance: A Comprehensive ... clarity for research/practice and identification of research directions to impro...
Agentic AI offers substantial potential for enhanced market efficiency, liquidity provision, and risk management.
Survey synthesis of foundational research, market applications, and technical architectures suggesting potential benefits; no original empirical evaluation reported.
high positive Agentic Artificial Intelligence in Finance: A Comprehensive ... market efficiency, liquidity provision, risk management
The emergence of agentic AI represents a fundamental transformation in financial markets, characterized by autonomous systems capable of reasoning, planning, and adaptive decision-making with minimal human intervention.
Conceptual claim stated in the survey's introduction and synthesis of recent advances; based on literature review and theoretical framing rather than new empirical data.
high positive Agentic Artificial Intelligence in Finance: A Comprehensive ... degree of autonomy and decision-making capability of AI systems in financial mar...
Countries around the world are rushing to encourage greater investment and growth in their domestic AI industries.
Statement/observation presented in the paper's introduction; based on the paper's descriptive overview of global policy activity (literature review / policy survey implied). No sample size reported.
high positive Fighting for Democracy Amid the AI Race: Designing Tech In... government encouragement of AI investment and growth
Dynamic combinations of AI and organizational structure can help managers overcome traditional trade-offs between scale and scope, opening pathways for scalable, cross-market expansion.
Managerial implication drawn from the paper's longitudinal case study of ByteDance; qualitative inference from observed organizational practices and AI deployment patterns.
high positive Scaling high and wide: How firms leverage AI and organizatio... managerial ability to overcome scale–scope trade-offs and enable cross-market ex...
AI transforms the scale–scope nexus from being a trade-off into a source of strategic advantage.
Synthesis and theoretical claim derived from longitudinal case study of ByteDance showing simultaneous scaling and diversification enabled by AI and organizational design.
high positive Scaling high and wide: How firms leverage AI and organizatio... ability to simultaneously achieve scale and scope (strategic advantage from comb...
AI reverses the conventional logic of the resource-based view: rather than valuable resources enabling diversification, diversification amplifies the value of resources.
Theoretical argument supported by the ByteDance case study; paper presents this as a theorized inversion based on observed patterns in the single-case study.
high positive Scaling high and wide: How firms leverage AI and organizatio... amplification of resource value as a result of diversification
The value of AI learning transfer across domains is contingent on access to structurally related data that allow learning to transfer across domains.
Claim derived from the ByteDance longitudinal case study showing conditions for successful cross-domain AI transfer (qualitative evidence emphasizing data structure/relatedness).
high positive Scaling high and wide: How firms leverage AI and organizatio... effectiveness of transfer learning across domains (dependence on structurally re...
AI evolves and improves through self-learning and cross-fertilization across domains, becoming increasingly valuable as learning accumulates.
Theoretical claim supported by longitudinal observations from the ByteDance case study (qualitative evidence from repeated AI deployments over time).
high positive Scaling high and wide: How firms leverage AI and organizatio... AI capability improvement/value accumulation over time
ByteDance leveraged AI and adaptive organizational design to scale rapidly and diversify across industries and markets without incurring rising costs or coordination complexity.
Longitudinal single-case (qualitative) study of ByteDance described in the paper; method reported as a longitudinal case study of one firm.
high positive Scaling high and wide: How firms leverage AI and organizatio... ability to scale and diversify across industries and markets (growth and diversi...
Effective governance of AI as a dual-use technology will likely require a multilateral institutional architecture functionally analogous (though not identical) to the role performed by the IAEA in the nuclear domain, with explicit safeguards against co-option of hardware controls for domestic repression.
Normative institutional design argument and analogy to the IAEA presented in the paper (policy proposal; comparative institutional analysis).
high positive The Open-Weight Paradox: Why Restricting Access to AI Models... need for multilateral institutional governance to manage dual-use AI
Hardware-layer governance, including chip-level attestation mechanisms such as FlexHEG, trusted execution environments, confidential computing, and complementary software-layer safeguards, offers a defense-in-depth alternative to the current binary framing of openness vs restriction.
Proposed governance architecture and technical discussion in the paper citing concrete mechanisms (technical-proposal and conceptual analysis; no experimental or deployment data reported in the summary).
high positive The Open-Weight Paradox: Why Restricting Access to AI Models... effectiveness of hardware-plus-software safeguards as an alternative governance ...
The global concentration of compute infrastructure makes open-weight models one of the most viable pathways to sovereign AI capacity in the Global South.
Analysis of global compute infrastructure concentration and pathway mapping in the paper (conceptual/structural analysis; no numerical sample provided in the summary).
high positive The Open-Weight Paradox: Why Restricting Access to AI Models... pathways to sovereign AI capacity (access/adoption of open-weight models)
Long-term prospects of agentic AI include catalyzing accelerated innovation in physical design via autonomous algorithm discovery, continuous tool improvement, and closed-loop learning from large design corpora.
Forward-looking conclusion in the paper; framed as the authors' projection based on survey synthesis rather than as an empirically demonstrated outcome in the abstract.
high positive Invited: Agentic AI for Physical Design R&D: Status and Pros... autonomous algorithm discovery, continuous tool improvement, closed-loop learnin...
Interfaces between agentic systems and traditional EDA frameworks are a key area of focus and enable tighter integration of agent capabilities into existing design workflows.
Survey highlights interfaces between agents and EDA frameworks as a focus area; claim is descriptive of research direction rather than reporting empirical outcomes.
high positive Invited: Agentic AI for Physical Design R&D: Status and Pros... development and importance of interfaces between agents and EDA frameworks
Autonomous agents can explore heuristic spaces for placement, routing, and partitioning, enabling autonomous exploration of design heuristics.
Presented as an emphasized capability/area of research in the survey; the abstract asserts this possibility but does not report empirical benchmarks or sample sizes.
high positive Invited: Agentic AI for Physical Design R&D: Status and Pros... autonomous exploration of heuristic spaces (placement, routing, partitioning)
Tool-integrated agents can be used for algorithm evolution, debugging, and workflow automation in physical design R&D.
Paper emphasizes this as a primary area of application in the survey; rationale and examples are discussed but no quantitative trial sizes are given in the abstract.
high positive Invited: Agentic AI for Physical Design R&D: Status and Pros... use of agents for algorithm evolution, debugging, and workflow automation
Agentic AI systems can comprehend user specifications, modify code, run EDA tools, analyze results, perform multi-step reasoning, and iteratively refine design heuristics—unlike earlier ML uses that focused narrowly on prediction or optimization subroutines.
Descriptive claim in the paper contrasting agentic AI capabilities with earlier ML approaches; presented as an overview of functional capabilities rather than empirical measurement.
high positive Invited: Agentic AI for Physical Design R&D: Status and Pros... breadth of tasks agentic AI systems can perform (spec comprehension, code modifi...
Recent advances in large language models (LLMs) and tool-using autonomous agents present new opportunities for accelerating research and development in physical design.
Stated as a central thesis in the paper's abstract/survey; based on the authors' synthesis of recent advances and emerging applications (no empirical sample or quantified evaluation reported in the abstract).
high positive Invited: Agentic AI for Physical Design R&D: Status and Pros... acceleration of research and development in physical design
Local governments should develop coordinated AI policy mixes, align differentiated policy pathways with regional conditions, and prioritize technology R&D support, talent cultivation and collaboration, and application demonstration and promotion to sustain long-term regional competitiveness.
Authors' policy recommendations derived from the fsQCA findings and interpretation of which conditions are recurrent/core across configurations.
high positive How Can Artificial Intelligence Policies Promote the Sustain... regional science and technology industrial competitiveness (policy recommendatio...
Technology R&D support, talent cultivation and collaboration, and application demonstration and promotion are the most recurrent core policy conditions across the identified configurations.
Frequency/core-condition analysis within the fsQCA configurations reported by the authors showing these three policy instruments repeatedly appear as core conditions.
high positive How Can Artificial Intelligence Policies Promote the Sustain... regional science and technology industrial competitiveness
The study identifies three driving pathways to sustained competitiveness: (supply and demand)-environmental resonance; demand-driven (supply-environmental) assurance; and supply–demand complementarity, which together cover five specific configurations.
Reported fsQCA solution paths (three aggregated driving pathways and five specific configurations) derived from the analysis of provincial AI policy instruments.
high positive How Can Artificial Intelligence Policies Promote the Sustain... regional science and technology industrial competitiveness
Sustained competitiveness is achieved through multiple equivalent configurations of policy instruments (i.e., policy instrument combinations rather than single instruments).
fsQCA results reported in the paper showing multiple configurations (solution paths) that are associated with high regional competitiveness.
high positive How Can Artificial Intelligence Policies Promote the Sustain... regional science and technology industrial competitiveness
Under these conditions (alignment of forces and AI-driven ideation cost reductions), PIM offers a framework for organising governed discovery in real time and provides the methodological foundation for later applied work.
The paper presents PIM as a proposed framework and positioning statement for future applied research and implementations (theoretical proposal; no applied trials reported).
high positive Probabilistic Innovation Methodology: A Scientific Methodolo... feasibility of using PIM to organise real-time governed discovery
Organised attacks on complex problems can generate an epistemic mode transition: a shift from predominantly Knightian uncertainty toward probabilistically characterisable innovation dynamics as relevant structures become more visible, decomposed, coordinated, and testable.
The paper states and formalises this methodological claim within PIM as a central proposition (theoretical argumentation; no empirical validation reported).
high positive Probabilistic Innovation Methodology: A Scientific Methodolo... degree of uncertainty characterization (Knightian vs probabilistic)
When problem-relevant causal, informational, and coordinative forces become sufficiently aligned, the epistemic character of search changes and open-ended uncertainty can be progressively transformed into structured probabilistic search.
The claim is presented as the central theoretical argument and formalised within the PIM conceptual framework (theoretical/model-based argumentation; no empirical sample).
high positive Probabilistic Innovation Methodology: A Scientific Methodolo... epistemic character of search (shift from Knightian uncertainty to probabilistic...
Sustainable development outcomes in MENA economies are driven not only by technology adoption but by the interaction between digital infrastructure, AI, and institutional readiness.
Regression models including interaction terms between digital transformation, AI measures, and indicators of institutional readiness within the System GMM analysis.
There is significant regional heterogeneity: Gulf Cooperation Council (GCC) countries exhibit stronger effects of digital transformation and AI on sustainable development than non-GCC MENA economies.
Subgroup/interaction analyses by region (GCC vs non-GCC) within the System GMM framework reported differential coefficients.
Artificial intelligence (AI) has a positive but weaker impact on sustainable development relative to digital transformation, reflecting its complementary and maturity-dependent role within the digital ecosystem.
Same System GMM regressions on panel of MENA economies (2010–2023) that include measures of AI and digital transformation; reported positive but smaller coefficient for AI.
Digital transformation is the primary driver of sustainable development in MENA economies, exerting a stronger and more consistent effect than AI.
Dynamic panel data analysis of MENA economies (2010–2023) using System GMM; reported comparative effect sizes of digital transformation vs. AI in regression results.
In the ICT industry, Tobin's Q significantly increased following AI adoption (heterogeneous positive effect).
Subgroup/heterogeneity analysis within the main sample (KOSDAQ firms 2018–2025), estimating the post-adoption effect of AI on Tobin's Q in firms classified as ICT.
high positive The Dynamic Causal Effects of Corporate AI Adoption on Profi... Tobin's Q (market value) in ICT-industry firms
The Barcelona Declaration offers a promising forum for boundary governance.
Policy recommendation pointing to an existing initiative (Barcelona Declaration) as a suitable forum; stated without empirical evaluation in the excerpt.
high positive Market Dynamics, Governance and Open Research Metadata in th... suitability of the Barcelona Declaration as a forum for boundary governance
Governance should calibrate the annulus, not abolish it: thin enough to serve research efficiently, wide enough to sustain innovation.
Normative policy recommendation from the authors; based on their conceptual framework rather than on empirical policy evaluation in the excerpt.
high positive Market Dynamics, Governance and Open Research Metadata in th... optimal governance calibration of the annulus balancing research efficiency and ...
Artificial intelligence reshapes the annulus by lowering barriers to basic structuring.
Conceptual claim in the paper; asserted as an effect of AI on metadata production without empirical estimates in the excerpt.
high positive Market Dynamics, Governance and Open Research Metadata in th... barriers to basic structuring of metadata
States can adjust their foreign policies to this fact by focusing on resilience, technological sovereignty, strategic decoupling, and coordination through alliances.
Policy-prescriptive recommendations based on the paper's theoretical framework and analysis; no empirical testing or sample size reported in the abstract.
high positive ARTIFICIAL INTELLIGENCE AND THE WEAPONIZATION OF ECONOMIC IN... effectiveness of foreign policy adjustments (resilience, sovereignty, decoupling...
Time Series Augmented Generation (TSAG) enables LLM agents to delegate quantitative tasks to verifiable external tools.
Description of TSAG framework in paper stating delegation mechanism to external verifiable tools for quantitative computations.
high positive Time Series Augmented Generation for Financial Applications delegation capability to external tools
We publicly release the evaluation framework and empirical insights to foster standardized research on reliable financial AI.
Paper states that the framework, benchmark, and empirical results are released publicly by the authors.
high positive Time Series Augmented Generation for Financial Applications public release of resources
The results demonstrate that capable agents can achieve near-perfect tool-use accuracy with minimal hallucination, validating the tool-augmented paradigm.
Empirical results from the authors' experiments on the 100-question benchmark across multiple agents; paper states agents achieve 'near-perfect' tool-use accuracy and 'minimal' hallucination.
high positive Time Series Augmented Generation for Financial Applications tool-use accuracy; hallucination rate
We apply this methodology in a large-scale empirical study using our framework, Time Series Augmented Generation (TSAG), where an LLM agent delegates quantitative tasks to verifiable, external tools.
Paper reports applying the TSAG framework in an empirical study in which agents call external tools to perform quantitative computations; described as 'large-scale' and implemented by the authors.
high positive Time Series Augmented Generation for Financial Applications use of external/verifiable tools by LLM agents
We introduce a novel evaluation methodology and benchmark designed to rigorously measure an LLM agent's reasoning for financial time-series analysis.
Paper describes a new methodology and benchmark (Time Series Augmented Generation, TSAG) developed by the authors for evaluating LLM reasoning on financial time-series tasks.
high positive Time Series Augmented Generation for Financial Applications existence of a new evaluation methodology / benchmark
Effective evaluation-driven loop scaling is a central axis for advancing LLM-driven scientific discovery, and SimpleTES provides a simple yet practical framework for realizing these gains.
High-level claim supported by the aggregate experimental results and discussion in the paper.
high positive Evaluation-driven Scaling for Scientific Discovery impact of scaling evaluation-driven discovery loops on LLM-driven scientific dis...
When post-trained on successful trajectories, models not only improve efficiency on seen problems but also generalize to unseen problems, discovering solutions that base models fail to uncover.
Experiments in which models were post-trained on successful SimpleTES trajectories and evaluated on both seen and unseen problems (paper claim of improved efficiency and generalization).
high positive Evaluation-driven Scaling for Scientific Discovery post-training efficiency on seen problems and generalization to unseen problems ...
SimpleTES produces trajectory-level histories that naturally supervise feedback-driven learning.
Methodological claim and supporting experiments where SimpleTES generates solution trajectories that are then used as supervision for learning.
high positive Evaluation-driven Scaling for Scientific Discovery availability and usefulness of trajectory-level histories for supervision
We discovered new Erdos minimum overlap constructions that surpass the best-known results.
Reported novel combinatorial constructions (Erdos minimum overlap) in the experiments that improve on prior best-known results.
high positive Evaluation-driven Scaling for Scientific Discovery quality of Erdos minimum overlap constructions (best-known benchmarks)
We designed quantum circuit routing policies that reduce gate overhead by 24.5%.
Experimental results reported for quantum circuit routing tasks showing a 24.5% reduction in gate overhead when using SimpleTES-designed policies.
high positive Evaluation-driven Scaling for Scientific Discovery quantum circuit gate overhead