The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (8066 claims)

Adoption
5586 claims
Productivity
4857 claims
Governance
4381 claims
Human-AI Collaboration
3417 claims
Labor Markets
2685 claims
Innovation
2581 claims
Org Design
2499 claims
Skills & Training
2031 claims
Inequality
1382 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 417 113 67 480 1091
Governance & Regulation 419 202 124 64 823
Research Productivity 261 100 34 303 703
Organizational Efficiency 406 96 71 40 616
Technology Adoption Rate 323 128 74 38 568
Firm Productivity 307 38 70 12 432
Output Quality 260 71 27 29 387
AI Safety & Ethics 118 179 45 24 368
Market Structure 107 128 85 14 339
Decision Quality 177 75 37 19 312
Fiscal & Macroeconomic 89 58 33 22 209
Employment Level 74 34 78 9 197
Skill Acquisition 98 36 40 9 183
Innovation Output 121 12 24 13 171
Firm Revenue 98 35 24 157
Consumer Welfare 73 31 37 7 148
Task Allocation 87 16 34 7 144
Inequality Measures 25 76 32 5 138
Regulatory Compliance 54 61 13 3 131
Task Completion Time 89 7 4 3 103
Error Rate 44 51 6 101
Training Effectiveness 58 12 12 16 99
Worker Satisfaction 47 33 11 7 98
Wages & Compensation 54 15 20 5 94
Team Performance 47 12 15 7 82
Automation Exposure 27 26 10 6 72
Job Displacement 6 39 13 58
Hiring & Recruitment 40 4 6 3 53
Developer Productivity 34 4 3 1 42
Social Protection 22 11 6 2 41
Creative Output 16 7 5 1 29
Labor Share of Income 12 6 9 27
Skill Obsolescence 3 20 2 25
Worker Turnover 10 12 3 25
There exists an optimal level of data (big data) sharing that achieves the best balance between economic development and privacy, thereby maximizing individuals' welfare.
Analytical optimization within the theoretical macro model: model yields an interior optimum for data-sharing intensity that trades off economic gains and privacy costs (derivation/analytical result; no empirical test).
high positive Study on the impact of big data sharing on individuals’ welf... individuals' welfare maximization via optimal data-sharing level
Structured intent representations (PPS) can improve alignment and usability in human–AI interaction, especially in tasks where user intent is inherently ambiguous.
Synthesis of experimental findings (rendered PPS better on goal_alignment overall, task-dependent gains concentrated in high-ambiguity business tasks) and the preliminary user survey.
A preliminary retrospective survey (N = 20) suggests a 66.1% reduction in follow-up prompts required, from 3.33 to 1.13 rounds, when using PPS.
Authors report a small retrospective survey of N = 20 respondents comparing number of follow-up prompt rounds required before vs after adopting PPS (self-reported).
high positive Evaluating 5W3H Structured Prompting for Intent Alignment in... number_of_follow-up_prompt_rounds_required
We introduce goal_alignment, a user-intent-centered evaluation dimension, and find that natural-language-rendered PPS outperforms both simple prompts and raw PPS JSON on this metric.
Experimental comparison across the three prompt conditions using the goal_alignment evaluation dimension applied to the collected outputs (540 outputs across 60 tasks and 3 models), as judged by an LLM judge.
The Institutional Scaling Law predicts that the next phase transition will be driven not by larger models but by better-orchestrated systems of domain-specific models adapted to specific institutional niches.
Predictive conclusion derived from the Institutional Scaling Law and theoretical analysis in the paper. No empirical validation or sample size reported in the excerpt.
high positive The Institutional Scaling Law: Non-Monotonic Fitness, Capabi... drivers of the next phase transition in AI (orchestration of domain-specific sys...
A Symbiogenetic Scaling correction demonstrates that orchestrated systems of domain-specific models can outperform frontier generalists in their native deployment environments.
Theoretical correction/derivation and comparative analysis within the paper (no empirical sample or quantitative benchmark reported in the excerpt).
high positive The Institutional Scaling Law: Non-Monotonic Fitness, Capabi... performance of orchestrated domain-specific model systems versus frontier genera...
A mixed-methods empirical research agenda is presented, proposing a future PLS-SEM approach to test the mediating role of the cognitive flywheel and the moderating effect of fractal governance on organizational resilience.
Methodological proposal described in the paper (research design and proposed analytic approach); no executed empirical study or sample reported.
high positive Governing Human–AI Co-Evolution: Intelligentization Capabili... organizational_resilience (as mediator/moderator relationships to be tested)
Fractal governance architecture is proposed to mitigate systemic vulnerabilities such as automation bias.
Conceptual proposal of a governance design in the paper; no empirical test or sample provided.
high positive Governing Human–AI Co-Evolution: Intelligentization Capabili... reduction_in_automation_bias / improvement_in_decision_quality
The cognitive flywheel is the central mechanism of this dynamic capability and can be operationalized (the paper operationalizes the cognitive flywheel).
Theoretical operationalization within the paper (concept definition and proposed operational measures); no empirical measurement or sample reported.
high positive Governing Human–AI Co-Evolution: Intelligentization Capabili... mechanism_operationalization (cognitive_flywheel)
The co-evolutionary dynamic is formalized using coupled non-linear differential equations and time decay integrals.
Mathematical formalization reported in the paper (modeling methods described); no empirical parameter estimation or sample provided.
high positive Governing Human–AI Co-Evolution: Intelligentization Capabili... existence_of_mathematical_model/formal_framework
Dynamic cognitive advantage arises from the historical, recursive, structural coupling of human semantic intent and machine syntactic processing (a co-evolutionary dynamic).
Conceptual theory introduced and argued in the paper (mechanism-level proposition); formalization provided but no empirical validation.
high positive Governing Human–AI Co-Evolution: Intelligentization Capabili... competitive_differentiation/innovation_output
Conceptualizing the enterprise as a complex adaptive system operating far from thermodynamic equilibrium provides a more appropriate framing for organizations integrating AI and enables the theory of dynamic cognitive advantage.
Theoretical development and conceptual argumentation within the paper; formal framing rather than empirical test; no sample reported.
high positive Governing Human–AI Co-Evolution: Intelligentization Capabili... competitive_differentiation/innovation_output
We propose a multi-agent discussion framework wherein specialized agents collaboratively process extensive product information, distributing cognitive load to alleviate single-agent attention bottlenecks and capturing critical decision factors through structured dialogue.
Method description: multi-agent discussion architecture described and implemented; claimed to distribute cognitive load and reduce single-agent attention bottlenecks (design + reported behavior).
high positive MALLES: A Multi-agent LLMs-based Economic Sandbox with Consu... reduction of single-agent attention bottlenecks / distributed processing of prod...
To enhance simulation stability, we implement a mean-field mechanism designed to model the dynamic interactions between the product environment and customer populations, effectively stabilizing sampling processes within high-dimensional decision spaces.
Method description: implementation of a mean-field mechanism within the simulator; paper asserts this design stabilizes sampling in high-dimensional decision spaces (method + reported simulation behavior).
high positive MALLES: A Multi-agent LLMs-based Economic Sandbox with Consu... simulation stability / stabilized sampling processes
We introduce a preference learning paradigm in which LLMs are economically aligned via post-training on extensive, heterogeneous transaction records across diverse product categories.
Method description: post-training LLMs on heterogeneous transaction records across product categories to align preferences (methodological / training procedure described).
high positive MALLES: A Multi-agent LLMs-based Economic Sandbox with Consu... ability of models to internalize consumer preferences via post-training
This paper introduces a Multi-Agent Large Language Model-based Economic Sandbox (MALLES) as a unified simulation framework applicable to cross-domain and cross-category scenarios.
Paper description: design and implementation of MALLES, presented as a unified framework leveraging large-scale LLM generalization for cross-domain/cross-category simulation (methodological contribution).
high positive MALLES: A Multi-agent LLMs-based Economic Sandbox with Consu... existence and applicability of MALLES as a unified simulation framework
Leaders' AI symbolization lessens AI's negative impact on employees' emotional exhaustion.
Moderation analysis in the four-stage longitudinal study of 285 finance professionals; leader AI symbolization tested as moderator of AI usage -> emotional exhaustion path.
high positive Autonomous enhancement or emotional depletion? The dual-path... emotional exhaustion (moderated by leaders' AI symbolization)
Leaders' AI symbolization strengthens AI's positive effect on employees' sense of self-determination.
Moderation analysis within the same four-stage longitudinal survey of 285 finance professionals; leader AI symbolization tested as moderator of AI usage -> sense of self-determination path.
high positive Autonomous enhancement or emotional depletion? The dual-path... sense of self-determination (moderated by leaders' AI symbolization)
AI usage can boost innovative work behavior by enhancing employees' sense of self-determination.
Four-stage longitudinal study (survey) of finance professionals (N=285); mediation analysis testing AI usage -> sense of self-determination -> innovative work behavior, grounded in SOR theory.
high positive Autonomous enhancement or emotional depletion? The dual-path... innovative work behavior (mediated by sense of self-determination)
Retrieval substantially improves reasoning over textual fundamentals.
Result reported from the experiments comparing zero-shot prompting to retrieval-augmented settings on fundamentals-focused questions; the paper asserts that retrieval provided substantial improvement for textual fundamentals reasoning.
high positive FinTradeBench: A Financial Reasoning Benchmark for LLMs improvement in reasoning/performance on fundamentals-focused questions with retr...
Human-AI systems should be designed under a cognitive sustainability constraint so that gains in hybrid performance do not come at the cost of degradation in human expertise.
Normative recommendation in the paper based on the conceptual/mathematical framework and the identified trade-off; presented as an argument rather than empirically validated policy outcome in the excerpt.
high positive Cognitive Amplification vs Cognitive Delegation in Human-AI ... preservation of human expertise under human-AI design choices
Together, these quantities provide a low-dimensional metric space for evaluating whether human-AI systems achieve genuine synergistic performance and whether such performance is cognitively sustainable for the human component over time.
Claim about the utility of the defined metrics, supported within the paper by the conceptual/mathematical framework and the proposed metric definitions (theoretical demonstration rather than reported empirical validation in the excerpt).
high positive Cognitive Amplification vs Cognitive Delegation in Human-AI ... hybrid human-AI performance and cognitive sustainability
The paper defines a set of operational metrics: the Cognitive Amplification Index (CAI*), the Dependency Ratio (D), the Human Reliance Index (HRI), and the Human Cognitive Drift Rate (HCDR).
Explicit listing of newly proposed operational metrics in the paper; this is a descriptive claim about the paper's content (theoretical definitions), no sample size or empirical estimation provided in the excerpt.
high positive Cognitive Amplification vs Cognitive Delegation in Human-AI ... operational metrics for human-AI cognitive interaction (CAI*, D, HRI, HCDR)
The paper introduces a conceptual and mathematical framework to distinguish cognitive amplification (AI improves hybrid human-AI performance while preserving human expertise) from cognitive delegation (reasoning is progressively outsourced to AI).
Explicit contribution claim in the paper (description of a conceptual and mathematical framework); evidence consists of the model and formal definitions presented in the paper (no external empirical validation reported in the excerpt).
high positive Cognitive Amplification vs Cognitive Delegation in Human-AI ... mode of human-AI interaction (amplification vs delegation)
Artificial intelligence generates positive spatial spillovers for UCEE (positive effects on neighboring regions).
Spatial Durbin model reported in the abstract indicating positive spillover coefficients for artificial intelligence.
high positive How artificial intelligence and environmental regulation inf... UCEE index (spatial spillover effect of AI)
The Global Malmquist–Luenberger (GML) index and its efficiency change (EC) and technological change (TC) components stay above 1, indicating sustained efficiency gains dominated by technological progress.
GML index and decomposition results reported in the abstract based on the panel data and GML computation.
high positive How artificial intelligence and environmental regulation inf... GML index and its EC and TC components (measures of productivity/efficiency chan...
Nationally, the average UCEE index rises from about 0.3 to above 0.7 over the sample period.
Computed UCEE index results from the Super-SBM model applied to the panel of 30 provinces (2013–2022) as reported in the abstract.
high positive How artificial intelligence and environmental regulation inf... UCEE index (average, national)
Recent advances in large language models, tool-using agents, and financial machine learning are shifting financial automation from isolated prediction tasks to integrated decision systems that can perceive information, reason over objectives, and generate or execute actions.
Literature synthesis and conceptual statement in the paper's introduction describing recent technological advances and their effects on financial automation; no empirical sample size reported.
high positive AI Agents in Financial Markets: Architecture, Applications, ... shift in type of financial automation (from isolated prediction to integrated de...
SOL-ExecBench reframes GPU kernel benchmarking from beating a mutable software baseline to closing the remaining gap to hardware Speed-of-Light.
Conceptual/positioning claim made by the authors about the intended shift in benchmarking perspective enabled by SOL-ExecBench.
high positive SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GP... benchmarking_objective_shift_toward_hardware_efficiency
To support robust evaluation of agentic optimizers, we provide a sandboxed harness with GPU clock locking, L2 cache clearing, isolated subprocess execution, and static analysis-based checks against common reward-hacking strategies.
Method/tool claim in paper describing the provided evaluation harness and its engineered controls (list of features included).
high positive SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GP... evaluation_robustness_and_integrity_of_benchmarking
We report a SOL Score that quantifies how much of the gap between a release-defined scoring baseline and the hardware SOL bound a candidate kernel closes.
Paper defines the SOL Score metric and states its interpretive meaning (fraction of gap closed between baseline and hardware SOL bound).
high positive SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GP... fraction_of_gap_closed_to_hardware_bound
SOL-ExecBench measures performance against analytically derived Speed-of-Light (SOL) bounds computed by SOLAR, our pipeline for deriving hardware-grounded SOL bounds, yielding a fixed target for hardware-efficient optimization.
Methodological claim: introduction of SOLAR pipeline to compute analytic hardware-grounded SOL bounds and use of those bounds as benchmark targets, as described in the paper.
high positive SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GP... proximity_to_hardware_speed_of_light_bounds
The benchmark covers forward and backward workloads across BF16, FP8, and NVFP4, including kernels whose best performance is expected to rely on Blackwell-specific capabilities.
Paper description of benchmark coverage (workload direction and data types; inclusion of kernels tied to Blackwell hardware features).
high positive SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GP... coverage_of_workloads_and_datatypes
We present SOL-ExecBench, a benchmark of 235 CUDA kernel optimization problems extracted from 124 production and emerging AI models spanning language, diffusion, vision, audio, video, and hybrid architectures, targeting NVIDIA Blackwell GPUs.
Paper reports construction of the benchmark with counts: 235 CUDA kernel problems and 124 source models; descriptive dataset claim in the manuscript.
high positive SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GP... benchmark_problem_count_and_coverage
Given these findings, policymakers should favor 'strategic forbearance'—apply existing laws rather than create new regulations that could stifle innovation and diffusion of AI.
Authors' normative policy recommendation based on their interpretation of the reviewed empirical literature (risk–benefit assessment); this is a prescriptive conclusion rather than an empirical finding, so no sample size applies.
high positive AI, Productivity, and Labor Markets: A Review of the Empiric... regulatory approach to AI governance (strategy of forbearance vs. new regulation...
Generative AI lowers entry costs for startups, facilitating new firm entry and product development.
Cited empirical and descriptive evidence in the literature review indicating reduced development costs and faster product prototyping enabled by AI tools; the brief does not provide a pooled sample size or a single quantitative estimate.
high positive AI, Productivity, and Labor Markets: A Review of the Empiric... barriers to entry / startup costs and rate of new product development
Generative AI significantly boosts productivity in specific tasks like coding, writing, and customer service—often by 15% to 50%.
Synthesis/review of empirical literature through 2025 (multiple empirical studies of task-level impacts, including field and lab studies and observational analyses); the brief reports aggregate reported effect ranges but does not list a single pooled sample size.
high positive AI, Productivity, and Labor Markets: A Review of the Empiric... task-level productivity in coding, writing, and customer service
The study contributes to theory by empirically integrating technological, human, and institutional dimensions within a single architectural framework, moving beyond isolated analyses of digital credit.
Author-stated contribution based on combining measures of algorithmic credit systems, human capability, and institutional design and testing interactions in the same regression models.
high positive Architecting financial well-being in algorithmic credit syst... theoretical contribution / integrative framework
Moderation analysis reveals that higher levels of human capability and stronger institutional design amplify the positive effects of algorithmic credit systems and mitigate their adverse effects (i.e., they strengthen repayment and resilience effects and reduce financial stress).
Reported moderation analyses using interaction terms in the regression models on the 400-user cross-sectional sample; results described as significant moderation by human capability and institutional design.
high positive Architecting financial well-being in algorithmic credit syst... conditional effects on repayment behavior, financial resilience, and financial s...
Algorithmic credit systems are positively associated with financial resilience.
Regression analyses reported show a positive relationship between algorithmic credit system use and measures of financial resilience in the sample of 400 users.
Algorithmic credit systems are positively associated with repayment behavior.
Multiple regression results reported in the study indicate a positive association between use of algorithmic credit systems and repayment behavior based on cross-sectional survey of 400 users.
Measurement reliability and validity were established through Cronbach's alpha and principal component analysis.
Paper states that Cronbach’s alpha and principal component analysis (PCA) were used to establish measurement reliability and validity.
high positive Architecting financial well-being in algorithmic credit syst... measurement reliability/validity
The study used a quantitative, explanatory, cross-sectional design and employed multiple regression and moderation analyses to assess relationships among algorithmic credit systems, human capability, institutional design, and financial-wellbeing outcomes.
Methods described explicitly: quantitative explanatory cross-sectional design; analytical methods named as multiple regression and moderation analyses.
high positive Architecting financial well-being in algorithmic credit syst... research design / analytic methods
Data were collected from 400 users of algorithmic and digitally mediated credit platforms.
Study reports a quantitative, explanatory, cross-sectional survey of users; sample size explicitly stated as 400.
high positive Architecting financial well-being in algorithmic credit syst... sample_size / data source
Institutional design (enforceable rules, auditable logs, human oversight on high-impact actions) is a precondition for safe delegation of real authority to LLM agents; systems should be stress-tested under governance-like constraints before assignment of real authority.
Policy recommendation derived from simulation findings that governance structure strongly influences corruption-related outcomes and that safeguards alone are not consistently sufficient; grounded in experiments and rubric-assessed outcomes across 28,112 transcript segments.
high positive I Can't Believe It's Corrupt: Evaluating Corruption in Multi... safety of delegation to LLM agents (compliance with rules, avoidance of abuse)
Among models operating below saturation, governance structure is a stronger driver of corruption-related outcomes than model identity.
Comparative analysis within the multi-agent governance simulations across different authority structures and model identities; outcomes aggregated and compared across regimes (based on the 28,112 transcript segments scored).
high positive I Can't Believe It's Corrupt: Evaluating Corruption in Multi... corruption-related outcomes / rule-breaking
Integrity in institutional AI should be treated as a pre-deployment requirement rather than a post-deployment assumption.
Argument and recommendation based on results from multi-agent governance simulations evaluating rule-breaking and abuse; conclusions drawn from aggregate outcomes across simulated regimes and interventions (see study of 28,112 transcript segments).
high positive I Can't Believe It's Corrupt: Evaluating Corruption in Multi... institutional integrity / safety of delegation to LLM agents
The AgentDS benchmark datasets are open-sourced and available at https://huggingface.co/datasets/lainmn/AgentDS.
Paper includes link to the open-source datasets and the AgentDS website.
The strongest solutions arise from human-AI collaboration.
Analysis of competition results showing top-performing submissions employed human-AI collaborative approaches rather than AI-only baselines (results from 29 teams / 80 participants).
high positive AgentDS Technical Report: Benchmarking the Future of Human-A... performance of human-AI collaborative solutions
We introduce AgentDS, a benchmark and competition designed to evaluate both AI agents and human-AI collaboration performance in domain-specific data science.
Paper describes the creation of the AgentDS benchmark and an associated competition as the study's primary methodological contribution.
high positive AgentDS Technical Report: Benchmarking the Future of Human-A... benchmark for evaluating AI agents and human-AI collaboration