The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (13661 claims)

Adoption
8339 claims
Productivity
7479 claims
Governance
6715 claims
Human-AI Collaboration
6267 claims
Org Design
4098 claims
Innovation
3987 claims
Labor Markets
3488 claims
Skills & Training
2888 claims
Inequality
2016 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 740 192 95 871 1945
Governance & Regulation 796 388 185 119 1512
Organizational Efficiency 765 186 123 82 1166
Technology Adoption Rate 610 227 121 95 1061
Research Productivity 409 121 56 331 928
Output Quality 464 174 58 47 743
Decision Quality 318 173 75 42 615
Firm Productivity 432 55 88 20 601
AI Safety & Ethics 214 273 65 33 589
Market Structure 175 165 120 24 489
Task Allocation 206 64 70 31 376
Skill Acquisition 161 57 57 16 291
Innovation Output 201 27 41 18 288
Fiscal & Macroeconomic 130 69 43 26 275
Employment Level 104 50 105 13 274
Consumer Welfare 116 62 42 11 231
Firm Revenue 149 45 26 3 223
Inequality Measures 43 120 49 6 218
Task Completion Time 164 29 8 12 214
Worker Satisfaction 89 60 20 12 181
Error Rate 69 89 9 2 169
Regulatory Compliance 74 67 14 4 159
Training Effectiveness 91 19 13 19 144
Wages & Compensation 77 33 25 6 141
Team Performance 86 17 27 9 140
Automation Exposure 49 50 22 12 136
Developer Productivity 91 17 14 5 128
Job Displacement 12 80 19 1 112
Hiring & Recruitment 51 7 8 3 69
Creative Output 31 16 7 2 57
Skill Obsolescence 5 43 6 1 55
Social Protection 27 16 8 2 53
Labor Share of Income 17 17 17 51
Worker Turnover 11 12 3 26
Industry 1 1
The Process Performance Index (PI) is positively associated with abnormal earnings.
Empirical regression results using the panel dataset (3,515 firms; 20,076 firm-year observations) reporting a positive association between PI and abnormal earnings.
A Process Performance Index (PI) is constructed to measure AI-enabled operational capability across resource allocation efficiency, coordination effectiveness, and production performance dimensions.
Authors describe construction of PI using multi-dimensional indicators and the AHP–EWM weighting plus FCE aggregation procedure.
high positive A Data-Driven Evaluation Framework for Quantifying the Impac... Process Performance Index (PI) as a measure of AI-enabled operational capability
This study proposes a data-driven evaluation framework that integrates the Feltham–Ohlson enterprise value assessment with a multi-level performance evaluation framework (hybrid AHP–EWM weighting and Fuzzy Comprehensive Evaluation aggregation) to quantify the impact of AI on industrial process performance and enterprise value creation.
Methodological description in the paper: authors describe integrating Feltham–Ohlson valuation with AHP–EWM weighting and FCE aggregation to form a unified evaluation framework.
high positive A Data-Driven Evaluation Framework for Quantifying the Impac... ability to evaluate AI-driven process performance and enterprise value
New tendencies in managerial AI research and practice include explainable AI, human–AI collaboration, knowledge management, enterprise analytics, and algorithmic management.
Descriptive finding from the paper's literature synthesis (topics emphasized in the review); no quantitative prevalence or counts provided in the abstract.
high positive Artificial intelligence, machine learning, and deep learning... emergent research and practice topics / adoption tendencies
Machine Learning and Deep Learning enhance employee productivity, business intelligence, process mining, and data-driven decision-making by enabling prediction, perception, and adaptive learning solutions.
Claim synthesized in the review from multiple studies identified via PRISMA screening; abstract does not list the number or identity of underlying empirical studies.
high positive Artificial intelligence, machine learning, and deep learning... employee productivity, effectiveness of business intelligence and process mining...
AI-based technologies can greatly enhance managerial efficiency by automating repetitive activities, improving resource allocation, enabling intelligent scheduling, and supporting predictive modelling and strategic planning.
Summary conclusion from the paper's literature review (PRISMA methodology referenced); no quantitative meta-analytic effect sizes provided in abstract.
high positive Artificial intelligence, machine learning, and deep learning... managerial efficiency, automation of repetitive tasks, resource allocation, sche...
Machine Learning, Artificial Intelligence, and Deep Learning are tools that can optimize managerial decisions, enable intelligent automation, streamline workflows, and improve organizational performance.
Synthesis claim from the paper's PRISMA-based literature review (no numeric sample size reported in the abstract).
high positive Artificial intelligence, machine learning, and deep learning... managerial decision quality, automation, workflow streamlining, organizational p...
The paper ends with strategic suggestions to foster inclusive growth and orchestrate disruption, contributing evidence-based insights to the future of work in Africa.
Description of the paper's conclusions/recommendations drawn from its systematic review; represents the paper's stated contribution rather than an empirical claim about external data.
high positive The Impact of AI-Driven Automation on Semi and Unskilled Wor... policy recommendations and strategic guidance for inclusive growth and managed d...
The technologies are capable of raising productivity.
Synthesis from the paper's systematic review indicating productivity gains associated with AI/automation in the literature; no quantified meta‑analytic estimate provided in the summary.
high positive The Impact of AI-Driven Automation on Semi and Unskilled Wor... productivity increases associated with AI adoption
Future research should explore hybrid frameworks that combine LLM reasoning with quantitative optimization for cost-sensitive environments.
Recommendation in conclusion based on observed results (LLMs perform reasonably but lag optimized methods and transaction costs matter).
high positive Few-Shot Portfolio Optimization: Can Large Language Models O... recommended research direction (hybrid LLM + optimization frameworks)
A transaction cost analysis revealed that low-turnover LLM strategies retain their competitiveness post-costs, surpassing cap-weighted benchmarks.
Post-transaction-cost analysis reported in results: LLM strategies with low turnover remained competitive after applying transaction cost assumptions and exceeded performance of cap-weighted benchmark.
high positive Few-Shot Portfolio Optimization: Can Large Language Models O... post-cost portfolio performance relative to cap-weighted benchmark
LLM-generated portfolios outperformed naive diversification (Sharpe ratio up to 0.741).
Backtest results comparing LLM-generated portfolios against naive diversification; reported Sharpe ratio value (up to 0.741) for LLM strategies.
high positive Few-Shot Portfolio Optimization: Can Large Language Models O... Sharpe ratio (risk-adjusted return) of portfolios
Findings underscore the importance of robust evaluation frameworks for deploying VLMs in visually rich and safety-critical environments.
Synthesis/recommendation based on experimental results showing that visual inputs (images and colors) can influence VLM decisions and that mitigation effectiveness varies by model.
high positive The Effects of Visual Priming on Cooperative Behavior in Vis... need for/importance of robust evaluation frameworks for VLM safety and reliabili...
Policy frameworks, reskilling initiatives, and institutional adaptations are required to ensure inclusive technological progress.
Prescriptive conclusion presented in abstract based on the review and synthesis; no empirical validation or sample sizes provided in abstract.
high positive AI and the Transformation of Human Employment: Challenges, O... effectiveness of policy and reskilling to ensure inclusion
AI simultaneously generates demand for higher-order problem solving, emotional intelligence, and human-AI collaboration skills.
Explicit finding reported in abstract from the review of interdisciplinary literature; no quantified effect sizes or sample sizes provided in abstract.
high positive AI and the Transformation of Human Employment: Challenges, O... demand for higher-order skills / skill acquisition requirements
The majority of AI’s effect on potential GDP in the period under review was due to increased labor productivity and the optimization of existing processes.
Attribution/decomposition within the scenario analysis of aggregated industry data indicating productivity and process-optimization channels as principal contributors.
high positive THE IMPACT OF AI ON POTENTIAL GDP AND LONG-TERM ECONOMIC GRO... labor productivity and process optimization contributions to GDP
Artificial intelligence has become a significant factor in the growth of Russia’s potential GDP.
Findings reported from the scenario analysis and aggregated industry data reviewed in the paper and syntheses of Russian analytical sources.
high positive THE IMPACT OF AI ON POTENTIAL GDP AND LONG-TERM ECONOMIC GRO... contribution of AI to potential GDP
AI implementation during 2023–2025 was accompanied by a positive contribution to Russia’s potential GDP.
Analysis of aggregated industry data and a scenario approach using Russian-language sources (Ministry of Digital Development, HSE, Digital Economy ANO, analytical reviews).
For memory workloads requiring stable facts and stateful computation, architecture matters more than retrieval scale or model strength alone.
Conclusion drawn by the authors based on comparative experimental results reported in the paper (xmemory vs retrieval/model-strength baselines); excerpt provides aggregate benchmark comparisons but not full experimental details.
high positive From Unstructured Recall to Schema-Grounded Memory: Reliable... relative importance of system architecture versus retrieval/model strength for m...
On the application-level task, xmemory reaches 95.2% accuracy, outperforming specialised memory systems, code-generated Markdown harnesses, and customer-facing frontier-model application harnesses.
Empirical evaluation on an application-level task reported in the paper showing 95.2% accuracy for xmemory and claiming it outperforms several classes of alternative systems; excerpt lacks details on the task, dataset size, or baseline numeric results.
high positive From Unstructured Recall to Schema-Grounded Memory: Reliable... accuracy on an application-level memory task
On the end-to-end memory benchmark, xmemory reaches 97.10% F1, compared with 80.16%-87.24% across the third-party baselines.
Empirical evaluation on the paper's end-to-end memory benchmark reporting F1 scores for xmemory and a range for third-party baselines; the excerpt does not provide dataset size or statistical significance details.
high positive From Unstructured Recall to Schema-Grounded Memory: Reliable... F1 score on an end-to-end memory benchmark
On the structured extraction benchmark (judge-in-the-loop configuration) the system reaches 90.42% object-level accuracy and 62.67% output accuracy, above all tested frontier structured-output baselines.
Empirical evaluation on the paper's structured extraction benchmark in the judge-in-the-loop configuration; the excerpt reports the numeric accuracies and states they exceed tested frontier structured-output baselines. The excerpt does not specify dataset size or number of runs.
high positive From Unstructured Recall to Schema-Grounded Memory: Reliable... object-level accuracy and output accuracy on a structured extraction benchmark
This iterative, schema-aware write-path design shifts interpretation from the read path to the write path: reads become constrained queries over verified records rather than repeated inference over retrieved prose.
Conceptual claim about how the proposed architecture affects system behavior; supported by the architectural description in the paper rather than explicit quantitative evidence in the excerpt.
high positive From Unstructured Recall to Schema-Grounded Memory: Reliable... nature of read queries (constrained queries over verified records vs repeated in...
We present an iterative, schema-aware write path that decomposes memory ingestion into object detection, field detection, and field-value extraction, with validation gates, local retries, and stateful prompt control.
Description of the proposed method/architecture in the paper (methodological contribution); no numeric evaluation attached to the description in the excerpt.
high positive From Unstructured Recall to Schema-Grounded Memory: Reliable... design/components of memory ingestion pipeline
Reliable external AI memory must be schema-grounded (schemas define what must be remembered, what may be ignored, and which values must never be inferred).
Normative assertion supported by the paper's proposed design and subsequent experimental results (the paper introduces a schema-grounded approach and evaluates it against benchmarks), though the excerpt does not give full methodological details or sample sizes for this claim alone.
high positive From Unstructured Recall to Schema-Grounded Memory: Reliable... reliability/stability of external AI memory
To manage AI legibility, creators perform four recurring forms of invisible authenticity labor: epistemic verification, linguistic naturalization, narrative restructuring, and performative embodiment.
Authors identify and name four recurrent practices from coding and analysis of 16 in-depth interviews with creators on Xiaohongshu and Douyin describing specific downstream repair and performance work.
high positive AI passing and invisible authenticity labor: trust vulnerabi... types of labor performed to conceal/humanize AI outputs
Creators engage in 'AI passing': strategic efforts to conceal and humanize AI-assisted drafts so that outputs plausibly appear human-authored.
Concept introduced based on analysis of 16 in-depth interviews with creators on Xiaohongshu and Douyin describing tactics to hide AI involvement and present content as human-authored.
high positive AI passing and invisible authenticity labor: trust vulnerabi... use of concealment/humanization strategies for AI outputs
Latency relaxation expands feasible geography for placing inference workloads.
Result reported from the paper's modeling and stylized simulation (energy-latency frontier analysis showing marginal cost/carbon benefits from relaxing latency budgets).
high positive AI Inference as Relocatable Electricity Demand: A Latency-Co... geographic feasibility of relocating inference demand as a function of latency b...
The paper provides a transparent stylized simulation over representative global compute regions to show how heterogeneous latency tolerance separates workloads into local, regional, and energy-oriented execution layers.
Empirical/methodological evidence from a stylized simulation described in the paper; uses representative global compute regions and latency-tolerance heterogeneity to categorize workloads.
high positive AI Inference as Relocatable Electricity Demand: A Latency-Co... assignment of workloads into execution layers (local, regional, energy-oriented)...
AI inference is becoming a persistent and geographically distributed source of electricity demand.
Statement/assertion in the paper's introduction framing the motivation; no empirical sample or experiment reported in the provided text.
high positive AI Inference as Relocatable Electricity Demand: A Latency-Co... electricity demand (geographic distribution and persistence)
Effective governance requires coordinated action across technical, organizational, and regulatory domains (e.g., system-level audits, vendor guidelines, continuous monitoring, documentation across dependency chains) to establish meaningful accountability in distributed development environments.
Policy and technical recommendations derived from literature review, regulatory analysis, and the paper's conceptual findings (recommendation, not empirically validated).
high positive How Supply Chain Dependencies Complicate Bias Measurement an... effectiveness of governance measures in producing meaningful accountability for ...
Claw-Eval-Live suggests that workflow-agent evaluation should be grounded twice, in fresh external demand and in verifiable agent action.
Conclusion/recommendation drawn from the benchmark design and experimental findings; conceptual claim advocating evaluation grounded in external demand signals and verifiable actions.
high positive Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-Wor... evaluation grounding (use of fresh external demand signals and verifiable agent ...
The release contains 105 tasks spanning controlled business services and local workspace repair, and evaluates 13 frontier models under a shared public pass rule.
Benchmark release statistics reported in the paper: explicit counts of tasks and evaluated models (105 tasks; 13 models).
high positive Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-Wor... benchmark scope (number of tasks) and evaluation breadth (number of models)
For grading, Claw-Eval-Live records execution traces, audit logs, service state, and post-run workspace artifacts, using deterministic checks when evidence is sufficient and structured LLM judging only for semantic dimensions.
Grading methodology described in the paper: instrumentation and hybrid deterministic/LLM-judging approach documented by authors (procedural description).
high positive Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-Wor... grading/verifiability pipeline (traces, logs, deterministic checks, structured L...
Each release is constructed from public workflow-demand signals, with ClawHub Top-500 skills used in the current release, and materialized as controlled tasks with fixed fixtures, services, workspaces, and graders.
Description of release construction in the methods: uses public workflow-demand data and ClawHub Top-500 skills; tasks are materialized with controlled fixtures and graders (procedural detail from the paper).
high positive Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-Wor... composition of benchmark releases (source signals and materialization strategy)
We introduce Claw-Eval-Live, a live benchmark for workflow agents that separates a refreshable signal layer, updated across releases from public workflow-demand signals, from a reproducible, time-stamped release snapshot.
Methodological contribution described in the paper; design and architecture of the benchmark are presented by the authors (design description, no external sample needed).
high positive Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-Wor... benchmark design (refreshable signal layer vs. time-stamped snapshot)
LLM agents are expected to complete end-to-end units of work across software tools, business services, and local workspaces.
Framing/background statement in the paper describing expected capabilities of workflow agents; no empirical sample size reported for this expectation.
high positive Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-Wor... ability to complete end-to-end units of work
Substantive technological competencies play an important role in shaping network resilience and complement structure-based perspectives in understanding innovation networks.
Synthesis of empirical findings from composite metric identification and disruption simulations on the 282,778-patent-derived networks showing capability-based removals have stronger impacts than structure-only removals.
high positive Technological capability and innovation network resilience: ... role of technological competency in network resilience
A composite technological capability metric can be constructed (from textual and network information) to identify core innovators beyond simple topological measures.
Construction and application of a composite metric combining text-derived technological value and network features on 282,778 patents; used to identify core innovators.
high positive Technological capability and innovation network resilience: ... ability to identify core innovators
Latent Dirichlet Allocation (LDA) on the patent texts delineates fine-grained technological domains within the Chinese AI patent corpus.
Text-mining method applied to a corpus of 282,778 Chinese AI patents using LDA to extract topic/domains.
high positive Technological capability and innovation network resilience: ... granular technological domain delineation
This study develops a multidimensional, knowledge-driven evaluation framework that integrates text mining with complex network analysis to identify core innovators.
Methodological description: framework built using Latent Dirichlet Allocation (LDA) on 282,778 Chinese AI patents, construction of a composite technological capability metric, and simulation of targeted disruptions across collaboration and knowledge networks.
high positive Technological capability and innovation network resilience: ... identification of core innovators
Managing evolutionary dynamics in software is as urgent as AGI alignment for safeguarding society’s co-evolution with its machines.
Author's concluding normative claim in the abstract; argument based on scenario analysis rather than comparative empirical evidence.
high positive Digital Darwinism: steering the evolution of artificial life... relative urgency of managing software evolutionary dynamics versus AGI alignment
Governance should shift focus from aligning goals to steering evolution; the paper proposes four guidance instruments: replication-rate thresholds (modeled on epidemiological R0), a public vulnerability registry for self-modifying code, tiered digital biosafety levels, and adaptive regulatory sandboxes.
Normative policy recommendation spelled out in the abstract; based on the paper's scenario analysis and argumentation rather than empirical validation.
high positive Digital Darwinism: steering the evolution of artificial life... proposed governance instruments to manage software evolutionary dynamics
Cloud platforms, open-source software supply chains, and crypto-economic incentives provide, at electronic speed, the three preconditions of evolution: replication, variation, and differential fitness.
Conceptual/mechanistic claim supported by theoretical argumentation and scenario-building in the paper (no empirical test or sample reported).
high positive Digital Darwinism: steering the evolution of artificial life... presence of replication, variation, and differential fitness in software ecosyst...
The proposed framework balances AI-driven productivity with the epistemic sovereignty necessary to manage increasingly opaque software ecosystems.
Normative/architectural claim about the proposed framework; presented conceptually in the paper without reported empirical testing in the excerpt.
high positive Cognitive Atrophy and Systemic Collapse in AI-Dependent Soft... balance between productivity gains and maintenance of epistemic sovereignty (hum...
To preserve long-term resilience, engineering leaders must move beyond prompt-based development to implement rigorous human-in-the-loop pedagogical standards.
Prescriptive recommendation based on the paper's conceptual analysis; no randomized trials or empirical validation of this intervention reported in the excerpt.
high positive Cognitive Atrophy and Systemic Collapse in AI-Dependent Soft... long-term resilience of engineering organizations when using human-in-the-loop p...
The findings offer practical insights for construction firms to enhance innovation performance through effective AI integration and help engineers better leverage AI tools in design and project management workflows.
Authors' stated practical implications based on their empirical findings (survey results linking AI capability, decision-making quality, and innovation performance).
Algorithmic transparency positively moderates the relationship between AI capability and decision-making quality.
Moderation analysis reported on questionnaire data (Credamo, time-lagged) with n=435; authors state a positive moderating effect of algorithmic transparency.
Decision-making quality mediates the relationship between AI capability and innovation performance.
Mediation analysis reported on the same survey dataset (time-lagged Credamo survey) with n=435 using established measurement scales; stated in results.
AI capability is positively associated with innovation performance.
Authors report statistical analysis of questionnaire data collected via the Credamo platform (time-lagged design) using established scales; sample size n=435; result stated in findings.