The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (4131 claims)

Adoption
8625 claims
Productivity
7686 claims
Governance
6917 claims
Human-AI Collaboration
6574 claims
Org Design
4189 claims
Innovation
4131 claims
Labor Markets
3588 claims
Skills & Training
2985 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 761 200 101 904 2020
Governance & Regulation 829 400 191 122 1566
Organizational Efficiency 784 193 125 84 1197
Technology Adoption Rate 637 236 124 97 1103
Research Productivity 431 131 58 340 972
Output Quality 481 183 59 47 770
Decision Quality 332 177 82 49 647
Firm Productivity 439 57 88 20 610
AI Safety & Ethics 218 279 66 33 602
Market Structure 181 170 123 24 503
Task Allocation 214 64 72 33 388
Skill Acquisition 174 62 62 17 315
Innovation Output 204 27 45 18 295
Employment Level 105 54 108 13 282
Fiscal & Macroeconomic 132 69 43 26 277
Consumer Welfare 117 63 42 11 233
Firm Revenue 154 48 26 3 231
Task Completion Time 173 31 8 12 225
Inequality Measures 44 123 50 6 223
Worker Satisfaction 89 65 22 12 188
Error Rate 71 92 10 2 175
Regulatory Compliance 77 69 14 5 165
Automation Exposure 58 56 26 13 156
Training Effectiveness 96 21 14 19 152
Wages & Compensation 77 37 25 6 145
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 81 21 1 115
Hiring & Recruitment 52 7 8 3 70
Creative Output 32 20 8 3 64
Skill Obsolescence 5 47 6 1 59
Social Protection 28 16 8 2 54
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Innovation Remove filter
The emergence of AI agents—systems where large language models serve as the primary reasoning engine, dynamically generating and discarding code as an instrumental resource—constitutes a fundamental restructuring of the software paradigm rather than an incremental improvement.
Argument based on first-principles analysis of complexity scaling and conceptual comparison between traditional software and agentic systems (theoretical analysis presented in the paper).
high positive The End of Software Engineering: How AI Agents Are Fundament... nature of the software development paradigm (static-code-centric vs LLM-driven a...
ALE is intended not merely as another leaderboard, but as an instrument for closing the gap between benchmark success and GDP-relevant impact.
Author-stated intent and high-level goal of the benchmark.
high positive Agents' Last Exam alignment of benchmark evaluation with GDP-relevant impact (economic impact of A...
ALE is designed as a living benchmark: its task pool grows continuously as new workflows and industries are onboarded.
Design and maintenance policy described by the authors.
high positive Agents' Last Exam continuous expansion of benchmark task pool
ALE was developed in collaboration with 250+ industry experts.
Author statement specifying collaborator count.
high positive Agents' Last Exam number of industry experts involved in development
This paper introduces Agents' Last Exam (ALE), a benchmark designed to evaluate AI agents on long-horizon, economically valuable, real-world tasks with verifiable outcomes.
Description of benchmark introduced by the authors (design claim).
high positive Agents' Last Exam AI agent performance on long-horizon real-world tasks (verifiable outcomes / tas...
Recent AI systems have achieved strong results on a wide range of benchmarks.
Statement in paper (background/context); refers to existing benchmark results in the literature (no specific benchmarks or datasets named in this excerpt).
high positive Agents' Last Exam performance on existing AI benchmarks
Understanding the evolution of LLM-augmented search is critical for organizations seeking to maintain brand relevance in an AI-augmented information landscape.
Prescriptive concluding claim in paper; based on the authors' synthesis of observed trends and conceptual analysis rather than empirical validation in the provided excerpt.
high positive SEARCH ENGINE OPTIMIZATION: HOW LLM-GENERATED SUMMARIES ARE ... organizational ability to maintain brand relevance
Public examples referenced include the reported PocketOS and Replit agentic database-deletion incidents and Moffatt v. Air Canada as an adjudicated output/reliance case.
The paper cites specific public incidents and a legal case as examples supporting its discussion.
high positive From Control Boundary to Insurance Claim: Reconstructing AI-... use of real-world examples and adjudicated case to illustrate AI reconstruction ...
The paper makes three contributions: it defines the AI-specific reconstruction problem, operationalizes that problem through CER, and specifies claim-grade evidence for AI reconstruction.
Author-stated contributions in the paper; descriptive of the paper's goals and deliverables.
high positive From Control Boundary to Insurance Claim: Reconstructing AI-... conceptual/operational contributions delivered by the paper (definition, operati...
The paper introduces CER, a use-case-level diagnostic for AI residual risk transfer: C (control boundary) asks whether the system had an enforceable operating envelope; E (evidence reconstruction) asks whether the system state and causal chain can be reconstructed from retained artifacts; R (insurance response) asks whether the reconstructed loss is insured (coverage available and placed, and proof needed to support claim recovery).
Framework introduction and operationalization described in the paper; presented as the paper's primary methodological contribution.
high positive From Control Boundary to Insurance Claim: Reconstructing AI-... diagnostic ability to evaluate residual risk transfer via control boundaries, ev...
The paper addresses losses in which the insured's AI system is in the causal chain, including externally triggered failures such as prompt injection, retrieval-augmented generation (RAG) poisoning, malicious tool output, credential misuse, and data poisoning.
Scope statement in the paper listing specific failure modes; descriptive rather than empirical.
high positive From Control Boundary to Insurance Claim: Reconstructing AI-... coverage of AI-caused loss modes (identification of failure types relevant to re...
The relevant question for such losses is not only what loss occurred, but what the system was allowed to do, what it actually did, and whether that reconstructed loss can support insurance claim recovery.
Conceptual framing provided in the paper; presented as the diagnostic/analytic focus rather than backed by empirical data in the excerpt.
high positive From Control Boundary to Insurance Claim: Reconstructing AI-... completeness of reconstruction (allowed actions, actual actions) needed to estab...
AI losses that arise through an insured organization's generative or agentic AI system require state reconstruction, not merely event reconstruction, because the relevant state changes as the system reasons, retrieves, calls tools, and acts.
Argument presented in the paper as a conceptual/theoretical claim about the nature of AI-system-caused losses; no empirical sample or quantitative study reported in the excerpt.
high positive From Control Boundary to Insurance Claim: Reconstructing AI-... need for state reconstruction (vs. event-only reconstruction) to support insuran...
The future of agentic-AI insurance lies not in a single monoline product but in a layered ecosystem of complementary coverages supported by improved governance, transparency, telemetry, and regulatory clarity.
Analytic conclusion/recommendation based on the paper's risk taxonomy, actuarial framework, and parallels to cyber insurance; forward-looking synthesis rather than empirical causal evidence.
high positive Insurance of Agentic AI recommended market design for agentic-AI insurance (layered ecosystem vs single ...
A coordinated insurance architecture integrating cyber, technology errors and omissions, product liability, performance-warranty, and affirmative AI-liability coverages with explicit allocation mechanisms and dedicated AI aggregates is proposed.
Design proposal in the paper detailing a layered insurance architecture combining multiple coverages and allocation mechanisms; conceptual design not empirically tested.
high positive Insurance of Agentic AI proposed coordinated insurance architecture for agentic AI
The paper proposes an actuarial framework based on exposure assessment, scenario analysis, dependency mapping, and accumulation-risk management, drawing parallels to the evolution of cyber insurance.
Proposed actuarial approach described in the paper, invoking methods like scenario analysis and dependency mapping and analogizing to cyber insurance development; methodological proposal without empirical validation.
high positive Insurance of Agentic AI actuarial framework components for agentic-AI insurance
The paper develops a framework for understanding underwriting, pricing, reinsurance, and product-design implications for agentic-AI insurance.
Methodological contribution stated in the paper: proposed actuarial/underwriting framework (exposure assessment, scenario analysis, dependency mapping, accumulation-risk management); conceptual development rather than empirical validation.
high positive Insurance of Agentic AI framework for underwriting/pricing/reinsurance/product design
Large-scale online experiments demonstrate consistent relative improvements in device cold-start engagement.
Reported results from large-scale online experiments in Tubi production (no numerical effect sizes or sample sizes provided in excerpt).
high positive Bridging the Semantic-Collaborative Gap: An Asymmetric Graph... device cold-start engagement
Large-scale online experiments demonstrate consistent relative improvements in impression acquisition.
Reported results from large-scale online experiments in Tubi production (no numerical effect sizes or sample sizes provided in excerpt).
high positive Bridging the Semantic-Collaborative Gap: An Asymmetric Graph... impression acquisition (number/rate of impressions for content)
Large-scale online experiments demonstrate consistent relative improvements in promotion speed.
Reported results from large-scale online experiments in Tubi production (no numerical effect sizes or sample sizes provided in excerpt).
high positive Bridging the Semantic-Collaborative Gap: An Asymmetric Graph... promotion speed (how quickly new content is promoted)
Large-scale online experiments demonstrate consistent relative improvements in content cold-start engagement.
Reported results from large-scale online experiments in Tubi production (no numerical effect sizes or sample sizes provided in excerpt).
high positive Bridging the Semantic-Collaborative Gap: An Asymmetric Graph... content cold-start engagement
After training, the learned content encoder generates embeddings for both warm and newly ingested content, enabling implicit graph completion through retrieval of warm surrogate neighbors.
Functional claim based on model training and retrieval behavior described in paper (mechanistic claim; supported by described architecture and training procedure).
high positive Bridging the Semantic-Collaborative Gap: An Asymmetric Graph... ability to generate embeddings for new content and enable implicit graph complet...
The RHS content tower does not use ID-based embeddings, content-side subgraphs, neighbor aggregation, or interaction-derived representations, forcing the content encoder to map intrinsic features into a collaborative-filtering-aware embedding space.
Design choice and intended representational effect described in paper (architectural constraints and claimed representational consequence).
high positive Bridging the Semantic-Collaborative Gap: An Asymmetric Graph... content encoder representation (mapping intrinsic features into CF-aware embeddi...
The article proposes a Strategic Action Framework to support more inclusive and context-responsive AI ecosystems.
Policy recommendation/framework presented by the authors as a conclusion; not empirically evaluated within the study.
high positive Compressed professionalization in informal economies: a soci... Strategic Action Framework (policy intervention)
Empirical observations show that youth mobilize AI tools for translation, content creation, customer engagement, and micro-entrepreneurial activities, enabling partial and situational approximation of selected formal-sector practices.
Qualitative interview data from the 125 semi-structured interviews in three DRC cities, used as illustrative grounding for observed uses of AI by youth.
high positive Compressed professionalization in informal economies: a soci... use of AI for translation, content creation, customer engagement, and micro-entr...
By bridging established knowledge with emerging governance challenges, this study advances a more comprehensive understanding of platform governance and outlines future research avenues related to technological change, dynamic capabilities, and ecosystem perception.
Authors' stated contribution based on their integrative framework and literature synthesis of 644 publications.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... advancement of understanding of platform governance and identification of future...
The paper proposes a research agenda that examines how emerging technologies, including algorithmic governance, generative AI, and agentic systems, are reshaping governance practices.
Paper's concluding/prospective section proposing future research directions; conceptual proposal rather than empirical test.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... proposed future research topics concerning the impact of emerging technologies o...
The identified governance mechanisms foster innovation in platform ecosystems.
Claim based on the paper's integrative synthesis of 644 publications indicating governance's role in fostering innovation.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... innovation outcomes in platform ecosystems associated with governance mechanisms
The identified governance mechanisms ensure quality in platform ecosystems.
Argument and synthesis from the systematic literature review of 644 publications as presented in the paper's framework.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... quality assurance in platform ecosystems via governance mechanisms
The identified governance mechanisms (incentives, control, boundary resources) enable platform owners to coordinate value creation.
Argument based on the integrative framework derived from the systematic literature review (644 publications).
high positive Mission: Orchestration – Governance Mechanisms And Future Re... coordination of value creation by platform owners using governance mechanisms
There are three core types of governance mechanisms that enable platform owners to coordinate value creation, ensure quality, and foster innovation: incentives, control, and boundary resources.
Synthesis and classification resulting from the systematic literature review of 644 publications, producing an integrative framework that identifies the three mechanism types.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... identification of three governance mechanism types (incentives, control, boundar...
This study conducts a systematic literature review of 644 publications to synthesize the governance landscape and develop an integrative framework.
Methodological statement from the paper reporting the authors performed a systematic literature review analyzing 644 publications.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... number of publications reviewed and use of SLR to develop framework
Platform owners orchestrate complementor participation through governance mechanisms.
Synthesis and conceptual argument based on the systematic literature review of 644 publications.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... use of governance mechanisms by platform owners to orchestrate participation
Digital platform ecosystems rely on loosely coupled complementors to jointly create value with platform owners.
Synthesis of prior literature via the paper's systematic literature review (644 publications); conceptual framing in the literature on platform ecosystems.
high positive Mission: Orchestration – Governance Mechanisms And Future Re... reliance of platform ecosystems on loosely coupled complementors for joint value...
The productivity-enhancing effect of fintech is stronger in regions with higher levels of economic development.
Heterogeneity/subsample analysis reported for regional economic development levels using the sample of Chinese A-share listed manufacturing firms (2015–2023); paper states fintech's effect on TFP is more pronounced in more economically developed regions (no subgroup sample sizes or quantitative estimates provided in the excerpt).
high positive Research on the Impact of Financial Technology on the Total ... corporate total factor productivity (heterogeneous effect by regional developmen...
The productivity-enhancing effect of fintech is more pronounced in high-tech industries.
Heterogeneity/subsample analysis in the paper using the sample of Chinese A-share listed manufacturing firms (2015–2023); paper reports stronger fintech–TFP effects in high-tech industry subsample (no subgroup sample sizes or coefficients provided in the excerpt).
high positive Research on the Impact of Financial Technology on the Total ... corporate total factor productivity (heterogeneous effect by industry tech-inten...
The positive effect of fintech on corporate total factor productivity operates primarily through the channels of supply chain finance and innovation effects.
Mediation/ mechanism analysis reported in the study using the same sample of Chinese A-share listed manufacturing firms (2015–2023); paper states supply chain finance and innovation as the primary channels (specific mediation estimates not provided in the excerpt).
high positive Research on the Impact of Financial Technology on the Total ... corporate total factor productivity (through supply chain finance and innovation...
Fintech development can significantly enhance corporate total factor productivity for Chinese A-share listed manufacturing firms.
Empirical analysis on a sample of Chinese A-share listed manufacturing enterprises covering 2015–2023; result described as statistically significant in the paper (specific estimation methods and sample size not provided in the excerpt).
high positive Research on the Impact of Financial Technology on the Total ... corporate total factor productivity
Dijital platformlar insan deneyimini veriye dönüştürerek ekonomik değere tahvil eden yeni bir rejim (gözetim kapitalizmi) kurmuştur.
Teorik ve kavramsal analiz; çalışma Zuboff'un gözetim kapitalizmi yaklaşımına atıf yapmaktadır. No empirical sample or quantitative evidence reported.
high positive GÖZETİM KAPİTALİZMİNİN HUKUKSAL TEMELLERİ: FOUCAULTCU BİR AN... dijital platformların insan deneyimini veriye dönüştürme ve bunun ekonomik değer...
The field can be organized around an integrated decision-system framework consisting of five connected constructs—delegation frontier, reliance wedge, decision-useful XAI, meaningful oversight, and reflexive AI loop—to support cumulative research on investment, trading, credit, asset management, risk, compliance, and financial regulation.
Proposal of a conceptual framework grounded in the paper’s integrative literature review (no empirical validation or sample size reported in the abstract).
high positive Human–AI hybrid finance: from AI tools to decision systems utility of the proposed decision-system framework for structuring future researc...
The review integrates evidence on methods, data, scenarios, explainability, trust, governance, financial large language models (FinLLMs), and agentic finance.
Descriptive claim about the scope of this paper’s literature synthesis (the review itself; content-based rather than empirical).
high positive Human–AI hybrid finance: from AI tools to decision systems breadth of topics integrated in the review
The central question is moving from model performance to decision architecture: how authority, oversight, and accountability should be allocated across financial workflows.
Argument based on synthesis of prior literature across relevant fields (conceptual review; no single empirical study or sample size reported).
high positive Human–AI hybrid finance: from AI tools to decision systems allocation of authority, oversight, and accountability in financial decision wor...
AI is moving from a predictive tool to a component of human–AI hybrid financial decision systems.
Integrative conceptual literature review synthesizing work across finance, management, human–computer interaction (HCI), and AI (no primary empirical sample reported).
high positive Human–AI hybrid finance: from AI tools to decision systems role of AI within financial decision workflows (predictive tool vs. integrated d...
The benchmark is publicly available at: https://github.com/ant-research/meta-agent-challenge.
Statement of public release and URL provided in the paper.
high positive The Meta-Agent Challenge: Are Current Agents Capable of Auto... public availability / open-source access
MAC provides a rigorous, open-source benchmark for autonomous AI research and development and offers an empirical proxy for evaluating recursive self-improvement.
Claim about the utility and intended purpose of the released benchmark; supported by the benchmark's design and experiments described in the paper.
high positive The Meta-Agent Challenge: Are Current Agents Capable of Auto... empirical evaluation of recursive self-improvement
The few meta-agents that do match human-engineered baselines are dominated by proprietary frontier models.
Experimental observations reported in the paper indicating that successful meta-agents rely on proprietary frontier models; details (counts, model names) not provided in abstract.
high positive The Meta-Agent Challenge: Are Current Agents Capable of Auto... composition of successful meta-agents (proprietary vs non-proprietary)
To ensure evaluation integrity, the framework is secured by multi-layer defenses against reward hacking.
Methodological claim in paper about security measures implemented in the benchmark.
high positive The Meta-Agent Challenge: Are Current Agents Capable of Auto... resistance to reward hacking / evaluation integrity
In MAC a code agent (the meta-agent) is given a sandboxed environment, an evaluation API, and a time limitation to iteratively program an agent artifact that maximizes performance on a held-out test set across five domains.
Method description of the benchmark setup; specification includes 'held-out test set across five domains'.
high positive The Meta-Agent Challenge: Are Current Agents Capable of Auto... performance on a held-out test set
We introduce the Meta-Agent Challenge (MAC), an evaluation framework designed to test the capacity of frontier models for autonomous agent development.
Paper contribution: description of a new evaluation framework (methodological introduction).
high positive The Meta-Agent Challenge: Are Current Agents Capable of Auto... capacity of models to develop autonomous agents
The findings imply that research evaluation and science policy should adopt assessment frameworks that distinguish between recombinant and conceptual forms of creativity and recognize that different modes of AI adoption produce different types of scientific contribution.
Policy/recommendation statement grounded in the paper's empirical findings on heterogeneous creativity effects by AI research mode.
high positive Does Artificial Intelligence Advance Science? policy recommendation for research evaluation frameworks