The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (8570 claims)

Adoption
8570 claims
Productivity
7631 claims
Governance
6869 claims
Human-AI Collaboration
6491 claims
Org Design
4175 claims
Innovation
4114 claims
Labor Markets
3566 claims
Skills & Training
2966 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 758 199 100 900 2007
Governance & Regulation 826 400 191 122 1563
Organizational Efficiency 777 193 124 84 1189
Technology Adoption Rate 635 233 124 97 1098
Research Productivity 422 128 57 336 954
Output Quality 476 179 59 47 761
Decision Quality 328 177 81 47 640
Firm Productivity 435 57 88 20 606
AI Safety & Ethics 218 277 65 33 599
Market Structure 180 170 123 24 502
Task Allocation 213 64 72 33 387
Skill Acquisition 170 61 61 17 309
Innovation Output 203 27 43 18 292
Employment Level 105 54 107 13 281
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 117 63 42 11 233
Firm Revenue 153 48 26 3 230
Task Completion Time 173 31 8 12 225
Inequality Measures 44 122 49 6 221
Worker Satisfaction 89 65 22 12 188
Error Rate 69 92 10 2 173
Regulatory Compliance 77 69 14 5 165
Automation Exposure 56 56 26 13 154
Training Effectiveness 94 21 13 19 149
Wages & Compensation 77 36 25 6 144
Team Performance 86 17 27 10 141
Developer Productivity 95 17 14 6 133
Job Displacement 12 80 20 1 113
Hiring & Recruitment 52 7 8 3 70
Creative Output 31 18 8 3 61
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 19 17 53
Worker Turnover 11 12 3 26
Industry 1 1
Clear
Adoption Remove filter
Agent-written code introduces more security vulnerabilities than code authored by humans.
Comparative analysis of security vulnerabilities attributed to agent-authored code versus human-authored code within the SWE-chat dataset (method details not specified in excerpt).
high negative SWE-chat: Coding Agent Interactions From Real Users in the W... security vulnerabilities introduced by agent-written code versus human-written c...
Just 44% of all agent-produced code survives into user commits.
Empirical measurement of code provenance and survival within the SWE-chat dataset: proportion of agent-produced code that becomes part of subsequent user commits across sessions.
high negative SWE-chat: Coding Agent Interactions From Real Users in the W... survival/usefulness of agent-produced code (proportion incorporated into commits...
Despite rapidly improving capabilities, coding agents remain inefficient in natural settings.
Authors' summary claim supported by dataset-derived metrics such as agent code survival rate (44%) and user pushback (44% of turns); observational analysis of SWE-chat.
high negative SWE-chat: Coding Agent Interactions From Real Users in the W... overall agent efficiency in natural developer workflows (qualitative synthesis)
Regulated deployment imposes four load-bearing systems properties — deterministic replay, auditable rationale, multi-tenant isolation, statelessness for horizontal scale — and stateful architectures violate them by construction.
Conceptual/architectural argument presented in the paper (theoretical analysis), not an empirical measurement in the abstract.
high negative Stateless Decision Memory for Enterprise AI Agents compatibility of stateful architectures with regulatory/system properties
Evaluation of four leading AI platforms shows that standard RAG-based approaches achieve an average of only 15% accuracy when information is insufficient.
Empirical evaluation described in paper: four AI platforms tested on benchmark; reported average accuracy of 15% for RAG-based approaches on cases with insufficient information.
high negative Learning When Not to Decide: A Framework for Overcoming Fact... accuracy on cases where information is insufficient (inconclusive cases)
Unemployment insurance adjudication has seen rapid integration of AI systems and the question of additional fact-finding poses the most significant bottleneck for a system that affects millions of applicants annually.
Contextual/introductory claim in paper; references to domain-scale impact and bottleneck; no specific numeric study sample provided in excerpt.
high negative Learning When Not to Decide: A Framework for Overcoming Fact... scale of impact (number of applicants affected) and fact-finding bottleneck in a...
A well-known limitation of AI systems is presumptuousness: the tendency of AI systems to provide confident answers when information may be lacking.
Statement in paper framing the problem; general literature/contextual claim (no specific experiment cited in the excerpt).
high negative Learning When Not to Decide: A Framework for Overcoming Fact... tendency to provide confident answers when information is lacking (presumptuousn...
Critical gaps persist in explainability, regulatory alignment, ethical governance, and context-specific validation.
Authors' synthesis and Conclusion listing persistent shortcomings identified across the reviewed literature.
high negative AI-Driven Financial Risk Management and Decision Intelligenc... presence of gaps in explainability, regulation, ethics, and validation
Integration of decision intelligence principles into AI applications for financial risk management in emerging markets is nascent.
Authors' synthesis noting limited presence of decision intelligence frameworks or hybrid human-AI decision processes across the reviewed literature.
high negative AI-Driven Financial Risk Management and Decision Intelligenc... degree of decision intelligence integration
There is limited empirical validation of AI approaches in emerging market settings.
Review finding described in Results and Conclusion: comparatively few studies provide robust, context-specific empirical validation for emerging markets despite general claims of effectiveness.
high negative AI-Driven Financial Risk Management and Decision Intelligenc... extent of empirical validation in emerging markets
Recent policy and academic discourse has increasingly acknowledged the infeasibility of fullstack AI sovereignty, but has not yet provided an integrating theoretical architecture for governing dependence under these conditions.
Literature/policy-discourse claim made in the paper (review/interpretation). No empirical sampling or quantitative evidence reported in the provided text.
high negative Digital Sovereignty in the Global Cognitive-Informational Or... feasibility of full technological autonomy (fullstack AI sovereignty) and the pr...
The concentration of AI-related infrastructures is coalescing into distinct geocognitive power poles whose competing infrastructural ecosystems generate structural asymmetries that position small and medium-sized states within regimes of cognitive-informational dependence.
Theoretical/geopolitical argument introduced in the paper (conceptual framing). No empirical sample size or quantitative measurement provided in the excerpt.
high negative Digital Sovereignty in the Global Cognitive-Informational Or... structural asymmetries and dependence of small and medium-sized states on domina...
There is a growing concentration of computational capacity, data ecosystems, and advanced model architectures within a limited number of technological actors, signaling the emergence of a cognitive-informational order in which influence is exercised through the architectures that shape how knowledge is generated, interpreted, and operationalized.
Theoretical/observational assertion in the paper (conceptual synthesis). No empirical details, sample sizes, or quantitative analyses provided in the supplied text.
high negative Digital Sovereignty in the Global Cognitive-Informational Or... concentration of technological capabilities and resulting influence over knowled...
The observed negative OPM effect is consistent with short-term 'J-curve' transition costs (process redesign and capability buildup) during early AI adoption.
Interpretation of empirical patterns (short-term decline in OPM concurrent with no ROA change) offered by the authors as an explanatory mechanism; not presented as separately estimated or experimentally tested.
high negative The Dynamic Causal Effects of Corporate AI Adoption on Profi... operating profit margin dynamics / transition costs interpretation
AI adoption had a significantly negative impact on the operating profit margin (OPM).
Causal analysis of KOSDAQ-listed companies (2018–2025) with AI-adoption timing identified via multi-step, contextually validated text analysis of DART business reports; endogeneity addressed using two-way fixed effects (TWFE) and Propensity Score Matching (PSM).
high negative The Dynamic Causal Effects of Corporate AI Adoption on Profi... operating profit margin (OPM)
For agentic systems, there are three structural breaks: decision diffusion, evidence fragmentation, and responsibility ambiguity.
Analytical identification and labeling of three specific structural problems for agentic AI within the paper's argumentation.
high negative Governed Auditable Decisioning Under Uncertainty: Synthesis ... types of structural governance failures in agentic AI
The paper introduces the 'cascade of uncertainty', showing how governance failures propagate through serial dependencies between framework layers.
Conceptual/theoretical model introduced and analyzed in the paper (cascade model linking framework layers and failure propagation).
high negative Governed Auditable Decisioning Under Uncertainty: Synthesis ... propagation of governance failure/uncertainty across framework layers
Agentic AI systems encounter structural breaks that prevent normal framework fillability.
Paper's analytic assessment reports that agentic AI systems cause structural breaks undermining the framework's ability to fill DES-properties.
high negative Governed Auditable Decisioning Under Uncertainty: Synthesis ... framework fillability / governance evidence coverage in agentic systems
Classical ML systems achieve only minimal DES-property fillability.
Analytic comparison in the paper classifies classical ML systems as providing minimal governance evidence fillability.
When automated decision systems fail, organizations frequently discover that formally compliant governance infrastructure cannot reconstruct what happened or why.
Asserted by the paper as an observed problem motivating the study; presented as a general empirical/experiential claim (literature/examples synthesis) rather than a controlled empirical estimate.
high negative Governed Auditable Decisioning Under Uncertainty: Synthesis ... ability of governance infrastructure to reconstruct decisions (post-hoc explaina...
Training data scarcity is an emerging challenge for organizations that aim to train proprietary LLMs.
Paper highlights training data scarcity as a challenge in its analysis and discussion sections (qualitative observation).
high negative Buy Or Build? A Practitioner’s Framework for Large Language ... feasibility of training proprietary LLMs (availability of training data)
A gender gap persists, concentrated in the most exposed occupations.
Stratified/descriptive and regression analyses of the 2024 EWCS showing gender differences in self-reported generative AI adoption, with the gap largest among occupations with highest exposure; sample >36,600 workers across 35 countries.
high negative Generative AI at Work: From Exposure to Adoption across 35 E... self-reported adoption of generative AI by gender
AI is driving states to reconsider interdependence not as the source of peace, but as a battlefield of power.
Normative and interpretive conclusion drawn from the paper's analysis of AI's geopolitical implications; no empirical data or sample reported in the abstract.
high negative ARTIFICIAL INTELLIGENCE AND THE WEAPONIZATION OF ECONOMIC IN... states' strategic framing of interdependence (from peace-building to power conte...
AI is redefining foreign policy in a multipolar world by making the line between economic cooperation and strategic vulnerability indistinct.
Theoretical claim and synthesis in the paper's thesis; no empirical evidence or sample size provided in the abstract.
high negative ARTIFICIAL INTELLIGENCE AND THE WEAPONIZATION OF ECONOMIC IN... ambiguity between economic cooperation and strategic vulnerability in foreign po...
AI is reshaping economic relationships between countries that were previously sources of mutually beneficial relations into instruments of coercion.
The paper presents a theoretical analysis drawing on international political economy and foreign policy theory; no empirical measurements reported in the abstract.
high negative ARTIFICIAL INTELLIGENCE AND THE WEAPONIZATION OF ECONOMIC IN... transformation of international economic relationships from cooperation to coerc...
AI enhances the weaponization of economic interdependence by enabling states to monitor, predict, manipulate, and disrupt transnational networks with unprecedented accuracy.
The paper advances a theoretical argument and synthesis of international political economy and foreign policy literatures; no empirical sample or quantitative data reported in the abstract.
high negative ARTIFICIAL INTELLIGENCE AND THE WEAPONIZATION OF ECONOMIC IN... capacity to monitor, predict, manipulate, and disrupt transnational networks
The infrastructure for cross-user agent collaboration is entirely absent, let alone the governance mechanisms needed to secure it.
Authoritative claim in paper framing the research gap; presented as observational/argumentative (no empirical audit reported).
high negative ClawNet: Human-Symbiotic Agent Network for Cross-User Autono... availability of cross-user collaboration infrastructure and governance mechanism...
Current AI agent frameworks have made remarkable progress in automating individual tasks, yet all existing systems serve a single user.
Statement in paper's introduction/positioning; conceptual survey-style claim (no empirical study or systematic benchmark reported).
high negative ClawNet: Human-Symbiotic Agent Network for Cross-User Autono... automation scope (single-user vs multi-user)
Standard benchmarks often fail to isolate an agent's core ability to parse queries and orchestrate computations.
Paper asserts that existing/standard benchmarks do not adequately isolate parsing and computation-orchestration abilities, motivating the new benchmark.
high negative Time Series Augmented Generation for Financial Applications benchmark adequacy for isolating parsing/computation orchestration
As multimodal AI achieves human-parity understanding of speech and gesture, [the keyboard's] necessity dissolves.
Theoretical claim supported by multidisciplinary review (history, neuroscience, technology, organizational studies); no quantified empirical test reported.
high negative The Instrumental Dissolution of Typing: Why AI Challenges th... necessity/usage of keyboard as default input
General-purpose LLMs pose misinformation risks for development and policy experts, lacking epistemic humility for verifiable outputs.
Conceptual/argumentative claim stated in the paper's motivation; no empirical test reported in the abstract.
high negative Learning from AVA: Early Lessons from a Curated and Trustwor... misinformation risk / epistemic humility
Traditional machine learning approaches, including the baseline methodology proposed in previous studies, typically optimize global predictive accuracy and therefore fail to capture business-critical outcomes, especially the identification of high-risk clients.
Conceptual critique and literature/contextual claim in the paper; contrasted with the study's business-aware methods (no direct external benchmarking numbers provided in the abstract).
high negative Advanced Insurance Risk Modeling for Pseudo-New Customers Us... identification_of_high-risk_clients
Classifying customers without a prior history at a given company is particularly challenging due to the absence of historical behavior, extreme class imbalance, heavy-tailed loss distributions, and strict operational constraints.
Argumentation / problem statement in the paper (no empirical test reported); descriptive characterization of the insurance cold-start classification problem.
high negative Advanced Insurance Risk Modeling for Pseudo-New Customers Us... classification_difficulty
Thin training coverage fosters anxiety about substitution and slows diffusion of AI tools.
Reported associations from surveys of mid-level managers and technical staff, interviews, and document analysis across cases; thematic coding identified links between limited training, worker anxiety, and slower diffusion. (Sample size not reported.)
high negative Overcoming Resistance to Change: Artificial Intelligence in ... worker anxiety and speed of diffusion/adoption
There exist inequalities in the emergence of algorithmic bias and in transparency of these systems.
Paper states that inequalities and lack of transparency were observed/identified (citing Memarian, 2023; Bello, 2023; Gambacorta et al., 2024) and discusses these as findings.
high negative A Machine Learning Perspective on FinTech-Driven Inclusion: ... inequalities related to algorithmic bias and transparency
Algorithmic bias in automated credit scoring systems may block marginalized groups from accessing financial services.
Explicit statement in the introduction citing prior literature (Agboola, 2025; Nwafor et al., 2024; Oguntibeju, 2024) and motivating the study.
high negative A Machine Learning Perspective on FinTech-Driven Inclusion: ... access to credit for marginalized groups
In the geographical network, both technological diversity and technological proximity inhibit main path formation, implying macro-regional evolution requires specialized focus and complementary knowledge.
ERGM results for the geographical diffusion layer showing negative (inhibitory) associations for diversity and proximity variables; interpreted in regional evolution context.
high negative Mapping China’s digital transformation: a multilayer network... effect of diversity and proximity on main path formation (geographical layer)
The study identified significant implementation challenges including algorithmic bias, digital divide concerns, data privacy risks, and low technology readiness among HR teams in Tier 2 cities.
Synthesis of qualitative case study findings from 4 organizations plus survey responses (N=150) reporting barriers and risks encountered during adoption.
high negative A Study on the Effectiveness of Technology-Driven Recruitmen... implementation challenges / risks
Current LLMs are unreliable delegates: they introduce sparse but severe errors that silently corrupt documents, compounding over long interaction.
Qualitative and quantitative analysis of errors observed across the DELEGATE-52 experiments (19 LLMs) showing sparse, high-severity, and silently introduced errors that accumulate over long workflows.
high negative LLMs Corrupt Your Documents When You Delegate error severity and silent corruption over time
Degradation severity is exacerbated by document size, length of interaction, or presence of distractor files.
Additional experiments and analyses varying document size, interaction length, and presence of distractor files reported in the paper showing increased degradation under these conditions.
high negative LLMs Corrupt Your Documents When You Delegate severity of document degradation / error rate
Agentic tool use does not improve performance on DELEGATE-52.
Additional experiments reported in the paper that compare plain LLM delegation vs. agentic tool-using configurations on DELEGATE-52 and find no performance improvement from agentic tool use.
high negative LLMs Corrupt Your Documents When You Delegate task performance on DELEGATE-52 (document quality/corruption)
Even frontier models (Gemini 3.1 Pro, Claude 4.6 Opus, GPT 5.4) corrupt an average of 25% of document content by the end of long workflows.
Reported results from the experiment evaluating 19 LLMs on DELEGATE-52; these named models are highlighted and an average corruption fraction (25%) is reported at the end of long workflows.
high negative LLMs Corrupt Your Documents When You Delegate proportion of document content corrupted
Our large-scale experiment with 19 LLMs reveals that current models degrade documents during delegation.
Large-scale experiment reported in the paper evaluating 19 LLMs on DELEGATE-52 long delegated workflows; observed document degradation across models.
high negative LLMs Corrupt Your Documents When You Delegate document degradation / output quality
Inherent algorithmic opacity and historical data biases tend to give rise to obvious group prejudices based on gender, educational background, age, and regional origin, thereby further exacerbating the structural inequalities that exist in the current employment market.
Claim made in abstract referencing known sources of algorithmic bias (opacity, historical data bias) and listing affected group attributes; presented as a problem motivating the study, without specific empirical statistics in the abstract.
high negative Job Search Game Under an Algorithmic Black Box: Generation o... group prejudice / structural inequalities in employment
Small and medium-sized practices face challenges of skill gaps and resource constraints that hinder adoption of technology and data analytics.
Consistent findings across included studies highlighting barriers in small and medium-sized practices (SMPs).
high negative The Use of Technology and Data Analytics in Modern Auditing:... ability to adopt and implement technology/data analytics
AI adoption is reinforcing existing structural disparities within the BRICS bloc, creating a two‑tier productivity hierarchy (China & India vs. Brazil, Russia & South Africa).
Observed divergence in TFP trajectories and differing links between AI indicators and TC/EC across the five BRICS economies; comparative analysis shows stronger frontier-shifting effects in China and India and weaker or negative effects in the other three economies.
high negative AI-driven productivity dynamics in BRICS economies: Evidence... Cross-country divergence in Total Factor Productivity (TFP) growth and its compo...
Brazil, Russia, and South Africa experience stagnation or decline in both efficiency and technological advancement over 2005–2023.
Malmquist TFP decomposition (EC and TC) for each BRICS economy showing flat or negative trends in EC and TC for Brazil, Russia, and South Africa during 2005–2023.
high negative AI-driven productivity dynamics in BRICS economies: Evidence... Efficiency Change (EC) and Technological Change (TC) components of the Malmquist...
While achieving financial autonomy, firms are also getting exposed to new constraints by shifting their reliance on third-party software, technological infrastructures and opaque algorithms (Gaviyau & Godi, 2025; Suhrab et al., 2026).
Stated with citations to Gaviyau & Godi (2025) and Suhrab et al. (2026); presented as an observed/paraphrased risk or unintended consequence in the paper. No empirical sample details in the excerpt.
high negative Re-Evaluation of Resource Dependence in AI Enabled SME Finan... increased reliance/dependency on third-party technology and opaque algorithms (n...
SMEs are suffering from various financial constraints, mostly relying heavily on traditional financial institutions for their survival (Kadzima et al., 2025).
Statement supported by citation to Kadzima et al. (2025); presented as a literature-supported empirical generalization in the paper's background/introduction. No sample size or empirical details given in the excerpt.
high negative Re-Evaluation of Resource Dependence in AI Enabled SME Finan... financial constraints / reliance on traditional financial institutions
Fluency is not reliability: without structures that stabilise both human and model reasoning, AI cannot be trusted or governed where it matters most.
Central thesis/claim of the paper; normative argument synthesising the paper's observations and proposals rather than an empirically tested finding provided here.
high negative The Missing Knowledge Layer in AI: A Framework for Stable Hu... trustworthiness/governability of AI in high-stakes contexts