The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (7953 claims)

Adoption
5539 claims
Productivity
4793 claims
Governance
4333 claims
Human-AI Collaboration
3326 claims
Labor Markets
2657 claims
Innovation
2510 claims
Org Design
2469 claims
Skills & Training
2017 claims
Inequality
1378 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 402 112 67 480 1076
Governance & Regulation 402 192 122 62 790
Research Productivity 249 98 34 311 697
Organizational Efficiency 395 95 70 40 603
Technology Adoption Rate 321 126 73 39 564
Firm Productivity 306 39 70 12 432
Output Quality 256 66 25 28 375
AI Safety & Ethics 116 177 44 24 363
Market Structure 107 128 85 14 339
Decision Quality 177 76 38 20 315
Fiscal & Macroeconomic 89 58 33 22 209
Employment Level 77 34 80 9 202
Skill Acquisition 92 33 40 9 174
Innovation Output 120 12 23 12 168
Firm Revenue 98 34 22 154
Consumer Welfare 73 31 37 7 148
Task Allocation 84 16 33 7 140
Inequality Measures 25 77 32 5 139
Regulatory Compliance 54 63 13 3 133
Error Rate 44 51 6 101
Task Completion Time 88 5 4 3 100
Training Effectiveness 58 12 12 16 99
Worker Satisfaction 47 32 11 7 97
Wages & Compensation 53 15 20 5 93
Team Performance 47 12 15 7 82
Automation Exposure 24 22 9 6 62
Job Displacement 6 38 13 57
Hiring & Recruitment 41 4 6 3 54
Developer Productivity 34 4 3 1 42
Social Protection 22 10 6 2 40
Creative Output 16 7 5 1 29
Labor Share of Income 12 5 9 26
Skill Obsolescence 3 20 2 25
Worker Turnover 10 12 3 25
JobMatchAI optimizes utility across skill fit, experience, location, salary, and company preferences.
Paper claims the system's objective/utility function includes these factors and that the reranking/optimization accounts for them. No optimization algorithm details, weighting, or empirical utility gains are given in the excerpt.
medium positive JobMatchAI An Intelligent Job Matching Platform Using Knowle... aggregate utility across factors: skill fit, experience, location, salary, compa...
JobMatchAI is production-ready.
Paper explicitly describes JobMatchAI as "production-ready" and also claims a hosted website and installable package (artifacts consistent with deployment readiness). No formal certification, deployment metrics, or uptime/performance SLAs are provided in the excerpt.
medium positive JobMatchAI An Intelligent Job Matching Platform Using Knowle... production readiness (availability of deployable artifacts such as hosted site a...
For AI agent tool design, surfacing contextual information outperforms prescribing procedural workflows.
Authors' conclusion drawn from the suite of experiments (GraphRAG vs TDD prompting vs auto-improvement) showing better regression reduction and/or resolution when contextual information is surfaced.
medium positive TDAD: Test-Driven Agentic Development - Reducing Code Regres... effectiveness in reducing regressions and improving resolution when using contex...
An autonomous auto-improvement loop raised resolution from 12% to 60% on a 10-instance subset with 0% regression.
Reported experiment on a 10-instance subset where an auto-improvement loop was applied (numbers provided in the excerpt).
medium positive TDAD: Test-Driven Agentic Development - Reducing Code Regres... resolution rate (increase from 12% to 60%) and regression rate (reported as 0%) ...
Smaller models benefit more from contextual information (which tests to verify) than from procedural instructions (how to do TDD).
Inferred from comparative results across models (Qwen3-Coder 30B vs Qwen3.5-35B-A3B) and interventions (contextual test-surfacing vs TDD prompting) reported in the paper.
medium positive TDAD: Test-Driven Agentic Development - Reducing Code Regres... relative improvement in regression rate and resolution when providing contextual...
When deployed as an agent skill, GraphRAG improved resolution from 24% to 32%.
Empirical comparison reported in the evaluation on SWE-bench Verified (same experimental context as above).
medium positive TDAD: Test-Driven Agentic Development - Reducing Code Regres... resolution rate (percentage of issues/problems resolved)
TDAD's GraphRAG workflow reduced test-level regressions by 70% (from 6.08% to 1.82%).
Empirical result reported from the SWE-bench Verified evaluation using the GraphRAG workflow (sample details: Qwen3-Coder 30B on 100 instances and Qwen3.5-35B-A3B on 25 instances as reported).
medium positive TDAD: Test-Driven Agentic Development - Reducing Code Regres... test-level regression rate (percentage of tests that regressed)
Partial validation against observed AIS vessel behavior shows PIER is consistent with the fastest real transits while exhibiting 23.1× lower variance.
Comparison between PIER trajectories and observed fastest transits in AIS data (details in paper); reported relative variance reduction of 23.1×.
medium positive Physics-informed offline reinforcement learning eliminates c... variance of transit times or fuel use compared to fastest observed AIS transits
PIER eliminates catastrophic fuel waste: great-circle routing produces extreme fuel consumption (>1.5× median) in 4.8% of voyages, while PIER reduces this to 0.5% (a 9-fold reduction).
Analysis on the same 2023 AIS validation dataset across seven Gulf of Mexico routes (840 episodes per method) comparing distribution tails of voyage fuel consumption; reported incidence rates (4.8% vs 0.5%).
medium positive Physics-informed offline reinforcement learning eliminates c... fraction of voyages with fuel consumption >1.5× median
PIER reduces mean CO2 emissions by 10% relative to great-circle routing.
Offline evaluation using physics‑calibrated environments grounded in historical AIS data and ocean reanalysis products; validation on one full year (2023) of AIS across seven Gulf of Mexico routes with 840 episodes per method; reported mean reduction of 10% and bootstrap 95% CI for mean savings [2.9%, 15.7%].
medium positive Physics-informed offline reinforcement learning eliminates c... mean CO2 emissions per voyage (percent reduction vs great-circle routing)
The system is in production at Personize.ai.
Deployment statement in the paper asserting production use at Personize.ai.
medium positive Governed Memory: A Production Architecture for Multi-Agent W... deployment status (production at Personize.ai)
The LoCoMo result confirms that governance and schema enforcement impose no retrieval quality penalty.
Interpretation in the paper linking LoCoMo benchmark accuracy (74.8%) to the conclusion that governance/schema enforcement did not degrade retrieval quality.
medium positive Governed Memory: A Production Architecture for Multi-Agent W... inferred retrieval quality impact of governance/schema enforcement (no penalty)
Governed Memory implements a closed-loop schema lifecycle with AI-assisted authoring and automated per-property refinement.
Design description in the paper describing the closed-loop schema lifecycle and AI-assisted authoring/refinement.
medium positive Governed Memory: A Production Architecture for Multi-Agent W... schema lifecycle process including AI-assisted authoring and per-property refine...
Governed Memory uses reflection-bounded retrieval with entity-scoped isolation.
Design description in the paper specifying reflection-bounded retrieval and entity-scoped isolation.
medium positive Governed Memory: A Production Architecture for Multi-Agent W... retrieval strategy (reflection-bounded) and isolation scope (entity-scoped)
Governed Memory uses tiered governance routing with progressive context delivery.
Design description in the paper listing tiered governance routing and progressive delivery as mechanisms.
medium positive Governed Memory: A Production Architecture for Multi-Agent W... governance routing strategy (tiered) and context delivery method (progressive)
Governed Memory implements a dual memory model combining open-set atomic facts with schema-enforced typed properties.
Design specification within the paper describing the dual memory model (architectural mechanism).
medium positive Governed Memory: A Production Architecture for Multi-Agent W... memory model design: open-set atomic facts + schema-enforced typed properties
The paper presents Governed Memory, a shared memory and governance layer addressing the memory governance gap.
System architecture and design description in the paper (proposal of a shared memory and governance layer).
medium positive Governed Memory: A Production Architecture for Multi-Agent W... existence of an architecture called Governed Memory
The results confirm the positive impact of cognitive technologies on the development of entrepreneurial opportunities and innovative activity.
Conclusion drawn from the positive estimated association (0.33 coefficient) and the observed increases in the indices between 2020 and 2024 reported in the paper.
medium positive Innovative Cognitive Tools for Studying Market Opportunities... entrepreneurial opportunities and innovation activity (proxied by the Market Opp...
The Cognitive Tools Index and the Market Opportunity Index were -0.42 and -0.35 in 2020 and 0.94 and 0.92 in 2024, respectively.
Reported observed/computed index values for the years 2020 and 2024 in the study (data source and aggregation method not detailed in the excerpt).
medium positive Innovative Cognitive Tools for Studying Market Opportunities... Cognitive Tools Index and Market Opportunity Index (yearly values for 2020 and 2...
The empirical study for 2020–2024 showed that a one standard unit increase in the Cognitive Tools Index is associated with an average 0.33 increase in the Market Opportunity Index.
Estimated coefficient reported from the panel econometric model over 2020–2024 (model included lags and used instrumental approach; sample size and standard errors not provided in the excerpt).
medium positive Innovative Cognitive Tools for Studying Market Opportunities... Market Opportunity Index (effect of one standard unit change in Cognitive Tools ...
Pidgin significantly outperformed standard English on measures of knowledge transfer across agriculture, education, and health domains.
Aggregate analysis of questionnaire comprehension items (44-item instrument) across domain-specific modules administered to 45 participants; comparative language-performance results reported in study.
medium positive From Linguistic Hybridity to Development Sovereignty: Pidgin... domain-specific comprehension / knowledge transfer
Volunteers who used proverbs and vernacular registers were incorporated into local kinship structures, granted traditional titles, and perceived as legitimate development actors rather than outsiders.
Qualitative evidence from participant observation and discourse samples collected during fieldwork; interview and questionnaire items on perceptions of volunteer legitimacy and social integration.
medium positive From Linguistic Hybridity to Development Sovereignty: Pidgin... social integration indicators (kinship incorporation, traditional titles, percei...
Agricultural techniques taught in Pidgin were nearly universally adopted by recipients.
Self-reported adoption/behavior-change items in the 44-item questionnaire and corroborating qualitative observation of agricultural practice among participants in the sample (N = 45).
medium positive From Linguistic Hybridity to Development Sovereignty: Pidgin... adoption of agricultural innovations / reported behavior change
Pidgin-mediated interventions achieved large comprehension gains on health messaging, exceeding 30 percentage points compared with standard English.
Quantitative comparison derived from the 44-item field questionnaire (comprehension items) administered to the 45-participant sample; reported percentage-point difference (>30 pp) in health-message comprehension by language of instruction.
medium positive From Linguistic Hybridity to Development Sovereignty: Pidgin... health-message comprehension (percentage-point gain)
Using Cameroon Pidgin English as the primary medium for Peace Corps development work produced substantially better knowledge transfer, uptake, and social legitimacy than standard English.
Mixed-methods field study of Peace Corps interventions in Cameroon's Northwest: 44-item questionnaire administered to 45 participants across agriculture, education, and health; quantitative measures of comprehension and self-reported adoption; supplemented by qualitative observation and discourse samples.
medium positive From Linguistic Hybridity to Development Sovereignty: Pidgin... knowledge transfer (comprehension), behavioral uptake/adoption, social legitimac...
A hybrid strategic–computational framework, supported by governance mechanisms (human-in-the-loop checkpoints, escalation paths, accountability structures), is motivated to manage tensions and ensure responsible decision-making in AI-rich managerial contexts.
Synthesis-driven prescriptive framework produced by cross-framework analysis; conceptual recommendation rather than implementation evidence.
medium positive Comparative analysis of strategic vs. computational thinking... presence and effectiveness of hybrid governance mechanisms in managing human–alg...
Roles oriented to information processing, optimisation, and operational precision (monitor, disseminator, resource allocator) are substantially enhanced by computational thinking (automation, optimisation, algorithmic decision-support).
Theoretical mapping of computational capabilities onto Mintzberg’s information-processing roles; conceptual reasoning without empirical validation.
medium positive Comparative analysis of strategic vs. computational thinking... enhancement in information-processing tasks (accuracy, speed, automation potenti...
AI adoption will shift fact-checking tasks (more monitoring, less rote verification), creating a need for reskilling and new roles (AI tool operators, analysts); donor and public investments should fund capacity building for local organizations.
Workforce implications inferred from interview reports about changing task mixes and the study's interpretive recommendations.
medium positive Fact-Checking Platforms in the Middle East: A Comparative St... changes in task allocation, workforce skills, and need for capacity-building inv...
Investments should prioritize hybrid models where automation provides scale and humans handle contextual, adversarial, and legally sensitive judgments.
Recommendation based on interview findings about AI benefits and limitations and the study's interpretive synthesis.
medium positive Fact-Checking Platforms in the Middle East: A Comparative St... verification effectiveness and error mitigation in workflows
The study distills context-sensitive best practices for fact-checking in restrictive environments, including safety protocols, local partnerships, and hybrid verification workflows.
Synthesis of findings from document analysis and interviews producing a set of recommended practices documented in the study's outputs.
medium positive Fact-Checking Platforms in the Middle East: A Comparative St... recommended operational practices for safety and verification effectiveness
AI can lower verification costs and scale reach by automating tasks such as classification, clustering, alerting, and translation.
Interview reports from platform staff and interpretive analysis identifying AI-assisted use cases for prioritization, monitoring, and translation.
medium positive Fact-Checking Platforms in the Middle East: A Comparative St... verification cost/time and monitoring/translation capacity
Community reporting and audience-focused formats are used to improve engagement.
Platform outputs and staff interviews describing deployment of community-reporting mechanisms and tailored audience formats.
Platforms form partnerships with media outlets, academic institutions, and civil-society actors to amplify reach and secure data.
Interview accounts and organizational documents describing cross-sector partnerships and collaboration arrangements.
medium positive Fact-Checking Platforms in the Middle East: A Comparative St... audience reach and data access through partnerships
Transparent workflows and clear labeling are used to build credibility with audiences.
Document analysis of platform outputs and guidelines showing explicit workflow transparency and labeling practices, supported by interview statements.
medium positive Fact-Checking Platforms in the Middle East: A Comparative St... audience perceptions of credibility/trust
Platforms emphasize local-language expertise and culturally grounded sourcing as a strategy to improve verification and credibility.
Observed practices and platform guidelines derived from document analysis and staff interviews describing the use of local-language expertise and sourcing.
medium positive Fact-Checking Platforms in the Middle East: A Comparative St... verification quality and perceived credibility
Practical policy recommendation: require transparent documentation and third‑party auditing for high‑impact LLM deployments and subsidize public‑interest evaluation infrastructure.
Policy prescription supported by the paper's normative and economic analysis; no pilot implementation or empirical evaluation of the recommendation is provided.
medium positive LLM Alignment should go beyond Harmlessness–Helpfulness and ... policy adoption rates for documentation/auditing requirements and availability o...
Policy levers that can address alignment externalities include disclosure requirements (data provenance, evaluation practices), mandatory participatory evaluation for high‑impact systems, standards for auditing, procurement rules favoring participatory transparency, and liability/certification regimes.
Policy recommendation based on economic and governance reasoning and synthesis of prior regulatory proposals; no policy pilot data or impact evaluation is reported.
medium positive LLM Alignment should go beyond Harmlessness–Helpfulness and ... adoption of listed policy levers and subsequent changes in alignment-related out...
Economics research should develop multi‑dimensional metrics capturing welfare, distributional impacts, and autonomy rather than relying on single aggregate accuracy or safety scores.
Prescriptive recommendation grounded in critique of current benchmarking practices and theoretical desiderata; no new metric is empirically validated in the paper.
medium positive LLM Alignment should go beyond Harmlessness–Helpfulness and ... availability and adoption of multi‑dimensional metrics for welfare, distribution...
Dynamic constraints (continuous monitoring, feedback loops, and configurable safety settings that adapt post‑deployment) are preferable to static pre‑deployment-only safety fixes.
Conceptual argument and synthesis of deployment experience and monitoring literature; suggestions for operational tooling and monitoring rather than empirical evaluation.
medium positive LLM Alignment should go beyond Harmlessness–Helpfulness and ... responsiveness and adaptivity of safety mechanisms post‑deployment; reduction in...
Participatory governance—includes varied stakeholders such as users, affected communities, domain experts, and regulators in design, evaluation, and deployment decisions—will improve alignment outcomes and legitimacy.
Theoretical and normative argument citing participatory design literature and ethical governance scholarship; paper offers procedural recommendations but no empirical trial of governance models.
medium positive LLM Alignment should go beyond Harmlessness–Helpfulness and ... stakeholder inclusion in governance processes and perceived legitimacy/effective...
Alignment should shift from static, post‑training constraints (one‑off fixes like safety filters or RLHF alone) to dynamic, participatory systems that explicitly protect pluralism, autonomy, and justice.
Normative argument and conceptual synthesis drawing on literature in AI safety, value alignment, and participatory design; prescriptive reasoning rather than original empirical results.
medium positive LLM Alignment should go beyond Harmlessness–Helpfulness and ... degree to which alignment processes protect pluralism, autonomy, and justice in ...
Investment choices in collaboration AI and digital infrastructure become central strategic decisions affecting firms' comparative advantage.
Management literature synthesis and illustrative multinational cases; argument is conceptual without firm‑level comparative empirical data presented in the paper.
medium positive The Sociology of Remote Work and Organisational Culture: How... firm comparative advantage; strategic investment in AI/digital infrastructure
AI collaboration tools (virtual assistants, meeting summarizers, asynchronous platforms) complement hybrid work by reducing coordination costs and supporting dispersed teamwork.
Conceptual integration of technology and organizational literature; supported by illustrative case examples of multinational organizations but not by new quantitative causal evidence.
medium positive The Sociology of Remote Work and Organisational Culture: How... coordination costs; dispersed teamwork effectiveness
Hybrid and remote work increase employee autonomy and work–life integration.
Conceptual synthesis of sociological and management literatures; supported by secondary data and illustrative case studies from multinational organizations. No primary quantitative analysis or sample size reported—based on comparative case illustrations and theoretical integration.
medium positive The Sociology of Remote Work and Organisational Culture: How... employee autonomy; work–life integration
Tariff reductions and expanded supply channels following CAFTA contributed as secondary channels to increased third‑country agricultural imports.
Paper documents tariff changes and supply‑channel expansion as part of mechanism analysis; DID and mediator tests link tariff reductions and expanded channels to import outcomes.
medium positive How regional trade policy uncertainty affects agricultural i... tariff rates and measures of available supply channels (e.g., number of source m...
CAFTA improved logistics and service frictions (e.g., storage, logistics performance) relevant to agricultural imports.
Secondary channel analysis using logistics/storage indicators and related service frictions available in the data; assessed as mediators in the DID framework.
medium positive How regional trade policy uncertainty affects agricultural i... logistics/service friction indicators (storage capacity/use, logistics performan...
CAFTA widened China's trading‑partner and product diversity in agricultural imports, increasing both partner and product variety from third countries.
DID estimates on partner and product diversity metrics constructed from customs import records (2000–2014); reported changes in diversity as outcomes in the paper.
medium positive How regional trade policy uncertainty affects agricultural i... trading‑partner diversity (number of partners) and product diversity (number of ...
A complementary‑products linkage effect is a key mechanism: expanded channels and product complementarities make sourcing non‑ASEAN goods easier and more attractive.
Mechanism analysis using product‑level and partner‑level import data (China Customs) showing increased imports of complementary products and linkages consistent with this channel in DID estimates.
medium positive How regional trade policy uncertainty affects agricultural i... imports of complementary products and cross‑product linkage indicators (product ...
The primary spillover mechanism is a 'low‑cost import experience' effect: cheaper/consistent regional sourcing lowers firms' marginal cost of engaging additional foreign suppliers, encouraging imports from third countries.
Mechanism tests using mediator variables (cost/procurement indicators) within the DID framework and firm‑level data; reported as the main channel in the paper's analysis.
medium positive How regional trade policy uncertainty affects agricultural i... import uptake from third countries attributable to reductions in procurement/mar...
A new market will emerge for controls, certification, attestations, secure toolchains, and audited model deployments; compliance costs will shape comparative advantages among firms and countries.
Policy-market synthesis and analogies to certification markets in other regulated tech domains (qualitative).
medium positive Highly Autonomous Cyber-Capable Agents: Anticipating Capabil... size and growth of market for certification/compliance services and distribution...