The Commonplace
Home Dashboard Papers Evidence Digests 🎲

Evidence (5539 claims)

Adoption
5539 claims
Productivity
4793 claims
Governance
4333 claims
Human-AI Collaboration
3326 claims
Labor Markets
2657 claims
Innovation
2510 claims
Org Design
2469 claims
Skills & Training
2017 claims
Inequality
1378 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 402 112 67 480 1076
Governance & Regulation 402 192 122 62 790
Research Productivity 249 98 34 311 697
Organizational Efficiency 395 95 70 40 603
Technology Adoption Rate 321 126 73 39 564
Firm Productivity 306 39 70 12 432
Output Quality 256 66 25 28 375
AI Safety & Ethics 116 177 44 24 363
Market Structure 107 128 85 14 339
Decision Quality 177 76 38 20 315
Fiscal & Macroeconomic 89 58 33 22 209
Employment Level 77 34 80 9 202
Skill Acquisition 92 33 40 9 174
Innovation Output 120 12 23 12 168
Firm Revenue 98 34 22 154
Consumer Welfare 73 31 37 7 148
Task Allocation 84 16 33 7 140
Inequality Measures 25 77 32 5 139
Regulatory Compliance 54 63 13 3 133
Error Rate 44 51 6 101
Task Completion Time 88 5 4 3 100
Training Effectiveness 58 12 12 16 99
Worker Satisfaction 47 32 11 7 97
Wages & Compensation 53 15 20 5 93
Team Performance 47 12 15 7 82
Automation Exposure 24 22 9 6 62
Job Displacement 6 38 13 57
Hiring & Recruitment 41 4 6 3 54
Developer Productivity 34 4 3 1 42
Social Protection 22 10 6 2 40
Creative Output 16 7 5 1 29
Labor Share of Income 12 5 9 26
Skill Obsolescence 3 20 2 25
Worker Turnover 10 12 3 25
Clear
Adoption Remove filter
Algorithmic credit systems are positively associated with repayment behavior.
Multiple regression results reported in the study indicate a positive association between use of algorithmic credit systems and repayment behavior based on cross-sectional survey of 400 users.
Measurement reliability and validity were established through Cronbach's alpha and principal component analysis.
Paper states that Cronbach’s alpha and principal component analysis (PCA) were used to establish measurement reliability and validity.
high positive Architecting financial well-being in algorithmic credit syst... measurement reliability/validity
The study used a quantitative, explanatory, cross-sectional design and employed multiple regression and moderation analyses to assess relationships among algorithmic credit systems, human capability, institutional design, and financial-wellbeing outcomes.
Methods described explicitly: quantitative explanatory cross-sectional design; analytical methods named as multiple regression and moderation analyses.
high positive Architecting financial well-being in algorithmic credit syst... research design / analytic methods
Data were collected from 400 users of algorithmic and digitally mediated credit platforms.
Study reports a quantitative, explanatory, cross-sectional survey of users; sample size explicitly stated as 400.
high positive Architecting financial well-being in algorithmic credit syst... sample_size / data source
Empirical simulations of five game scenarios (ranging from repeated prisoner's dilemma to stylized repeated marketing promotion games) validate the theoretical predictions: AI agents naturally exhibit the proposed reasoning patterns and attain stable equilibrium behaviors intrinsically.
Simulation experiments reported in the paper across five distinct game scenarios; these simulations are presented as empirical validation of the theoretical results.
high positive Reasonably reasoning AI agents can avoid game-theoretic fail... frequency/occurrence of stable equilibrium behaviors (Nash-like play) in simulat...
Relaxing the common-knowledge payoff assumption—allowing stage payoffs to be unknown and each agent to observe only its own privately realized stochastic payoffs—still yields the same on-path Nash convergence guarantee.
Theoretical extension/proof in the paper showing convergence results hold under private, stochastic stage payoffs (no common-knowledge of payoffs).
high positive Reasonably reasoning AI agents can avoid game-theoretic fail... on-path Nash convergence under private, stochastic payoffs
We prove that 'reasonably reasoning' agents—agents capable of forming beliefs about others' strategies from previous observation and learning to best respond to these beliefs—eventually behave along almost every realized play path in a way that is weakly close to a Nash equilibrium of the continuation game.
Formal theoretical proof provided in the paper (mathematical analysis of agent belief-formation and best-response learning leading to on-path closeness to Nash equilibria).
high positive Reasonably reasoning AI agents can avoid game-theoretic fail... on-path proximity (weak closeness) to Nash equilibrium of the continuation game
Off-the-shelf reasoning AI agents can achieve Nash-like play zero-shot, without explicit post-training.
Stated claim in the paper supported by a combination of theoretical results (formal proofs about convergence properties of 'reasonably reasoning' agents) and empirical simulations across five game scenarios (including repeated prisoner's dilemma and stylized repeated marketing promotion games).
high positive Reasonably reasoning AI agents can avoid game-theoretic fail... attainment of Nash-like play / strategic equilibrium (zero-shot)
End-to-end verified pipelines can produce provably correct code from informal specifications.
The paper surveys early research demonstrating pipelines that go from informal specifications to formally verified code; the provided text does not include experimental sample sizes or benchmarks.
high positive Intent Formalization: A Grand Challenge for Reliable Coding ... provable correctness of generated code
AI-generated postconditions catch real-world bugs missed by prior methods.
Surveyed early research asserted by the paper indicating empirical instances where AI-generated postconditions found bugs that other methods missed; no numeric details provided in the excerpt.
high positive Intent Formalization: A Grand Challenge for Reliable Coding ... bugs detected / error detection rate
Interactive test-driven formalization improves program correctness.
Paper surveys early research that reportedly demonstrates this effect (described as 'interactive test-driven formalization that improves program correctness'); the excerpt does not include specific study details or sample sizes.
The central bottleneck is validating specifications: since there is no oracle for specification correctness other than the user, we need semi-automated metrics that can assess specification quality with or without code, through lightweight user interaction and proxy artifacts such as tests.
Analytical claim and research agenda item in the paper; motivates need for new metrics and interaction designs. No empirical validation or sample size reported in the excerpt.
high positive Intent Formalization: A Grand Challenge for Reliable Coding ... ability to validate specification correctness / specification quality
Intent formalization offers a tradeoff spectrum suitable to the reliability needs of different contexts: from lightweight tests that disambiguate likely misinterpretations, through full functional specifications for formal verification, to domain-specific languages from which correct code is synthesized automatically.
Conceptual framework proposed in the paper describing a spectrum of specification formality; presented as an argument rather than an empirical finding, with no sample sizes provided in the excerpt.
high positive Intent Formalization: A Grand Challenge for Reliable Coding ... suitability of specification approaches for reliability requirements
Intent formalization — translating informal user intent into checkable formal specifications — is the key challenge that will determine whether AI makes software more reliable or merely more abundant.
Normative argument presented by the authors as the central thesis of the paper; no empirical study or sample size cited in the provided text.
high positive Intent Formalization: A Grand Challenge for Reliable Coding ... software reliability (correctness relative to user intent)
Agentic AI systems can now generate code with remarkable fluency.
Authoritative assertion in the paper based on contemporary observations of large code-generating models; no empirical sample size or benchmark numbers reported in the text provided.
high positive Intent Formalization: A Grand Challenge for Reliable Coding ... code generation fluency / ability to produce code
The paper proposes design principles for effective, accountable, and adaptive sandboxes to contribute to debates on experimentalism in AI governance.
Stated contribution of the paper (descriptive claim about content; abstract does not list the principles or empirical testing).
high positive Experimentalism beyond ex ante regulation: A law and economi... existence and articulation of design principles for RSs
Regulatory sandboxes (RSs) have emerged as a potential solution to AI regulatory challenges.
Descriptive observation and normative framing within the paper; contextual reference to the EU AI Act's treatment of sandboxes (no empirical sample reported in the abstract).
high positive Experimentalism beyond ex ante regulation: A law and economi... adoption/emergence of RSs as a governance mechanism for AI
The authors provide a demo video, a hosted website, and an installable package demonstrating JobMatchAI.
Paper explicitly states availability of a demo video, a hosted website, and an installable package. No links, access dates, or artifact verification details are provided in the excerpt.
high positive JobMatchAI An Intelligent Job Matching Platform Using Knowle... availability of demonstration artifacts (video, hosted website, installable pack...
The authors provide a hybrid retrieval stack combining BM25, a skill knowledge graph, and semantic components to evaluate skill generalization.
Paper describes a hybrid retrieval stack composed of BM25, a knowledge graph, and semantic retrieval components intended for evaluation of skill generalization. No evaluation metrics or comparisons are included in the excerpt.
high positive JobMatchAI An Intelligent Job Matching Platform Using Knowle... retrieval stack composition (BM25 + knowledge graph + semantic components) inten...
The authors release JobSearch-XS benchmark.
Paper explicitly states release of the JobSearch-XS benchmark. No dataset size, annotation protocol, or access URL provided in the excerpt.
high positive JobMatchAI An Intelligent Job Matching Platform Using Knowle... availability of JobSearch-XS benchmark (artifact release)
JobMatchAI integrates Transformer embeddings, skill knowledge graphs, and interpretable reranking.
Statement in paper describing system architecture and components (implementation claim). No quantitative implementation details or component-level ablation results provided in the supplied excerpt.
high positive JobMatchAI An Intelligent Job Matching Platform Using Knowle... system design / component integration (presence of Transformer embeddings, knowl...
On the LoCoMo benchmark, the architecture achieves 74.8% overall accuracy.
Benchmark evaluation reported in the paper using the LoCoMo benchmark with a reported overall accuracy of 74.8%.
high positive Governed Memory: A Production Architecture for Multi-Agent W... overall accuracy on the LoCoMo benchmark (percentage)
Adversarial governance compliance was 100%.
Adversarial compliance testing reported in the paper (linked to the adversarial query experiments); reported compliance = 100%.
high positive Governed Memory: A Production Architecture for Multi-Agent W... governance compliance under adversarial queries (percentage)
There was zero cross-entity leakage across 500 adversarial queries.
Adversarial testing reported in the paper: 500 adversarial queries used to test cross-entity leakage; result = zero leakage.
high positive Governed Memory: A Production Architecture for Multi-Agent W... cross-entity information leakage (count/occurrence across 500 queries)
Progressive context delivery yielded a 50% token reduction.
Reported experimental result in the controlled experiments indicating token usage reduction from progressive delivery = 50%.
high positive Governed Memory: A Production Architecture for Multi-Agent W... token usage reduction (percentage)
Governance routing precision was 92% in the experiments.
Reported experimental metric from the controlled experiments (N=250, five content types) showing governance routing precision = 92%.
high positive Governed Memory: A Production Architecture for Multi-Agent W... governance routing precision (percentage)
The system achieved 99.6% fact recall (with complementary dual-modality coverage) in the controlled experiments.
Reported experimental result from the controlled experiments (N=250, five content types) as stated in the paper.
high positive Governed Memory: A Production Architecture for Multi-Agent W... fact recall (percentage recall of facts)
Immediate practical steps include improved documentation, stakeholder audits, and multi‑metric evaluation; medium‑term steps include standards for participatory evaluation and tooling for transparency and monitoring; long‑term steps include institutional governance, interoperable safety APIs, and public‑interest evaluation infrastructure.
Prescriptive roadmap in the paper based on conceptual analysis and prior literature; these are recommended policy/program milestones rather than empirically validated interventions.
high positive LLM Alignment should go beyond Harmlessness–Helpfulness and ... implementation status of the recommended immediate, medium‑term, and long‑term a...
Transparency (detailed documentation of data, objectives, evaluation processes, and deployment constraints; audit and contest mechanisms) is a necessary mechanism for accountable alignment.
Normative and practical argumentation supported by prior work on model cards, documentation standards, and auditing; no new audits are presented in the paper.
high positive LLM Alignment should go beyond Harmlessness–Helpfulness and ... availability and granularity of documentation and auditability of model developm...
Pluralistic evaluation—using multiple, diverse evaluation criteria and stakeholder‑informed metrics rather than single aggregated alignment scores—will better capture the values and harms at stake.
Argumentative rationale and literature synthesis advocating multi‑metric evaluation approaches; examples from prior evaluation critiques are referenced rather than new empirical comparison.
high positive LLM Alignment should go beyond Harmlessness–Helpfulness and ... evaluation coverage of diverse values, harms, and stakeholder perspectives
The Flourishing–Justice–Autonomy (FJA) framework should guide alignment efforts, emphasizing (1) Flourishing (human well‑being and meaningful opportunities), (2) Justice (distributional fairness and protection of vulnerable groups), and (3) Autonomy (informed choice and user control).
Prescriptive proposal grounded in conceptual analysis and synthesis of ethical and technical literature; the paper defines and motivates the three principles as its core normative contribution.
high positive LLM Alignment should go beyond Harmlessness–Helpfulness and ... alignment criteria operationalized as Flourishing, Justice, and Autonomy metrics...
The positive spillover effects of CAFTA on third‑country agricultural imports are concentrated in medium and large firms.
Heterogeneity analysis using firm‑size subgroup DID estimates derived from the China Industrial Enterprise Database (2000–2014) showing stronger effects for medium and large enterprises.
high positive How regional trade policy uncertainty affects agricultural i... firm‑level import increases from third countries, by firm size (medium/large vs ...
CAFTA induced spillovers that significantly increased China's agricultural imports from non‑ASEAN (third) countries.
Difference‑in‑differences (DID) estimation exploiting CAFTA as an exogenous shock; import outcomes drawn from China Customs Database 2000–2014; robustness checks reported (mediator tests and subgroup analyses).
high positive How regional trade policy uncertainty affects agricultural i... China's agricultural imports from non‑ASEAN countries (import volumes/values)
Total effect of trust on brand loyalty is approximately 0.800 (total β ≈ 0.800 = direct β 0.410 + indirect β ≈ 0.390), all reported as statistically significant (p < .001 for direct effects; p = .001 for indirect).
Path coefficients reported from SEM (n = 450) and arithmetic combination of direct and indirect standardized effects as reported in the paper.
high positive Trust in AI-Driven Marketing and its Impact on Brand Loyalty... Brand Loyalty (total effect of Trust)
Adoption intention for AI marketing strongly predicts brand loyalty (Adoption Intention → Brand Loyalty: standardized β = 0.717, p < .001).
Cross-sectional survey (n = 450 Gen Z); SEM (SPSS AMOS); reported standardized path coefficient β = 0.717 with p < .001.
Trust in AI-driven marketing directly increases Generation Z consumers' brand loyalty (Trust → Brand Loyalty: standardized β = 0.410, p < .001).
Cross-sectional survey (n = 450 Gen Z); SEM (SPSS AMOS); reported standardized path coefficient β = 0.410 with p < .001.
Trust in AI-driven marketing has a strong positive effect on Generation Z consumers' intention to adopt AI marketing (Trust → Adoption Intention: standardized β = 0.718, p < .001).
Cross-sectional survey (n = 450 Generation Z respondents); analysis via Structural Equation Modeling (SPSS AMOS); reported standardized path coefficient β = 0.718 with p < .001.
The main results are robust to inclusion of firm, industry, and year fixed effects, DID identification using the 2018 SCD pilot, and multiple robustness checks addressing potential confounders and endogeneity.
Authors report baseline regressions with firm/industry/year fixed effects, DID specifications exploiting the 2018 Supply Chain Innovation and Application Pilot Program as a quasi-natural experiment, and a battery of robustness tests (alternative specifications, controls, and checks).
high positive Supply Chain Digitalization and its Impact on Green Innovati... robustness of estimated SCD effects on corporate green innovation
The positive effect of SCD on green innovation is stronger for substantive green innovation (actual environmentally beneficial R&D and technologies) than for strategic green innovation (symbolic/labeling or reputation‑oriented activities).
Heterogeneous outcome analysis splitting green innovation into 'substantive' (e.g., green patents, technological R&D outputs) versus 'strategic' (signaling/compliance indicators); regression and DID estimates show larger and statistically significant coefficients for substantive measures compared to smaller or weaker effects on strategic measures.
high positive Supply Chain Digitalization and its Impact on Green Innovati... substantive green innovation (green patents, concrete environmental R&D outputs)...
Supply chain digitalization (SCD) significantly increases corporate green innovation among Chinese A-share listed firms (2012–2022).
Panel analysis of Chinese A-share listed firms over 2012–2022 using regression models with firm, industry, and year fixed effects; difference-in-differences (DID) identification exploiting the 2018 Supply Chain Innovation and Application Pilot Program as an exogenous shock to SCD; firm-level controls included; multiple robustness checks reported.
high positive Supply Chain Digitalization and its Impact on Green Innovati... corporate green innovation (aggregate measures of green innovation such as green...
Algorithmic transparency and interpretability are important so investors and regulators can understand how ESG inputs affect automated decision systems.
Normative recommendation grounded in literature on model risk, accountability, and regulatory needs; not an empirical finding but a consensus implication of reviewed work.
high positive SUSTAINABILITY ISSUES IN FINANCIAL ACCOUNTING RESEARCH model interpretability / stakeholder understanding / accountability
MYRIAD-EU synthesizes progress and remaining challenges and proposes concrete directions for continued research and practice in multi-hazard, multi-risk DRR.
Overall project scope: synthesis and reflection on interdisciplinary research and practice conducted across MYRIAD-EU (2021–2025), as reported in the paper.
high positive Reducing risk together: moving towards a more holistic appro... existence of a consolidated synthesis and recommended research/practice directio...
MYRIAD-EU conducted in-depth, place-based case studies co-produced with local stakeholders to test methods and tools for multi-risk assessment.
Reported methods include in-depth place-based case studies co-produced with local stakeholders as part of MYRIAD-EU activities (2021–2025).
high positive Reducing risk together: moving towards a more holistic appro... testing and validation of methods and tools via co-produced case studies
The main results are robust to inclusion of controls and a range of heterogeneity and moderation checks, supporting that findings are not driven by simple time trends or obvious confounders.
Reported robustness checks in the staggered-DID framework (control variables, alternative specifications, subgroup tests) and discussion of parallel-trends assumption.
high positive How Does Urban Green Data Center Policy Empower Corporate En... corporate energy utilization efficiency (stability of estimated policy effect ac...
Implementation of urban green data center pilot policies leads to measurable improvements in firms' energy utilization efficiency.
Staggered-adoption difference-in-differences (DID) using an unbalanced firm–year panel of Chinese A-share listed firms linked to prefecture-level cities (2012–2024); treatment is timing/location of urban green data center pilot designation; results reported as statistically significant and robust to controls and alternative specifications.
high positive How Does Urban Green Data Center Policy Empower Corporate En... corporate energy utilization efficiency
Mechanisms linking digital services to export performance include reduced transaction and search costs, platform network and scale effects, data as an input improving service quality and customization, and task‑level specialization changing comparative advantage.
Conceptual/theoretical synthesis drawing on multiple strands of literature and illustrative case studies presented in the review (no new causal identification).
high positive Analysis of Digital Services Trade and Export Competitivenes... export performance of digital services (via transaction costs, service quality, ...
Digital services trade is shifting from traditional cross‑border delivery toward online, platform‑based models, with cross‑border data flows a core input and determinant of competitiveness.
Integrative literature and policy review synthesizing domestic and international studies; theoretical/conceptual synthesis and cited case examples (no new econometric analysis or primary microdata).
high positive Analysis of Digital Services Trade and Export Competitivenes... mode of digital services delivery and export competitiveness (role of platforms ...
Policy recommendations include standards on explainability, audit trails, certification for finance/tax AI systems, stronger data governance, and public–private coordination to update regulatory guidance.
Paper's policy and governance recommendations drawn from case findings and literature synthesis; prescriptive content rather than evaluated interventions.
high positive Explore the Impact of Generative AI on Finance and Taxation existence/adoption of standards, improvements in regulatory clarity and complian...
Deployments should build governance, explainability, and auditability into systems and start with pilots on high-volume, well-structured tasks before scaling.
Paper recommendations based on case experience and analytic framing; advocated strategy rather than empirically validated at scale within the paper.
high positive Explore the Impact of Generative AI on Finance and Taxation deployment success rate, governance completeness, pilot-to-scale learning outcom...
To mitigate risks and realize benefits, AI systems in finance/tax should combine AI with human-in-the-loop controls and clear escalation paths.
Prescriptive recommendation grounded in case lessons and literature on safe AI deployment; presented as a best-practice guideline rather than tested intervention.
high positive Explore the Impact of Generative AI on Finance and Taxation safety/accuracy of outputs, reduction in erroneous autonomous actions