The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲

Evidence (13870 claims)

Adoption
8467 claims
Productivity
7558 claims
Governance
6805 claims
Human-AI Collaboration
6363 claims
Org Design
4132 claims
Innovation
4065 claims
Labor Markets
3526 claims
Skills & Training
2945 claims
Inequality
2066 claims

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome Positive Negative Mixed Null Total
Other 749 196 98 892 1984
Governance & Regulation 817 394 188 121 1544
Organizational Efficiency 771 189 124 83 1177
Technology Adoption Rate 627 233 123 96 1088
Research Productivity 411 123 56 332 933
Output Quality 467 178 59 47 751
Decision Quality 320 174 75 42 618
Firm Productivity 435 55 88 20 604
AI Safety & Ethics 214 276 65 33 593
Market Structure 178 167 122 24 496
Task Allocation 207 64 71 32 379
Skill Acquisition 165 59 60 17 301
Innovation Output 203 27 43 18 292
Employment Level 105 52 107 13 279
Fiscal & Macroeconomic 131 69 43 26 276
Consumer Welfare 116 63 42 11 232
Firm Revenue 150 48 26 3 227
Inequality Measures 44 122 49 6 221
Task Completion Time 169 29 8 12 219
Worker Satisfaction 89 63 20 12 184
Error Rate 69 92 10 2 173
Regulatory Compliance 76 68 14 5 163
Training Effectiveness 93 21 13 19 148
Wages & Compensation 77 36 25 6 144
Automation Exposure 51 54 22 12 142
Team Performance 86 17 27 9 140
Developer Productivity 94 17 14 6 132
Job Displacement 12 80 20 1 113
Hiring & Recruitment 51 7 8 3 69
Creative Output 31 17 7 3 59
Skill Obsolescence 5 46 6 1 58
Social Protection 27 16 8 2 53
Labor Share of Income 17 17 17 51
Worker Turnover 11 12 3 26
Industry 1 1
OneSearch-V2 effectively mitigates common search system issues such as information bubbles and long-tail sparsity, without incurring additional inference costs or serving latency.
Author claim in the paper stating mitigation of these issues and no added inference/latency costs; no quantitative measures, benchmarks, or latency numbers provided in the excerpt.
high positive OneSearch-V2: The Latent Reasoning Enhanced Self-distillatio... information bubbles and long-tail sparsity (and inference/serving latency)
Manual evaluation confirms gains in query-item relevance, with +1.37%.
Reported manual evaluation metric in the paper; no sample size or annotation protocol provided in the excerpt.
Manual evaluation confirms gains in search experience quality, with +1.65% in page good rate.
Reported manual evaluation metric in the paper; no sample size or annotation protocol provided in the excerpt.
OneSearch-V2 increases order volume by +2.11% in online A/B tests.
Reported online A/B test result in the paper; no sample size, test duration, or statistical significance reported in the excerpt.
OneSearch-V2 increases buyer conversion rate by +3.05% in online A/B tests.
Reported online A/B test result in the paper; no sample size, test duration, or statistical significance reported in the excerpt.
OneSearch-V2 increases item CTR by +3.98% in online A/B tests.
Reported online A/B test result in the paper; no sample size, test duration, or statistical significance reported in the excerpt.
OneSearch, as a representative industrial-scale deployed generative search framework, has brought significant commercial and operational benefits.
Author assertion describing OneSearch as industrial-scale and commercially/operationally beneficial; no supporting numerical evidence or sample size reported in the excerpt.
high positive OneSearch-V2: The Latent Reasoning Enhanced Self-distillatio... commercial and operational benefits
Generative Retrieval (GR) offers advantages over multi-stage cascaded architectures such as end-to-end joint optimization and high computational efficiency.
Statement in paper positioning GR as a promising paradigm and listing these advantages; no quantitative study or sample size reported in the excerpt.
high positive OneSearch-V2: The Latent Reasoning Enhanced Self-distillatio... computational efficiency and ability to perform end-to-end joint optimization
The framework aims to support more comparable benchmarks and cumulative research on human-AI readiness, advancing safer and more accountable human-AI collaboration.
Stated aims and intended impact in paper; aspirational/conceptual rather than empirically demonstrated in excerpt.
high positive From Accuracy to Readiness: Metrics and Benchmarks for Human... benchmarks, cumulative research, safety and accountability in human-AI collabora...
Operationalizing evaluation through interaction traces rather than model properties or self-reported trust enables deployment-relevant assessment of calibration, error recovery, and governance.
Methodological claim/proposed approach in paper; presented as enabling assessment but no empirical evaluation reported in excerpt.
high positive From Accuracy to Readiness: Metrics and Benchmarks for Human... assessment of calibration, error recovery, governance via interaction traces
The taxonomy and metrics are connected to the Understand-Control-Improve (U-C-I) lifecycle of human-AI onboarding and collaboration.
Conceptual mapping described in paper; no empirical tests or sample reported in excerpt.
high positive From Accuracy to Readiness: Metrics and Benchmarks for Human... linking metrics to U-C-I onboarding lifecycle
We introduce a four part taxonomy of evaluation metrics spanning outcomes, reliance behavior, safety signals, and learning over time.
Explicit methodological claim in paper announcing a taxonomy; described as a contribution rather than empirically tested in excerpt.
high positive From Accuracy to Readiness: Metrics and Benchmarks for Human... evaluation metrics taxonomy (outcomes, reliance behavior, safety signals, learni...
This paper proposes a measurement framework for evaluating human-AI decision-making centered on team readiness.
Methodological contribution presented in paper; conceptual framework proposed (no empirical validation reported in excerpt).
high positive From Accuracy to Readiness: Metrics and Benchmarks for Human... team readiness evaluation
Artificial intelligence (AI) systems are deployed as collaborators in human decision-making.
Statement in paper (conceptual/observational claim); no empirical sample or method provided in excerpt.
high positive From Accuracy to Readiness: Metrics and Benchmarks for Human... deployment of AI as collaborators
Late disclosure of AI involvement improved affective engagement for AI-enhanced content.
Reported experimental result in the abstract from the two online studies (study 1: n = 325; study 2: n = 371) manipulating disclosure timing (early vs. late).
high positive AI content labeling and user engagement on social media: The... affective engagement for AI-enhanced content under late disclosure
Automation in Japanese manufacturing increased even during periods of slow productivity growth.
Empirical finding from applying the framework to industry-level data in Japanese manufacturing; comparison of inferred automation trends with observed productivity growth periods (exact sample/time not provided in the summary).
high positive The macroeconomics of automation trend in automation versus productivity growth (automation increased despite slo...
Applying the framework to Japanese manufacturing industries shows that automation increased through capital deepening.
Empirical application of the theoretical framework to Japanese manufacturing industries (industry-level analysis); estimation/inference using industry macro observables. (Paper states result; exact sample size/time span not provided in the summary.)
high positive The macroeconomics of automation increase in automation (share of tasks by capital) attributable to capital deepe...
The model provides a transparent mapping from standard macroeconomic observables (capital-labor ratio, output per worker, elasticity of substitution) into the degree of automation, allowing automation to be measured without relying on technology-specific indicators.
Theoretical mapping derived from the CES structure that links observable macro variables to the endogenous degree of automation; methodological claim about inference procedure.
high positive The macroeconomics of automation degree of automation inferred from macro observables
Aggregating task-level decisions generates a CES production function in which the economy-wide degree of automation emerges endogenously.
Analytical derivation in the paper: aggregation of task-level adoption decisions yields a CES aggregate production function with endogenous automation parameter.
high positive The macroeconomics of automation form of aggregate production function / emergence of economy-wide automation par...
The degree of automation is defined as the share of tasks performed by capital rather than labor.
Explicit model definition provided in the paper (conceptual/theoretical definition).
high positive The macroeconomics of automation share of tasks performed by capital
The degree of automation in the aggregate economy emerges endogenously as an equilibrium outcome and can be inferred from standard macroeconomic data.
Theoretical development in a task-based production framework with endogenous technology adoption; mapping from model to observable macro variables (capital-labor ratio, output per worker, elasticity of substitution).
high positive The macroeconomics of automation degree of automation (economy-wide share of tasks performed by capital)
The results of this regional research outline a multi-dimensional policy roadmap that dives deep into the region’s current capabilities and the hurdles it faces in catching up with the AI revolution from a governance and policy perspective, presenting them in a practical framework for public sector leaders.
Report summary claiming that the study's results produce a comprehensive roadmap and practical framework (content description).
high positive Charting AI Governance Future in the Arab Region: A Policy R... comprehensiveness and practicality of the policy roadmap produced by the study
This executive report provides a roadmap for establishing an AI governance infrastructure through a set of strategic policy recommendations across seven key pillars.
Document assertion describing the content and structure of the report (authors' deliverable).
high positive Charting AI Governance Future in the Arab Region: A Policy R... existence of a multi-pillar policy roadmap in the report
The reality of limited AI governance capacity calls for a series of policy interventions at both local and regional levels to empower the AI ecosystem in the Arab region.
Authors' policy recommendation derived from the regional study and synthesis of findings.
high positive Charting AI Governance Future in the Arab Region: A Policy R... adoption of policy interventions to strengthen AI governance and ecosystem
A governance model linking 'trustworthy AI' practices to competitive advantage yields reduced uncertainty, faster deployment cycles, and higher stakeholder trust.
Central claim of the paper tying the proposed AIGSF to business benefits; supported by conceptual linkage and illustrative examples rather than quantified empirical evidence or controlled evaluation.
Case illustrations across hiring, credit, consumer services, and generative AI draw lessons on controls such as model documentation, algorithmic audits, impact assessments, and human-in-the-loop oversight.
Paper includes qualitative case illustrations in the listed domains to demonstrate governance controls; these are presented as examples and lessons rather than as systematic empirical studies (no sample sizes reported).
The paper develops an AI Governance Strategic Framework (AIGSF) and an implementation roadmap that connect ethical accountability, regulatory readiness, cybersecurity resilience, and performance outcomes.
Paper contribution described as an integrative conceptual framework and roadmap; supported by theoretical grounding and illustrative cases rather than empirical validation; no sample size provided.
high positive Artificial Intelligence Governance In Corporate Strategy: Et... organizational_efficiency
AI governance should be treated as a strategic governance function—anchored in board oversight and enterprise risk management—rather than a narrow technical or compliance task.
Central normative recommendation and thesis of the paper; derived from an integrative conceptual framework grounded in corporate governance theory, ERM, and emerging regulation. No empirical testing or sample reported.
high positive Artificial Intelligence Governance In Corporate Strategy: Et... governance_and_regulation
AI has moved from a peripheral digital capability to a central driver of corporate strategy, reshaping decision-making, customer engagement, operations, and risk exposure.
Statement presented in the paper's introduction and motivation; supported by integrative conceptual design and literature grounding (theory and descriptive citations). No empirical sample or quantitative analysis reported.
high positive Artificial Intelligence Governance In Corporate Strategy: Et... organizational_efficiency
A policy of 20% mandatory practice preserves 92% more capability than the simulation baseline (baseline includes a 5% background AI-failure rate).
Simulation comparing baseline (5% background AI-failure rate) to a counterfactual with 20% mandatory practice; reported 92% relative preservation of capability.
high positive The enrichment paradox: critical capability thresholds and i... preserved human capability under mandatory practice policy vs baseline
The model predicts that periodic AI failures improve human capability 2.7-fold (relative improvement reported in simulations).
Simulation experiments comparing scenarios with/without periodic AI failures; reported fold-change in capability of 2.7×.
high positive The enrichment paradox: critical capability thresholds and i... human capability (H) under periodic AI-failure regime
Validated against 15 countries' PISA data (102 points), the model achieves R^2 = 0.946 with 3 parameters and attains the lowest BIC among compared specifications.
Empirical validation using PISA dataset covering 15 countries and 102 data points; reported fit statistics (R^2, number of parameters, BIC).
high positive The enrichment paradox: critical capability thresholds and i... fit of model to PISA data (explained variance, model selection via BIC)
The model was calibrated to four domains: education, medicine, navigation, and aviation.
Model calibration procedures applied separately to four named domains reported in the paper.
high positive The enrichment paradox: critical capability thresholds and i... model parameter fits across domains
We present a two-variable dynamical systems model coupling capability (H) and delegation (D), grounded in three axioms: learning requires capability, practice, and disuse causes forgetting.
Model specification and theoretical construction described in the paper (two-variable dynamical system; three axioms).
high positive The enrichment paradox: critical capability thresholds and i... human capability as a dynamical variable (H) and delegation level (D)
These results demonstrate a practical path toward high-precision, low-latency text-to-SQL applications using domain-specialized, self-hosted language models in large-scale production environments.
Conclusion drawn by the authors based on their implementation, token reduction, and reported accuracy/latency-related claims; generalization to large-scale production is asserted but not supported by detailed production deployment metrics in the excerpt.
high positive Schema on the Inside: A Two-Phase Fine-Tuning Method for Hig... feasibility of production-grade text-to-SQL (precision and latency)
The resulting system achieves 98.4% execution success and 92.5% semantic accuracy, substantially outperforming a prompt-engineered baseline using Google's Gemini Flash 2.0 (95.6% execution, 89.4% semantic accuracy).
Reported empirical evaluation comparing the authors' system to a prompt-engineered baseline (Gemini Flash 2.0) with explicit performance percentages for execution success and semantic accuracy; no sample size, test set composition, statistical significance, or evaluation protocol provided in the excerpt.
high positive Schema on the Inside: A Two-Phase Fine-Tuning Method for Hig... execution success rate; semantic accuracy
The approach replaces costly external API calls with efficient local inference.
System design claim: the model is self-hosted and performs local inference instead of using external API-based LLM calls; no cost accounting or latency benchmarks provided in the excerpt.
high positive Schema on the Inside: A Two-Phase Fine-Tuning Method for Hig... use of external API calls vs local inference (cost/efficiency implication)
This reduces input tokens by over 99%, from a 17k-token baseline to fewer than 100.
Reported measurement comparing input token counts before and after applying their approach (explicit numerical baseline and resulting counts provided); no sample size or distribution of token counts reported.
A novel two-phase supervised fine-tuning approach enables the model to internalize the entire database schema, eliminating the need for long-context prompts.
Methodological description (two-phase supervised fine-tuning) and claim that this internalization removes reliance on long-context prompts; no detailed experimental protocol or sample size provided in the excerpt.
high positive Schema on the Inside: A Two-Phase Fine-Tuning Method for Hig... need for long-context prompts / model internalization of schema
We present a specialized, self-hosted 8B-parameter model designed for a conversational bot in CriQ, a sister app to Dream11 that answers user queries about cricket statistics.
Stated implementation detail in the paper describing the model architecture and deployment target (CriQ conversational bot). No experimental sample size reported for this statement.
high positive Schema on the Inside: A Two-Phase Fine-Tuning Method for Hig... model specification and deployment
Legal professionals, courts, and regulators should replace the outdated 'black box' mental model with verification protocols based on how these systems actually fail.
Policy recommendation stated in the abstract based on the paper's analysis; no trial or deployment evidence of such protocols provided in the excerpt.
high positive When AI output tips to bad but nobody notices: Legal implica... adoption of verification protocols / change in mental model
The adoption of generative AI across commercial and legal professions offers dramatic efficiency gains.
Asserted in the paper's introduction/abstract; no empirical data, sample, or quantitative study reported in the excerpt.
Those extended-model equilibria also show increasing concentration consistent with power-law-like distributions (i.e., winner-take-most / superstar effects).
Theoretical model combining quality heterogeneity and reinforcement dynamics that yields equilibrium distributions with heavy tails; argument and formalization presented in the paper; no empirical testing reported.
high positive The Economics of Builder Saturation in Digital Markets market concentration / distribution of returns (power-law-like)
Even as the number of producers increases and average attention per producer falls, total output expands (production scales elastically).
Same formal theoretical model (analytical result): production scales elastically in the model despite finite attention; no empirical validation provided.
high positive The Economics of Builder Saturation in Digital Markets total market output
Mechanisms identified — network structure evolution and increased relational embeddedness — contribute to a broader understanding of how digital transformation shapes innovation dynamics across geographical boundaries in a globalized knowledge economy.
Synthesis of empirical network evolution results and mediation/structural analyses from the 2011–2021 dataset of digital transformation indicators and patent collaboration networks among cities and firms.
high positive How Does Digital Transformation Affect Cross-Regional Collab... role of network structure evolution and relational embeddedness as mechanisms li...
These results provide empirical evidence from a major emerging economy (China) that can offer insights to inform policies and strategies in other regions undergoing digital transition.
Generalization claim based on empirical findings from the 2011–2021 analysis of A-share listed companies' digital transformation and patent collaboration patterns in China.
high positive How Does Digital Transformation Affect Cross-Regional Collab... policy relevance / generalizability of findings to other regions
When the volume of digital patent applications surpasses a certain threshold, the positive effect of digital transformation on the quality of cross-regional collaborative innovation accelerates (nonlinear threshold effect).
Threshold regression / nonlinear analysis relating counts of digital patent applications to the marginal effect of digital transformation on collaborative innovation quality, using 2011–2021 patent and digitalization data from A-share listed firms.
high positive How Does Digital Transformation Affect Cross-Regional Collab... quality of cross-regional collaborative innovation (and its change above a paten...
Advancement of digital transformation positively contributes to both the quality and the quantity of cross-regional cooperative innovation.
Empirical econometric analysis (panel regressions) linking measures of corporate/urban digital transformation to indicators of cross-regional cooperative innovation quality and counts, using A-share listed companies' digital transformation indicators and patent collaboration data, 2011–2021.
high positive How Does Digital Transformation Affect Cross-Regional Collab... quality and quantity (counts) of cross-regional cooperative innovation
China’s urban collaborative innovation network demonstrates a notable quadrilateral spatial structure and has evolved toward a multicenter pattern over time.
Spatio-temporal network analysis based on the same 2011–2021 dataset of digital transformation indicators and patent/co-patent links among cities inferred from A-share listed companies' patent data.
high positive How Does Digital Transformation Affect Cross-Regional Collab... spatio-temporal structure of urban collaborative innovation network (quadrilater...
The cooperative innovation network exhibits pronounced small-world characteristics.
Network analysis of cross-regional collaborative innovation using digital transformation and patent data from A-share listed companies on the Shanghai and Shenzhen stock exchanges (2011–2021).
high positive How Does Digital Transformation Affect Cross-Regional Collab... presence of small-world characteristics in the cooperative innovation network