Evidence (8066 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	417	113	67	480	1091
Governance & Regulation	419	202	124	64	823
Research Productivity	261	100	34	303	703
Organizational Efficiency	406	96	71	40	616
Technology Adoption Rate	323	128	74	38	568
Firm Productivity	307	38	70	12	432
Output Quality	260	71	27	29	387
AI Safety & Ethics	118	179	45	24	368
Market Structure	107	128	85	14	339
Decision Quality	177	75	37	19	312
Fiscal & Macroeconomic	89	58	33	22	209
Employment Level	74	34	78	9	197
Skill Acquisition	98	36	40	9	183
Innovation Output	121	12	24	13	171
Firm Revenue	98	35	24	—	157
Consumer Welfare	73	31	37	7	148
Task Allocation	87	16	34	7	144
Inequality Measures	25	76	32	5	138
Regulatory Compliance	54	61	13	3	131
Task Completion Time	89	7	4	3	103
Error Rate	44	51	6	—	101
Training Effectiveness	58	12	12	16	99
Worker Satisfaction	47	33	11	7	98
Wages & Compensation	54	15	20	5	94
Team Performance	47	12	15	7	82
Automation Exposure	27	26	10	6	72
Job Displacement	6	39	13	—	58
Hiring & Recruitment	40	4	6	3	53
Developer Productivity	34	4	3	1	42
Social Protection	22	11	6	2	41
Creative Output	16	7	5	1	29
Labor Share of Income	12	6	9	—	27
Skill Obsolescence	3	20	2	—	25
Worker Turnover	10	12	—	3	25

AI has moved from a peripheral digital capability to a central driver of corporate strategy, reshaping decision-making, customer engagement, operations, and risk exposure.

Statement presented in the paper's introduction and motivation; supported by integrative conceptual design and literature grounding (theory and descriptive citations). No empirical sample or quantitative analysis reported.

high positive Artificial Intelligence Governance In Corporate Strategy: Et... organizational_efficiency

A policy of 20% mandatory practice preserves 92% more capability than the simulation baseline (baseline includes a 5% background AI-failure rate).

Simulation comparing baseline (5% background AI-failure rate) to a counterfactual with 20% mandatory practice; reported 92% relative preservation of capability.

high positive The enrichment paradox: critical capability thresholds and i... preserved human capability under mandatory practice policy vs baseline

The model predicts that periodic AI failures improve human capability 2.7-fold (relative improvement reported in simulations).

Simulation experiments comparing scenarios with/without periodic AI failures; reported fold-change in capability of 2.7×.

high positive The enrichment paradox: critical capability thresholds and i... human capability (H) under periodic AI-failure regime

Validated against 15 countries' PISA data (102 points), the model achieves R^2 = 0.946 with 3 parameters and attains the lowest BIC among compared specifications.

Empirical validation using PISA dataset covering 15 countries and 102 data points; reported fit statistics (R^2, number of parameters, BIC).

high positive The enrichment paradox: critical capability thresholds and i... fit of model to PISA data (explained variance, model selection via BIC)

The model was calibrated to four domains: education, medicine, navigation, and aviation.

Model calibration procedures applied separately to four named domains reported in the paper.

high positive The enrichment paradox: critical capability thresholds and i... model parameter fits across domains

We present a two-variable dynamical systems model coupling capability (H) and delegation (D), grounded in three axioms: learning requires capability, practice, and disuse causes forgetting.

Model specification and theoretical construction described in the paper (two-variable dynamical system; three axioms).

high positive The enrichment paradox: critical capability thresholds and i... human capability as a dynamical variable (H) and delegation level (D)

These results demonstrate a practical path toward high-precision, low-latency text-to-SQL applications using domain-specialized, self-hosted language models in large-scale production environments.

Conclusion drawn by the authors based on their implementation, token reduction, and reported accuracy/latency-related claims; generalization to large-scale production is asserted but not supported by detailed production deployment metrics in the excerpt.

high positive Schema on the Inside: A Two-Phase Fine-Tuning Method for Hig... feasibility of production-grade text-to-SQL (precision and latency)

The resulting system achieves 98.4% execution success and 92.5% semantic accuracy, substantially outperforming a prompt-engineered baseline using Google's Gemini Flash 2.0 (95.6% execution, 89.4% semantic accuracy).

Reported empirical evaluation comparing the authors' system to a prompt-engineered baseline (Gemini Flash 2.0) with explicit performance percentages for execution success and semantic accuracy; no sample size, test set composition, statistical significance, or evaluation protocol provided in the excerpt.

high positive Schema on the Inside: A Two-Phase Fine-Tuning Method for Hig... execution success rate; semantic accuracy

The approach replaces costly external API calls with efficient local inference.

System design claim: the model is self-hosted and performs local inference instead of using external API-based LLM calls; no cost accounting or latency benchmarks provided in the excerpt.

high positive Schema on the Inside: A Two-Phase Fine-Tuning Method for Hig... use of external API calls vs local inference (cost/efficiency implication)

This reduces input tokens by over 99%, from a 17k-token baseline to fewer than 100.

Reported measurement comparing input token counts before and after applying their approach (explicit numerical baseline and resulting counts provided); no sample size or distribution of token counts reported.

high positive Schema on the Inside: A Two-Phase Fine-Tuning Method for Hig... input token count

A novel two-phase supervised fine-tuning approach enables the model to internalize the entire database schema, eliminating the need for long-context prompts.

Methodological description (two-phase supervised fine-tuning) and claim that this internalization removes reliance on long-context prompts; no detailed experimental protocol or sample size provided in the excerpt.

high positive Schema on the Inside: A Two-Phase Fine-Tuning Method for Hig... need for long-context prompts / model internalization of schema

We present a specialized, self-hosted 8B-parameter model designed for a conversational bot in CriQ, a sister app to Dream11 that answers user queries about cricket statistics.

Stated implementation detail in the paper describing the model architecture and deployment target (CriQ conversational bot). No experimental sample size reported for this statement.

high positive Schema on the Inside: A Two-Phase Fine-Tuning Method for Hig... model specification and deployment

Legal professionals, courts, and regulators should replace the outdated 'black box' mental model with verification protocols based on how these systems actually fail.

Policy recommendation stated in the abstract based on the paper's analysis; no trial or deployment evidence of such protocols provided in the excerpt.

high positive When AI output tips to bad but nobody notices: Legal implica... adoption of verification protocols / change in mental model

The adoption of generative AI across commercial and legal professions offers dramatic efficiency gains.

Asserted in the paper's introduction/abstract; no empirical data, sample, or quantitative study reported in the excerpt.

high positive When AI output tips to bad but nobody notices: Legal implica... efficiency gains

Those extended-model equilibria also show increasing concentration consistent with power-law-like distributions (i.e., winner-take-most / superstar effects).

Theoretical model combining quality heterogeneity and reinforcement dynamics that yields equilibrium distributions with heavy tails; argument and formalization presented in the paper; no empirical testing reported.

high positive The Economics of Builder Saturation in Digital Markets market concentration / distribution of returns (power-law-like)

Even as the number of producers increases and average attention per producer falls, total output expands (production scales elastically).

Same formal theoretical model (analytical result): production scales elastically in the model despite finite attention; no empirical validation provided.

high positive The Economics of Builder Saturation in Digital Markets total market output

Mechanisms identified — network structure evolution and increased relational embeddedness — contribute to a broader understanding of how digital transformation shapes innovation dynamics across geographical boundaries in a globalized knowledge economy.

Synthesis of empirical network evolution results and mediation/structural analyses from the 2011–2021 dataset of digital transformation indicators and patent collaboration networks among cities and firms.

high positive How Does Digital Transformation Affect Cross-Regional Collab... role of network structure evolution and relational embeddedness as mechanisms li...

These results provide empirical evidence from a major emerging economy (China) that can offer insights to inform policies and strategies in other regions undergoing digital transition.

Generalization claim based on empirical findings from the 2011–2021 analysis of A-share listed companies' digital transformation and patent collaboration patterns in China.

high positive How Does Digital Transformation Affect Cross-Regional Collab... policy relevance / generalizability of findings to other regions

When the volume of digital patent applications surpasses a certain threshold, the positive effect of digital transformation on the quality of cross-regional collaborative innovation accelerates (nonlinear threshold effect).

Threshold regression / nonlinear analysis relating counts of digital patent applications to the marginal effect of digital transformation on collaborative innovation quality, using 2011–2021 patent and digitalization data from A-share listed firms.

high positive How Does Digital Transformation Affect Cross-Regional Collab... quality of cross-regional collaborative innovation (and its change above a paten...

Advancement of digital transformation positively contributes to both the quality and the quantity of cross-regional cooperative innovation.

Empirical econometric analysis (panel regressions) linking measures of corporate/urban digital transformation to indicators of cross-regional cooperative innovation quality and counts, using A-share listed companies' digital transformation indicators and patent collaboration data, 2011–2021.

high positive How Does Digital Transformation Affect Cross-Regional Collab... quality and quantity (counts) of cross-regional cooperative innovation

China’s urban collaborative innovation network demonstrates a notable quadrilateral spatial structure and has evolved toward a multicenter pattern over time.

Spatio-temporal network analysis based on the same 2011–2021 dataset of digital transformation indicators and patent/co-patent links among cities inferred from A-share listed companies' patent data.

high positive How Does Digital Transformation Affect Cross-Regional Collab... spatio-temporal structure of urban collaborative innovation network (quadrilater...

The cooperative innovation network exhibits pronounced small-world characteristics.

Network analysis of cross-regional collaborative innovation using digital transformation and patent data from A-share listed companies on the Shanghai and Shenzhen stock exchanges (2011–2021).

high positive How Does Digital Transformation Affect Cross-Regional Collab... presence of small-world characteristics in the cooperative innovation network

This work offers a cost-effective, scientifically grounded blueprint for ubiquitous AI education.

Authors' concluding statement based on the SOP, low labor/hardware claims, and the pilot exam results showing high accuracy with the Shadow Agent in newer 32B models.

high positive From 50% to Mastery in 3 Days: A Low-Resource SOP for Locali... scalability/adoption potential of AI tutors

This suggests that structured reasoning guidance (as implemented by the Shadow Agent) is the key to unlocking the latent power of modern small language models.

Interpretive claim based on the pilot study's observed large gains for newer 32B models when using Shadow Agent guidance versus smaller gains for older models and stagnation in baselines.

high positive From 50% to Mastery in 3 Days: A Low-Resource SOP for Locali... model capability unlocking (qualitative interpretation tied to accuracy gains)

In contrast, older models see only modest gains (~10%) from the Shadow Agent guidance.

Same pilot study reporting that older (unspecified) model generations showed only about a ~10% improvement when using the Shadow Agent versus baseline. No exact accuracy numbers, sample size, or model names provided.

high positive From 50% to Mastery in 3 Days: A Low-Resource SOP for Locali... change in exam accuracy (percentage point gain)

The Shadow Agent, which provides structured reasoning guidance, triggers a massive capability surge in newer 32B models, boosting performance from 74% (Naive RAG) to mastery level (90%).

Pilot study on a full graduate-level final exam reported comparisons between Naive RAG (74% accuracy) and the Shadow Agent (90% accuracy) for newer 32B models. Specific number of exam items or statistical testing not stated.

high positive From 50% to Mastery in 3 Days: A Low-Resource SOP for Locali... exam accuracy (percentage correct)

We used a Vision-Language Model data cleaning strategy and a novel Shadow-RAG architecture as core technical components of the localization pipeline.

Methodological description in the practitioner report; the paper explicitly names these two techniques as the data-cleaning and architectural contributions used to create the tutor.

high positive From 50% to Mastery in 3 Days: A Low-Resource SOP for Locali... methodological approach (data quality and retrieval-augmented architecture)

Using a Vision-Language Model data cleaning strategy and a novel Shadow-RAG architecture, we localized a graduate-level Applied Mathematics tutor using only 3 person-days of non-expert labor and open-weights 32B models deployable on a single consumer-grade GPU.

Practitioner report describing a replicable Standard Operating Procedure (SOP); method claims include Vision-Language Model data cleaning and Shadow-RAG; deployment described as using open-weight 32B models on a single consumer GPU; labor reported as '3 person-days of non-expert labor'. No sample size or independent replication reported in text.

high positive From 50% to Mastery in 3 Days: A Low-Resource SOP for Locali... deployment resource requirements (time/labor and hardware feasibility)

If you can prove the value and the effort behind API token spending (agent memory), you can resell it.

Normative/operational claim within the paper's proposal; presented as an implication of verifiable provenance and market layering, with no empirical proof or transactional data.

high positive Infrastructure for Valuable, Tradable, and Verifiable Agent ... resellability of artifacts derived from API token spending

Enabling timely memory transfer reduces repeated exploration.

Argument in the paper asserting that shared/tradable memory decreases redundant exploration; no experimental or observational data provided.

high positive Infrastructure for Valuable, Tradable, and Verifiable Agent ... frequency/amount of repeated exploration by agents

Together, clawgang and meowtrade transform one-shot API token spending into reusable and tradable assets.

High-level systems argument in the paper; no empirical measurements of reuse or tradability presented.

high positive Infrastructure for Valuable, Tradable, and Verifiable Agent ... conversion of one-shot API calls into reusable/tradable assets

Meowtrade is a market layer for listing, transferring, and governing certified memory artifacts.

Design proposal described in the paper; no pilot deployment, user adoption metrics, or experimental data provided.

high positive Infrastructure for Valuable, Tradable, and Verifiable Agent ... existence/functionality of a market layer for certified memory artifacts

Clawgang binds memory to verifiable computational provenance.

System/design claim describing the proposed mechanism (clawgang) in the paper; no implementation results or empirical validation reported.

high positive Infrastructure for Valuable, Tradable, and Verifiable Agent ... ability to cryptographically or procedurally link memories to provenance

Agent memory can serve as an economic commodity in the agent economy, if buyers can verify that it is authentic, effort-backed, and produced in a compatible execution context.

Conceptual argument in the paper's proposal; no empirical evaluation, sample size, or experiments reported.

high positive Infrastructure for Valuable, Tradable, and Verifiable Agent ... feasibility of agent memory becoming a tradable commodity

Economic theory can be used to generate structured synthetic data that improves foundation-model predictions when the theory implies observable patterns in the data.

General conclusion drawn from the paper's experimental findings: improvement in model predictions after fine-tuning on theory-derived synthetic data.

high positive GARP-EFM: Improving Foundation Models with Revealed Preferen... improvement in foundation-model prediction accuracy when using theory-generated ...

Fine-tuning on GARP-consistent synthetic data substantially improves prediction relative to zero-shot Chronos-2 at all forecast horizons we study.

Empirical results comparing fine-tuned Chronos-2 to zero-shot Chronos-2 across multiple forecast horizons on the authors' experimental panel (no numeric metrics or sample sizes given in the excerpt).

high positive GARP-EFM: Improving Foundation Models with Revealed Preferen... forecast prediction accuracy across forecast horizons

The fine-tuned model serves as a rationality-constrained forecasting prior: it learns price-quantity relations from GARP-consistent synthetic histories and then uses those relations to predict the choices of real consumers.

Empirical approach described in paper: model fine-tuned on synthetic GARP-consistent histories and then evaluated on real consumer choice data (supports claim that model transfers learned relations to predicting real choices).

high positive GARP-EFM: Improving Foundation Models with Revealed Preferen... model's ability to predict real consumer choices (use of learned price-quantity ...

GARP is a simple condition to check that allows us to generate time series from a large class of utilities efficiently.

Methodological argument in the paper: authors use GARP as a constructive condition to generate synthetic time series from many utility functions (no numeric efficiency metrics provided in the excerpt).

high positive GARP-EFM: Improving Foundation Models with Revealed Preferen... feasibility/efficiency of generating synthetic time series from utility classes

Teaching them basic economic logic improves how they predict demand using an experimental panel.

Reported experimental results in the paper: fine-tuning models on synthetic, economics-consistent data and evaluating on an experimental panel of consumer demand (no numeric sample size or metrics provided in the excerpt).

high positive GARP-EFM: Improving Foundation Models with Revealed Preferen... prediction accuracy of consumer demand

AI adoption and the associated improved governance lead to higher total factor productivity (TFP).

Empirical analysis showing a positive association between firm-level AI application index and measures of total factor productivity in the 2010–2023 Chinese A-share panel.

high positive The risk-mitigation effects of artificial intelligence adopt... total factor productivity (TFP)

AI adoption and the associated improved governance lead to a lower cost of debt financing for firms.

Empirical tests linking firm-level AI application and governance improvements to measures of debt financing costs (e.g., interest rates on debt, financing spreads) in the Chinese A-share firm sample.

high positive The risk-mitigation effects of artificial intelligence adopt... cost of debt financing (interest rate/spread measures)

The governance risk-mitigation effects of AI operate through enhancing external monitoring.

Mechanism analyses showing that AI adoption is associated with measures of stronger external monitoring (e.g., analyst coverage, media scrutiny, regulator activity) in the firm-year panel, linking that channel to reduced misconduct.

high positive The risk-mitigation effects of artificial intelligence adopt... external monitoring intensity (analyst coverage, media/regulatory scrutiny proxi...

The governance risk-mitigation effects of AI operate through strengthening internal control capacity.

Mechanism analyses showing that higher AI application is associated with improved internal control measures (as reported by firms or regulatory/financial-control indicators) in the dataset of Chinese A-share firms.

high positive The risk-mitigation effects of artificial intelligence adopt... internal control capacity (corporate internal control metrics)

The governance risk-mitigation effects of AI operate through lowering agency costs.

Mechanism analyses reported by authors linking AI adoption to reductions in measures interpreted as agency costs (e.g., agency-cost proxies, corporate governance metrics) in the same firm-year panel.

high positive The risk-mitigation effects of artificial intelligence adopt... agency costs (proxied by governance/financial measures)

AI application significantly reduces the monetary amount of penalties associated with executive misconduct.

Regression analyses on monetary penalty data for Chinese A-share firms (2010–2023) showing a statistically significant negative relationship between firm AI application index and penalty amounts.

high positive The risk-mitigation effects of artificial intelligence adopt... monetary amount of penalties for executive misconduct

AI application significantly reduces the frequency (number) of violations by executives.

Empirical frequency/regression analyses on the firm-year panel of Chinese A-share firms using the AI application index; authors report robust reductions in the number/frequency of violations conditional on AI adoption.

high positive The risk-mitigation effects of artificial intelligence adopt... frequency (count) of executive violations

AI application significantly reduces the incidence of executive misconduct.

Empirical analysis on Chinese A-share listed firms (2010–2023) using the constructed firm-level AI application index; reported significant negative association between AI application and whether a firm experiences executive misconduct (incidence).

high positive The risk-mitigation effects of artificial intelligence adopt... incidence (occurrence) of executive misconduct

Using Chinese A-share firms listed in Shanghai and Shenzhen from 2010 to 2023, we construct a firm-level AI application index and examine whether and how AI adoption mitigates executive misconduct.

Authors report building a firm-level AI application index and applying it to Chinese A-share listed firms (Shanghai and Shenzhen) over 2010–2023 to study links between AI adoption and executive misconduct (method: panel analysis using firm-year observations).

high positive The risk-mitigation effects of artificial intelligence adopt... existence and measurement of firm-level AI application index; sample frame of Ch...

Applying our framework to product listings on Etsy, we find that following ChatGPT's release, listings have significantly more machine-usable information about product selection, consistent with systematic mecha-nudging.

Empirical analysis of Etsy product listings comparing measures of 'machine-usable information about product selection' before and after ChatGPT's release. (The abstract states a significant increase; full paper presumably contains dataset details and statistical tests, but sample size and exact estimates are not provided in the excerpt.)

high positive Mecha-nudges for Machines machine-usable information about product selection

Adoption of AI can reduce procurement costs by 15.7%.

Field survey data (n=326) and regression analysis; authors report a 15.7% reduction in procurement costs associated with AI adoption.

high positive Research on the Adoption of Artificial Intelligence and Proc... procurement costs

« Prev 1 2 3 … 56 57 58 … 161 162 Next »