Evidence (5267 claims)
Adoption
5267 claims
Productivity
4560 claims
Governance
4137 claims
Human-AI Collaboration
3103 claims
Labor Markets
2506 claims
Innovation
2354 claims
Org Design
2340 claims
Skills & Training
1945 claims
Inequality
1322 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 378 | 106 | 59 | 455 | 1007 |
| Governance & Regulation | 379 | 176 | 116 | 58 | 739 |
| Research Productivity | 240 | 96 | 34 | 294 | 668 |
| Organizational Efficiency | 370 | 82 | 63 | 35 | 553 |
| Technology Adoption Rate | 296 | 118 | 66 | 29 | 513 |
| Firm Productivity | 277 | 34 | 68 | 10 | 394 |
| AI Safety & Ethics | 117 | 177 | 44 | 24 | 364 |
| Output Quality | 244 | 61 | 23 | 26 | 354 |
| Market Structure | 107 | 123 | 85 | 14 | 334 |
| Decision Quality | 168 | 74 | 37 | 19 | 301 |
| Fiscal & Macroeconomic | 75 | 52 | 32 | 21 | 187 |
| Employment Level | 70 | 32 | 74 | 8 | 186 |
| Skill Acquisition | 89 | 32 | 39 | 9 | 169 |
| Firm Revenue | 96 | 34 | 22 | — | 152 |
| Innovation Output | 106 | 12 | 21 | 11 | 151 |
| Consumer Welfare | 70 | 30 | 37 | 7 | 144 |
| Regulatory Compliance | 52 | 61 | 13 | 3 | 129 |
| Inequality Measures | 24 | 68 | 31 | 4 | 127 |
| Task Allocation | 75 | 11 | 29 | 6 | 121 |
| Training Effectiveness | 55 | 12 | 12 | 16 | 96 |
| Error Rate | 42 | 48 | 6 | — | 96 |
| Worker Satisfaction | 45 | 32 | 11 | 6 | 94 |
| Task Completion Time | 78 | 5 | 4 | 2 | 89 |
| Wages & Compensation | 46 | 13 | 19 | 5 | 83 |
| Team Performance | 44 | 9 | 15 | 7 | 76 |
| Hiring & Recruitment | 39 | 4 | 6 | 3 | 52 |
| Automation Exposure | 18 | 17 | 9 | 5 | 50 |
| Job Displacement | 5 | 31 | 12 | — | 48 |
| Social Protection | 21 | 10 | 6 | 2 | 39 |
| Developer Productivity | 29 | 3 | 3 | 1 | 36 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
| Skill Obsolescence | 3 | 19 | 2 | — | 24 |
| Creative Output | 15 | 5 | 3 | 1 | 24 |
| Labor Share of Income | 10 | 4 | 9 | — | 23 |
Adoption
Remove filter
These results demonstrate a practical path toward high-precision, low-latency text-to-SQL applications using domain-specialized, self-hosted language models in large-scale production environments.
Conclusion drawn by the authors based on their implementation, token reduction, and reported accuracy/latency-related claims; generalization to large-scale production is asserted but not supported by detailed production deployment metrics in the excerpt.
The resulting system achieves 98.4% execution success and 92.5% semantic accuracy, substantially outperforming a prompt-engineered baseline using Google's Gemini Flash 2.0 (95.6% execution, 89.4% semantic accuracy).
Reported empirical evaluation comparing the authors' system to a prompt-engineered baseline (Gemini Flash 2.0) with explicit performance percentages for execution success and semantic accuracy; no sample size, test set composition, statistical significance, or evaluation protocol provided in the excerpt.
The approach replaces costly external API calls with efficient local inference.
System design claim: the model is self-hosted and performs local inference instead of using external API-based LLM calls; no cost accounting or latency benchmarks provided in the excerpt.
This reduces input tokens by over 99%, from a 17k-token baseline to fewer than 100.
Reported measurement comparing input token counts before and after applying their approach (explicit numerical baseline and resulting counts provided); no sample size or distribution of token counts reported.
A novel two-phase supervised fine-tuning approach enables the model to internalize the entire database schema, eliminating the need for long-context prompts.
Methodological description (two-phase supervised fine-tuning) and claim that this internalization removes reliance on long-context prompts; no detailed experimental protocol or sample size provided in the excerpt.
We present a specialized, self-hosted 8B-parameter model designed for a conversational bot in CriQ, a sister app to Dream11 that answers user queries about cricket statistics.
Stated implementation detail in the paper describing the model architecture and deployment target (CriQ conversational bot). No experimental sample size reported for this statement.
Those extended-model equilibria also show increasing concentration consistent with power-law-like distributions (i.e., winner-take-most / superstar effects).
Theoretical model combining quality heterogeneity and reinforcement dynamics that yields equilibrium distributions with heavy tails; argument and formalization presented in the paper; no empirical testing reported.
Even as the number of producers increases and average attention per producer falls, total output expands (production scales elastically).
Same formal theoretical model (analytical result): production scales elastically in the model despite finite attention; no empirical validation provided.
Mechanisms identified — network structure evolution and increased relational embeddedness — contribute to a broader understanding of how digital transformation shapes innovation dynamics across geographical boundaries in a globalized knowledge economy.
Synthesis of empirical network evolution results and mediation/structural analyses from the 2011–2021 dataset of digital transformation indicators and patent collaboration networks among cities and firms.
These results provide empirical evidence from a major emerging economy (China) that can offer insights to inform policies and strategies in other regions undergoing digital transition.
Generalization claim based on empirical findings from the 2011–2021 analysis of A-share listed companies' digital transformation and patent collaboration patterns in China.
When the volume of digital patent applications surpasses a certain threshold, the positive effect of digital transformation on the quality of cross-regional collaborative innovation accelerates (nonlinear threshold effect).
Threshold regression / nonlinear analysis relating counts of digital patent applications to the marginal effect of digital transformation on collaborative innovation quality, using 2011–2021 patent and digitalization data from A-share listed firms.
Advancement of digital transformation positively contributes to both the quality and the quantity of cross-regional cooperative innovation.
Empirical econometric analysis (panel regressions) linking measures of corporate/urban digital transformation to indicators of cross-regional cooperative innovation quality and counts, using A-share listed companies' digital transformation indicators and patent collaboration data, 2011–2021.
China’s urban collaborative innovation network demonstrates a notable quadrilateral spatial structure and has evolved toward a multicenter pattern over time.
Spatio-temporal network analysis based on the same 2011–2021 dataset of digital transformation indicators and patent/co-patent links among cities inferred from A-share listed companies' patent data.
The cooperative innovation network exhibits pronounced small-world characteristics.
Network analysis of cross-regional collaborative innovation using digital transformation and patent data from A-share listed companies on the Shanghai and Shenzhen stock exchanges (2011–2021).
This work offers a cost-effective, scientifically grounded blueprint for ubiquitous AI education.
Authors' concluding statement based on the SOP, low labor/hardware claims, and the pilot exam results showing high accuracy with the Shadow Agent in newer 32B models.
This suggests that structured reasoning guidance (as implemented by the Shadow Agent) is the key to unlocking the latent power of modern small language models.
Interpretive claim based on the pilot study's observed large gains for newer 32B models when using Shadow Agent guidance versus smaller gains for older models and stagnation in baselines.
In contrast, older models see only modest gains (~10%) from the Shadow Agent guidance.
Same pilot study reporting that older (unspecified) model generations showed only about a ~10% improvement when using the Shadow Agent versus baseline. No exact accuracy numbers, sample size, or model names provided.
The Shadow Agent, which provides structured reasoning guidance, triggers a massive capability surge in newer 32B models, boosting performance from 74% (Naive RAG) to mastery level (90%).
Pilot study on a full graduate-level final exam reported comparisons between Naive RAG (74% accuracy) and the Shadow Agent (90% accuracy) for newer 32B models. Specific number of exam items or statistical testing not stated.
We used a Vision-Language Model data cleaning strategy and a novel Shadow-RAG architecture as core technical components of the localization pipeline.
Methodological description in the practitioner report; the paper explicitly names these two techniques as the data-cleaning and architectural contributions used to create the tutor.
Using a Vision-Language Model data cleaning strategy and a novel Shadow-RAG architecture, we localized a graduate-level Applied Mathematics tutor using only 3 person-days of non-expert labor and open-weights 32B models deployable on a single consumer-grade GPU.
Practitioner report describing a replicable Standard Operating Procedure (SOP); method claims include Vision-Language Model data cleaning and Shadow-RAG; deployment described as using open-weight 32B models on a single consumer GPU; labor reported as '3 person-days of non-expert labor'. No sample size or independent replication reported in text.
AI adoption and the associated improved governance lead to higher total factor productivity (TFP).
Empirical analysis showing a positive association between firm-level AI application index and measures of total factor productivity in the 2010–2023 Chinese A-share panel.
AI adoption and the associated improved governance lead to a lower cost of debt financing for firms.
Empirical tests linking firm-level AI application and governance improvements to measures of debt financing costs (e.g., interest rates on debt, financing spreads) in the Chinese A-share firm sample.
The governance risk-mitigation effects of AI operate through enhancing external monitoring.
Mechanism analyses showing that AI adoption is associated with measures of stronger external monitoring (e.g., analyst coverage, media scrutiny, regulator activity) in the firm-year panel, linking that channel to reduced misconduct.
The governance risk-mitigation effects of AI operate through strengthening internal control capacity.
Mechanism analyses showing that higher AI application is associated with improved internal control measures (as reported by firms or regulatory/financial-control indicators) in the dataset of Chinese A-share firms.
The governance risk-mitigation effects of AI operate through lowering agency costs.
Mechanism analyses reported by authors linking AI adoption to reductions in measures interpreted as agency costs (e.g., agency-cost proxies, corporate governance metrics) in the same firm-year panel.
AI application significantly reduces the monetary amount of penalties associated with executive misconduct.
Regression analyses on monetary penalty data for Chinese A-share firms (2010–2023) showing a statistically significant negative relationship between firm AI application index and penalty amounts.
AI application significantly reduces the frequency (number) of violations by executives.
Empirical frequency/regression analyses on the firm-year panel of Chinese A-share firms using the AI application index; authors report robust reductions in the number/frequency of violations conditional on AI adoption.
AI application significantly reduces the incidence of executive misconduct.
Empirical analysis on Chinese A-share listed firms (2010–2023) using the constructed firm-level AI application index; reported significant negative association between AI application and whether a firm experiences executive misconduct (incidence).
Using Chinese A-share firms listed in Shanghai and Shenzhen from 2010 to 2023, we construct a firm-level AI application index and examine whether and how AI adoption mitigates executive misconduct.
Authors report building a firm-level AI application index and applying it to Chinese A-share listed firms (Shanghai and Shenzhen) over 2010–2023 to study links between AI adoption and executive misconduct (method: panel analysis using firm-year observations).
Applying our framework to product listings on Etsy, we find that following ChatGPT's release, listings have significantly more machine-usable information about product selection, consistent with systematic mecha-nudging.
Empirical analysis of Etsy product listings comparing measures of 'machine-usable information about product selection' before and after ChatGPT's release. (The abstract states a significant increase; full paper presumably contains dataset details and statistical tests, but sample size and exact estimates are not provided in the excerpt.)
Adoption of AI can reduce procurement costs by 15.7%.
Field survey data (n=326) and regression analysis; authors report a 15.7% reduction in procurement costs associated with AI adoption.
Adoption of AI can shorten the procurement decision-making cycle by 21.3%.
Field survey data (n=326) analyzed (authors report a 21.3% reduction in procurement decision-making cycle associated with AI adoption); method described as questionnaire surveys and multiple linear regression.
Supplier AI capability positively drives AI adoption in procurement (β = 0.28, p < 0.01).
Same questionnaire survey (n=326) and multiple linear regression analysis; reported coefficient β=0.28 with p<0.01.
Perceived usefulness positively drives AI adoption in procurement (β = 0.32, p < 0.01).
Questionnaire survey of 326 procurement managers/supply chain managers in SMEs (Yangtze River Delta and Pearl River Delta) analyzed using multiple linear regression; reported coefficient β=0.32 with p<0.01.
The paper provides recommendations for designing strategic indicators to drive adoption, foster innovation, and objectively assess whether digital tools are delivering top-line impact.
Descriptive claim about the content of the perspective article (the authors state they provide these recommendations); the excerpt itself summarizes this contribution.
The shift from expert-driven computer-aided drug design (CADD) to semiautonomous AI necessitates a new framework of impact-oriented KPIs.
Stated by the EFMC2 community authors as a normative conclusion in the perspective piece; based on the characterisation of a technological shift rather than on presented empirical tests in the excerpt.
Harnessing AI's potential requires moving beyond measuring technical model performance (e.g., predictive accuracy) to measuring strategic impact.
Authors argue this as a conceptual requirement for realizing AI's benefits in R&D; presented as a recommendation rather than supported by quantified empirical evidence in the excerpt.
Preliminary analyses suggest that 'AI-native' companies may be outpacing traditional peers.
Explicitly stated in the paper as based on preliminary analyses; the excerpt provides no details on the analyses, metrics, or sample sizes.
The broad introduction of AI into the R&D landscape over the last years holds the promise to lift pharmaceutical R&D out of its productivity problem.
Framed as an expectation/promise in the paper; based on recent broad adoption trends of AI in R&D (no specific empirical evaluation or sample size reported in the excerpt).
In this verifiable domain, simple arbitrage strategies generate net profit margins of up to 40%.
Empirical result from the SWE-bench case study comparing arbitrage strategy returns using GPT-5 mini and DeepSeek v3.2 (reported maximum net profit margin = 40%).
Generative AI can autonomously produce novel content, including text, images, models, and scenarios.
General technical/descriptive claim stated in the paper's background/introduction; not an empirically tested claim within the provided excerpt.
Generative AI facilitates the synthesis of structured and unstructured information from diverse sources, enabling managers to explore multiple decision pathways, identify potential risks, and optimize strategic choices.
Descriptive/functional claim made in the paper's introduction and conceptual framing; the empirical component (survey + SEM) is described generally but no specific measures or effect sizes for information synthesis or these capabilities are provided in the excerpt.
Generative AI augments human creativity by producing innovative solutions and scenario-planning alternatives that may not emerge through conventional analytical approaches.
Stated in the conceptual/argumentative portion of the paper; may be supported by survey items but no explicit empirical measure or effect size for creativity is provided in the provided text.
Decision quality and strategic agility positively influence organizational performance.
Reported SEM results from the paper linking the constructs (decision quality and strategic agility) to organizational performance using survey data from senior managers and AI adoption specialists; method = SmartPLS.
Generative AI adoption significantly enhances strategic agility.
Same empirical source as above: survey of senior managers/decision-makers/AI adoption specialists; tested via Structural Equation Modeling (SmartPLS) as reported in the paper.
Generative AI adoption significantly enhances decision quality.
Empirical analysis reported in the paper: survey data collected from senior managers, decision-makers, and AI adoption specialists across multiple industries; relationships assessed using Structural Equation Modeling (SmartPLS). No numeric sample size or effect estimate reported in the provided text.
By enabling developers without initial capital to participate in the digital economy, RSI could unlock the 'latent jobs dividend' in low-income countries and help address local challenges in health, agriculture, and services.
Societal-impact argument in the paper linking the RSI model to potential employment gains and localized solutions; speculative extrapolation, no empirical employment estimates or pilot studies reported.
The RSI model could stimulate innovation in the ecosystem.
Argument based on lowered financial barriers and incentive structures from the paper's theoretical comparative analysis; no empirical measures of innovation provided.
The RSI model aligns stakeholder interests (platforms and developers).
Theoretical argument and incentive-alignment reasoning in the paper's comparative framework; no empirical validation presented.
A comparative analysis in the paper shows that the RSI model lowers entry barriers for developers.
Detailed comparative (theoretical) analysis within the paper contrasting existing models and RSI; no empirical trial, sample, or randomized test reported.