Evidence (2215 claims)
Adoption
5126 claims
Productivity
4409 claims
Governance
4049 claims
Human-AI Collaboration
2954 claims
Labor Markets
2432 claims
Org Design
2273 claims
Innovation
2215 claims
Skills & Training
1902 claims
Inequality
1286 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 369 | 105 | 58 | 432 | 972 |
| Governance & Regulation | 365 | 171 | 113 | 54 | 713 |
| Research Productivity | 229 | 95 | 33 | 294 | 655 |
| Organizational Efficiency | 354 | 82 | 58 | 34 | 531 |
| Technology Adoption Rate | 277 | 115 | 63 | 27 | 486 |
| Firm Productivity | 273 | 33 | 68 | 10 | 389 |
| AI Safety & Ethics | 112 | 177 | 43 | 24 | 358 |
| Output Quality | 228 | 61 | 23 | 25 | 337 |
| Market Structure | 105 | 118 | 81 | 14 | 323 |
| Decision Quality | 154 | 68 | 33 | 17 | 275 |
| Employment Level | 68 | 32 | 74 | 8 | 184 |
| Fiscal & Macroeconomic | 74 | 52 | 32 | 21 | 183 |
| Skill Acquisition | 85 | 31 | 38 | 9 | 163 |
| Firm Revenue | 96 | 30 | 22 | — | 148 |
| Innovation Output | 100 | 11 | 20 | 11 | 143 |
| Consumer Welfare | 66 | 29 | 35 | 7 | 137 |
| Regulatory Compliance | 51 | 61 | 13 | 3 | 128 |
| Inequality Measures | 24 | 66 | 31 | 4 | 125 |
| Task Allocation | 64 | 6 | 28 | 6 | 104 |
| Error Rate | 42 | 47 | 6 | — | 95 |
| Training Effectiveness | 55 | 12 | 10 | 16 | 93 |
| Worker Satisfaction | 42 | 32 | 11 | 6 | 91 |
| Task Completion Time | 71 | 5 | 3 | 1 | 80 |
| Wages & Compensation | 38 | 13 | 19 | 4 | 74 |
| Team Performance | 41 | 8 | 15 | 7 | 72 |
| Hiring & Recruitment | 39 | 4 | 6 | 3 | 52 |
| Automation Exposure | 17 | 15 | 9 | 5 | 46 |
| Job Displacement | 5 | 28 | 12 | — | 45 |
| Social Protection | 18 | 8 | 6 | 1 | 33 |
| Developer Productivity | 25 | 1 | 2 | 1 | 29 |
| Worker Turnover | 10 | 12 | — | 3 | 25 |
| Creative Output | 15 | 5 | 3 | 1 | 24 |
| Skill Obsolescence | 3 | 18 | 2 | — | 23 |
| Labor Share of Income | 7 | 4 | 9 | — | 20 |
Innovation
Remove filter
Expert (per-expert) sizes and overall design are positioned between the GPT-OSS and Qwen3 MoE designs.
Architectural comparison asserted in the paper; claim is based on relative model-design choices (expert count/size) compared to public descriptions of GPT-OSS and Qwen3. The summary provides the positioning but not detailed layer-by-layer comparisons.
An orchestrator coordinates components with intent-aware routing and layered safety checks, enabling multi-step workflows and productized services.
Paper describes an agentic tool-calling framework and multi-layer orchestrator used for intent-aware routing, defense-in-depth safety validation, and multi-step workflows.
Aura is a long-form ASR system capable of handling hours-long audio.
Paper lists Aura in the product stack as 'long-form ASR handling hours-long audio.' Specific evaluation metrics or training data for ASR are not provided in the summary.
Arabic content comprises only about 0.5% of web data despite roughly 400 million native speakers.
Paper cites this data-point to motivate intentional data strategies for Arabic underrepresentation on the web; exact source of the web-proportion not specified in the summary.
Methods among the surveyed systems span token-level code generation to circuit-structure generation, and evaluation metrics are often task- and artifact-specific.
Surveyed system descriptions show diversity in generative approaches (token-level language models, graph/diffusion-based circuit generators, agentic optimizers) and corresponding tailored metrics noted in the review.
Many early-stage AI advances have not translated into higher Phase II/III success rates.
Synthesis of reported outcomes and failures from industry experience; no new systematic statistical analysis provided.
After roughly a decade of adoption in large biopharma, AI has not yet changed late-stage (Phase II/III) clinical success rates.
Qualitative assessment of industrywide experience and reported outcomes; statement based on narrative review rather than systematic, long-run quantitative analysis or causal estimates.
Three primary adoption archetypes in large pharma are (1) partnership-driven acceleration, (2) culture-centric transformation, and (3) production-first democratization.
Conceptual classification in the editorial derived from trends and illustrative examples rather than empirical survey or sampling; no quantitative validation provided.
This paper systematically studies the Impact Mechanism of artificial intelligence on the Globalized Division of Labor and reveals the Structural Transformation under Technology Substitution and Data Elements Dual-wheel Drive through Literature Review and Theoretical Analysis.
Methodological claim: supported by the paper's literature review and theoretical analysis; no quantitative sample or empirical design indicated for this specific conclusion in the excerpt.
AI adoption is not associated with significant changes in operating costs.
Analysis of operating costs in firm financials showing no significant post-adoption change for adopters relative to nonadopters.
The innovation effects of AI adoption are not concentrated among larger firms, financially unconstrained firms, or high-tech firms.
Heterogeneity tests across firm size, financial constraint status, and industry technology intensity showing no concentration of effects in these groups (as reported in the paper).
There is a gap in the existing literature regarding empirical evidence about the relationship between AI/Big Data use and market uncertainty during economic downturns.
Paper motivates the study by citing this gap based on its literature review (the summary does not list the reviewed works or systematic review method).
Unemployment does not exert a statistically significant impact on GDP growth in the employed model.
Unemployment included among the macroeconomic determinants in the panel regressions but reported as statistically insignificant (no effect) in the provided summary; methods cited include OLS, FE, Difference and System GMM (sample details not included).
Robust methodology (panel VAR and DID) was used to assess the impact of technology and public policy interventions on emissions reductions.
Methods stated in the paper (panel VAR and difference-in-differences); robustness is claimed by the authors based on using these established econometric approaches, though formal robustness checks are not detailed in the summary.
The short‑term effect of AI on labor‑intensive industries is weak.
Short‑run/dynamic subgroup analysis in the China 2003–2017 panel indicating minimal or weak immediate growth effects for labor‑intensive sectors.
The article clarifies theoretical relationships and gaps between Material Passports, Digital Product Passports, and Digital Building Logbooks.
Theoretical analysis and synthesis section of the SLR where the authors compare concepts and identify overlaps and gaps among MPs, DPPs, and DBLs.
Top management support does not have a direct influence on AI Adoption in the sampled firms.
PLS-SEM results from the 207-firm survey showing a non-significant direct path from top management support to AI Adoption (as reported in the paper).
Effort expectancy does not have a direct influence on AI Adoption in the sampled firms.
PLS-SEM results from the 207-firm survey showing a non-significant direct path from effort expectancy to AI Adoption (as reported in the paper).
The paper introduces a novel taxonomy that separates patenting into three domains: core AI, traditional robotics, and AI-enhanced robotics.
Methodological contribution of the paper: construction and application of a classification scheme that assigns patent filings (1980–2019) into three domains (core AI, traditional robotics, AI-enhanced robotics). Data source: patent filings 1980–2019 (aggregate counts by domain and country). Exact number of patents not provided in the summary.
When green-technology innovation is low (below the threshold), the main measurable effect of DE is on improving carbon emission efficiency (CEE), but DE does not yet reduce per capita emissions (PCE).
Results from the threshold-regression models on the 278-city panel (2011–2022) show that in the low-green-innovation regime DE coefficients are significant for CEE but not for PCE; mediating-effect models corroborate the efficiency channel in low-innovation contexts.
Realising DT value requires upfront investment in sensors, integration, standards, and skills; economic viability depends on contract structures and how gains are allocated between investors, owners, contractors, and operators.
Synthesis of cost/benefit discussions and case descriptions in the reviewed literature; policy and procurement examples referenced.
HCI has explored usable consent, but there is no systematic framework for consent in the AI era.
Literature synthesis and gap identification from workshop participants and solicited position papers; no systematic review or meta-analysis with counted studies reported in the summary.
Research priorities include causal studies on productivity gains from AI, firm‑level adoption dynamics, sectoral labor reallocation, long‑run general equilibrium effects, and heterogeneous impacts across regions and demographic groups.
Set of empirical research recommendations drawn from gaps identified in the literature review and limitations section; not an empirical claim but a prioritized research agenda based on secondary evidence.
Growth‑accounting frameworks and measurement approaches must be updated to capture AI/robotics as intangible and embodied capital, including quality improvements and spillovers.
Methodological argument grounded in literature on measurement challenges and examples of intangible capital; no new measurement exercise or empirical re‑estimation is provided in the paper.
Recommendation for research and modeling: economic models of AI markets should incorporate institutional regime types (centralized vs decentralized), enforcement uncertainty, and legitimacy effects as parameters affecting data access costs, R&D productivity, and market concentration.
Normative recommendation based on the comparative typology and inferred mechanisms from the document analysis; not empirically validated within the study.
Theoretical contribution: the paper extends modular coordination theory by treating openness–security trade‑offs as layered, adaptive institutional processes embedded in political regimes and 'legitimacy economies.'
Argumentative/theoretical development in the paper grounded in document analysis and literature on coordination and legitimacy.
The framework offers a replicable model for governments and institutions seeking to proactively support high-potential innovations across sectors.
Paper asserts replicability and applicability to governments/institutions based on the described methods and outputs; no deployment case studies or empirical replication evidence reported in text provided.
A data-driven, foresight-based approach to policy design significantly enhances responsiveness, precision, and resource efficiency in science and technology governance.
Paper concludes this benefit based on its integrated framework, triangulation, Delphi/AHP validation and illustrative mapping; no quantified comparative metrics or experimental evaluation reported in text provided.
Fostering digital transformation alongside workforce reskilling and innovation-ecosystem development is essential for sustainable industrial growth and strengthening Kazakhstan’s global economic position.
Policy and strategic recommendations based on the study's empirical results, case studies, and macro-level index comparisons.
Digital transformation combined with workforce retraining optimizes labor costs and enhances productivity.
Synthesis of enterprise-level case examples and aggregated regression/correlation findings at industry and national levels that link digitalization and retraining programs to labor-cost and productivity indicators.
These findings provide an early empirical baseline and point toward competitive plurality rather than winner-take-all consolidation among engaged users.
Interpretation synthesized from survey results (multi-platform usage, indistinguishable satisfaction among top platforms, differing adoption reasons); overall sample N=388.
Switching costs between platforms are negligible (users treat these tools as interchangeable utilities rather than sticky ecosystems).
Survey responses indicating platform-switching behavior and perceived costs; inference based on reported multi-platform usage and responses about platform loyalty/switching (overall N=388).
These results establish agent scaling as a practical and effective axis for HLS optimization.
Synthesis/interpretation of empirical results (including mean 8.27× speedup and per-benchmark gains) reported in the paper.
Across benchmarks, agents consistently rediscover known hardware optimization patterns without domain-specific training.
Qualitative and empirical observations across the evaluated benchmarks (12) reporting that agents found recognized hardware optimization patterns despite no hardware-specific training.
This work demonstrates the technical feasibility of scalable, AI-augmented quality assessment for early childhood education and lays a foundation for continuous, inclusive AI-assisted evaluation enabling systemic improvement and equitable growth.
Overall results of dataset release, Interaction2Eval performance (agreement), and deployment efficiency reported in the paper; used by the authors to argue broader feasibility and potential systemic impact.
AI-assisted monitoring could shift assessment practice from annual expert audits to monthly AI-assisted monitoring with targeted human oversight.
Authors' synthesis combining dataset-scale results, Interaction2Eval performance (agreement), and deployment efficiency gains to argue feasibility of more frequent monitoring.
Digital transformation enhances the relational embeddedness among cities, and this enhanced relational embeddedness facilitates improved outcomes in collaborative innovation (mediating mechanism).
Mediation analysis / network metric analysis using city-level relational embeddedness measures computed from patent collaboration networks and digital transformation indicators from A-share listed companies (2011–2021).
Robust arbitrage strategies remain profitable even when generalized across different domains (claim reiteration emphasizing cross-domain profitability and robustness).
Repeated/strengthened claim in the paper referencing multiple experiments and robustness checks across domains.
An arbitrageur can efficiently allocate inference budget across providers to undercut the market, creating a competitive offering with no model-development risk.
Methodological description and empirical demonstration in the paper showing arbitrageur strategies that allocate inference budget across multiple providers to create a competitive service without incurring model-development risk.
Arbitrage reduces market segmentation and facilitates market entry for smaller model providers by enabling earlier revenue capture.
Reported analysis and/or experiments suggesting arbitrage homogenizes offerings (reduces segmentation) and allows smaller providers to capture revenue earlier through arbitrage-enabled routes.
Robust arbitrage strategies that generalize across different domains remain profitable.
Reported experiments indicating that arbitrage strategies generalized beyond the primary SWE-bench domain and still yielded profit (authors state robust strategies remain profitable across domains).
Arbitrage is viable in AI model markets (we empirically demonstrate the viability of arbitrage and illustrate its economic consequences).
Empirical experiments and analyses presented in the paper (case study on SWE-bench and additional experiments on arbitrage strategies).
This systematic framework can help predict at a detailed level where today's AI systems can and cannot be used and how future AI capabilities may change this.
Interpretive/utility claim: authors argue that the ontology plus classification results serve as rough predictive tools for AI applicability across work activities.
The results contribute to literature arguing that cloud-based GenAI is a source of enterprise value creation rather than merely an experimental technology.
Paper's stated addition to the existing literature based on the combined empirical and theoretical findings.
When compared to baseline approaches, the ARL-based model's accuracy in revenue and price optimization decreased by less than 20%, indicating that it can adapt and optimize pricing techniques in intricate, cutthroat markets.
Reported experimental comparison versus baselines (fixed/rule-based and cost-plus); specific metrics, dataset size, and whether 'decrease' refers to error or accuracy are not clarified in the excerpt.
Our results substantiate the potential of large language models as a foundational pillar for high-fidelity, scalable decision simulation and latter analysis in the real economy based on foundational database.
High-level conclusion drawn from the paper's experiments and methodological contributions; generalization claim asserting LLMs' potential as foundational tools for scalable, high-fidelity decision simulation.
Experiments demonstrate that our framework achieves improved simulation stability compared to existing economic and financial LLM simulation baselines.
Empirical claim: experiments vs. baselines showing improved simulation stability (paper statement that framework improved simulation stability, without quantitative details in the excerpt).
Experiments demonstrate that our framework achieves significant improvements in purchase quantity prediction compared to existing economic and financial LLM simulation baselines.
Empirical claim: experiments comparing MALLES against existing baselines; paper reports 'significant improvements' in purchase quantity prediction (no numerical values provided in the excerpt).
Experiments demonstrate that our framework achieves significant improvements in product selection accuracy compared to existing economic and financial LLM simulation baselines.
Empirical claim: experiments comparing MALLES against existing economic and financial LLM simulation baselines; paper reports 'significant improvements' in product selection accuracy (no numerical values provided in the excerpt).
This preference-learning approach enables the models to internalize and transfer latent consumer preference patterns, thereby mitigating the data sparsity issues prevalent in individual categories.
Claim based on the paper's reported approach: cross-category post-training and transfer of latent preferences; supported by experiments (paper states mitigation of data sparsity).