Evidence (5539 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	402	112	67	480	1076
Governance & Regulation	402	192	122	62	790
Research Productivity	249	98	34	311	697
Organizational Efficiency	395	95	70	40	603
Technology Adoption Rate	321	126	73	39	564
Firm Productivity	306	39	70	12	432
Output Quality	256	66	25	28	375
AI Safety & Ethics	116	177	44	24	363
Market Structure	107	128	85	14	339
Decision Quality	177	76	38	20	315
Fiscal & Macroeconomic	89	58	33	22	209
Employment Level	77	34	80	9	202
Skill Acquisition	92	33	40	9	174
Innovation Output	120	12	23	12	168
Firm Revenue	98	34	22	—	154
Consumer Welfare	73	31	37	7	148
Task Allocation	84	16	33	7	140
Inequality Measures	25	77	32	5	139
Regulatory Compliance	54	63	13	3	133
Error Rate	44	51	6	—	101
Task Completion Time	88	5	4	3	100
Training Effectiveness	58	12	12	16	99
Worker Satisfaction	47	32	11	7	97
Wages & Compensation	53	15	20	5	93
Team Performance	47	12	15	7	82
Automation Exposure	24	22	9	6	62
Job Displacement	6	38	13	—	57
Hiring & Recruitment	41	4	6	3	54
Developer Productivity	34	4	3	1	42
Social Protection	22	10	6	2	40
Creative Output	16	7	5	1	29
Labor Share of Income	12	5	9	—	26
Skill Obsolescence	3	20	2	—	25
Worker Turnover	10	12	—	3	25

Adoption Remove filter

Wages for workers in K_T‑intensive firms/industries fall or grow more slowly relative to less-exposed counterparts, compressing wage contributions to income.

Panel regressions estimating wage outcomes conditional on K_T intensity measures, with controls and robustness specifications; supported by matched employer‑employee microdata in case studies and industry-level decompositions.

medium negative The Macroeconomic Transition of Technological Capital in the... wage levels and wage growth

Strict oversight requirements for GLAI could raise fixed compliance costs (audit, certification, human-in-the-loop processes), benefiting incumbent firms and potentially reducing competition and barriers to entry.

Regulatory economics argument drawing on compliance-cost logic and market structure effects; no empirical entry-cost analysis or case studies.

medium negative (for competition), positive (for incumbents) Why Avoid Generative Legal AI Systems? Hallucination, Overre... barriers to entry and market competition metrics in legal-AI markets

Perception of increased legal risk and regulatory uncertainty may slow adoption of GLAI and redirect investment toward safer subfields (verification tools, retrieval-augmented systems, formal-reasoning hybrids).

Economic reasoning and market-design argumentation based on risk/uncertainty dynamics; no econometric or survey data presented.

medium negative (for generative adoption), positive (for verification subfields) Why Avoid Generative Legal AI Systems? Hallucination, Overre... adoption rates of GLAI and relative investment flows across AI subfields

Divergent regulatory regimes (e.g., strict EU rules vs. looser regimes elsewhere) may produce regulatory arbitrage, influencing where GLAI companies locate, invest, and trade internationally.

Cross-jurisdictional regulatory analysis and economic inference about firm behavior under differential regulation; no firm-level relocation data provided.

medium negative (for regulatory harmonization), neutral for firms (strategic outcome) Why Avoid Generative Legal AI Systems? Hallucination, Overre... firm location/investment decisions and cross-border trade in legal-AI services

Evaluations that measure outcomes only via official-language channels risk underestimating impacts where vernacular mediation is central.

Argument based on the discrepancy between vernacular-mediated comprehension/adoption observed in the sample and the likely invisibility of those effects in official-language measurement channels; supported by questionnaire and qualitative data.

medium negative (regarding official-language-only evaluation validity) From Linguistic Hybridity to Development Sovereignty: Pidgin... measurement bias / underestimation of program impacts

DPPs raise privacy and surveillance risks if personal data are linked to product use; economic regulation should incentivize privacy-preserving analytics (e.g., federated learning, differential privacy) and data minimality to maintain trust.

Risk assessment and governance recommendation grounded in stakeholder concerns and standard privacy literature; not empirically measured in the surveys.

medium negative (risk) Integrating knowledge management and digital product passpor... privacy/surveillance risk and recommended governance/technical mitigations

Automated benchmarks dominate the evaluation of large language models, yet no systematic study has compared user satisfaction, adoption motivations, and frustrations across competing platforms using a consistent instrument.

Statement of the paper's motivation/background; implied literature review and identification of an empirical gap (no systematic, cross-platform user survey reported prior).

medium neutral Beyond Benchmarks: How Users Evaluate AI Chat Assistants state of the evaluation literature (dominance of automated benchmarks and lack o...

Interpretive, ad-hoc human-centered evaluation practices (e.g., “vibe checks”, team sense-making) are rational adaptations to LLM behavior rather than merely sloppy or inferior methodological choices.

Authors' interpretive argument based on interview evidence where practitioners explained why such practices persist and how they serve sense-making for unpredictable model behavior.

medium neutral Results-Actionability Gap: Understanding How Practitioners E... characterization of interpretive evaluation practices (rational adaptation vs. m...

The possibility of strategic argument construction (gaming) motivates governance needs: standards for provenance, certification, and liability rules.

Policy recommendation based on anticipated incentive problems; no empirical governance evaluations.

medium neutral Argumentative Human-AI Decision-Making: Toward AI Agents Tha... existence and effectiveness of governance mechanisms (standards, certification, ...

Standard GDP statistics can mask AI-driven demand shortfalls; central banks and statistical agencies should therefore monitor labor-share–velocity links, distributional income measures, and consumption by income quantile in addition to headline GDP.

Theoretical Ghost GDP channel and calibration results showing divergence between measured GDP and consumption-relevant income; policy recommendation follows from those model results.

medium neutral Abundant Intelligence and Deficient Demand: A Macro-Financia... detection of demand shortfalls (labor-share–velocity relationship and consumptio...

Health technology assessment (HTA) frameworks should be adapted to evaluate models trained on synthetic or hybrid data, incorporating metrics for fidelity, domain generalization, and economic impact (cost-effectiveness, budget impact, distributional effects).

Recommendation from the review synthesizing HTA literature and gaps identified when applying existing HTA to AI models trained on non-traditional data sources; based on policy analysis rather than empirical HTA trials of synthetic-data models.

medium neutral On the use of synthetic data for healthcare AI in Africa: Te... HTA evaluation metrics (fidelity scores, generalization performance, cost-effect...

Technical fixes alone are insufficient: governance, validation pipelines (e.g., health technology assessment), and capacity building are needed for safe, effective uptake of synthetic-data–trained AI.

Cross-disciplinary synthesis of governance analyses, health technology assessment literature, and implementation studies in the review arguing for combined technical and institutional interventions; recommendation-based evidence rather than new empirical trials.

medium neutral On the use of synthetic data for healthcare AI in Africa: Te... safe/effective uptake operationalized via validated deployment, regulatory compl...

AI changes the nature of capital (digital/algorithmic assets) and complicates productivity accounting; researchers should decompose firm-level productivity gains into AI technology, complementary organizational capital, and human capital effects.

Theoretical proposal grounded in productivity accounting literature and conceptual discussion; no single decomposition empirical result presented.

medium neutral Modern Management in the Age of Artificial Intelligence: Str... components of multifactor productivity attributable to AI assets versus organiza...

Time-series metrics (e.g., derivatives like d/dt(student enrollment)) are useful monitoring signals for validation and system oversight.

Methodological suggestion in the paper proposing time-series analysis of enrollment and other administrative data; no empirical demonstration or threshold criteria provided.

medium neutral Establishes a technical and academic bridge between the educ... sensitivity of monitoring to enrollment changes, anomaly detection lead time

Rehabilitation was the most common research area with 336 publications (~18.35%), followed by Pediatrics (reported as 1,387 publications in the text).

Results: WoS research area counts provided in the paper (listed values for Rehabilitation and Pediatrics).

medium null result Bibliometric Analysis of Publications on Parents of Children... research_productivity

Most action tools support medium-stakes tasks like editing files.

Classification of action tools by task consequentiality using O*NET mapping and inspection of tool functions (paper states majority are medium-stakes, e.g., file editing).

medium null result How are AI agents used? Evidence from 177,000 MCP tools consequentiality / stakes of action tools (proportion medium-stakes)

Mobile penetration reaches 84% (in the context of low-income countries), a statistic used to motivate RSI's potential reach.

Single numeric statistic reported in the paper as background context; source or empirical basis for the statistic not provided within the supplied text.

medium null result Revenue-Sharing as Infrastructure: A Distributed Business Mo... mobile penetration rate (percent)

The authors assess system performance on JobSearch-XS across retrieval tasks.

Paper states that system performance is assessed on JobSearch-XS across retrieval tasks. The excerpt does not provide the tasks, metrics, sample sizes, or numerical results.

medium null result JobMatchAI An Intelligent Job Matching Platform Using Knowle... retrieval performance on JobSearch-XS tasks (metrics unspecified in excerpt)

Output quality saturates at approximately seven governed memories per entity.

Empirical analysis reported in the controlled experiments showing output quality vs. number of governed memories per entity, with saturation near seven memories.

medium null result Governed Memory: A Production Architecture for Multi-Agent W... output quality as a function of number of governed memories per entity (saturati...

The risk of endogeneity was avoided by using an instrumental approach to obtain causal estimates of the impact of technological diffusion on market opportunities.

Paper reports use of an instrumental variables approach to address endogeneity (instruments and diagnostics not described in the excerpt).

medium null result Innovative Cognitive Tools for Studying Market Opportunities... causal effect of technological diffusion (Cognitive Tools Index) on the Market O...

CAFTA spillovers stabilized import volumes from third countries (reduced volatility) for Chinese agricultural imports.

Analysis of import volume volatility metrics over 2000–2014 using customs data within DID framework; volatility/variance decline identified as an outcome in the mechanisms/secondary channel tests.

medium null result How regional trade policy uncertainty affects agricultural i... import volume volatility/stability (variance or coefficient of variation of impo...

A Sankey diagram of thematic evolution shows lexical convergence over time and indicates that a small set of authors has disproportionate influence in structuring the discourse.

Thematic evolution analysis visualized with a Sankey diagram; author influence inferred from performance trends (citations/publication counts) in the bibliometric data.

medium null result Generative AI and the algorithmic workplace: a bibliometric ... lexical convergence across themes and concentration of author influence (disprop...

CID does not significantly mediate the relationship between SCD and strategic green innovation.

Mediation tests showing that while CID is related to substantive innovation, the indirect effect via CID on strategic green innovation was statistically insignificant.

medium null result Supply Chain Digitalization and its Impact on Green Innovati... strategic green innovation (signaling/compliance-oriented measures) and CID as m...

This paper is one of the first systematic reviews focused specifically on NLP in bank marketing, organizing findings along the customer journey and the marketing mix to provide a practical taxonomy.

Authors' stated novelty claim based on the scoped literature search (2014–2024) and topical focus; novelty inferred from the small number of prior papers identified at the intersection.

medium null result Natural language processing in bank marketing: a systematic ... existence of prior systematic reviews specifically on NLP in bank marketing

There is a need to develop new trade statistics that capture AI‑enabled services and platform‑mediated cross‑border transactions.

Methodological gap identified across reviewed literature and statistical analyses; recommendation based on descriptive assessment (no development of such statistics in the paper).

medium null result Analysis of Digital Services Trade and Export Competitivenes... availability and quality of trade statistics for AI/platform‑mediated services

Productivity gains from AI may be under- or mis-measured if national accounts and tax systems do not adjust for AI-driven quality changes in services.

Analytic observation in the paper's measurement and externalities discussion; not empirically tested within the study.

medium null result Explore the Impact of Generative AI on Finance and Taxation accuracy of productivity measurement and GDP accounting for AI-enabled quality i...

The paper documents production failure vignettes and operational lessons drawn from a real enterprise deployment integrated with a major cloud provider's MCP servers (client redacted).

Paper states empirical context is field lessons from an enterprise agent platform; failure vignettes are enumerated as deliverables.

medium null result Bridging Protocol and Production: Design Patterns for Deploy... presence and content of documented failure vignettes and lessons

ToM alignment matters less (i.e., misalignment has smaller effect) in settings with explicit coordination protocols, strong signaling, or standardized conventions.

Analyses and experiments described in the paper showing smaller performance differences between matched and mismatched ToM orders when explicit conventions or reliable signals are available; reported as part of robustness/conditional analyses.

medium null result Adaptive Theory of Mind for LLM-based Multi-Agent Coordinati... difference in coordination performance between matched and mismatched ToM orders...

Manipulating costs and benefits of observation versus action in experiments can probe the switching behavior driven by System M.

Proposed experimental manipulation; no empirical data presented.

medium null result Why AI systems don't learn and what to do about it: Lessons ... switching thresholds; allocation of observation vs action; resultant task perfor...

Ablation studies disabling System M or decoupling Systems A and B will help test whether meta-control provides empirical benefits.

Suggested experimental design (ablation study) in the methods section; no results provided.

medium null result Why AI systems don't learn and what to do about it: Lessons ... performance difference with/without M; switching/adaptation behavior

Expert (per-expert) sizes and overall design are positioned between the GPT-OSS and Qwen3 MoE designs.

Architectural comparison asserted in the paper; claim is based on relative model-design choices (expert count/size) compared to public descriptions of GPT-OSS and Qwen3. The summary provides the positioning but not detailed layer-by-layer comparisons.

medium null result EngGPT2: Sovereign, Efficient and Open Intelligence relative expert size / MoE configuration compared to named architectures

An orchestrator coordinates components with intent-aware routing and layered safety checks, enabling multi-step workflows and productized services.

Paper describes an agentic tool-calling framework and multi-layer orchestrator used for intent-aware routing, defense-in-depth safety validation, and multi-step workflows.

medium null result Fanar 2.0: Arabic Generative AI Stack system orchestration capability (intent-aware routing, layered safety)

Aura is a long-form ASR system capable of handling hours-long audio.

Paper lists Aura in the product stack as 'long-form ASR handling hours-long audio.' Specific evaluation metrics or training data for ASR are not provided in the summary.

medium null result Fanar 2.0: Arabic Generative AI Stack ASR capability (long-form/hours-long audio handling)

Arabic content comprises only about 0.5% of web data despite roughly 400 million native speakers.

Paper cites this data-point to motivate intentional data strategies for Arabic underrepresentation on the web; exact source of the web-proportion not specified in the summary.

medium null result Fanar 2.0: Arabic Generative AI Stack proportion of web data in Arabic (~0.5%)

Methods among the surveyed systems span token-level code generation to circuit-structure generation, and evaluation metrics are often task- and artifact-specific.

Surveyed system descriptions show diversity in generative approaches (token-level language models, graph/diffusion-based circuit generators, agentic optimizers) and corresponding tailored metrics noted in the review.

medium null result Generative AI for Quantum Circuits and Quantum Code: A Techn... range of generative methods and specificity of evaluation metrics

Measuring AI's contribution to productivity and coordination effects will be challenging; new metrics (e.g., coordination time per task, error/rework rates attributable to communication lapses) are required.

Conceptual argument and recommended measurement agenda in the paper; no empirical testing of proposed metrics provided.

medium null result AI as a universal collaboration layer: Eliminating language ... feasibility and precision of proposed coordination/productivity metrics

Many early-stage AI advances have not translated into higher Phase II/III success rates.

Synthesis of reported outcomes and failures from industry experience; no new systematic statistical analysis provided.

medium null result Learning from the successes and failures of early artificial... Phase II/III clinical success rates

After roughly a decade of adoption in large biopharma, AI has not yet changed late-stage (Phase II/III) clinical success rates.

Qualitative assessment of industrywide experience and reported outcomes; statement based on narrative review rather than systematic, long-run quantitative analysis or causal estimates.

medium null result Learning from the successes and failures of early artificial... Phase II/III clinical success rates (late-stage trial success probability)

Three primary adoption archetypes in large pharma are (1) partnership-driven acceleration, (2) culture-centric transformation, and (3) production-first democratization.

Conceptual classification in the editorial derived from trends and illustrative examples rather than empirical survey or sampling; no quantitative validation provided.

medium null result AI as the Catalyst for a New Paradigm in Biomedical Research types of organizational approaches to AI adoption

This paper systematically studies the Impact Mechanism of artificial intelligence on the Globalized Division of Labor and reveals the Structural Transformation under Technology Substitution and Data Elements Dual-wheel Drive through Literature Review and Theoretical Analysis.

Methodological claim: supported by the paper's literature review and theoretical analysis; no quantitative sample or empirical design indicated for this specific conclusion in the excerpt.

medium null result Artificial Intelligence and Globalized Division of Labor: Re... identification of mechanisms (technology substitution; data elements dual-wheel ...

Existing research largely focuses on general computer literacy and lacks precise measurement of the economic returns to specific vocational digital skills.

Paper's literature review and motivating statements (qualitative assessment of prior studies; no quantitative meta-analysis reported in the excerpt).

medium null result Measuring the Economic Returns of Vocational Digital Skills ... coverage/precision of prior research on economic returns to vocational digital s...

AI adoption is not associated with significant changes in operating costs.

Analysis of operating costs in firm financials showing no significant post-adoption change for adopters relative to nonadopters.

medium null result AI and Productivity: The Role of Innovation operating costs

The innovation effects of AI adoption are not concentrated among larger firms, financially unconstrained firms, or high-tech firms.

Heterogeneity tests across firm size, financial constraint status, and industry technology intensity showing no concentration of effects in these groups (as reported in the paper).

medium null result AI and Productivity: The Role of Innovation distribution of treatment effects across firm-size, financial-constraint, and in...

A complexity-aware routing mechanism selectively activates planning for complex queries, ensuring optimal resource allocation during online serving.

Method description in the paper explaining adaptive online serving and complexity-aware routing; evaluated in serving experiments.

medium null result Probe-then-Plan: Environment-Aware Planning for Industrial E... selective activation of planning (system routing/resource allocation outcome)

There is a gap in the existing literature regarding empirical evidence about the relationship between AI/Big Data use and market uncertainty during economic downturns.

Paper motivates the study by citing this gap based on its literature review (the summary does not list the reviewed works or systematic review method).

medium null result An Empirical Study on the Impact of the Integration of AI an... Existence of an empirical evidence gap in the literature

AI has not yet significantly promoted university–industry collaborative R&D capabilities.

Mechanism analysis in the paper testing the university–industry collaborative R&D channel and reporting no statistically significant effect of AI adoption on that capability in the sample.

medium null result Artificial intelligence and the sustainable development of a... university–industry collaborative R&D capability (and its contribution to TFP)

The studied construction supply chain network exhibits moderate density, reported as 0.591.

Network-level metric (density = 0.591) reported in the results; derived from the constructed network based on coded interview interactions (network size and sampling details not provided in abstract).

medium null result Social-Network Analytics of Construction Supply Chain network density (0.591)

Purposive and snowball sampling produced semi-structured interview data that span all major construction supply chain roles.

Sampling approach stated in the paper: purposive and snowball sampling for interviews; claim that interviews 'span all major supply chain roles' (number of interviews and role breakdown not reported in the abstract).

medium null result Social-Network Analytics of Construction Supply Chain representation of supply chain roles in interview sample

LLMs can be understood as condensates of human symbolic behavior—compressed, generative representations that render patterns of collective discourse computationally accessible.

Theoretical framing and conceptual argument provided by the authors; presented as an interpretive model rather than an empirically tested assertion in the excerpt.

medium null result The Third Ambition: Artificial Intelligence and the Science ... conceptual characterization of LLMs (as condensed representations of collective ...

This study empirically tests a theoretically acknowledged but rarely tested relationship (AI adoption → performance conditional on structural constraints) in an emerging-economy setting.

Literature gap claim supported by the authors' review and execution of an empirical test using survey data from 280 Tunisian SMEs and PLS-SEM.

medium null result Structural Constraints as Moderators in the Ai–performance R... existence and nature of the conditional relationship between AI adoption and fir...

« Prev 1 2 3 … 68 69 70 … 110 111 Next »