Evidence (5267 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	378	106	59	455	1007
Governance & Regulation	379	176	116	58	739
Research Productivity	240	96	34	294	668
Organizational Efficiency	370	82	63	35	553
Technology Adoption Rate	296	118	66	29	513
Firm Productivity	277	34	68	10	394
AI Safety & Ethics	117	177	44	24	364
Output Quality	244	61	23	26	354
Market Structure	107	123	85	14	334
Decision Quality	168	74	37	19	301
Fiscal & Macroeconomic	75	52	32	21	187
Employment Level	70	32	74	8	186
Skill Acquisition	89	32	39	9	169
Firm Revenue	96	34	22	—	152
Innovation Output	106	12	21	11	151
Consumer Welfare	70	30	37	7	144
Regulatory Compliance	52	61	13	3	129
Inequality Measures	24	68	31	4	127
Task Allocation	75	11	29	6	121
Training Effectiveness	55	12	12	16	96
Error Rate	42	48	6	—	96
Worker Satisfaction	45	32	11	6	94
Task Completion Time	78	5	4	2	89
Wages & Compensation	46	13	19	5	83
Team Performance	44	9	15	7	76
Hiring & Recruitment	39	4	6	3	52
Automation Exposure	18	17	9	5	50
Job Displacement	5	31	12	—	48
Social Protection	21	10	6	2	39
Developer Productivity	29	3	3	1	36
Worker Turnover	10	12	—	3	25
Skill Obsolescence	3	19	2	—	24
Creative Output	15	5	3	1	24
Labor Share of Income	10	4	9	—	23

Adoption Remove filter

Technological innovation was assessed via adoption of new systems, integration of digital channels, and use of Artificial Intelligence and data analytics.

Measurement description provided in the paper listing the components used to operationalize technological innovation.

high null result Technology Innovation Strategy and the Competitiveness of Ke... measurement/operationalization of technological innovation

Competitiveness in the study was measured through market share, return on equity and customer satisfaction.

Measurement description provided in the paper describing dependent variable operationalization (explicit list of three indicators).

high null result Technology Innovation Strategy and the Competitiveness of Ke... measurement/operationalization of competitiveness

Metode penelitian yang digunakan adalah penelitian hukum normatif dengan pendekatan perundang-undangan, konseptual, dan komparatif, didukung oleh analisis literatur dari jurnal nasional terindeks SINTA dan jurnal internasional bereputasi.

Pernyataan metode yang jelas tercantum dalam abstrak/metodologi makalah.

high null result Reformasi Hukum Ketenagakerjaan di Era Artificial Intelligen... metodologi penelitian (penelitian hukum normatif dan tinjauan literatur)

Penelitian menilai kecukupan perlindungan hukum yang tersedia bagi pekerja terdampak PHK akibat adopsi AI.

Pernyataan tujuan penelitian dan pendekatan analitis (normatif, komparatif) yang didukung oleh tinjauan literatur pada jurnal-jurnal terpilih.

high null result Reformasi Hukum Ketenagakerjaan di Era Artificial Intelligen... kecukupan perlindungan hukum bagi pekerja terdampak AI

Penelitian ini bertujuan menganalisis bagaimana Undang-Undang Cipta Kerja dan peraturan turunannya mengklasifikasikan dan menjustifikasi Pemutusan Hubungan Kerja (PHK) akibat adopsi AI.

Pernyataan tujuan penelitian yang tercantum di bagian metodologi/pendahuluan; pendekatan peraturan-perundang-undangan dalam penelitian hukum normatif.

high null result Reformasi Hukum Ketenagakerjaan di Era Artificial Intelligen... klasifikasi dan justifikasi PHK dalam kerangka UU Cipta Kerja

The user study had N=50 participants.

Reported user study sample size (N=50) used to evaluate AI-assisted intent expansion in ecologically valid settings.

high null result Structured Intent as a Protocol-Like Communication Layer: Cr... user study sample size

Under the current evaluation resolution, 5W3H, CO-STAR, and RISEN achieve similarly high goal-alignment scores, suggesting that dimensional decomposition itself is an important active ingredient.

Controlled comparison between three structured frameworks (5W3H, CO-STAR, RISEN) across the evaluated outputs, with no meaningful differences reported between them.

high null result Structured Intent as a Protocol-Like Communication Layer: Cr... goal-alignment scores

The study evaluated 3,240 model outputs (3 languages x 6 conditions x 3 models x 3 domains x 20 tasks) using an independent judge (DeepSeek-V3).

Reported experimental design and evaluation: 3 languages, 6 conditions, 3 models, 3 domains, 20 tasks; judged by DeepSeek-V3.

high null result Structured Intent as a Protocol-Like Communication Layer: Cr... number of model outputs evaluated / evaluation procedure

Data construction: The authors treat Wikipedia technology pages as distinct technologies and trace them across patents and job postings from 1976 to 2007, using technical bigrams to identify technologies in texts.

Description of dataset construction building on Kalyani et al. (2025) in Section 2; methodological description of linking Wikipedia pages, patent text, and job postings.

high null result THE SKILL PREMIUM IN TIMES OF RAPID TECHNOLOGICAL CHANGE coverage and method of technology identification in data

Proposition 1: With a constant pace of technology creation (m(b)=m), the model admits a unique balanced growth path (BGP) along which real wages and output grow at rate g, the skill premium remains constant and is independent of m.

Analytical result (proposition) proved in the paper's model appendix under model assumptions.

high null result THE SKILL PREMIUM IN TIMES OF RAPID TECHNOLOGICAL CHANGE skill premium dependence on pace parameter m along BGP

The modal technology in the top 1% densest locations (e.g., New York, San Francisco) is 34 years old, while the modal technology in the bottom 50% lowest-density locations is 48 years old, indicating sizable diffusion gaps.

Empirical measurement from the text-based technology dataset tracking vintage of technologies across locations; reported modal ages by location density percentile.

high null result THE SKILL PREMIUM IN TIMES OF RAPID TECHNOLOGICAL CHANGE modal technology age by location density

Limitations: the Comscore data observe household internet activity on home (non-mobile) devices and do not capture offline or mobile device activities, so extrapolation to total at-home activities should be done with caution.

Authors' explicit limitation discussion in paper stating data do not include mobile devices or offline activities.

high null result https://arxiv.org/pdf/2603.03144 data coverage (mobile/offline activities not observed)

ChatGPT adoption leaves the total time spent on productive online activities (including any time spent using ChatGPT) unchanged.

Same IV long-difference estimates as above; authors state 'leaving time spent on productive digital tasks unchanged' and that total productive activity time does not decline significantly.

high null result https://arxiv.org/pdf/2603.03144 total time spent on productive online activities

The analysis uses detailed Internet browsing microdata from over 200,000 U.S. households' home devices from 2021 to 2024.

Comscore web browsing panel described in paper; authors state dataset covers 'over 200,000 U.S. households' across 2021-2024; data provides timestamps, visit durations, URLs, demographic bins, etc.

high null result https://arxiv.org/pdf/2603.03144 size and coverage of browsing panel

The present review examined the intersection of artificial intelligence, sustainable finance, ESG performance, FinTech, climate risk analytics, algorithmic governance, and responsible investing.

Statement of the paper's scope and aims (description of the review content and topics covered).

high null result Artificial intelligence in sustainable finance and Environme... topics covered by the review

The literature on AI-based ESG scoring, green finance, and data-driven sustainability reporting is disjointed across finance, management, and technology fields and requires application of the PRISMA framework to provide transparency and methodological rigor in systematic reviews.

Paper's methodological assessment and recommendation based on the authors' systematic review process and literature mapping (statement about the state of the literature and methodological needs). No numeric evidence provided in the excerpt.

high null result Artificial intelligence in sustainable finance and Environme... transparency and methodological rigor of literature reviews in the field

The analysis draws on data from 170 countries for 2020–2024 for the Government AI Readiness Index (GAIRI)–EGDI comparison.

Data description in abstract explicitly reporting the GAIRI–EGDI sample coverage as 170 countries for 2020–2024.

high null result E-government development: Artificial intelligence vibrancy a... E-Government Development Index (EGDI)

The analysis draws on data from 36 countries for 2018–2022 for the AI Vibrancy Score (AIVS)–EGDI comparison.

Data description in abstract explicitly reporting the AIVS–EGDI sample coverage as 36 countries for 2018–2022.

high null result E-government development: Artificial intelligence vibrancy a... E-Government Development Index (EGDI)

Methods combine targeted literature synthesis, comparative conceptual analysis, and framework building (with recent scholarly and institutional sources reviewed).

Explicit methodological statement in the paper describing the review and analytic approach; no primary-data methods used.

high null result Behavioral Factors as Determinants of Successful Scaling of ... methodological approach (literature synthesis and conceptual framework developme...

AI coding assistants are a high-visibility class of corporate AI and are given special attention as an illustrative case in the paper.

Paper specifically calls out AI coding assistants as a focal example in the conceptual analysis and discussion; based on literature review rather than original measurement.

high null result Behavioral Factors as Determinants of Successful Scaling of ... role of coding assistants as illustrative case for scaling and behavioral dynami...

AI’s societal integration in India is gradual, and therefore its impact on economic variables (like wages and inequality) is also gradual.

Synthesis in the paper based on empirical adoption figures (e.g., <0.7% adoption for AI ride services) and the observed weak changes in inequality measures in the transportation sector.

high null result Artificial Intelligence, Demand Switching and Sectoral Wage ... pace of AI integration and consequent economic impact

Despite AI’s introduction, wage inequality in the transportation sector (measured by the Gini coefficient) has not significantly worsened.

Empirical investigation reported in the paper analyzing transportation-sector wage disparities over time using the Gini coefficient; the paper reports no significant worsening post-introduction.

high null result Artificial Intelligence, Demand Switching and Sectoral Wage ... Gini coefficient of wages in the transportation sector

The Article translates these insights into risk-sensitive guideposts for modernizing governance of AI-enabled tools and emerging modalities, from agentic systems to blockchain-deployed smart contracts.

Prescriptive/conceptual policy guidance presented in the Article (normative recommendations; governance framework).

high null result Rewired: Reconceptualizing Legal Services for the AI Age provision of governance guideposts for AI-enabled legal technologies

The Innovation Frontier traces LegalTech’s evolution from 2000s-vintage e-discovery to generative AI.

Historical/chronological analysis in the Article (literature review/history of LegalTech provided by authors).

high null result Rewired: Reconceptualizing Legal Services for the AI Age narrative/historical scope of LegalTech evolution covered in the Article

The Legal Services Value Chain disaggregates the lifecycle of a legal matter into five distinct nodes of activity.

Model description in the Article (conceptual architecture; decomposition of legal work).

high null result Rewired: Reconceptualizing Legal Services for the AI Age number and structure of nodes in the proposed value-chain model

The Article develops two core organizing models: the Legal Services Value Chain and the Innovation Frontier.

Explicit claim in the Article describing conceptual/model contributions (theoretical/model-building).

high null result Rewired: Reconceptualizing Legal Services for the AI Age presence of two organizing conceptual models in the Article

This Article provides a practical framework for navigating the shifting terrain of legal innovation and AI.

Statement of purpose in the Article (conceptual contribution; framework development). No empirical validation reported in the excerpt.

high null result Rewired: Reconceptualizing Legal Services for the AI Age existence of a practical framework for legal-AI governance and strategy

There are action tools for higher-stakes tasks like financial transactions.

Observed examples of action tools in the monitored MCP repositories that perform higher-stakes functions, with financial transactions given as an explicit example in the paper.

high null result How are AI agents used? Evidence from 177,000 MCP tools presence of action tools enabling high-stakes tasks (e.g., financial transaction...

We use O*NET mapping to identify each tool's task domain and consequentiality.

Method described in paper: mapping each tool to O*NET task domains and consequentiality using the monitored tool metadata and descriptions.

high null result How are AI agents used? Evidence from 177,000 MCP tools method for assigning task domain and consequentiality

We categorise tools according to their direct impact: perception tools to access and read data, reasoning tools to analyse data or concepts, and action tools to directly modify external environments.

Methodological classification described in paper (taxonomy of tools into perception, reasoning, action); applied to monitored MCP server dataset.

high null result How are AI agents used? Evidence from 177,000 MCP tools tool category / taxonomy

AI transparency alone did not significantly increase data-sharing.

Result reported from the randomized experiment (N=240) comparing actual data-sharing rates across human, white-box AI, and black-box AI conditions; authors state that transparency alone did not produce a significant increase in sharing.

high null result Understanding Data-Sharing with AI Systems: The Roles of Tra... actual data-sharing (behavioral sharing decisions)

These energy reductions are achieved without statistically significant performance loss.

Paper states that performance loss is not statistically significant across the evaluated benchmarks (as reported in the abstract).

high null result EcoThink: A Green Adaptive Inference Framework for Sustainab... model performance / benchmark accuracy (no statistically significant degradation...

The research surveys current methodologies and empirical evidence related to regulatory early-warning systems and desegregates (synthesizes) findings from empirical information.

Paper states it examines existing methodologies and empirical findings (literature review / synthesis); no scope (e.g., number of studies reviewed) given in the excerpt.

high null result Research on the Construction of an AI-Driven Financial Regul... state of evidence on methodologies for regulatory early-warning of fiscal risk

The study uses a mixed-methods approach combining qualitative insights from 1,500 semi-structured customer interviews with quantitative analysis of transaction records, loan repayment histories, and account activity.

Paper states methods explicitly in abstract: 1,500 semi-structured interviews plus quantitative analysis of transaction records, loan repayment histories, and account activity (case-study approach across three platforms).

high null result Artificial Intelligence, Climate Resilience, and Financial I... research_methodology

Three interlocking threads characterize AI for science: (1) AI as research instrument, (2) AI for research infrastructure, and (3) the reshaping of scholarly profiles and incentives by machine-readable metrics.

Conceptual framework presented in the paper; organization of topics rather than empirical measurement. The paper indicates these threads are followed through historical and contemporary examples.

high null result A Brief History of AI for Scientific Discovery: Open Researc... conceptual decomposition of AI-for-science developments

The history of artificial intelligence for scientific discovery is not a two year story about chatbots learning to write papers; it is a sixty year story beginning with DENDRAL (1965).

Historical narrative / literature review citing early systems such as DENDRAL (1965) and subsequent developments in scholarly infrastructure (arXiv, Google Scholar, ORCID). No empirical sample or statistical test reported.

high null result A Brief History of AI for Scientific Discovery: Open Researc... historical scope and timeline of AI for scientific discovery

At the macroeconomic level, Kazakhstan's state programs (e.g., 'Digital Kazakhstan' and the Industrial and Innovation Development Program) and international indices (WIPO Global Innovation Index, OECD digital assessments, IMF data) are used to evaluate and position Kazakhstan within the global digital economy.

Macro-level analysis using national programs and international indices described in the article to assess Kazakhstan's digital economy standing.

high null result Digitalization and labor costs: efficiency of industrial ent... Kazakhstan's position in global digital economy (evaluative metric)

This paper uses panel data of China's Shanghai and Shenzhen A-share non-financial listed companies from 2010 to 2022 to study AI's effects.

Explicit data description in the paper (sample frame and period stated).

high null result THE IMPACT OF ARTIFICIAL INTELLIGENCE ON ENTERPRISE INCOME D... n/a (methodological/data claim)

Deep Reinforcement Learning (DRL) has shown strong microscopic performance in car-following conditions, but its macroscopic traffic flow characteristics remain underexplored.

Literature synthesis / motivation in the paper (review of existing DRL work focused on microscopic performance). No empirical sample size.

high null result Macroscopic Characteristics of Mixed Traffic Flow with Deep ... extent of prior research on macroscopic traffic flow characteristics for DRL mod...

The paper is intentionally public-safe: it omits proprietary implementation details, training recipes, thresholds, hidden-state instrumentation, deployment procedures, and confidential system design choices, and therefore the contribution is theoretical rather than operational.

Statement about the paper's scope and publication choices; directly asserted by the authors regarding omitted content and the theoretical nature of the contribution.

high null result A Public Theory of Distillation Resistance via Constraint-Co... scope_and_nature_of_contribution (theoretical vs operational)

The paper introduces a constraint-coupled reasoning framework with four elements: bounded transition burden, path-load accumulation, dynamically evolving feasible regions, and a capability-stability coupling condition.

Descriptive/theoretical: the paper explicitly defines and enumerates these four framework elements. This is a claim about the paper's content rather than an empirical finding.

high null result A Public Theory of Distillation Resistance via Constraint-Co... presence_and_definition_of_framework_components

The analysis uses data on 31 million users of Ctrip, China's largest online travel platform, to study "Wendao," an LLM-based AI assistant integrated into the platform.

Descriptive statement in the paper about data source: platform logs/usage data for Ctrip covering 31 million users and the Wendao assistant.

high null result Shopping with a Platform AI Assistant: Who Adopts, When in t... other

The top three platforms (Claude, ChatGPT, and DeepSeek) receive statistically indistinguishable satisfaction ratings despite vast differences in funding, team size, and benchmark performance.

Statistical comparison of self-reported satisfaction ratings collected via the paper's survey (overall N=388); statistical tests reported in paper (specific test and per-platform n not provided in abstract).

high null result Beyond Benchmarks: How Users Evaluate AI Chat Assistants user satisfaction ratings

We ran a behavioral experiment (N = 200) in which participants predicted the AI's correctness across four AI calibration conditions: standard, overconfidence, underconfidence, and a counterintuitive "reverse confidence" mapping.

Reported experimental design and sample size in the paper (behavioral experiment with N = 200; four experimental conditions).

high null result Learning to Trust: How Humans Mentally Recalibrate AI Confid... experimental conditions / task setup (participants predicting AI correctness)

Study methodology: Two online experiments were conducted via the crowdsourcing platform Prolific with sample sizes study 1: n = 325 and study 2: n = 371; participant mean age = 35 years; 55% female.

Methodological and sample description provided in the abstract.

high null result AI content labeling and user engagement on social media: The... study design and sample characteristics

Late disclosure of AI involvement did not improve affective engagement for AI-generated content.

Reported experimental result in the abstract from the two online studies manipulating disclosure timing (early vs. late).

high null result AI content labeling and user engagement on social media: The... affective engagement for AI-generated content under late disclosure

The study was conducted by the Mohammed bin Rashid School of Government’s Future of Government Center, in collaboration with global AI pioneers.

Authorship and collaboration statement in the report.

high null result Charting AI Governance Future in the Arab Region: A Policy R... institutional authorship and collaboration on the study

The report highlights the key findings of a field study covering ten Arab countries to explore the realities and challenges of AI governance.

Report statement describing the geographic scope of the field study (explicitly: ten Arab countries).

high null result Charting AI Governance Future in the Arab Region: A Policy R... geographic coverage of the field study (number of countries)

The recommendations are based on regional research that included hundreds of leaders active in the AI domains, from the public and private sectors.

Report statement claiming participant base of the underlying research (described as 'hundreds of leaders').

high null result Charting AI Governance Future in the Arab Region: A Policy R... scope and participant coverage of the underlying research

Zero-shot baselines and standard retrieval stagnate around 50-60% accuracy across model generations on the graduate-level final exam.

Pilot study reported on a full graduate-level final exam comparing zero-shot and standard retrieval baselines across model generations; reported accuracy range given as ~50-60%. Exact number of exam questions or models compared not stated.

high null result From 50% to Mastery in 3 Days: A Low-Resource SOP for Locali... exam accuracy (percentage correct)

« Prev 1 2 3 … 11 12 13 … 105 106 Next »