Evidence (8625 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	761	200	101	904	2020
Governance & Regulation	829	400	191	122	1566
Organizational Efficiency	784	193	125	84	1197
Technology Adoption Rate	637	236	124	97	1103
Research Productivity	431	131	58	340	972
Output Quality	481	183	59	47	770
Decision Quality	332	177	82	49	647
Firm Productivity	439	57	88	20	610
AI Safety & Ethics	218	279	66	33	602
Market Structure	181	170	123	24	503
Task Allocation	214	64	72	33	388
Skill Acquisition	174	62	62	17	315
Innovation Output	204	27	45	18	295
Employment Level	105	54	108	13	282
Fiscal & Macroeconomic	132	69	43	26	277
Consumer Welfare	117	63	42	11	233
Firm Revenue	154	48	26	3	231
Task Completion Time	173	31	8	12	225
Inequality Measures	44	123	50	6	223
Worker Satisfaction	89	65	22	12	188
Error Rate	71	92	10	2	175
Regulatory Compliance	77	69	14	5	165
Automation Exposure	58	56	26	13	156
Training Effectiveness	96	21	14	19	152
Wages & Compensation	77	37	25	6	145
Team Performance	86	17	27	10	141
Developer Productivity	95	17	14	6	133
Job Displacement	12	81	21	1	115
Hiring & Recruitment	52	7	8	3	70
Creative Output	32	20	8	3	64
Skill Obsolescence	5	47	6	1	59
Social Protection	28	16	8	2	54
Labor Share of Income	17	19	17	—	53
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

Adoption Remove filter

The study uses 5 million job postings from Beijing covering 2018--2024 as its primary data source.

Stated dataset scope and size in the paper's description of data.

high null result Generative AI impacts on intra-urban inequality and skill pr... dataset size and temporal coverage

We construct a neighborhood-level GenAI Exposure Index by aggregating task-level assessments from five leading large language models.

Methodological construction described in the paper: task-level GenAI suitability assessments from five LLMs applied to tasks in 5 million Beijing job postings (2018--2024), aggregated to the neighborhood level.

high null result Generative AI impacts on intra-urban inequality and skill pr... GenAI Exposure Index (measurement / adoption proxy)

Decision-makers (DMs) are similarly ambiguity-seeking and ambiguity-generated insensitive (a-insensitive) regardless of whether the analyst is human or a machine learning (ML) model.

Incentivized laboratory experiment in which participants' ambiguity attitudes were measured for forecasts attributed to human and ML analysts; comparison of ambiguity-seeking and a-insensitivity across analyst type reported in the paper (sample size not reported in abstract).

high null result Trusting human versus machine predictions as a decision unde... ambiguity attitude (ambiguity-seeking and a-insensitivity)

The same bias was not observed when imagining help from another human participant.

Empirical comparison reported in the abstract: predictions about receiving help from another human did not show the same faster-than-reality bias as predictions about AI assistance (from the same preregistered study, N = 1237).

high null result Cognitive offloading and the speedup illusion in human-AI in... predicted completion time when imagining help from another human

Actual completion times between independent completion and AI-assisted completion did not differ.

Empirical result reported in the abstract comparing measured completion times for independent vs. AI-assisted task completion in the preregistered study (N = 1237).

high null result Cognitive offloading and the speedup illusion in human-AI in... actual completion time

We conducted a preregistered large-scale behavioral study (N = 1237) to characterize mismatches between expectations and reality, with a focus on simple cognitive tasks.

Authors report study design and sample size in the abstract: preregistered behavioral experiment with N = 1237 participants.

high null result Cognitive offloading and the speedup illusion in human-AI in... study design / sample size (methodological claim)

Identification strategy exploits import lumpiness in product categories linked to automation technologies (including robots) to disentangle adoption effects from selection into adoption.

Methodological claim: use of import 'lumpiness' in automation-related product categories as a plausibly exogenous source of adoption variation within a difference-in-differences framework.

high null result Firm size and the automation wage premium identification strategy (exogeneity of adoption variation)

We integrate datasets on trade activities, firm, and worker characteristics for the population of Italian importing firms from 2011 to 2019.

Data integration described in abstract; population-level administrative datasets on trade, firm, and worker characteristics for Italian importing firms covering years 2011–2019.

high null result Firm size and the automation wage premium coverage of datasets (population of Italian importing firms 2011–2019)

The study examines the impact of AI technologies on Uzbekistan's labor market transformation in the context of implementing the national strategy 'Digital Uzbekistan - 2030' and the Strategy for the Development of AI Technologies until 2030.

Framing and scope statement in the paper; analysis based on national strategy documents, statistical data, industry reviews, and regulatory legal documents.

high null result The Impact of Artificial Intelligence During the Transformat... impact of AI in the context of national digital/AI strategies

The system was evaluated on a real 64-GPU A100 testbed emulating three wind-powered sites with Azure production traces.

Experimental evaluation described in abstract: 64-GPU A100 testbed, emulation of three sites, use of Azure production traces.

high null result XWind: A Cross-site Router for Large Language Model Inferenc... experimental evaluation setup

The paper includes comparisons against accelerated baselines (reported experimental comparisons).

Statement in experimental section that comparisons to accelerated baselines were performed; specific baselines and results are in the paper.

high null result CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolv... comparative performance vs. baselines

The paper examines the legal implications of overusing export controls.

Statement of the paper's analytic scope and structure (description of content).

high null result Strategic Stalemates: The Paradox of Export Controls in the ... legal implications of export control overuse

We sample 50 benchmark games from a 2,000-game generated pool and evaluate nine frontier and open-weight LLMs in a head-to-head tournament with over 36,000 matches.

Empirical setup reported in the paper's abstract: 50 sampled games, 2,000-game pool, nine LLMs, >36,000 head-to-head matches.

high null result GENSTRAT: Toward a Science of Strategic Reasoning in Large L... evaluation sample size / tournament scale (matches run)

We interviewed 24 product-focused individuals at a large technology firm about how AI has impacted their own work, their work within their product team, and their professional interactions.

Qualitative semi-structured interviews with 24 product-focused employees at a single large technology firm; sample size = 24.

high null result Beyond the Org Chart: AI and the Transformation of Invisible... description of sample and data collection

This scoping review adhered to the PRISMA-ScR guidelines and encompassed 29 peer-reviewed empirical studies published from 2020 to 2025.

Methods statement in the paper (explicit methodological description).

high null result The influence of AI-Driven Employee Performance Management (... scope and methodological adherence of the review (PRISMA-ScR; n=29 studies)

The paper identifies five major research gaps and proposes future research directions in intelligent international marketing.

Author-reported outcome of the paper's systematic review and content analysis (2010–2025); descriptive claim about the paper's contributions.

high null result Research on International Marketing in the Context of Intell... identification of research gaps and proposed directions

Prior productivity does not predict AI use.

Analysis linking prior productivity measures to reported AI adoption in the Census Bureau survey data; finding of no predictive relationship reported.

high null result The Adoption of Industrial AI in America predictive relationship between prior productivity and AI adoption

The analysis uses a mandatory, purpose-designed Census Bureau survey of approximately 28,500 establishments.

Census Bureau mandatory survey specifically designed for this study; sample size stated as approximately 28,500 establishments.

high null result The Adoption of Industrial AI in America survey_sample_size / data source

Identification of effects uses within-firm variation with firm and city-by-year fixed effects.

Identification strategy reported in abstract: within-firm variation under firm and city-by-year fixed effects.

high null result Toward Sustainable Workforce Development: How AI Reshapes Sk... identification approach / econometric controls

The study measures four skill-category demand shares and their within-category importance from job-description text.

Methodological statement in abstract: measurement of four skill-category demand shares and within-category importance via job-description text.

high null result Toward Sustainable Workforce Development: How AI Reshapes Sk... skill-category demand shares and within-category importance

AI exposure is decomposed into displacement and augmentation components based on task routineness.

Methodological claim in abstract: decomposition of exposure into displacement and augmentation using a routineness criterion for tasks.

high null result Toward Sustainable Workforce Development: How AI Reshapes Sk... decomposed AI exposure measures (displacement vs augmentation)

The authors construct firm-by-year potential AI exposure via semantic matching between AI patent texts and detailed occupation task descriptions.

Method description in abstract: semantic matching of AI patent texts to occupation task descriptions to build firm-by-year exposure.

high null result Toward Sustainable Workforce Development: How AI Reshapes Sk... firm-by-year potential AI exposure (constructed measure)

The study uses approximately 67 million online job postings from two major Chinese recruitment platforms (2019–2024).

Statement in paper abstract describing dataset size and source (job postings from two major Chinese recruitment platforms over 2019–2024).

high null result Toward Sustainable Workforce Development: How AI Reshapes Sk... dataset size and coverage (number of job postings, platforms, years)

The study extends the Technology Acceptance Model (TAM), Dynamic Capabilities Theory, and the Technology-Organisation-Environment (TOE) framework into the qualitative, emerging-economy entrepreneurial context.

Authors' stated theoretical contribution based on mapping thematic results to TAM, Dynamic Capabilities, and TOE frameworks within analysis and discussion sections.

high null result Navigating the Intelligence Frontier: AI Adoption as a Succe... theoretical contribution / framework extension

This study employed an interpretivist, qualitative research design using sixteen in-depth semi-structured interviews with entrepreneurs across fintech, edtech, health-tech, logistics, retail, and SaaS in Delhi/NCR, India, and used Braun & Clarke's (2006) six-phase thematic analysis framework.

Explicit methodological description in the paper: interpretivist qualitative design; n=16 in-depth semi-structured interviews across specified sectors in Delhi/NCR; thematic analysis following Braun & Clarke (2006).

high null result Navigating the Intelligence Frontier: AI Adoption as a Succe... research design / data collection (qualitative interviews)

Using a qualitative approach with 17 expert interviews from employees at startups.

Methods statement in paper specifying qualitative study design and sample size of 17 interviews.

high null result From Prompt To Process: Qualitative Insights On How Genai Us... study methodology and sample

Process-related insights into how GenAI transforms startups are limited.

Authors' literature positioning / gap statement in paper (no empirical metric provided).

high null result From Prompt To Process: Qualitative Insights On How Genai Us... availability of process-related insights in literature

The paper's findings are based on three pre-registered user studies with a combined sample size of N = 2691.

Statement in the paper's abstract reporting three pre-registered user studies and combined N = 2691.

high null result The efficiency-gain illusion: People underestimate the rate ... study_sample_description

Agentic payments are distinct from traditional automated systems because they emphasise autonomy, contextual reasoning and adaptability.

Conceptual distinction asserted in the abstract (comparative analysis between agentic payments and traditional automated systems).

high null result AI Agents in Payments: Applications, Risks and Regulations system characteristics (autonomy, contextual reasoning, adaptability)

The paper examines operational logic, defining features and emerging use cases of agentic payments across retail, e-commerce and decentralised finance.

Stated scope in the abstract; analysis and case-study-driven review across specified sectors (retail, e-commerce, DeFi). No sample sizes reported.

high null result AI Agents in Payments: Applications, Risks and Regulations emerging use cases / sector-level application

Agentic payments refer to transactions initiated and completed by AI agents without direct human intervention.

Explicit definitional statement in the abstract (conceptual definition provided by the authors).

high null result AI Agents in Payments: Applications, Risks and Regulations definition/characterisation of a payment modality

All [the listed orchestration frameworks] follow the same pattern: an external orchestrator above the LLM, injecting instructions and routing decisions every turn.

Author assertion based on architectural analysis of the listed frameworks (observation of orchestration pattern in the named projects).

high null result Compiling Agentic Workflows into LLM Weights: Near-Frontier ... architectural pattern (external orchestrator behavior)

The paper draws on empirical studies from 2024–2026.

Methodological statement in the paper specifying the time window of empirical studies used in the analysis.

high null result The Algorithmic Mirror: Can Artificial Intelligence Truly Mi... temporal scope of literature reviewed

This inverse scaling does not appear on single-threshold metrics common in LLM forecasting benchmarks.

Comparative evaluation reported in the paper showing that single-threshold (binary) scoring metrics do not exhibit the inverse-scaling pattern observed with tail-inclusive distributional metrics (specific metrics and calculations not given in excerpt).

high null result Is Capability a Liability? More Capable Language Models Make... relationship between model capability and accuracy under single-threshold metric...

Domain knowledge does not reliably rescue calibration.

Experiments reported in the paper where domain-knowledge interventions (procedures or prompts incorporating domain knowledge) were applied and did not consistently improve forecast calibration (details not provided in excerpt).

high null result Is Capability a Liability? More Capable Language Models Make... forecast calibration after incorporating domain knowledge

Using large language models, we measure the AIO level of Chinese listed companies from 2010 to 2023.

Authors report constructing firm-level measures of artificial intelligence orientation (AIO) by applying large language models to corporate texts/disclosures for Chinese listed companies over the 2010–2023 period.

high null result Artificial intelligence orientation and decarbonization spil... artificial intelligence orientation (AIO) measurement

This study provides the first cross-class synthesis covering raw materials, work-in-process, and finished goods within a unified evaluative framework, positioning machine learning and deep reinforcement learning methods alongside classical policy families and quantifying the boundary conditions for each approach.

Author-stated theoretical contribution and scope of the review (coverage of raw materials, WIP, finished goods and methods).

high null result Equitable railway corridor investment under demand uncertain... breadth and novelty of synthesis across inventory classes and methods

A random-effects model estimated by restricted maximum likelihood was applied to pool percentage cost-reduction effect sizes across 18 studies admissible to quantitative synthesis.

Methods reported in the paper: random-effects meta-analysis using REML across 18 studies eligible for quantitative pooling.

high null result Equitable railway corridor investment under demand uncertain... pooled percentage cost-reduction effect sizes

A systematic review and meta-analytic synthesis of 31 peer-reviewed studies published between 2004 and 2025 was conducted following the PRISMA 2020 protocol.

Study methods reported in the paper: systematic review following PRISMA 2020; sample of 31 peer-reviewed studies dated 2004–2025.

high null result Equitable railway corridor investment under demand uncertain... number and coverage of studies included in the review

Across 660 trials with Claude Code, code cleanliness does not change the agent's pass rate.

Empirical evaluation: 660 trials run using Claude Code on the minimal-pair repos with hidden tests; reported comparison of pass rates between clean and messy repo variants showing no change.

high null result Does Code Cleanliness Affect Coding Agents? A Controlled Min... pass rate (task success on hidden tests)

We conduct extensive experiments on public datasets, in simulated auction environments, and through large-scale online deployment on Taobao.

Statement of experimental methodology describing the types of evaluations performed (public datasets, simulated auctions, and online deployment).

high null result Generative Auto-Bidding with Unified Modeling and Exploratio... scope and environments of experiments (public datasets, simulations, live deploy...

Reported empirical values are transformed through transparent indicators such as relative growth, CAGR, growth multipliers, stock-flow ratios, concentration ratios, and HHI.

Methodological description and application in the paper listing these specific indicators used to summarize public data on AI investment, adoption, robots, compute, and labour-market reallocation.

high null result The Agentic Economy: Humans, AI Agents, Robots, and the Meas... data transformation / indicator usage

The study uses a conceptual-empirical quantitative diagnostic design rather than a causal econometric model.

Explicit methodological statement in the paper describing the design choice and rejecting causal econometric modeling in favor of diagnostics using public institutional data and transparent indicators.

high null result The Agentic Economy: Humans, AI Agents, Robots, and the Meas... study methodology (diagnostic vs causal modeling)

The agentic economy is not yet a completed global order, but its transition pressure is measurable enough to require a distinct economic vocabulary, reproducible diagnostics, and future sector-level measurement.

Synthesis of diagnostic indicators (AI investment/adoption trends, robot stock, compute-energy coupling, labour reallocation measures) showing measurable transition pressures; conclusion drawn from the conceptual-empirical diagnostic.

high null result The Agentic Economy: Humans, AI Agents, Robots, and the Meas... degree of completion of 'agentic economy' transition / measurability of transiti...

Following PRISMA 2020 guidelines, searches across Google Scholar, Web of Science, Scopus, ScienceDirect, and CNKI yielded 1,562 initial records, of which 21 studies published between 2019 and 2026 met inclusion criteria.

Methodological description of the systematic literature review reported in the paper: initial records = 1,562; included studies = 21; publication years 2019–2026.

high null result Application of Artificial Intelligence in Human Resource Man... number of records screened and studies included

Small and medium-sized enterprises (SMEs) constitute over 98.5% of businesses in many economies including China.

Descriptive statistic reported in the paper's background/intro; source of the statistic not specified within the summary provided.

high null result Application of Artificial Intelligence in Human Resource Man... share of businesses that are SMEs

This study analyzes developments through April 2026.

Explicit timeframe statement in the paper's summary/introduction.

high null result AI for Auto-Research: Roadmap & User Guide temporal coverage of the review/analysis

Results remain robust across checks.

Robustness checks reported by the authors (unspecified in abstract) that do not overturn the main findings.

high null result Dissipation of Debt Financing Privilege on Corporate AI Wash... robustness of core findings (debt financing cost increase for AI washing firms)

China's 14th Five Year Plan (FYP) is used as a quasi-natural experiment / strategic policy shock to study effects of AI washing.

Research design leverages the FYP announcement as an exogenous policy shock in a difference-in-differences framework (design claim; no sample size in abstract).

high null result Dissipation of Debt Financing Privilege on Corporate AI Wash... policy shock (use of FYP as quasi-experiment)

AI washing is identified as the residual between AI narrative intensity and patent output.

Constructed a firm-level AI washing proxy by regressing AI narrative intensity on patent output and using the residual; described as the study's measurement approach (no sample size reported in the abstract).

high null result Dissipation of Debt Financing Privilege on Corporate AI Wash... AI washing measure (residual between narrative intensity and patent output)

« Prev 1 2 3 … 34 35 36 … 172 173 Next »