Evidence (13870 claims)

Evidence Matrix

Claim counts by outcome category and direction of finding.

Outcome	Positive	Negative	Mixed	Null	Total
Other	749	196	98	892	1984
Governance & Regulation	817	394	188	121	1544
Organizational Efficiency	771	189	124	83	1177
Technology Adoption Rate	627	233	123	96	1088
Research Productivity	411	123	56	332	933
Output Quality	467	178	59	47	751
Decision Quality	320	174	75	42	618
Firm Productivity	435	55	88	20	604
AI Safety & Ethics	214	276	65	33	593
Market Structure	178	167	122	24	496
Task Allocation	207	64	71	32	379
Skill Acquisition	165	59	60	17	301
Innovation Output	203	27	43	18	292
Employment Level	105	52	107	13	279
Fiscal & Macroeconomic	131	69	43	26	276
Consumer Welfare	116	63	42	11	232
Firm Revenue	150	48	26	3	227
Inequality Measures	44	122	49	6	221
Task Completion Time	169	29	8	12	219
Worker Satisfaction	89	63	20	12	184
Error Rate	69	92	10	2	173
Regulatory Compliance	76	68	14	5	163
Training Effectiveness	93	21	13	19	148
Wages & Compensation	77	36	25	6	144
Automation Exposure	51	54	22	12	142
Team Performance	86	17	27	9	140
Developer Productivity	94	17	14	6	132
Job Displacement	12	80	20	1	113
Hiring & Recruitment	51	7	8	3	69
Creative Output	31	17	7	3	59
Skill Obsolescence	5	46	6	1	58
Social Protection	27	16	8	2	53
Labor Share of Income	17	17	17	—	51
Worker Turnover	11	12	—	3	26
Industry	—	—	—	1	1

The study employed an interpretivist philosophy and a case study design.

Methods section: explicitly states interpretivist philosophy and case study design.

high null result The augmented recruiter: examining AI integration and decisi... research_design

Collaboration among content creators can be modeled as a multi-agent stochastic linear bandit problem with a transferable utility (TU) cooperative game formulation, where a coalition's value equals the negative sum of its members' cumulative regrets.

Methodological/modeling claim: the paper defines a multi-agent stochastic linear bandit and maps coalition value to negative sum of cumulative regrets as the TU game payoff function.

high null result Creator Incentives in Recommender Systems: A Cooperative Gam... coalition value defined as negative sum of members' cumulative regrets

All participants had access to the same AI tool; the experiment varied only the structure surrounding its use (behavioral vs cognitive scaffolding vs unstructured).

Experimental design description in the paper: common AI tool provided to all participants; randomization/assignment varied only the scaffolding around AI use.

high null result Scaffolding Human-AI Collaboration: A Field Experiment on Be... experimental manipulation fidelity (same AI tool across conditions)

Both country and domain rankings are stable from 2021-2024.

Temporal analysis reported in paper comparing GCI and ETGCI rankings across 2021-2024, concluding stability of rankings over that period.

high null result The Geoeconomics of Venture Capital An Economic Complexity A... stability of GCI and ETGCI rankings over time (2021–2024)

We found no evidence that information provision drove effects on our behavioural outcomes.

Analysis from the preregistered experiments showing that manipulations of information provision did not produce corresponding changes in measured behaviours (e.g., petition signing, donations).

high null result Artificial intelligence can persuade people to take politica... behavioural outcomes (petition signing, donations) in response to information pr...

We observed no evidence of a correlation between AI persuasion effects on attitudes and behaviour.

Analysis reported in the two preregistered experiments comparing AI-induced changes in attitudes with corresponding behavioural outcomes across participants (sample reported in paper).

high null result Artificial intelligence can persuade people to take politica... correlation between attitude changes and behavioural changes induced by AI persu...

The study uses the 2015 Green Data Center Pilot Policy as a quasi-natural experiment and employs the difference-in-differences (DID) method to identify the policy's impact on urban inclusive green growth.

Author-stated research design: quasi-natural experiment leveraging the 2015 policy and DID estimation (methodological claim in the paper).

high null result How does green digital economy policy enable inclusive green... Research design / identification strategy (DID using 2015 policy)

These results are observational and reflect a single-operator dataset without controlled comparison.

Author statement in the paper describing study limitations.

high null result Context Engineering: A Practitioner Methodology for Structur... study design and limitations

A digital–intelligent integration index was constructed using entropy weighting and a coupling coordination model.

Methodological description in the paper: index construction via entropy weighting combined with a coupling coordination model.

high null result Research on the Pathways and Spatial Effects of Digital–Inte... construction of digital–intelligent integration index

The study uses panel data from 30 provincial-level regions in China covering 2014–2023 to analyze the relationship between digital–intelligent integration and carbon intensity.

Panel dataset described as 30 provincial-level regions, years 2014–2023; index construction and empirical analysis reported in the paper.

high null result Research on the Pathways and Spatial Effects of Digital–Inte... dataset coverage (30 provinces, 2014–2023)

The paper provides statistics on the agreement rates between different measures of AI exposure.

Descriptive/statistical comparison of multiple AI-exposure measures (e.g., different O*NET-based metrics) reporting agreement rates.

high null result AI and Coder Employment: Compiling the Evidence agreement rates between AI exposure measures

The authors validate their industry-level control variable by examining historical examples of occupations that experienced either occupation-specific or industry-level shocks.

Validation exercise using historical case studies/examples comparing known occupation-specific and industry-level shocks to assess the control variable's performance.

high null result AI and Coder Employment: Compiling the Evidence performance/validity of the industry-level control variable in distinguishing sh...

There is a significant research gap in comparative understanding of generative AI's impact across developed and developing economies; differences in infrastructure, labour markets, and skill distributions may lead to uneven outcomes.

Review observation that the included literature lacks sufficient comparative studies across country-development contexts (explicitly noted as a gap in the paper).

high null result Generative AI in the Workplace: A Systematic Review of Produ... comparative evidence on generative AI impacts across developed vs. developing ec...

This systematic literature review synthesised findings from 40 empirical and conceptual studies published between 2020 and 2025 using the PRISMA framework (search across Google Scholar and Dimensions.ai), yielding 3,252 database records plus 8 hand-searched studies, of which 40 met the inclusion criteria.

PRISMA-style structured literature search reported in the paper: database search (Google Scholar, Dimensions.ai) returning 3,252 records, 8 hand-searched records, 40 studies meeting inclusion.

high null result Generative AI in the Workplace: A Systematic Review of Produ... systematic review sample and search yield (records screened/included)

This study uses Partial Least Squares Structural Equation Modeling (PLS-SEM) on 350 survey responses to examine the effects of AI adoption, regulatory clarity, digital infrastructure readiness, and cross-border data governance quality on international trade performance, with compliance effectiveness as a mediating mechanism.

Methodological description in the paper: PLS-SEM analysis on a survey sample of 350 responses (sample size explicitly reported).

high null result Artificial Intelligence and International Business Law: Tran... effects of the four antecedent factors on compliance effectiveness and trade per...

Empirical evidence remains limited on how AI deployment and institutional conditions jointly influence compliance effectiveness and international trade performance.

Statement of research gap based on the paper's literature review and motivation for the study.

high null result Artificial Intelligence and International Business Law: Tran... availability/extent of empirical evidence on joint influence of AI deployment an...

The selected studies originated mainly from Peru, Colombia, Chile, and Ecuador.

Geographic provenance reported for the 27 included studies (country distribution summarized in results).

high null result Artificial Intelligence for Business Decision-Making in Lati... geographic origin of research studies

After screening, 27 studies were selected for inclusion in the review.

PRISMA-style screening and eligibility process reported in the methods/results, yielding 27 included studies.

high null result Artificial Intelligence for Business Decision-Making in Lati... number of studies included

The initial search returned 276,302 records.

Reported search yield from the Scopus query described in the methods.

high null result Artificial Intelligence for Business Decision-Making in Lati... number of records retrieved

A systematic search was conducted in the Scopus database following PRISMA 2020 guidelines for articles published between 2021 and 2025 using Boolean operators related to AI and decision-making.

Methodological description in the paper stating adherence to PRISMA 2020 and the search strategy (Scopus, 2021–2025).

high null result Artificial Intelligence for Business Decision-Making in Lati... systematic review methodology / search procedure

External environmental pressures did not show a significant role in the adoption process.

PLS-SEM results from the survey (n=110) reportedly found no significant effect of environmental/external pressures on AI adoption.

high null result Drivers of AI Adoption: The Role of Innovation Attributes, O... AI adoption (in relation to external/environmental pressure)

Data analysis involved Smart PLS-SEM, which facilitated reliability and validity assessment along with hypothesis evaluation.

Paper reports using SmartPLS for Partial Least Squares Structural Equation Modeling to assess reliability, validity, and test hypotheses.

high null result Drivers of AI Adoption: The Role of Innovation Attributes, O... analytical method used

The investigation was guided by the Technology-Organization-Environment (TOE) framework combined with innovation characteristics from Diffusion of Innovation theory.

Paper states theoretical frameworks used to design variables and hypotheses: TOE plus DOI innovation characteristics.

high null result Drivers of AI Adoption: The Role of Innovation Attributes, O... theoretical framing / constructs selection

A total of 110 valid responses were collected through an organized online survey using purposive sampling.

Reported sample description in the paper: online survey, purposive sampling, resulting in 110 valid responses.

high null result Drivers of AI Adoption: The Role of Innovation Attributes, O... sample_size / data_collection

The explanatory interface has no significant impact on situational trust.

Trust measured in different forms (situational, learned, cognitive, emotional) in the RCT; authors report no significant effect of explanatory interface on situational trust (N=120).

high null result How AI-Assisted Decision-Making Paradigms and Explainability... situational trust

Under the sequential AI-assisted decision-making paradigm, the explanatory interface has no significant effect on immediate task performance.

Same randomized controlled experiment; authors report no significant effect of explanatory interface on immediate task performance in the sequential paradigm (N=120 total).

high null result How AI-Assisted Decision-Making Paradigms and Explainability... immediate task performance (task execution stage)

The study was a randomized controlled experiment with 120 pre-service teachers.

Randomized controlled experiment reported in the paper; sample described as 120 pre-service teachers.

high null result How AI-Assisted Decision-Making Paradigms and Explainability... study_design/sample

The study uses a panel of 283 prefecture-level and above cities from 2012 to 2023 and a difference-in-differences (DID) identification strategy exploiting the establishment of national big data comprehensive pilot zones as a natural experiment.

Methodological description in the paper: sample composition (283 prefecture-level+ cities), time span (2012–2023), and the DID/natural experiment design to estimate policy effects.

high null result Study on the Impact of Establishing Big Data Comprehensive P... study design / methodological setup

We instantiate this vision in a controlled study (n=36) comparing the gaze-aware AI assistant to a text-only LLM assistant.

The paper reports running a controlled user study with sample size n=36 directly comparing the gaze-aware assistant against a text-only LLM assistant.

high null result From Gaze to Guidance: Interpreting and Adapting to Users' C... study design / experimental comparison

Exploratory innovation does not show a significant direct association with long-term competitive performance.

PLS-SEM results from the survey of 104 Portuguese B2B managers reporting a non-significant direct path from exploratory innovation to performance.

high null result Generative AI Adoption in B2B Firms: Ethical Governance, Inn... long-term competitive performance

Data were analyzed using partial least squares structural equation modeling (PLS-SEM) implemented in SmartPLS 4.

Methods section statement in paper indicating use of PLS-SEM and SmartPLS 4 for data analysis.

high null result Decision-Making in Complex Systems Using AI-Based Decision S... analysis method

The empirical analysis is based on a questionnaire survey administered to 324 respondents from Romanian organizations operating in IT, services, industry, and public administration.

Questionnaire survey described in paper; sample size explicitly stated as 324 respondents from Romanian organizations across IT, services, industry, and public administration.

high null result Decision-Making in Complex Systems Using AI-Based Decision S... sample description / data source

ARS's implementation can be found at https://github.com/t54-labs/AgenticRiskStandard.

Link to code repository provided in the abstract (factual statement pointing to implementation).

high null result Quantifying Trust: Financial Risk Management for Trustworthy... availability of ARS implementation in a public GitHub repository

As AI systems evolve into autonomous agents deployed in open environments and increasingly connected to payments or assets, the operational meaning of trust shifts to end-to-end outcomes: whether an agent completes tasks, follows user intent, and avoids failures that cause material or psychological harm.

Conceptual/argumentative claim presented in the paper (no empirical sample reported in the abstract).

high null result Quantifying Trust: Financial Risk Management for Trustworthy... agent task completion, alignment with user intent, avoidance of material or psyc...

Prior work on trustworthy AI emphasizes model-internal properties such as bias mitigation, adversarial robustness, and interpretability.

Summary statement about existing literature (no empirical data or sample reported in the abstract; asserted by authors as background).

high null result Quantifying Trust: Financial Risk Management for Trustworthy... research emphasis on model-internal properties (bias mitigation, adversarial rob...

On document intelligence (DocILE), our Code Factory variant matches Direct LLM on key field extraction (KILE: 80.0%).

Empirical evaluation reported on DocILE dataset of 5,680 invoices; KILE metric reported at 80.0%.

high null result Compiled AI: Deterministic Code Generation for LLM-Based Wor... key field extraction accuracy (KILE)

We evaluate on two task types: function-calling (BFCL, n=400) and document intelligence (DocILE, n=5,680 invoices).

Statement in paper specifying dataset/task types and sample sizes used in evaluation.

high null result Compiled AI: Deterministic Code Generation for LLM-Based Wor... evaluation datasets and sample sizes

All four models converge to similar skill profiles (3.6-point spread), suggesting that text-based automation feasibility may be more skill-dependent than model-dependent.

Comparison across 4 LLMs (LLaMA 3.3 70B, Mistral Large, Qwen 2.5 72B, Gemini 2.5 Flash) with reported 3.6-point spread in skill-profile SAFI scores.

high null result The AI Skills Shift: Mapping Skill Obsolescence, Emergence, ... variation (spread) in SAFI skill profiles across models

We validate this principle through a controlled experiment on log format token economy across four conditions (human-readable, structured, compressed, and tool-assisted compressed).

Controlled experiment described in the paper comparing four log-format conditions (human-readable, structured, compressed, tool-assisted compressed); exact sample size not reported in the abstract.

high null result Beyond Human-Readable: Rethinking Software Engineering Conve... performance on log-format token economy under different formatting conditions

For six decades, software engineering principles have been optimized for a single consumer: the human developer.

Historical/position claim asserted in the paper (conceptual/literature-based argument), no empirical sample or quantitative test reported.

high null result Beyond Human-Readable: Rethinking Software Engineering Conve... orientation of software engineering design towards human developers

A series of robustness checks were conducted to ensure the reliability of the conclusions.

Paper statement that multiple robustness checks were performed in support of the main DiD findings (e.g., alternative specifications, placebo tests, etc. implied).

high null result Unlocking Green Growth: How Artificial Intelligence Policies... green economic efficiency (GEE)

The study uses China's National New-Generation Artificial Intelligence Innovation Development Pilot Zone (NAIDPZ) as a quasi-natural experiment and applies a staggered difference-in-differences (DiD) model on panel data of 267 Chinese prefecture-level cities from 2007 to 2023.

Paper statement of research design: staggered DiD model applied to panel data covering 267 prefecture-level Chinese cities over 2007–2023, treating NAIDPZ as quasi-natural experiment.

high null result Unlocking Green Growth: How Artificial Intelligence Policies... green economic efficiency (GEE)

Explicit 'Sponsored' labels do not significantly reduce persuasion.

Experimental comparison including conditions with explicit 'Sponsored' labels; authors report no significant reduction in persuasion when labels were present (from the preregistered experiments).

high null result Commercial Persuasion in AI-Mediated Conversations effect of 'Sponsored' labels on sponsored product selection

A fifth of all products were randomly designated as sponsored and promoted in different ways.

Paper description of experimental manipulation: 20% of products (a fifth) were randomly designated as sponsored in the catalog.

high null result Commercial Persuasion in AI-Mediated Conversations sponsorship assignment (experimental manipulation)

We conducted two preregistered experiments with N = 2,012 participants.

Statement of experimental design in the paper (two preregistered experiments) with total sample size reported as N = 2,012.

high null result Commercial Persuasion in AI-Mediated Conversations study_design / sample_size

Today's LLMs are trained to align with user preferences through methods such as reinforcement learning.

Statement of standard practice referenced in the paper, drawing on existing literature about alignment methods (e.g., reinforcement learning from human feedback). This is a descriptive claim about common training techniques rather than an experimental result in this paper.

high null result Ads in AI Chatbots? An Analysis of How Large Language Models... training and alignment methodology (use of RL-based methods)

Through a causal decomposition that automates one side of agent communication, we separate cooperation failures from competence failures, tracing their origins through agent reasoning analysis.

Method described in the paper: causal decomposition approach that automates one side of communication and analyzes agent reasoning to attribute failures (methodological claim; abstract mentions the approach but gives no sample size or quantitative metrics there).

high null result More Capable, Less Cooperative? When LLMs Fail At Zero-Cost ... ability to distinguish cooperation failures from competence failures

Capability does not predict cooperation.

Comparative experimental results reported in the paper showing different models with different capability levels achieving substantially different collective cooperation outcomes (specifically comparing OpenAI o3 and o3-mini).

high null result More Capable, Less Cooperative? When LLMs Fail At Zero-Cost ... degree of cooperation / collective performance

We build a multi-agent setup designed to study cooperative behavior in a frictionless environment, removing all strategic complexity from cooperation.

Methodological description in the paper: design and implementation of a multi-agent experimental setup intended to remove strategic complexity (no sample size or quantitative detail reported in the abstract).

high null result More Capable, Less Cooperative? When LLMs Fail At Zero-Cost ... ability to study cooperation in a frictionless environment (methodological capab...

A pre-registered experiment evaluates this thesis in a commons production economy -- where agents share a finite resource pool and collaboratively produce value -- at 50-1,000 agent scale.

Paper states that a pre-registered experiment is planned/described; the experiment context (commons production economy) and planned scale (50-1,000 agents) are specified in the excerpt. No experimental outcomes or effect estimates are reported here.

high null result AgentCity: Constitutional Governance for Autonomous Agent Ec... alignment-through-accountability in a commons production economy (collective pro...

« Prev 1 2 3 … 73 74 75 … 277 278 Next »