Evidence (13870 claims)
Adoption
8467 claims
Productivity
7558 claims
Governance
6805 claims
Human-AI Collaboration
6363 claims
Org Design
4132 claims
Innovation
4065 claims
Labor Markets
3526 claims
Skills & Training
2945 claims
Inequality
2066 claims
Evidence Matrix
Claim counts by outcome category and direction of finding.
| Outcome | Positive | Negative | Mixed | Null | Total |
|---|---|---|---|---|---|
| Other | 749 | 196 | 98 | 892 | 1984 |
| Governance & Regulation | 817 | 394 | 188 | 121 | 1544 |
| Organizational Efficiency | 771 | 189 | 124 | 83 | 1177 |
| Technology Adoption Rate | 627 | 233 | 123 | 96 | 1088 |
| Research Productivity | 411 | 123 | 56 | 332 | 933 |
| Output Quality | 467 | 178 | 59 | 47 | 751 |
| Decision Quality | 320 | 174 | 75 | 42 | 618 |
| Firm Productivity | 435 | 55 | 88 | 20 | 604 |
| AI Safety & Ethics | 214 | 276 | 65 | 33 | 593 |
| Market Structure | 178 | 167 | 122 | 24 | 496 |
| Task Allocation | 207 | 64 | 71 | 32 | 379 |
| Skill Acquisition | 165 | 59 | 60 | 17 | 301 |
| Innovation Output | 203 | 27 | 43 | 18 | 292 |
| Employment Level | 105 | 52 | 107 | 13 | 279 |
| Fiscal & Macroeconomic | 131 | 69 | 43 | 26 | 276 |
| Consumer Welfare | 116 | 63 | 42 | 11 | 232 |
| Firm Revenue | 150 | 48 | 26 | 3 | 227 |
| Inequality Measures | 44 | 122 | 49 | 6 | 221 |
| Task Completion Time | 169 | 29 | 8 | 12 | 219 |
| Worker Satisfaction | 89 | 63 | 20 | 12 | 184 |
| Error Rate | 69 | 92 | 10 | 2 | 173 |
| Regulatory Compliance | 76 | 68 | 14 | 5 | 163 |
| Training Effectiveness | 93 | 21 | 13 | 19 | 148 |
| Wages & Compensation | 77 | 36 | 25 | 6 | 144 |
| Automation Exposure | 51 | 54 | 22 | 12 | 142 |
| Team Performance | 86 | 17 | 27 | 9 | 140 |
| Developer Productivity | 94 | 17 | 14 | 6 | 132 |
| Job Displacement | 12 | 80 | 20 | 1 | 113 |
| Hiring & Recruitment | 51 | 7 | 8 | 3 | 69 |
| Creative Output | 31 | 17 | 7 | 3 | 59 |
| Skill Obsolescence | 5 | 46 | 6 | 1 | 58 |
| Social Protection | 27 | 16 | 8 | 2 | 53 |
| Labor Share of Income | 17 | 17 | 17 | — | 51 |
| Worker Turnover | 11 | 12 | — | 3 | 26 |
| Industry | — | — | — | 1 | 1 |
We instantiate SoP in AgentCity on an EVM-compatible layer-2 blockchain (L2) with a three-tier contract hierarchy (foundational, meta, and operational).
Reported implementation/instantiation described in the paper (system implementation claim). The paper states the platform (AgentCity) and technical details (EVM-compatible L2, three-tier contracts).
In this architecture, smart contracts are the law itself -- the actual legislative output that agents produce and that governs their behavior.
Architectural/design claim in the paper describing conceptual role of smart contracts within SoP; presented as an intended property of the system.
Agents discover, transact with, and delegate to agents owned by other parties without centralized oversight.
Asserted behavior pattern of autonomous agents in the paper's motivation; presented as descriptive claim rather than supported by a reported experiment or dataset in the excerpt.
Autonomous AI agents are beginning to operate across organizational boundaries on the open internet.
Stated as an empirical observation in the paper's introduction/introduction-level motivation; no specific dataset or sample described in the text excerpt.
The divergence in collective outputs is not driven by participants abandoning AI, but by how participants use it.
Behavioral/usage data from the RCT indicating continued AI use across incentive conditions and differing usage patterns (no sample size or quantitative metrics provided in excerpt).
The empirical basis of the study is industry data from the Bureau of National Statistics of the Republic of Kazakhstan for 2020–2024.
Statement in the paper specifying the data source and years used for calibration of the model.
The study's methodological framework integrates the Bass model of innovation diffusion, an expanded production function with endogenous technological progress and the task-oriented Acemoglu–Restrepo approach, plus a multi-criteria system of industry prioritisation.
Description of the paper's modelling approach in the methods section; model components identified explicitly in the paper.
We evaluated EcoAssist through benchmarks of 500 websites and a controlled study with 20 developers.
Explicit methodological statement in paper: benchmark sample size = 500 websites; user study sample size = 20 developers.
Functional correctness (test-based correctness) exhibits negligible statistical association with design satisfaction.
Statistical analysis reported in experiments comparing test pass (functional correctness) and design-satisfaction labels produced by verifier; paper states negligible association.
The review covers publications between 2019 and 2025.
Explicit scope of the literature search reported by the authors (time window of included/considered publications).
The survey synthesizes methodological trends across data-, feature-, and decision-level fusion strategies.
Synthesis and categorization reported in the paper based on analysis of the included studies (n=18).
The review examines 18 multimodal GeoAI studies identified through a PRISMA-ScR screening process from 57 candidate publications between 2019 and 2025.
Explicit methodological reporting in the paper: PRISMA-ScR screening yielded 18 included studies out of 57 candidates over the 2019–2025 period.
This paper presents a systematic survey of recent GeoAI studies that fuse multiple geospatial data modalities for key urban mobility tasks.
Authors report conducting a systematic literature survey using a PRISMA-ScR screening process described in the paper.
Inclusive urban mobility examines whether transport systems equitably support the everyday movements and accessibility needs of historically marginalized and underserved populations.
Definition/interpretive claim presented in the paper as conceptual framing (no empirical measurement reported).
Without further assumptions, fitness need not increase over time.
Theoretical result from the model: analysis shows no guaranteed monotonic increase in fitness absent additional assumptions (proof/derivation in paper).
Humans retain partial control through a 'fitness function' that allocates limited computational resources across lineages.
Model assumption and formalization in the mathematical model: fitness function used to allocate computational resources across AI lineages (theoretical model specification).
Biological DNA mutations are random and approximately reversible, but descendant design in AIs will be strongly directed (so standard biological evolutionary models are not appropriate).
Analytic comparison and conceptual argument in the paper; replacement of random-mutation assumptions with a directed tree model of possible AI programs (theoretical discussion and model construction).
The authors build a dynamic model of public good provision in which agents contribute by solving problems posted on a public platform and accumulated solutions form a depreciating public archive.
Methodological claim in the paper — statement that a dynamic theoretical model is constructed; this is a description of the paper's method.
We conducted a systematic review and bibliometric analysis of 627 articles.
Statement in abstract reporting a systematic review and bibliometric analysis; sample size explicitly given as 627 articles.
Injecting generic green language into prompts has no reliable effect.
Controlled prompting experiments reported in the benchmark comparing prompts with 'generic green language' to other prompt types; claim of no reliable effect on measured footprint (no numerical statistics given in abstract).
This study uses data from 743 listed enterprises in China’s strategic emerging industries from 2014 to 2023 and employs mediation and moderation (interaction) tests to examine mechanisms (digital-green synergy, information asymmetry, financing constraints) and the moderating role of AI applications.
Statement of data and methods in the paper: panel of 743 listed firms (2014–2023); empirical strategy includes mediation analyses and moderation (interaction) tests.
This paper has been accepted at PEARC 2026.
Statement in the paper indicating conference acceptance.
The University's GIS Center Ecological Archive (849 curated datasets) serves as a single-agent baseline deployment of EnviSmart.
Reported deployment dataset count provided in the paper: 849 curated datasets used as a single-agent baseline.
The study employed a mixed-methods approach: a quantitative survey of 150 leading Nigerian firms across finance, tech, and manufacturing, complemented by qualitative analysis of government policy and workforce interviews.
Methodological statement in the paper explicitly describing sample and methods (quantitative survey n=150; qualitative policy and interviews).
The governance calibration problem — balancing control with the autonomy that gives agentic AI its value — emerges as the STS joint optimization challenge: governance must simultaneously enable and constrain autonomous operation.
Authors' synthesis and theoretical claim based on STS analysis and identified tensions between autonomy benefits and control needs in the literature.
Agentic AI transformation barriers constitute an interdependent sociotechnical system rather than isolated obstacles.
Interpretive conclusion drawn from STS mapping and cross-barrier interaction analysis across the reviewed literature.
Governance serves as the social subsystem's primary mechanism for managing the technical subsystem.
Interpretation from STS analysis in the review: authors identify governance as the key social mechanism constraining/enabling technical subsystem behavior.
STS mapping based on root-cause analysis revealed that 12 barriers originate in the technical subsystem and 17 in the social subsystem.
Authors' STS mapping of the 29 barriers to subsystem origins (technical vs. social) as derived from their root-cause analysis of the coded literature.
Twenty-nine barriers were identified and classified into five dimensions: technological (7), organizational (7), human (6), governance and regulatory (4), and economic (5).
Results of inductive coding of the 30-source literature corpus yielding 29 distinct barriers and reported counts per dimension.
Sociotechnical Systems (STS) theory was applied as an interpretive lens to map dimensions onto social and technical subsystems and analyze cross-subsystem interactions.
Self-reported analytic approach: application of STS theory to the coded barriers to map origins and interactions across subsystems.
Barriers were identified inductively through open and axial coding.
Self-reported qualitative method: inductive thematic analysis using open and axial coding on the literature corpus.
A critical narrative literature review of 30 sources (2019–2026) was conducted.
Self-reported study method: critical narrative literature review; sample_size = 30 sources published between 2019 and 2026.
Acemoglu (2025) argues that near-term aggregate productivity gains from AI may be quite modest.
Citation to Acemoglu (2025) viewpoint noted in the introduction.
The experiment used stratified randomization across 32 strata with 255 treatment firms and 260 control firms; baseline characteristics are well balanced across groups.
Experimental design description: stratification by geography, traction score, and baseline AI use; reporting of allocation counts and balance tests in Table 2.
Attrition from the accelerator was low (1.6%, eight ventures) and balanced across treatment and control.
Program enrollment and retention records for the 515 firms in the randomized accelerator; 8 firms attrited.
The gains from treatment are broad-based: there are no significant differential effects by baseline firm performance or founder technical background.
Heterogeneity/subgroup analyses in the randomized sample (515 firms) comparing treatment effects across strata defined by baseline traction and founder technical background.
Treated firms' demand for labor remains unchanged.
RCT with 515 firms; firms reported labor demand/changes, comparison between treatment and control groups showed no significant change.
The authors measure the set of tasks that are automated in a given year via queries to ChatGPT's Deep Research (their 'heroic measurement').
Methodological statement in the introduction describing the measurement approach for identifying automated tasks.
When the automation process is continuous, firms switch from labor to capital at exactly the point where costs are equal; the switching process itself generates no productivity growth.
Theoretical result (Proposition in the model) derived in the task-based framework under competitive equilibrium and the no-de-automation assumption.
Despite substantial expected AI progress, most respondents do not forecast major departures from recent macroeconomic baselines, citing factors like historical base rates, adoption lags, demographic headwinds, policy responses, and infrastructure bottlenecks.
Qualitative summary of respondents' reasoning accompanying their unconditional forecasts (Key Findings and 1.2 description of survey elicitation).
AIGC and HGC exhibit distinct creation behaviors and consumption behaviors.
Descriptive comparisons in the longitudinal dataset showing differences in production rates, content volumes, and consumption patterns between AIGC and HGC.
The paper uses a comprehensive longitudinal dataset comprising tens of millions of users from a leading Chinese video-sharing platform.
Statement in paper summarizing data source: a longitudinal dataset covering 'tens of millions of users' from a major Chinese video-sharing platform; used for descriptive and comparative analyses of creation and consumption behavior.
Increasing reasoning effort (low, medium, high) provides no consistent benefit to estimation performance.
Controlled variation of each model's reasoning effort (low/medium/high) while asking them to produce 95% credible intervals for population statistics.
These chats were committed to public repositories as part of routine development, capturing in-the-wild behavior.
Data collection method: analysis of chat transcripts that were committed to public repositories (authors state collected from repos and describe them as routine commits).
We analyze 74,998 developer messages from 11,579 chat sessions across 1,300 repositories and 899 developers using Cursor and GitHub Copilot.
Reported dataset counts in the paper (message, session, repository, developer counts) drawn from public commit histories of chats.
As advanced artificial systems become more autonomous participants in these processes, the resulting interaction space begins to resemble a new kind of ecosystem in which diverse agents exchange information, cooperate, compete, and jointly explore complex adaptive landscapes.
Conceptual argument presented in the paper drawing on theories of adaptive systems and collective intelligence; no empirical test or dataset reported.
Human and artificial agents are increasingly interacting within a shared informational environment that shapes economic activity, scientific discovery, governance, and collective decision making.
Statement in paper's introduction; based on observational/phenomenological claim and citation-less framing (conceptual assertion rather than empirical analysis). No sample or empirical method reported.
Conventional microeconomic models often treat interactions between algorithmic platforms and workers as static principal-agent problems.
Literature statement in paper (conceptual framing / literature review); no empirical sample reported.
We evaluate APEX across three baselines and six scenarios using sample sizes 2–4x larger than initial experiments (N=20–40 per scenario).
Experimental design statement in the paper (three baselines, six scenarios, reported N range of 20–40 per scenario).
The HTTP 402 protocol treats payment as a first-class protocol event, but most implementations rely on cryptocurrency rails.
Descriptive claim in the paper about the state of HTTP 402 and common implementations (literature/implementation survey-style claim in paper).