Does Artificial Intelligence Advance Science?

This paper examines whether and how artificial intelligence (AI) advances scientific creativity. Drawing on scientific publications, the primary output of researchers, we analyze over one million publications from OpenAlex to investigate the relationship between AI adoption and multiple dimensions of scientific creativity, including novelty (recombinant novelty and object novelty) and impact (3-year short-run citation impact and 10-year long-run citation impact). We find that AI publications are significantly more likely to achieve top-decile creativity relative to non-AI publications, with 5.5 to 10.2 percentage point higher likelihood to rank in the top creativity decile. Critically, we uncover substantial heterogeneity across AI research modes. Tool-oriented AI research, which applies existing AI models to domain tasks, is associated with the largest gains in recombinant-based creativity, while Adaptation-oriented AI research, modifying AI models for domain-specific problems, is associated with relatively higher object-based creativity. These findings reveal that AI does not advance science through a single mechanism but through structurally distinct creative pathways that depend on how AI is incorporated into the research process. Our results contribute to ongoing debates about AI's role in science and carry direct implications for research evaluation and science policy, highlighting the need for assessment frameworks that can distinguish between recombinant and conceptual forms of creativity and that recognize how different modes of AI adoption produce fundamentally different types of scientific contribution.

Summary

Main Finding

AI-related publications are substantially more likely to produce highly creative scientific outputs than non-AI publications: AI papers have a 5.5 to 10.2 percentage point higher likelihood of ranking in the top decile of measured creativity. Importantly, the effect is heterogeneous by how AI is used: Tool-oriented AI work yields the largest gains in recombinant-style novelty, while Adaptation-oriented AI (modifying/developing models for domain problems) is more strongly associated with object-based novelty and higher long-run impact.

Key Points

Scope and headline result
- Analysis of 1,127,716 peer-reviewed papers (OpenAlex snapshot 30 May 2025): 697,551 AI-adoption papers vs 430,165 non-AI papers.
- AI adoption increases the probability of top-decile creativity by roughly 5.5–10.2 percentage points.
Heterogeneous creative pathways by AI mode
- Tool-oriented mode (applying existing AI models to domain tasks): largest gains in recombinant novelty/creativity (i.e., novel recombinations of existing knowledge).
- Adaptation-oriented mode (adapting/developing AI for domain-specific problems): stronger association with object-based novelty (new phrases/concepts) and higher citation impact (including long-run 10-year impact).
- Foundational mode (development of new AI methodologies) and Discussion-mode papers were treated separately; the main comparative findings focus on the Adaptation vs Tool contrast (Discussion-mode papers were excluded from the AI-adoption group).
Conceptual framing
- Distinguishes novelty (departure from prior knowledge) from creativity (novelty + usefulness/impact).
- Distinguishes two empirical novelty measures: recombinant novelty (combinatorial novelty based on references) and object novelty (introduction of new phrases/objects).
Robust classification
- Two-stage AI identification: curated AI keyword set followed by a GPT–SciBERT semantic-embedding filter to reduce false positives.
- AI-mode classification follows an established framework (Adaptation, Tool, Foundational, Discussion); only Adaptation/Tool/Foundational are treated as AI adoption in analysis.
Data filters and sampling
- Restricts to English-language, peer-reviewed journal articles and conference proceedings, with DOIs, 2004–2024.
- Excludes preprints, retractions, editorials/commentaries; deduplicated.
- Non-AI comparison sample drawn by stratified random sampling (2% of non-AI publications, stratified by OpenAlex field), then filtered to match AI sample constraints.
Outcome measures
- Novelty: recombinant novelty (reference-combination based) and object novelty (new terms/phrases).
- Impact: 3-year (short-run) and 10-year (long-run) citation impact.
- Creativity operationalized as high novelty combined with meaningful impact (approximated via top-decile thresholds).
Limitations noted by authors
- AI mention ≠ AI use: classification attempts to filter discussion-only papers but measurement error remains possible.
- Exclusion of preprints (to avoid bias and incomplete citation data) may undercount some AI work, particularly in fast-moving subfields.
- OpenAlex coverage and gaps in citation/reference data can introduce attrition; the authors restrict to observations with non-missing key measures.

Data & Methods

Data source: OpenAlex bibliometric records (snapshot 30 May 2025), covering journals and conference proceedings (2004–2024).
Sample construction:
- Candidate AI papers: keyword matching on titles+abstracts → GPT–SciBERT embedding filter to remove false positives → classify into AI modes (Adaptation, Tool, Foundational, Discussion).
- Non-AI sample: stratified random 2% draw across 26 OpenAlex fields, then same preprocessing filters applied.
- Final analytical dataset: 1,127,716 papers (697,551 AI-adoption; 430,165 non-AI).
Classification pipeline:
- Curated AI keyword set (from prior work) to create candidate set.
- Semantic filtering using a GPT–SciBERT embedding pipeline to capture topical relevance beyond keywords and reduce false positives.
- AI-mode labels assigned per an established taxonomy (Ding et al., 2025b); Discussion-mode papers excluded from the AI-adoption group.
Dependent variables and measurement:
- Recombinant novelty: inferred from reference combinations (atypical co-citation/reference pairings).
- Object novelty: inferred from the appearance of new phrases/objects in the publication text.
- Citation impact: short-run (3-year) and long-run (10-year) citation counts; creativity proxied by joint top-decile performance on novelty and impact metrics.
Empirical approach (overview as reported):
- Comparative analysis of probabilities of top-decile creativity between AI and non-AI publications, and across AI modes.
- Controls and robustness checks are described (disciplinary stratification, exclusion criteria, handling of missing values), though full model specifications are beyond the excerpt provided.

Implications for AI Economics

Re-evaluate AI as a heterogeneous input in models of research production
- Economic models should not treat AI as a single productivity multiplier. AI functions both as a toolkit that expands combinatorial search (Tool mode) and as a source of conceptual transformation when adapted/developed for domain problems (Adaptation mode). Policy and theoretical models must allow for these distinct pathways and elasticities of substitution.
Funding and policy design
- Policies that solely subsidize off-the-shelf tool adoption may boost short-run recombinant productivity but are less likely to generate the object-level conceptual innovations associated with long-run scientific impact. Targeted support for adaptation and methodological development (training, domain-AI fellowships, incentives for method development) can foster deeper conceptual novelty with larger long-run returns.
Research evaluation and metrics
- Evaluation frameworks (grant review, hiring, promotion, bibliometrics) should distinguish recombinant versus object-based creativity. Relying only on short-term citation or publication counts risks overstating the creative contribution of tool-based AI use and understating long-run gains from adaptation/foundational modes.
Labor-market and organizational consequences
- Differential returns to modes imply heterogeneous rewards for skills: expertise in adapting/developing AI for domain problems is likely to yield higher long-run impact premiums than routine application of off-the-shelf models. Training and workforce development programs should prioritize skills that support adaptation and methodological innovation.
Systemic effects and diversity risks
- Consistent with related literature, AI tool adoption may concentrate attention and reduce topic diversity. From a welfare perspective, policymakers should balance measures that increase productivity with interventions that preserve exploratory diversity (e.g., seed grants for high-risk/high-novelty projects).
Measurement and future research needs
- Economists studying innovation should incorporate multiple novelty metrics (recombinant vs object-based) and consider time horizons for impact (short vs long run). Further work is needed to causally identify mechanisms (e.g., randomized funding experiments, researcher-level panel analyses) and to quantify social returns across AI modes.

If you’d like, I can (a) extract or summarize the paper’s regression specifications and robustness checks (if you can supply that section), (b) draft suggested policy actions grounded in the numbers here, or (c) prepare figures/tables that illustrate the mode-specific effects for a presentation.

Assessment

Paper Typecorrelational Evidence Strengthmedium — Large-scale, multi-measure analysis shows consistent and sizable associations between AI use and multiple dimensions of creativity, and reports heterogeneous patterns by AI research mode; however, the observational design lacks clear exogenous variation to rule out selection, confounding, reverse causation, or measurement biases, limiting causal interpretation. Methods Rigorhigh — Uses a very large bibliometric dataset, multiple complementary creativity and impact metrics (short- and long-run citations, recombinant vs object novelty), and disaggregates AI into meaningful modes (tool vs adaptation), with robustness checks reported; potential issues remain around AI-paper classification, unobserved confounders, and citation as an imperfect proxy for scientific value. SampleOver one million scientific publications drawn from the OpenAlex bibliometric database, classified into AI vs non-AI papers and further into AI research modes (tool-oriented, adaptation-oriented, etc.), with measures of recombinant and object novelty and 3-year and 10-year citation impact; sample spans multiple fields and years (exact years and field coverage as in the paper). Themesinnovation productivity IdentificationComparative observational analysis of publication-level data from OpenAlex: AI-labeled vs non-AI papers contrasted using multivariate regression and stratified analyses, with controls (e.g., field, year, author/institution proxies) and robustness checks; no exogenous variation or randomized assignment is reported, so causal claims rely on statistical adjustment and heterogeneous associations across AI research modes. GeneralizabilityRelies on OpenAlex-indexed publications: may underrepresent books, conference artifacts in some fields, or non-indexed regional journals, AI-paper labeling and AI-mode classification could misclassify interdisciplinary/ambiguous papers, Citation-based impact measures vary across disciplines and time and reflect visibility as well as quality, Findings may not generalize to unpublished work, industry R&D, or non-academic innovation settings, Temporal generalizability limited if AI capabilities and adoption patterns change rapidly

Claims (8)

Claim	Direction	Confidence	Outcome	Details
The analysis draws on over one million publications from OpenAlex. Other	null_result	high	sample of publications (dataset size)	n=1000000 0.5
AI publications are significantly more likely to achieve top-decile creativity relative to non-AI publications. Creativity	positive	high	likelihood of ranking in top creativity decile	n=1000000 5.5 to 10.2 percentage point higher likelihood 0.3
AI publications have a 5.5 to 10.2 percentage point higher likelihood to rank in the top creativity decile. Creativity	positive	high	increase in probability of being top-decile creative	n=1000000 5.5 to 10.2 percentage point higher likelihood 0.3
Tool-oriented AI research (applying existing AI models to domain tasks) is associated with the largest gains in recombinant-based creativity. Creativity	positive	high	recombinant-based novelty/creativity	n=1000000 0.3
Adaptation-oriented AI research (modifying AI models for domain-specific problems) is associated with relatively higher object-based creativity. Creativity	positive	high	object-based novelty/creativity	n=1000000 0.3
AI advances science through structurally distinct creative pathways rather than a single mechanism; the creative pathway depends on how AI is incorporated into the research process. Creativity	mixed	high	mechanism/pathway of scientific creativity (qualitative synthesis from heterogeneous quantitative results)	n=1000000 0.3
The paper analyzes multiple dimensions of scientific creativity and impact, specifically recombinant novelty, object novelty, 3-year short-run citation impact, and 10-year long-run citation impact. Other	null_result	high	measures used (recombinant novelty, object novelty, 3-year citations, 10-year citations)	n=1000000 0.5
The findings imply that research evaluation and science policy should adopt assessment frameworks that distinguish between recombinant and conceptual forms of creativity and recognize that different modes of AI adoption produce different types of scientific contribution. Governance And Regulation	positive	high	policy recommendation for research evaluation frameworks	n=1000000 0.05