Firms that better harness data produce more AI patents, and data access especially helps low‑productivity Chinese digital firms close the innovation gap; the positive link holds across alternate productivity measures, though causality is not fully pinned down.
Using a sample of Chinese A-share listed companies in core digital economy industries from 2015 to 2024, this study examines how data element utilization drives AI technological innovation. Employing a panel fixed‑effects regression model, we find that the level of data factor utilization has a significant positive impact on AI patent output. This effect is more pronounced in firms with low total factor productivity (TFP), exhibiting a "contrarian" catch‑up characteristic. The conclusions remain robust after substituting different TFP measurement methods. This study reveals the unique mechanism through which data elements enable late‑entrant firms to catch up technologically, providing empirical evidence for deepening data element market reforms.
Summary
Main Finding
Digital transformation (measured by firm annual‑report keyword intensity for digital/AI-related terms) significantly increases corporate green technological innovation among Chinese A‑share manufacturing firms (2019–2024). The effect is robust to multiple checks and operates mainly through three channels: easing financing constraints (resource effect), reducing agency problems (governance effect), and strengthening firms’ growth/absorptive capacity (multiplier effect). The positive effect is stronger for non‑high‑tech firms and firms in heavily polluting industries, and applies to both green invention patents and green utility model patents.
Key Points
- Effect size (economic meaning): a one standard‑deviation rise in the digitalization measure is associated with a ~5.5% increase in total green patents (GPatent1) and ~12.8% increase in joint green patents (GPatent2) (paper reports these magnitudes).
- Mechanisms identified:
- Resource effect: digitalization reduces information asymmetry and search costs, improving access to external finance and public support for green R&D.
- Governance effect: digital management and disclosure reduce managerial discretion and agency conflicts, improving implementation of green projects.
- Multiplier effect: digital technologies improve supply‑chain integration, knowledge sharing and market responsiveness, raising firms’ capacity to undertake sustained green innovation.
- Heterogeneity: the positive digital → green innovation effect is more pronounced in non‑high‑tech enterprises and in heavily polluting sectors.
- Robustness: results hold under propensity score matching (PSM), instrumental variable approaches (reported), alternate variable constructions, extended observation windows, multi‑fixed effects, and sample scope adjustments.
- Policy recommendations (from paper): promote firm digital transformation—especially for non‑high‑tech and polluting firms—build digital infrastructure, improve information disclosure, and subsidize digitalized green R&D.
Data & Methods
- Sample: A‑share listed manufacturing firms on Shanghai and Shenzhen exchanges, initial period 2019–2024; green patents measured one period ahead (effectively using patents in the subsequent year). Final sample: 5,810 firm‑year observations after screening and winsorization.
- Data sources: digital keyword frequencies from WinGo Financial Text Data Platform; green patent counts from CNRDS; firm financials, governance, industry classification and high‑tech status from CSMAR.
- Dependent variables:
- GPatent1 = ln(1 + number of green patents independently obtained next period) (invention + utility models).
- GPatent2 = ln(1 + number of green patents jointly obtained next period).
- Core independent variable:
- Digword = ln(1 + total frequency of 94 digital‑related keywords in firm annual reports). Keywords built from policy seed words expanded via Word2Vec.
- Empirical strategy:
- Baseline: OLS regressions with year fixed effects and a battery of firm controls (size, leverage, age, ROA, SOE indicator, top shareholder concentration, institutional ownership, cash ratio, Tobin’s Q, R&D intensity, etc.).
- Robustness/causal checks: propensity score matching (1:1 nearest neighbor), instrumental variable estimation (reported), alternative variable definitions, extended observation window, multiple fixed effects, and adjusted sample scopes.
- Main baseline estimates: Digword coefficients positive and statistically significant (e.g., Digword ≈ 0.020 with t ≈ 3.49 in some specifications), with reported R‑squareds modest (typical for firm patent regressions).
Implications for AI Economics
- Microeconomic channel: AI and related digital technologies embedded in firm operations (captured by textual indicators) act as tangible inputs that lower frictions (information asymmetry, coordination costs) and raise firms’ capacity to adopt and generate green innovations. This provides micro‑level empirical support for theories that AI/digital adoption can induce green transition via both efficiency and organizational channels.
- Financial markets & investment: improved disclosure and data flows associated with digitalization can mobilize capital toward green projects. For AI economics, this highlights a measurable feedback loop—AI/digital adoption improves financing conditions, which in turn funds more R&D and innovation.
- Policy design: targeted digitalization subsidies or infrastructure investments for less digitalized, polluting, or non‑high‑tech firms may yield outsized green innovation returns. Policies that couple AI/digital vouchers with green R&D incentives could be particularly effective.
- Measurement & methods: the paper illustrates a replicable approach for quantifying firm digitalization using annual‑report text analysis (seed keywords + Word2Vec expansion). Researchers in AI economics can reuse/extend this textual approach to isolate AI‑specific adoption signals and link them to economic or environmental outcomes.
- Research avenues:
- Disentangle AI‑specific channels from broader “digitalization” (e.g., isolate mentions of machine learning/AI versus cloud/big data/IoT).
- Study longer‑term causal dynamics (e.g., event studies on major AI investments) and firm‑level heterogeneity (size, market power, international exposure).
- Explore complementary market effects (labor reallocation, product market competition) and welfare implications of digital‑enabled green innovation.
- Link granular measures of AI adoption (software/hardware spending, AI patenting, deployment cases) with patent quality (citations), not only counts.
If you want, I can (a) extract the 94 keyword list or an example subset used to build Digword, (b) convert key regression tables into a compact CSV, or (c) draft suggested AI‑policy experiments to test causal channels further.
Assessment
Claims (5)
| Claim | Direction | Confidence | Outcome | Details |
|---|---|---|---|---|
| The level of data factor utilization has a significant positive impact on AI patent output. Innovation Output | positive | high | AI patent output |
0.48
|
| The positive effect of data factor utilization on AI patent output is more pronounced in firms with low total factor productivity (TFP), exhibiting a 'contrarian' catch-up characteristic. Innovation Output | positive | high | AI patent output (differential effect by firm TFP level) |
0.48
|
| The conclusions remain robust after substituting different methods for measuring total factor productivity (TFP). Innovation Output | positive | high | AI patent output (robustness to TFP measurement method) |
0.48
|
| Data elements provide a unique mechanism that enables late‑entrant firms to catch up technologically. Innovation Output | positive | medium | technological catch‑up (proxied by AI patent output increases among late entrants) |
0.05
|
| The study analyzes Chinese A-share listed companies in core digital economy industries from 2015 to 2024 using a panel fixed‑effects regression model. Other | null_result | high | not applicable (methodological/sample description) |
0.48
|