The Commonplace
Home Dashboard Papers Evidence Syntheses Digests 🎲
← Papers

Firms using big-data applications earn higher markups: big-data adopters exhibit materially larger price markups, chiefly by boosting product innovation and production efficiency, but the payoff depends on complementary organizational and technological resources.

Big data application and firm markups: evidence from China
Dong Wang · April 08, 2026 · Scientific Reports
openalex correlational medium evidence 7/10 relevance DOI Source PDF
Combining a heterogeneous-firm model with firm-level data, the paper finds that adoption of big-data applications is associated with significantly higher price markups, largely mediated by increased product innovation and improved production efficiency, with effects varying by organizational, technological and environmental complements.

This study investigates the relationship between big data applications and firms' price markups. By constructing a heterogeneous firm model with variable markups, we analyze the mechanisms through which big data applications influence firms' price markups and conduct empirical tests using micro-level firm data. The results indicate that big data applications significantly enhance firms' price markups. Mechanism analysis reveals that promoting product innovation and improving production efficiency are two key channels through which big data applications contribute to higher markups. Furthermore, the positive effect of big data applications on firms' markups exhibits heterogeneity across organizational, technological, and environmental dimensions. These findings suggest that while big data applications positively influence firms' markups, the realization of this effect depends on the synergistic support of various complementary resources. The research uncovers the intrinsic mechanisms through which big data applications shape firms' competitive advantages and market power, providing valuable insights for policy formulation.

Summary

Main Finding

Big data applications significantly raise firm price markups in Chinese A‑share listed firms (2002–2023). The effect operates primarily through two channels—enhanced product innovation and improved production efficiency—and depends on complementary organizational, technological, and environmental resources. Measurement innovation: the paper uses large language models (LLMs) to extract firm-level big data application indicators from annual reports, improving objectivity and coverage.

Key Points

  • Core result: Higher levels of firm big data application are associated with larger price markups (price minus marginal cost). Theoretical derivations and micro‑level regressions both support a positive causal effect.
  • Mechanisms:
    • Product innovation: Big data increases innovation efficiency (f(a) rising in a), leading firms to introduce more or higher‑value differentiated products, which raises willingness to pay and pricing power.
    • Production efficiency: Big data lowers per‑unit variable and innovation costs via better monitoring, demand prediction, and process optimization, expanding margins and markups.
  • Theory: Builds a heterogeneous‑firm variable‑markup model (extension of Antoniades) where big data a affects innovation efficiency and fixed costs; model shows ∂markup/∂a > 0 and provides closed‑form expressions for optimal innovation, price, quantity, and profit.
  • Measurement innovation: Uses LLMs to mine unstructured annual report text to construct a nuanced, multi‑input indicator of firm big data application, addressing limitations of simple proxies (e.g., headcount of data analysts) and biased surveys.
  • Identification & endogeneity: Main empirical specification is a two‑way (firm and year) fixed effects panel regression of markup on log big‑data. To address reverse causality and omitted variables, the paper employs an instrument—the historical density of post offices per million people in each city (1984) — argued to affect modern information infrastructure (and thus adoption) while plausibly exogenous to current firm markups.
  • Heterogeneity: The positive effect varies across firm organizational structures, technological capabilities, and external environments—i.e., complementary resources matter for translating data investments into market power.
  • Policy relevance: Results imply data-driven capabilities can shift market structure toward greater firm market power (winner‑takes‑most dynamics) unless complemented by countervailing competition or regulation.

Data & Methods

  • Sample: Chinese A‑share listed firms, 2002–2023.
  • Big data measure: Novel LLM‑based text mining of annual reports to identify firm big data application intensity (captures multi‑source/ multi‑input nature, reduces survey/proxy bias).
  • Markup measure: Firm price markup (price minus marginal cost) constructed at micro level (consistent with heterogeneous‑firm markup literature).
  • Econometric approach:
    • Main model: Two‑way fixed effects panel regression: Markup_it = α + β ln(BigData_it) + γX_it + μ_i + λ_t + ε_it.
    • Controls: Time‑varying firm covariates (not fully listed in excerpt).
    • Endogeneity: Instrumental variable using 1984 post office density per million people at city level; argued to satisfy relevance (historical information infrastructure predicts later tech adoption) and exogeneity (unlikely to affect modern markups directly).
    • Mechanism tests: Mediation/stepwise tests linking big data to innovation outcomes (z*) and to production cost measures, and then to markups.
  • Theoretical model highlights:
    • Firms incur variable production cost c and innovation costs (including a unit cost coefficient δ and innovation efficiency κ=f(a) with f′(a)>0, f″(a)<0).
    • Product innovation level z* increases with a.
    • Price and markup expressions derived show markup increases with a due to both higher z* and lower marginal cost per unit.
  • Robustness & heterogeneity: Paper reports heterogeneity across organizational/technological/environmental dimensions (details not reproduced here).

Implications for AI Economics

  • Data/AI as a G‑P tech that raises markups: Like other general‑purpose technologies (AI, software), firm adoption of big data can increase market power by enabling differentiation and cost advantages. Empirical evidence at the firm level strengthens theory that digital investments can shift firm pricing power independent of measured productivity gains.
  • Complementarities matter: The effect is conditional on complementary intangible assets (skills, organizational changes, business processes). Policies aiming to spread AI/big‑data benefits should emphasize complementary investments (training, process redesign, data governance).
  • Measurement and empirical strategy: LLMs provide a scalable method to measure granular, unstructured indicators of AI/big‑data adoption from corporate texts—useful for future research on AI economics (adoption, diffusion, impacts).
  • Competition and regulation: Findings suggest possible upward pressure on industry concentration and consumer prices as data‑driven firms extract more surplus. This motivates scrutiny of data monopolies, interoperability mandates, and competition policy that considers data/network externalities.
  • Research directions:
    • Distributional impact: How do markup gains translate into wages, rents, and consumer surplus across sectors?
    • Market structure dynamics: Interaction between data‑driven markups and entry/exit, platform competition, and vertical integration.
    • AI vs. other intangible investments: Comparative magnitude of markup effects across AI, software, R&D, and other intangibles.
    • External validity: Replicating similar LLM‑based measurement and estimation in other countries and non‑listed firms to test robustness.

Limitations to note (explicit or implicit from the paper): reliance on listed firms may bias toward larger firms; instrument validity depends on the exclusion restriction (historical post office density affecting modern markups only through adoption); full robustness checks and effect sizes are not reported in this summary.

Assessment

Paper Typecorrelational Evidence Strengthmedium — The study triangulates theory and micro-level empirical evidence and conducts mechanism and heterogeneity tests, which strengthen plausibility; however, absent credible exogenous variation the estimated relationships remain vulnerable to endogeneity (more profitable or innovative firms may be more likely to adopt big-data applications), so causal claims are suggestive rather than definitive. Methods Rigormedium — Strengths include a structural heterogeneous-firm model, firm-level markup estimation, and exploration of mechanisms and heterogeneous effects; weaknesses are lack of a clearly exogenous identification strategy, potential measurement issues for 'big data applications' and markups, and limited discussion (in the summary) of robustness checks that would address endogeneity and measurement error. SampleMicro-level firm data across multiple industries (firm-year observations) with measures of big-data application intensity, firm-level controls, and estimated price markups (derived from revenue/cost or production-function based methods); exact country, time period, and sample size are not specified in the summary. Themesproductivity innovation IdentificationAssociational analysis: the paper combines a structural heterogeneous-firm model with micro-level firm regressions relating measures of big-data application intensity to estimated firm markups, and tests mechanisms via mediation/heterogeneity analyses; no clearly exogenous source of variation (randomization, credible instrumental variable, or natural experiment) is reported, leaving causal identification vulnerable to selection, reverse causality, and omitted-variable bias. GeneralizabilityUnclear country/context — results may not generalize beyond the sample economy (institutional/regulatory differences)., Industry composition — effects may differ in data-intensive vs. traditional sectors., Firm-size and selection bias — adopters may be larger or more productive firms, limiting applicability to small firms., Measurement limitations — 'big data applications' is a broad concept and may be measured with error or heterogeneity across firms., Time period dependence — effects could change as big-data technologies diffuse further.

Claims (5)

ClaimDirectionConfidenceOutcomeDetails
Big data applications significantly enhance firms' price markups. Market Structure positive high price markups
0.3
Promotion of product innovation is a key channel through which big data applications contribute to higher price markups. Innovation Output positive high product innovation (as mediator of markup increases)
0.3
Improving production efficiency is a key channel through which big data applications contribute to higher price markups. Firm Productivity positive high production efficiency (as mediator of markup increases)
0.3
The positive effect of big data applications on firms' markups exhibits heterogeneity across organizational, technological, and environmental dimensions. Market Structure mixed high heterogeneity of the big-data → markup effect across organizational, technological, and environmental dimensions
0.3
The realization of the positive effect of big data applications on markups depends on the synergistic support of various complementary resources. Market Structure positive high realization of the markup-increasing effect conditional on presence of complementary resources
0.3

Notes