https://arxiv.org/pdf/2603.03144

Summary

Main Finding

Adoption of ChatGPT by U.S. households (measured on home devices) causally increases households’ online leisure time while leaving time spent on productive digital tasks roughly unchanged. Households predominantly use ChatGPT in the context of productive non-market activities (education, job search, informational research), implying that generative AI raises the efficiency of productive online tasks and enables a reallocation of freed-up time toward leisure. Mapping the time reallocation into productivity implies large implied efficiency gains for adopters (preferred calibration: ~76%–176%, roughly a doubling).

Key Points

Data and scope
- Uses Comscore panel browsing data for >200,000 U.S. household home devices, 2021–2024 (pre- and post-ChatGPT release).
- Focuses on household-level changes in browsing behavior before/after ChatGPT (released Nov 30, 2022).
Adoption patterns and inequality
- Rapid private adoption but with a widening “generative AI divide”: higher and faster adoption among higher-income and younger households; little sign of convergence.
Predicting adoption
- Constructed an ex-ante household “exposure” measure (Jan–Dec 2021) that captures how much a household’s prior browsing overlapped with chatbot-capable website tasks. A 1 SD higher exposure predicts a 2.5 percentage-point higher probability of ChatGPT use by Dec 2024.
Causal effects (IV long-difference estimates)
- Instrument: pre-ChatGPT exposure (2021) → plausibly exogenous predictor of post-release adoption.
- Main IV result: ChatGPT adoption substantially raises leisure browsing time (paper reports an increase of roughly "150 log points") and increases the leisure share of browsing duration by ~30 percentage points, while total time on productive online activities remains statistically unchanged.
- Interpretation: ChatGPT substitutes for multi-site productive browsing (and/or shortens task durations), freeing time for leisure.
Mechanisms / browsing context
- High-frequency analysis (30-minute windows around ChatGPT visits) shows ChatGPT use is concentrated around productive websites (search, education, job search, informational sites) rather than leisure sites.
- Declines in other browsing after adoption are concentrated in categories that ChatGPT can substitute for (search, news, Stack Overflow), consistent with productivity substitution.
Productivity mapping
- Uses a time-allocation model (adapted from Aguiar et al., 2021) and estimated Engel-curve elasticities (identified via an IV using local precipitation shocks) to convert time savings into implied productivity gains.
- Preferred calibration implies efficiency gains in productive digital tasks of about 76%–176% for adopters (authors characterize this as approximately doubling efficiency).
Limitations
- Observed browsing is limited to household-owned desktop/laptop devices (no mobile device or offline activity observed).
- No direct measure of task output or welfare beyond browsing-time reallocation; mapping to productivity depends on model assumptions and elasticity estimates.

Data & Methods

Core data: Comscore web browsing panel capturing all qualified browsing on household-owned machines (timestamps, visit duration, URL, session id) plus demographic bins (income, age, location, household size).
Exposure measure construction:
- Web-scraped website content and used LLM-based classification to assess the overlap between each website’s activities and ChatGPT capabilities.
- Aggregated site-level overlap across each household’s 2021 browsing to form an ex-ante “exposure” score (higher if a larger share of pre-ChatGPT browsing was on sites amenable to chatbot substitution).
Identification strategy:
- Main causal design: long-difference regression instrumenting household ChatGPT adoption with pre-ChatGPT exposure, controlling for demographic-by-region fixed effects and browsing composition covariates.
- Rationale: pre-release exposure predicts later adoption but (plausibly) is exogenous to post-release shocks that would independently change leisure/productive browsing, conditional on controls.
Mechanism tests:
- LLM-based classification to label browsing sessions as “leisure” vs. “productive” (non-market) activities.
- High-frequency windowed comparisons (30 minutes before/after ChatGPT visits) contrasting users’ context to demographically similar non-users.
- Category-level analyses showing declines concentrated in AI-substitutable categories.
Productivity mapping:
- Quantitative time-allocation model translating time reallocations into implied productivity changes.
- Required Engel-curve elasticities for leisure vs. productive digital activities estimated via an IV exploiting variation from local precipitation shocks (used to identify how leisure and browsing respond to shocks in total browsing time).

Implications for AI Economics

Non-market productivity matters: Generative AI’s welfare and economic impact extend beyond workplaces — household-level efficiency gains and time reallocations are potentially large and should be included in aggregate assessments of GenAI’s economic value.
Distributional concerns: The documented “generative AI divide” (by income and age) implies uneven distribution of household-level productivity and welfare gains, which could amplify broader inequality unless adoption barriers for older and lower-income households are addressed.
Complementarity/substitution across digital platforms: Adoption reduces visits to websites and services that GenAI can substitute (search, news, Q&A sites), implying platform-level demand shifts and potential implications for online ad markets and content providers.
Policy and intervention relevance:
- Interventions (training, digital literacy programs, subsidies) targeted at low-adoption groups could equalize access to non-market productivity benefits.
- Regulatory and welfare assessments of GenAI should account for private household gains (not just workplace productivity), as these may be quantitatively substantial.
Research agenda:
- Extend measurement to mobile devices and offline household activities to quantify total-time and cross-device reallocation.
- Directly measure output/quality of household tasks (learning outcomes, job search success, financial decisions) to complement time-based productivity inferences.
- Study interactions between home and workplace GenAI use (complementarities, spillovers for human capital and labor-market outcomes).
- Investigate long-run dynamics (skill acquisition, habit formation, platform responses) and heterogeneous long-term welfare consequences.

Summary takeaway: Household adoption of ChatGPT appears to increase household non-market productivity by making productive online tasks markedly more efficient, producing substantial reallocation of freed time toward online leisure — with large implied productivity gains but rising adoption inequality that merits policy attention.

Assessment

Paper Typequasi_experimental Evidence Strengthmedium — The paper uses rich panel browsing data and a plausible IV (pre-period exposure) with long-difference estimation and extensive context checks, which provide credible quasi-experimental variation; however, the instrument may proxy for unobserved tech-savviness or time trends correlated with both exposure and later leisure changes, the analysis omits mobile and offline activities, and inferred task-purpose relies on LLM classification — all of which limit causal conclusiveness relative to an experiment. Methods Rigormedium — Methods combine high-frequency panel data, systematic website-level NLP/LLM classification, an IV long-difference framework, and mechanistic checks plus a structural time-allocation mapping; these are appropriate and sophisticated. Remaining concerns include instrument validity threats (pre-trends/selection), potential measurement error from LLM-based labeling, sample representativeness of the Comscore panel, and extrapolation from browsing-time changes to productivity gains. SampleComscore home-device web-browsing panel of over ~200,000 U.S. household machines (baseline filtered sample: households with income and age info, ≥6 months of 2021 browsing and ≥1 month post-ChatGPT), 2021–2024; data includes timestamped URLs, visit duration, session ids and binned demographics (income, age, city, household size); excludes work-owned machines and does not include mobile-device or offline activity. Themesproductivity adoption inequality human_ai_collab IdentificationInstrumental-variables long-difference design: household ChatGPT adoption is instrumented by an ex-ante 'exposure' score constructed from each household's pre-ChatGPT (2021) browsing composition (websites whose content overlaps with chatbot capabilities). Controls include demographic-by-region fixed effects and browsing-composition controls; supplementary checks use high-frequency browsing context around ChatGPT visits and an IV (local precipitation shocks) to estimate Engel elasticities for the time-allocation model. GeneralizabilityOnly captures browsing on household-owned (non-mobile) computers — excludes mobile and many in-person/offline household activities, U.S.-only Comscore panel with some income/age representation biases (over-represents low and high incomes; under-represents middle incomes and younger adults), Findings reflect early/adoption-period effects (2021–2024) and may not generalize to later GenAI versions or broader ecosystem changes, Does not directly measure non-digital household output or workplace interactions — limited to digital non-market productivity, Instrument (pre-period exposure) may reflect broader unobserved heterogeneity (tech-savviness, preference trends) that varies across contexts/countries

Claims (13)

Claim	Direction	Outcome	Confidence & Evidence	Details
The analysis uses detailed Internet browsing microdata from over 200,000 U.S. households' home devices from 2021 to 2024. Other	null_result	size and coverage of browsing panel	Reading fidelity high Study strength high	n=200000 0.8
ChatGPT adoption among private households has been rapid following release, but adoption is far from uniform. Adoption Rate	positive	ChatGPT adoption rate over time	Reading fidelity high Study strength medium	n=200000 0.48
High-income and younger households adopt generative AI substantially faster than low-income and older counterparts, and this gap is widening over time ('generative AI divide'). Inequality	negative	heterogeneity in adoption rates by income and age (inequality in adoption)	Reading fidelity high Study strength medium	n=200000 0.48
A household's pre-ChatGPT ex-ante exposure (based on 2021 browsing composition) strongly predicts subsequent ChatGPT adoption: a 1 SD higher exposure predicts a 2.5 percentage point higher rate of having used ChatGPT by December 2024. Adoption Rate	positive	probability / rate of ChatGPT adoption by Dec 2024	Reading fidelity high Study strength medium	n=200000 1 SD higher exposure predicts a 2.5pp higher rate of having used ChatGPT by December 2024 0.48
Using pre-existing exposure as an instrument for ChatGPT adoption in a long-difference IV design, ChatGPT adoption causes households to spend more time on digital leisure activities while leaving total time spent on productive online activities unchanged. Task Allocation	mixed	change in time spent on digital leisure activities and total time on productive online activities	Reading fidelity high Study strength medium	n=200000 0.48
In long-difference IV estimates, ChatGPT adoption raises total leisure browsing time by roughly 150 log points. Task Completion Time	positive	total leisure browsing time (log change)	Reading fidelity high Study strength medium	n=200000 roughly 150 log points 0.48
ChatGPT adoption increases the leisure share of browsing duration by about 30 percentage points. Task Allocation	positive	leisure share of total browsing duration	Reading fidelity high Study strength medium	n=200000 about 30 percentage points 0.48
ChatGPT adoption leaves the total time spent on productive online activities (including any time spent using ChatGPT) unchanged. Task Completion Time	null_result	total time spent on productive online activities	Reading fidelity high Study strength medium	n=200000 0.48
Households predominantly utilize ChatGPT in the context of productive online activities (education, job search, informational research) rather than during leisure browsing, as inferred from the browsing context around ChatGPT use. Task Allocation	positive	context/purpose of ChatGPT use (productive vs leisure)	Reading fidelity high Study strength medium	not reported 0.48
Observed declines in browsing time due to ChatGPT adoption are concentrated in website categories such as search and news, which are highly exposed to substitution by generative AI. Task Allocation	negative	browsing time on search and news website categories	Reading fidelity high Study strength medium	not reported 0.48
Mapping the empirical time-reallocation into a quantitative household time-allocation model implies generative AI approximately doubles the efficiency of productive online tasks for adopters; preferred calibration implies efficiency gains of 76%–176%. Task Completion Time	positive	efficiency (productivity) of productive digital tasks	Reading fidelity high Study strength speculative	76%-176% 0.08
These household-level non-market productivity gains (ChatGPT making productive online tasks more efficient and freeing time for leisure) are economically large and likely constitute a substantial share of the overall economic impact of generative AI. Consumer Welfare	positive	household non-market productivity and welfare (implied aggregate economic impact)	Reading fidelity high Study strength speculative	not reported 0.08
Limitations: the Comscore data observe household internet activity on home (non-mobile) devices and do not capture offline or mobile device activities, so extrapolation to total at-home activities should be done with caution. Other	null_result	data coverage (mobile/offline activities not observed)	Reading fidelity high Study strength high	not reported 0.8