Because AI performance improves with diminishing returns to data, compute, and model size, near-perfect accuracy is disproportionately costly, so firms often choose partial human-AI collaboration rather than full automation; calibrated to computer vision, cost-effective automation covers roughly 11% of exposed wages at the firm level but can scale far higher when AI services spread fixed costs across users.
This paper develops a unified framework for evaluating the optimal degree of task automation. Moving beyond binary automate-or-not assessments, we model automation intensity as a continuous choice in which firms minimize costs by selecting an AI accuracy level, from no automation through partial human-AI collaboration to full automation. On the supply side, we estimate an AI production function via scaling-law experiments linking performance to data, compute, and model size. Because AI systems exhibit predictable but diminishing returns to these inputs, the cost of higher accuracy is convex: good performance may be inexpensive, but near-perfect accuracy is disproportionately costly. Full automation is therefore often not cost-minimizing; partial automation, where firms retain human workers for residual tasks, frequently emerges as the equilibrium. On the demand side, we introduce an entropy-based measure of task complexity that maps model accuracy into a labor substitution ratio, quantifying human labor displacement at each accuracy level. We calibrate the framework with O*NET task data, a survey of 3,778 domain experts, and GPT-4o-derived task decompositions, implementing it in computer vision. Task complexity shapes substitution: low-complexity tasks see high substitution, while high-complexity tasks favor limited partial automation. Scale of deployment is a key determinant: AI-as-a-Service and AI agents spread fixed costs across users, sharply expanding economically viable tasks. At the firm level, cost-effective automation captures approximately 11% of computer-vision-exposed labor compensation; under economy-wide deployment, this share rises sharply. Since other AI systems exhibit similar scaling-law economics, our mechanisms extend beyond computer vision, reinforcing that partial automation is often the economically rational long-run outcome, not merely a transitional phase.
Summary
Main Finding
Partial automation — where firms choose an intermediate AI accuracy and humans handle the residual uncertainty — is frequently the cost‑minimizing long‑run outcome. Because model performance follows scaling laws with sharply diminishing returns, the marginal cost of pushing AI from “good” toward near‑perfect accuracy is convex and often exceeds the marginal labor savings. As a result, firms optimally stop at interior solutions (human–AI collaboration) rather than pursue full automation in many tasks. Scale of deployment (e.g., AI-as-a-Service or shared agents) materially expands the set of economically viable automations by spreading fixed development costs.
Key Points
- Framework: Automation is modeled as a continuous choice of AI accuracy (not a binary decision). Firms minimize costs by choosing an accuracy level that trades off AI development costs against labor savings.
- Supply side (costs): Fine‑tuning scaling‑law experiments show performance increases predictably with data, training steps, and model size but with diminishing returns. This produces a convex cost function: small gains in accuracy can be cheap, but close‑to‑perfect accuracy becomes disproportionately expensive.
- Demand side (labor substitution): An entropy/information‑theoretic mapping translates model accuracy into a labor‑substitution ratio: higher accuracy reduces residual uncertainty and thus human processing time. This gives a quantitative mapping from accuracy to how much human work AI displaces.
- Three possible optima per task: no automation, partial automation (interior solution), or full automation. Partial automation occupies a large share of task space because of the convex cost structure.
- Task complexity matters: tasks with few subtasks and low entropy (low complexity) have high substitution rates and are more likely to be highly automated; tasks with many subtasks/high complexity favor limited partial automation.
- Scale effects: Sharing fixed costs across many users (AI-as-a-Service, economy‑wide agents) lowers per‑user cost, increases optimal model quality, and raises automation rates. Under shared/economy‑wide deployment the economically viable share of automation rises sharply.
- Quantitative result (computer vision calibration): At typical firm‑level deployment, roughly 11% of labor compensation tied to computer‑vision‑exposed tasks is economically attractive to automate; most of that saving comes from partial rather than full automation. The share would be larger when including other modalities (LLMs, multimodal models).
Data & Methods
- Theoretical model: Microeconomic, task‑level cost minimization where firms choose AI accuracy. Supply side modeled via an estimated AI production function; demand side modeled by an entropy‑based accuracy→labor substitution mapping. Optimality determined by comparing marginal cost of accuracy improvements with marginal labor savings.
- AI production function: Estimated from fine‑tuning scaling‑law experiments linking performance to (i) additional task data, (ii) training steps, and (iii) model size. Results document performance elasticities and substitutability across inputs and confirm convex cost structure at high performance.
- Entropy mapping: Uses information theory to map remaining uncertainty (entropy) at a given model accuracy to human processing time required to resolve residual tasks, yielding a formal labor substitution ratio.
- Calibration and empirical implementation (computer vision domain):
- O*NET: Identified 420 computer‑vision‑exposed tasks across 263 occupations and used task→time allocations.
- Expert survey: Large survey of 3,778 domain experts elicited task‑specific required accuracies and validation of task characteristics.
- GPT‑4o decompositions: Automated extraction of number of vision subtasks, classes per subtask, and visual share per task; outputs were manually validated by human coders.
- Administrative data: Wages, employment, and firm‑size distributions from U.S. agencies scaled task decisions to occupation, firm, industry, and economy levels.
- Empirical findings: Using the estimated cost function and entropy mapping, the authors compute per‑task optimal automation levels and aggregate outcomes under different deployment scales (firm‑level, AI‑as‑a‑Service, economy‑wide).
Implications for AI Economics
- Reconceptualize exposure metrics: Technical feasibility alone is insufficient — economic viability requires modeling convex scaling costs and the accuracy→labor mapping. Measures of “exposure” should incorporate the cost of attaining required accuracy and task complexity.
- Human–AI collaboration is likely durable: Partial automation is not merely transitional; for many tasks the cost structure makes human oversight or residual human work optimal long term. Policies and models should anticipate extensive hybrid workflows.
- Scale and market structure matter: Large firms, platforms, and shared AI providers can broaden the automation frontier by amortizing fixed development costs. This helps explain concentrated early adoption and incentivizes centralized AI services.
- Distributional and labor implications: Partial automation implies task redesign rather than wholesale job elimination. Effects on wages and employment depend on which subtasks are automated (expert vs. inexpert work) and on within‑firm task heterogeneity. Models of labor market adjustment should incorporate task‑level partial substitution and complementarities.
- Modeling guidance for macro and policy work: Aggregate projections of automation impacts must account for (i) scaling‑law convexities in AI development, (ii) task complexity/entropy, and (iii) deployment scale. Ignoring these factors will overstate the pace and extent of full automation.
- Research directions: Extend empirical calibration beyond computer vision (LLMs, multimodal models), incorporate organizational/implementation costs (beyond model development), study dynamic evolution as model/data/compute costs change, and analyze distributional impacts across firm sizes and worker skill groups.
Assessment
Claims (10)
| Claim | Direction | Confidence | Outcome | Details |
|---|---|---|---|---|
| We model automation intensity as a continuous choice in which firms minimize costs by selecting an AI accuracy level, from no automation through partial human-AI collaboration to full automation. Task Allocation | positive | high | degree of automation (accuracy level chosen by firms) |
0.12
|
| AI systems exhibit predictable but diminishing returns to data, compute, and model size (scaling-law experiments), implying the cost of higher accuracy is convex: good performance may be inexpensive, but near-perfect accuracy is disproportionately costly. Firm Productivity | negative | high | marginal returns to inputs (data, compute, model size) and marginal cost of accuracy |
0.2
|
| Because higher accuracy is disproportionately costly (convex cost), full automation is often not cost-minimizing; partial automation, where firms retain human workers for residual tasks, frequently emerges as the equilibrium. Task Allocation | positive | high | prevalence of partial automation vs full automation as cost-minimizing choices |
0.12
|
| We introduce an entropy-based measure of task complexity that maps model accuracy into a labor substitution ratio, quantifying human labor displacement at each accuracy level. Automation Exposure | neutral | high | labor substitution ratio (human labor displaced per unit accuracy) |
0.12
|
| The framework is calibrated with O*NET task data, a survey of 3,778 domain experts, and GPT-4o-derived task decompositions, and implemented in computer vision. Other | neutral | high | validity of calibration / empirical grounding of the framework |
n=3778
0.2
|
| Task complexity shapes substitution: low-complexity tasks see high substitution, while high-complexity tasks favor limited partial automation. Automation Exposure | negative | high | degree of labor substitution as a function of task complexity |
n=3778
0.12
|
| Scale of deployment is a key determinant: AI-as-a-Service and AI agents spread fixed costs across users, sharply expanding economically viable tasks. Adoption Rate | positive | high | number/coverage of economically viable tasks (adoption potential) as a function of deployment scale |
0.12
|
| At the firm level, cost-effective automation captures approximately 11% of computer-vision-exposed labor compensation. Labor Share | positive | high | share of computer-vision-exposed labor compensation captured by cost-effective automation |
approximately 11%
0.12
|
| Under economy-wide deployment, the share of computer-vision-exposed labor compensation that is cost-effectively automatable rises sharply (relative to the firm-level 11% estimate). Labor Share | positive | high | share of labor compensation automatable under economy-wide deployment |
0.12
|
| Because other AI systems exhibit similar scaling-law economics, the mechanisms identified extend beyond computer vision, reinforcing that partial automation is often the economically rational long-run outcome, not merely a transitional phase. Task Allocation | positive | medium | prevalence of partial automation across AI application domains |
0.01
|