Unobserved counterfactuals in sequential decision systems systematically under-expose marginalized groups and can amplify exclusion; modelling model- and feedback-uncertainty and using uncertainty-aware exploration reduces outcome variance for disadvantaged groups without sacrificing expected institutional utility.
Fair machine learning (ML) methods help identify and mitigate the risk that algorithms encode or automate social injustices. Algorithmic approaches alone cannot resolve structural inequalities, but they can support socio-technical decision systems by surfacing discriminatory biases, clarifying trade-offs, and enabling governance. Although fairness is well studied in supervised learning, many real ML applications are online and sequential, with prior decisions informing future ones. Each decision is taken under uncertainty due to unobserved counterfactuals and finite samples, with dire consequences for under-represented groups, systematically under-observed due to historical exclusion and selective feedback. A bank cannot know whether a denied loan would have been repaid, and may have less data on marginalized populations. This paper introduces a taxonomy of uncertainty in sequential decision-making -- model, feedback, and prediction uncertainty -- providing shared vocabulary for assessing systems where uncertainty is unevenly distributed across groups. We formalize model and feedback uncertainty via counterfactual logic and reinforcement learning, and illustrate harms to decision makers (unrealized gains/losses) and subjects (compounding exclusion, reduced access) of policies that ignore the unobserved space. Algorithmic examples show it is possible to reduce outcome variance for disadvantaged groups while preserving institutional objectives (e.g. expected utility). Experiments on data simulated with varying bias show how unequal uncertainty and selective feedback produce disparities, and how uncertainty-aware exploration alters fairness metrics. The framework equips practitioners to diagnose, audit, and govern fairness risks. Where uncertainty drives unfairness rather than incidental noise, accounting for it is essential to fair and effective decision-making.
Summary
Main Finding
Unequal uncertainty in sequential (online/reinforcement) decision systems — especially uneven epistemic and selective-feedback uncertainty across groups — systematically compounds disparities. Explicitly accounting for uncertainty (e.g., uncertainty-aware exploration that targets high-uncertainty subgroups) can reduce variance in outcomes for historically disadvantaged groups and improve observed fairness metrics without necessarily sacrificing the decision-maker’s expected utility. The paper provides a taxonomy and formal framework to diagnose where uncertainty arises and how it drives fairness harms in sequential settings.
Key Points
- Taxonomy: The authors introduce a lifecycle-based taxonomy of uncertainty in sequential decision systems, distinguishing global uncertainties (systemic, e.g., model and data generation processes) from local uncertainties (individual or subgroup-level, e.g., prediction and feedback uncertainty). They highlight three broad categories emphasized throughout the paper: model uncertainty, feedback uncertainty, and prediction uncertainty.
- Unequal uncertainty matters: Marginalized or historically excluded groups often suffer higher epistemic uncertainty (less data, selective observation), which raises effective risk and can lead to systematically worse decisions (e.g., more loan denials).
- Selective feedback / selective labels: When outcomes are only observed conditional on past decisions (e.g., only seeing repayment behavior for approved loans), feedback uncertainty interacts with representation gaps to create self-reinforcing exclusionary dynamics.
- Formalization: The paper formalizes model and feedback uncertainty using counterfactual logic and reinforcement-learning (bandit/RL) techniques to show how naively ignoring unobserved counterfactuals biases policies and outcomes.
- Illustrative mechanism: The authors propose and simulate targeted, uncertainty-proportional exploration (increasing the probability of favorable actions when prediction uncertainty is high) as a principled alternative to naive policies or group-based preference rules (distinct from affirmative action).
- Empirical/experimental results: Simulations on synthetic datasets with controlled degrees of bias demonstrate that uncertainty-aware exploration can (i) reduce outcome variance for disadvantaged groups, (ii) improve fairness metrics, and (iii) retain institutional objectives like expected utility.
- Governance & legal considerations: RL-style exploration introduces stochasticity that raises legal, reputational, and operational concerns (e.g., individual fairness, non-discrimination law, and acceptability of randomness). The taxonomy is positioned as a diagnostic tool for auditing and governance.
Data & Methods
- Conceptual / theoretical tools:
- Counterfactual logic to formalize unobserved outcomes and feedback uncertainty (what would have happened under alternate decisions).
- Reinforcement learning / online learning (including bandit-style setups) to model sequential decision-making with exploration–exploitation trade-offs.
- Taxonomy construction:
- Survey of ML uncertainty literature mapped to stages of the ML lifecycle; six uncertainty types are organized into global (systemic/model-level) and local (individual/subgroup-level) categories.
- Experiments:
- Synthetic/simulated datasets generated to include varying degrees of historical bias and selective observation (selective labels).
- Implementation of simple algorithmic policies including baseline (naïve) policies and uncertainty-aware exploration policies that allocate exploration proportional to estimated uncertainty.
- Evaluation metrics include decision-maker utility (expected reward), fairness metrics, and variance in group outcomes; outcomes compared across policies and bias regimes.
- Key methodological claim: the paper is not primarily an algorithmic innovation paper; it offers a diagnostic/analytic framework plus illustrative, simple exploration strategies to demonstrate the effects of unequal uncertainty.
Implications for AI Economics
- Distributional dynamics and market access:
- Uneven uncertainty produces dynamic exclusion: under-observed groups get fewer positive decisions, reducing future data and reinforcing higher uncertainty — analogous to persistent adverse selection and market segmentation that lowers labor/credit access for disadvantaged groups.
- This can create long-run aggregate inefficiencies by under-allocating productive opportunities and underestimating demand in excluded segments.
- Firm incentives and strategic behavior:
- Firms maximizing short-term expected utility may rationally avoid exploring high-uncertainty submarkets, producing socially suboptimal “data poverty traps.” Understanding the exploration cost-benefit is crucial for incentive design.
- Uncertainty-aware exploration can, in some settings, be a Pareto-improving strategy (reduce group disparities while preserving firm utility), suggesting private incentives for its adoption may exist—but friction, legal risk, and reputational costs complicate adoption.
- Policy and regulation:
- Regulation should account for dynamic, uncertainty-driven harms (not just static fairness metrics). Possible approaches:
- Mandates or incentives for firms to deploy uncertainty-aware exploration (or subsidize exploration) where public-interest services are at stake (e.g., lending, credit scoring).
- Requirements for auditing pipelines to document sources of uncertainty (taxonomy-based documentation) and to monitor selective feedback effects.
- Time-limited safe exploration regimes, human-in-the-loop guardrails, or evidence-based pilot programs to limit legal/reputational exposures while collecting data.
- Regulation should account for dynamic, uncertainty-driven harms (not just static fairness metrics). Possible approaches:
- Welfare, cost-benefit, and measurement:
- Cost–benefit analyses of ML deployment must include the long-run welfare effects of perpetuated data gaps and reduced access to markets for marginalized groups.
- Measuring firm-level utility alone can be misleading; social welfare accounting should include effects of compounded exclusion and reduced downstream opportunities.
- Research and market design questions for economics:
- Optimal exploration policy design under regulatory constraints and private cost structures (how much exploration should firms do, who should pay).
- Market-level effects when multiple firms compete: do competitive pressures induce exploration that reduces exclusion, or do firms free-ride on others’ exploration?
- Mechanism design for corrective subsidies, data-pooling, or public-data provision to mitigate representation gaps.
- Empirical estimation of the magnitude of selective-feedback externalities in real markets (credit, hiring, health) and the welfare gains from uncertainty-aware policies.
- Auditing and governance implications:
- Regulators and auditors should track uncertainty heterogeneity across groups and require counterfactual analyses and monitoring of selective feedback loops.
- Documentation and disclosure (e.g., uncertainty maps, exploration policies) can help align firm incentives, consumer expectations, and legal standards.
Suggested next steps for economists interested in this area: - Model the firm’s optimization including exploration costs, legal/reputational risk, and long-run market access externalities. - Empirically estimate feedback uncertainty and the impact of targeted exploration on access and default/repayment rates in real-world lending or hiring datasets. - Design and evaluate policy instruments (subsidies, data trusts, mandated audits) to correct market failures arising from unequal uncertainty.
Assessment
Claims (10)
| Claim | Direction | Confidence | Outcome | Details |
|---|---|---|---|---|
| This paper introduces a taxonomy of uncertainty in sequential decision-making consisting of three types: model uncertainty, feedback uncertainty, and prediction uncertainty. Ai Safety And Ethics | positive | high | categories of uncertainty in sequential decision-making |
0.12
|
| The authors formalize model and feedback uncertainty using counterfactual logic and reinforcement learning. Ai Safety And Ethics | positive | high | formalization of uncertainty types |
0.12
|
| Algorithmic examples in the paper demonstrate it is possible to reduce outcome variance for disadvantaged groups while preserving institutional objectives such as expected utility. Inequality | positive | high | outcome variance for disadvantaged groups; expected utility (institutional objective) |
0.12
|
| Experiments on simulated data with varying bias show that unequal uncertainty and selective feedback produce disparities across groups. Inequality | negative | high | group disparities (fairness metrics) |
0.12
|
| Uncertainty-aware exploration (in algorithms) alters fairness metrics compared to policies that ignore uncertainty. Ai Safety And Ethics | mixed | high | fairness metrics |
0.12
|
| Policies that ignore the unobserved (counterfactual) space can harm decision makers (via unrealized gains or losses) and subjects (via compounding exclusion and reduced access). Inequality | negative | high | unrealized gains/losses for decision makers; compounding exclusion and reduced access for subjects |
0.12
|
| Many practical machine learning applications are online and sequential, meaning prior decisions inform future ones — a setting in which fairness challenges differ from standard supervised learning. Ai Safety And Ethics | neutral | high | characterization of ML application setting (online/sequential) |
0.12
|
| Under-represented groups tend to be systematically under-observed because of historical exclusion and selective feedback, which exacerbates uncertainty for those groups. Inequality | negative | high | observation frequency/data availability for under-represented groups; resulting uncertainty |
0.12
|
| The proposed framework can help practitioners diagnose, audit, and govern fairness risks in socio-technical decision systems. Governance And Regulation | positive | high | practitioner ability to diagnose/audit/govern fairness risks |
0.06
|
| When unfairness is driven by uncertainty (rather than incidental noise), accounting for uncertainty is essential to achieving fair and effective decision-making. Ai Safety And Ethics | positive | high | fairness and effectiveness of decision-making when uncertainty is accounted for |
0.12
|