Fairness–utility trade-offs in binary decisions collapse to simple group-specific probability thresholds: the entire Pareto frontier is formed by deterministic threshold rules (sometimes favoring lower-risk applicants), driven by population characteristics and objective weights rather than how the algorithm is engineered.

Fairness vs Performance: Characterizing the Pareto Frontier of Algorithmic Decision Systems

Mieke Wilms, Christoph Heitz · May 11, 2026

arxiv theoretical n/a evidence 7/10 relevance Source PDF

Under a broad class of utility functions, population distributions, and group fairness metrics, the Pareto frontier of trade-offs between decision-maker utility and group fairness is realized by deterministic, group-specific threshold rules — including both lower- and, for some metrics, upper-bound thresholds — and is determined by population and objective parameters rather than algorithmic design choices.

Designing fair algorithmic decision systems requires balancing model performance with fairness toward affected individuals: More fairness might require sacrificing some performance and vice versa, yet the space of possible trade-offs is still poorly understood. We investigate fairness in binary prediction-based decision problems by conceptualizing decision making as a multi-objective optimization problem that simultaneously considers decision-maker utility and group fairness. We investigate the set of Pareto-optimal decision rules for arbitrary utility functions for decision maker, arbitrary population distributions, and a wide range of group fairness metrics. We find that the Pareto frontier consists of deterministic, group-specific threshold rules applied to individuals' success probability. This complements existing optimality theorems from literature which, for specific fairness constraints, posit lower-bound threshold rules only. However we also show that, depending on the used fairness metric, the Pareto frontier may include upper-bound threshold rules, thus preferring individuals with lower success probabilities. We show that the location of the Pareto frontier depends only on population characteristics, utility functions and fairness score, but not on the technical design of the algorithm - our findings hold for pre-, in-, and post-processing approaches alike. Our results generalize existing optimality theorems for fairness-constrained classification and extend them to generalized fairness metrics and fairness principles, and to partial fairness regimes. This paper connects formal fairness research with legal and ethical requirements to search for less discriminatory alternatives, offering a principled foundation for evaluating and comparing algorithmic decision systems.

Summary

Main Finding

The Pareto frontier of binary prediction-based decision systems (trade-off between decision-maker utility and group fairness) — taken over the full space of possible decision rules, including group-dependent rules — consists of group-specific deterministic threshold rules applied to individuals’ success probability p(x) = P[Y=1 | x]. These thresholds can be lower-bound, upper-bound, or combinations thereof. The frontier is determined solely by (i) the population distributions of p conditional on group (g(p | a)), (ii) the decision-maker (DM) and decision-subject (DS) utility functions, and (iii) the chosen fairness score (FS) derived from group-wise expected DS utilities. It does not depend on the technical design (pre-, in- or post-processing) of the algorithm. The paper generalizes prior optimality results (which focused on specific confusion-matrix parity constraints and lower-threshold rules) to a wide class of fairness metrics (including egalitarian, prioritarian, Rawlsian, sufficientarian principles) and to partial fairness regimes, and highlights that Pareto-optimal solutions can include “upper-threshold” rules (cherry-picking) when DS utility structures make it optimal.

Key Points

Problem setup:
- Binary decision D ∈ {0,1} based on features x (which may include protected attribute a), with binary true outcome Y ∈ {0,1}.
- DM utility U specified by 2×2 payoff matrix (u_ij) for (D=i, Y=j).
- DS impact/fairness captured by DS utility matrix V (v_ij) and aggregated into group-wise expected DS utilities; fairness score FS is a function of those expected utilities across groups. FS can represent equality measures (differences) or other justice-inspired objectives (maximin, prioritarian, sufficientarian).
Main theorem:
- Over all decision rules (including those conditional on a), any Pareto-optimal rule is deterministic and takes the form: for each group a, accept iff p(x) is inside some interval (possibly a half-line) — i.e., thresholding on p with group-specific lower and/or upper cutoffs.
- Depending on FS and DS utility matrix, Pareto-optimal thresholds can be:
  - Lower-bound only (classical: accept above threshold),
  - Upper-bound only (prefer low-p individuals — “cherry-picking”),
  - Mixed (accept only if p in [t_low, t_high]).
Relation to prior work:
- Recovers classical results that parity constraints like statistical parity, TPR/FPR parity produce group-wise lower-bound thresholds as special cases.
- Extends to: (a) generalized DS-utility-based fairness metrics (not only confusion-matrix metrics), (b) other justice principles (Rawls, prioritarianism), and (c) partial enforcement of fairness (not only strict parity constraints).
Technology-agnostic benchmark:
- The Pareto frontier gives an outer bound for achievable utility-fairness trade-offs independent of model architecture. Thus different algorithmic approaches that optimally trade off the same objectives will converge to these threshold rules.
Practical observation:
- For locating the frontier it suffices to estimate the distributions g(p | a) of p = P[Y=1 | x] within groups, not a perfect Bayes predictor for each x.
Empirical / implementation note:
- The authors provide code (GitHub) and an example showing that an in-processing method can produce group-specific thresholds even without explicit access to the protected attribute.

Data & Methods

Nature of work: theoretical/analytical with illustrative examples and code (no original field dataset).
Formal model:
- Binary decision framework, joint distribution of (Y, x); p(x) = P[Y=1 | x] and feature density h(x).
- DM utility matrix (u_ij) and DS utility matrix (v_ij). Expected utilities per group computed via integrals over g(p | a).
- Fairness score FS = F( E[V | group] ), where F can encode difference, min, weighted sum, thresholds, etc.
Optimization viewpoint:
- Multi-objective optimization in two dimensions: maximize DM expected utility and optimize (minimize or maximize depending on principle) FS across all measurable decision rules f: x ↦ {0,1} (including those that use a).
- Characterization of Pareto-optimal set via variational/ordering arguments on p — show pointwise optimality implies thresholding on p within each group.
Generality:
- Results proven for arbitrary DM/DS payoff matrices and arbitrary group distributions of p; hold for pre-, in- and post-processing approaches because proof is over the space of all decision rules.
Reproducibility:
- Python code implementing constructions and examples is shared: https://github.com/mcwilms/Pareto-Frontier-of-Binary-Decision-Systems
Limitations of method:
- Framework limited to binary outcomes & binary decisions; focuses on group fairness (not individual, causal, or counterfactual fairness); assumes access to/estimability of g(p | a) and specified utility matrices.

Implications for AI Economics

Benchmarking and compliance:
- Provides a technology-agnostic benchmark for regulators, auditors, and firms to judge whether a deployed system is (near) Pareto-optimal in utility-fairness trade-offs. This can be used in “Less Discriminatory Alternative” (LDA) assessments required by law: if a model lies strictly inside the Pareto frontier, an LDA exists.
Cost-of-fairness quantification:
- The frontier quantifies the minimal DM utility loss needed to achieve any desired fairness level (or, conversely, maximal fairness achievable at given utility). Economists can translate these trade-offs into monetary costs, prices, or firm profit impacts.
Policy and regulation design:
- Regulators can use the frontier to set feasible fairness targets or to require disclosures (e.g., estimated g(p | a), assumed utility functions) so stakeholders can assess achievable trade-offs. Because the frontier is independent of algorithmic architecture, policy can focus on outcomes rather than specific technical constraints.
Market behavior and incentives:
- Firms deciding on adoption of fairness constraints can use the frontier to decide governance/investment: e.g., whether to accept profit loss for fairness gains, or to redesign products to shift g(p | a) (e.g., feature collection strategies) to obtain better frontier points.
- The possibility that Pareto-optimal solutions include upper-bound thresholds (cherry-picking) implies regulation should consider within-group impacts and specify which DS utility notions are socially acceptable. Simple parity constraints may be insufficient to prevent types of discriminatory selection that nonetheless satisfy some fairness metrics.
Empirical strategy for economists:
- Estimation focus: estimate distributions g(p | a) (density of score p within groups) and elicit or model DM/DS utility matrices to compute the frontier. This reduces data demands compared with trying to learn full optimal classifiers.
- Welfare analysis: combine DM profits and group DS utilities into social welfare functions (e.g., weighted sum, Rawlsian) to select socially preferred points on the frontier; use counterfactual experiments to evaluate market-wide effects (entry, pricing).
Mechanism and mechanism-design implications:
- When protected-attribute–conditional decision rules are legally prohibited, the frontier under such constraints can be compared to the unconstrained frontier to quantify efficiency loss from prohibition. This informs trade-offs between anti-discrimination legal norms and economic efficiency.
Cautions for applied work:
- Translating theoretical payoffs into monetary utilities requires careful modeling of DS impacts (v_ij). Different DS utility specifications can reshuffle the frontier and the nature of optimal rules (e.g., creating upper-threshold optima). Policymakers and analysts must make normative choices explicit (egalitarian vs. prioritarian vs. maximin).
Research directions for AI economics:
- Empirical estimation of g(p | a) in real markets (credit, health triage, hiring) to map practical frontiers.
- Integrate frontier analysis into firm-level cost-benefit and compliance models.
- Study multi-period/dynamic extensions (learning, strategic behavior) and non-binary settings (multi-class, continuous decisions).
- Investigate implications of prohibiting group-dependent rules vs. allowing them under transparency/oversight.

References / provenance - Wilms, M. & Heitz, C. (2026). Fairness vs Performance: Characterizing the Pareto Frontier of Algorithmic Decision Systems. FAccT ’26; arXiv:2605.10604. Code: https://github.com/mcwilms/Pareto-Frontier-of-Binary-Decision-Systems

If you want, I can: - Sketch the constructive procedure the authors use to compute the group-wise thresholds from g(p | a) and utilities, or - Outline an empirical protocol (data + steps) for estimating g(p | a) and producing an empirical Pareto frontier for a particular application (credit, hiring, health).

Assessment

Paper Typetheoretical Evidence Strengthn/a — Paper is purely theoretical/analytical; it provides formal proofs rather than empirical or causal evidence, so evidence strength for causal claims is not applicable. Methods Rigorhigh — The contribution is a formal, general mathematical derivation characterizing Pareto-optimal rules across arbitrary utility functions/populations and many fairness metrics; methods appear rigorous and general within the model assumptions, though the work does not empirically validate or test robustness to estimation error or strategic behavior. SampleNo empirical sample — a theoretical model of binary decision/prediction problems assuming individuals characterized by true success probabilities, arbitrary population distributions across groups, arbitrary decision-maker utility functions, and a wide class of group fairness metrics; analysis covers pre-, in-, and post-processing rules. Themesgovernance inequality IdentificationAnalytical characterization: derive the Pareto frontier of a multi-objective optimization (decision-maker utility versus group fairness) using mathematical proofs that identify the form of Pareto-optimal decision rules for arbitrary utility functions, population distributions, and a broad class of fairness metrics. GeneralizabilityApplies only to binary decision problems with well-defined individual success probabilities; not directly applicable to multi-class or continuous outcomes., Relies on access to or knowledge of true success probabilities — ignores estimation error, finite-sample uncertainty, and learning dynamics., Assumes static one-shot decisions; may not hold in dynamic, sequential, or strategic settings (e.g., behavior change, gaming)., Focuses on group fairness metrics and group-specific thresholds — may not capture intersectional groups or individual fairness notions., Doesn't address operational constraints (e.g., legal restrictions, information costs, implementation complexity) or downstream economic impacts (employment, productivity) without further modeling.

Claims (6)

Claim	Direction	Confidence	Outcome	Details
The Pareto frontier consists of deterministic, group-specific threshold rules applied to individuals' success probability. Decision Quality	positive	high	form of Pareto-optimal decision rules (deterministic, group-specific thresholds on success probability)	0.2
This result complements existing optimality theorems from the literature which, for specific fairness constraints, posit lower-bound threshold rules only. Decision Quality	positive	high	relation to and extension of existing fairness-constrained optimality theorems	0.12
Depending on the used fairness metric, the Pareto frontier may include upper-bound threshold rules, thus preferring individuals with lower success probabilities. Decision Quality	mixed	high	presence of upper-bound threshold rules on Pareto frontier (preference toward lower-success-probability individuals)	0.12
The location of the Pareto frontier depends only on population characteristics, utility functions and the fairness score, but not on the technical design of the algorithm — the findings hold for pre-processing, in-processing, and post-processing approaches alike. Decision Quality	null_result	high	dependence of Pareto frontier location on algorithmic design	0.2
The results generalize existing optimality theorems for fairness-constrained classification and extend them to generalized fairness metrics and partial fairness regimes. Decision Quality	positive	high	generality/extent of optimality theorems (coverage of fairness metrics and partial regimes)	0.12
This paper connects formal fairness research with legal and ethical requirements to search for less discriminatory alternatives, offering a principled foundation for evaluating and comparing algorithmic decision systems. Governance And Regulation	positive	medium	utility of the theoretical results for legal/ethical evaluation and comparison of algorithmic decision systems	0.01