Removing manipulable features from high-stakes predictors can backfire; jointly choosing which features to keep and how strongly to regularize them delivers better performance and reduces gaming, according to a formal analysis and a healthcare benchmark.
When algorithmic predictors inform resource allocation in high-stakes domains such as healthcare, these predictors must account for strategic manipulation of input features. The typical solution is to redesign the predictor itself to explicitly account for strategic interactions. In practice, however, decision makers are often constrained to adjusting coarser levers within existing prediction pipelines. For example, healthcare organizations often select which features to exclude based on perceived manipulability, while using standard regularization procedures to shrink the coefficients of retained features. In this work, we initiate a formal study of strategic classification through feature selection and its interaction with ridge regularization. Our main finding is that excluding individual features based on their manipulability alone is generally suboptimal. We provide a fine-grained characterization of the performance of a feature subset under optimal regularization, yielding new insights for policy design. Motivated by this characterization, we develop a practical algorithm for jointly choosing the feature set and the level of ridge regularization. Through a real-world case study on a healthcare payments benchmark, we illustrate how our algorithm can guide the design of coarse policy levers in practice. Our results provide a principled, practical framework for mitigating the effects of strategic behavior in algorithmic decision-making systems.
Summary
Main Finding
Excluding features based only on perceived manipulability is generally suboptimal. Instead, feature selection must be chosen jointly with ridge regularization and should account for each feature’s predictive value, its manipulability relative to other features, and the covariance structure among features. Under a linear predictor and quadratic manipulation costs, the authors characterize an irreducible loss that coarse interventions cannot overcome and give a tight decomposition of the extra loss incurred by support-restricted ridge estimators into (i) a manipulability gain, (ii) a predictive loss, and (iii) a heterogeneity gap. They develop a practical two-stage algorithm to jointly select features and the ridge penalty, and demonstrate in a calibrated Medicare-style case study that their method improves robustness to strategic manipulation while retaining most predictive accuracy.
Key Points
- Setting and assumptions
- Linear predictor fθ,b(x) = θ⊤x + b, population distribution of (X, Y) with finite second moments and centered data.
- Organizations observe X and can manipulate reported features by adding a ∈ R^d at quadratic cost C(a) = 1/2 a⊤H a (H ≻ 0).
- Decision maker has access to unmanipulated training data (pre-deployment or alternative regime).
- Policy levers considered: (i) feature selection (support restriction S) and (ii) ridge regularization (λ ≥ 0), including their joint use (support-restricted ridge).
- Fundamental limit (Theorem 1)
- There is an irreducible gap between the strategic optimal unconstrained predictor and any predictor fit under the support-restriction + ridge family. This gap can be bounded (e.g., by quantities like (θ⊤H^{-1}θ)^2), so coarse levers cannot always match a fully re-designed strategic-optimal predictor.
- Fine-grained upper bound (Theorem 2)
- The excess strategic MSE (over the best zero-intercept strategic fit on the full feature set) for an optimally regularized support S decomposes into three interpretable components:
- Manipulability gain: how much restricting to S reduces the strategic burden relative to using all features.
- Predictive loss: the loss from dropping predictive information (captured by the conditional covariance ΣR|S and θ*R).
- Heterogeneity gap: a term capturing how heterogeneity / misalignment between the cost matrix H and feature covariance Σ limits the ability of ridge regularization to neutralize manipulability.
- This decomposition shows that a highly manipulable feature can still be retained when it is predictive and when other retained features have comparable manipulability (i.e., it can act redundantly/proxy with similar incentives).
- Algorithm and empirical work
- A practical two-stage algorithm: (i) continuous relaxation of the combinatorial support selection + ridge tuning problem, (ii) local discrete support refinement.
- Case study: simulated upcoding in a Medicare Advantage payments benchmark (calibrated to realistic coding patterns). The proposed joint-selection algorithm substantially improves strategic robustness relative to standard full-support models and naive "drop-most-manipulable" heuristics, while largely preserving predictive accuracy.
- Policy takeaways
- Feature selection and coefficient regularization must be tuned jointly: the best support without regularization may not be best once ridge is applied.
- Dropping a manipulable feature is warranted only if there exists a less-manipulable proxy with similar predictive power; otherwise regularization or retaining the feature may be better.
- Coarse levers (selection + ridge) provide practical robustness benefits, including under uncertainty about manipulation costs.
Data & Methods
- Theoretical analysis
- Closed-form characterizations for least-squares, ordinary ridge, and support-restricted ridge estimators under centering assumptions.
- Proof of a lower bound (irreducible gap) and an upper bound decomposing excess strategic MSE (Theorems 1 and 2). The upper bound depends on Σ (feature covariance), θ* (best linear projection), and H (manipulation cost matrix).
- Heterogeneity gap formalized via operator-norm distances comparing H−1_SS to scaled identity in the retained-support subspace; constants capture sensitivity to alignment and magnitude of coefficients.
- Algorithm
- Continuous relaxation of support-selection (makes the combinatorial problem amenable to optimization), followed by local greedy/discrete refinement to produce a final support and tuned λ.
- Empirical evaluation
- Realistic case study in health-care payments: simulated upcoding calibrated to Medicare Advantage coding behavior drawn from prior policy work.
- Evaluation metric: strategic MSE after agents best-respond (i.e., measured after equilibrium manipulations).
- Baselines: full-support models, naive heuristics that drop the most-manipulable features, and other standard regularization choices.
- Results show joint selection + tuning yields better strategic MSE while retaining predictive performance.
Implications for AI Economics
- For regulators and policy-makers (e.g., CMS and similar institutions)
- Naively excluding features because they appear manipulable may produce worse strategic outcomes than a considered joint policy of selective retention plus coefficient shrinkage. Policies should evaluate features by joint predictability-manipulability-covariance structure.
- When a less-manipulable proxy exists and is sufficiently predictive, dropping a manipulable feature can reduce incentives to manipulate with limited predictive cost—this clarifies when common policy choices (feature removal) are warranted.
- Regularization can be a low-friction, legally and operationally feasible lever to reduce gaming incentives; however, its effect depends on the heterogeneity and alignment of manipulation costs across features.
- For economic design and welfare analysis
- The paper formalizes trade-offs that arise when decision rules are constrained by institutional inertia (coarse levers only). This makes explicit the welfare losses that cannot be eliminated without redesigning the predictor class.
- The irreducible gap quantifies the efficiency cost of restricting attention to legacy pipelines; this can guide decisions about whether to invest in full redesigns versus tuning coarse levers.
- Results bridge proxy-means-tests literature (which emphasizes verifiability and predictive power) with incentive-aware ML: selection must guard both against manipulation and predictive loss.
- Practical research and policy priorities suggested
- Estimation and robustification of the manipulation-cost structure H is crucial: the paper’s prescriptions depend on knowledge (or uncertainty models) for H.
- Extensions to non-linear models, heterogeneous agents, and non-quadratic costs would improve applicability in settings where linearity or quadratic costs are poor approximations.
- Empirical validation in deployed settings (beyond simulated upcoding) would help quantify real-world gains (e.g., potential reduction in overpayments).
- Overall: this work gives a principled, implementable framework for regulators and organizations who must operate within legacy prediction pipelines and need to mitigate strategic manipulation without full model redesign.
Assessment
Claims (7)
| Claim | Direction | Outcome | Confidence & Evidence | Details |
|---|---|---|---|---|
| Excluding individual features based on their manipulability alone is generally suboptimal. Output Quality | negative | predictive performance / classifier performance after strategic manipulation |
Reading fidelity
high
Study strength
high
|
|
| The paper provides a fine-grained characterization of the performance of a feature subset under optimal (ridge) regularization. Output Quality | positive | performance of feature subsets under optimal ridge regularization |
Reading fidelity
high
Study strength
medium
|
|
| The interaction between feature selection and ridge regularization yields new insights for policy design. Governance And Regulation | positive | implications for policy design regarding choice of coarse levers (feature exclusion and regularization) |
Reading fidelity
high
Study strength
medium
|
|
| We develop a practical algorithm for jointly choosing the feature set and the level of ridge regularization. Task Allocation | positive | ability to select feature subset and regularization parameter to mitigate strategic manipulation |
Reading fidelity
high
Study strength
medium
|
|
| Through a real-world case study on a healthcare payments benchmark, the algorithm can guide the design of coarse policy levers in practice. Decision Quality | positive | practical guidance for feature exclusion and regularization choice in a healthcare payments prediction task |
Reading fidelity
high
Study strength
medium
|
|
| The results provide a principled, practical framework for mitigating the effects of strategic behavior in algorithmic decision-making systems. Ai Safety And Ethics | positive | mitigation of effects of strategic behavior on algorithmic decision outcomes |
Reading fidelity
high
Study strength
medium
|
|
| In practice, decision makers are often constrained to adjusting coarser levers within existing prediction pipelines (e.g., excluding perceived-manipulable features and using standard regularization). Adoption Rate | null_result | typical operational constraints on predictor adjustment (feature exclusion, regularization) |
Reading fidelity
high
Study strength
low
|