Removing manipulable features from high-stakes predictors can backfire; jointly choosing which features to keep and how strongly to regularize them delivers better performance and reduces gaming, according to a formal analysis and a healthcare benchmark.

Strategic Feature Selection

Jivat Neet Kaur, Pratik Patil, Divya Shanmugam, Emma Pierson, Michael I. Jordan, Nika Haghtalab, Meena Jagadeesan, Ahmed Alaa, Serena Wang · June 17, 2026

arxiv theoretical medium evidence 7/10 relevance Source PDF

Excluding features judged manipulable is often suboptimal; jointly selecting features and tuning ridge regularization — guided by a formal model of strategic manipulation — yields superior predictive performance and mitigates gaming, as shown analytically and in a healthcare payments case study.

When algorithmic predictors inform resource allocation in high-stakes domains such as healthcare, these predictors must account for strategic manipulation of input features. The typical solution is to redesign the predictor itself to explicitly account for strategic interactions. In practice, however, decision makers are often constrained to adjusting coarser levers within existing prediction pipelines. For example, healthcare organizations often select which features to exclude based on perceived manipulability, while using standard regularization procedures to shrink the coefficients of retained features. In this work, we initiate a formal study of strategic classification through feature selection and its interaction with ridge regularization. Our main finding is that excluding individual features based on their manipulability alone is generally suboptimal. We provide a fine-grained characterization of the performance of a feature subset under optimal regularization, yielding new insights for policy design. Motivated by this characterization, we develop a practical algorithm for jointly choosing the feature set and the level of ridge regularization. Through a real-world case study on a healthcare payments benchmark, we illustrate how our algorithm can guide the design of coarse policy levers in practice. Our results provide a principled, practical framework for mitigating the effects of strategic behavior in algorithmic decision-making systems.

Summary

Main Finding

Excluding features based only on perceived manipulability is generally suboptimal. Instead, feature selection must be chosen jointly with ridge regularization and should account for each feature’s predictive value, its manipulability relative to other features, and the covariance structure among features. Under a linear predictor and quadratic manipulation costs, the authors characterize an irreducible loss that coarse interventions cannot overcome and give a tight decomposition of the extra loss incurred by support-restricted ridge estimators into (i) a manipulability gain, (ii) a predictive loss, and (iii) a heterogeneity gap. They develop a practical two-stage algorithm to jointly select features and the ridge penalty, and demonstrate in a calibrated Medicare-style case study that their method improves robustness to strategic manipulation while retaining most predictive accuracy.

Key Points

Setting and assumptions
- Linear predictor fθ,b(x) = θ⊤x + b, population distribution of (X, Y) with finite second moments and centered data.
- Organizations observe X and can manipulate reported features by adding a ∈ R^d at quadratic cost C(a) = 1/2 a⊤H a (H ≻ 0).
- Decision maker has access to unmanipulated training data (pre-deployment or alternative regime).
- Policy levers considered: (i) feature selection (support restriction S) and (ii) ridge regularization (λ ≥ 0), including their joint use (support-restricted ridge).
Fundamental limit (Theorem 1)
- There is an irreducible gap between the strategic optimal unconstrained predictor and any predictor fit under the support-restriction + ridge family. This gap can be bounded (e.g., by quantities like (θ⊤H^{-1}θ)^2), so coarse levers cannot always match a fully re-designed strategic-optimal predictor.
Fine-grained upper bound (Theorem 2)
- The excess strategic MSE (over the best zero-intercept strategic fit on the full feature set) for an optimally regularized support S decomposes into three interpretable components:
- Manipulability gain: how much restricting to S reduces the strategic burden relative to using all features.
- Predictive loss: the loss from dropping predictive information (captured by the conditional covariance ΣR|S and θ*R).
- Heterogeneity gap: a term capturing how heterogeneity / misalignment between the cost matrix H and feature covariance Σ limits the ability of ridge regularization to neutralize manipulability.
- This decomposition shows that a highly manipulable feature can still be retained when it is predictive and when other retained features have comparable manipulability (i.e., it can act redundantly/proxy with similar incentives).
Algorithm and empirical work
- A practical two-stage algorithm: (i) continuous relaxation of the combinatorial support selection + ridge tuning problem, (ii) local discrete support refinement.
- Case study: simulated upcoding in a Medicare Advantage payments benchmark (calibrated to realistic coding patterns). The proposed joint-selection algorithm substantially improves strategic robustness relative to standard full-support models and naive "drop-most-manipulable" heuristics, while largely preserving predictive accuracy.
Policy takeaways
- Feature selection and coefficient regularization must be tuned jointly: the best support without regularization may not be best once ridge is applied.
- Dropping a manipulable feature is warranted only if there exists a less-manipulable proxy with similar predictive power; otherwise regularization or retaining the feature may be better.
- Coarse levers (selection + ridge) provide practical robustness benefits, including under uncertainty about manipulation costs.

Data & Methods

Theoretical analysis
- Closed-form characterizations for least-squares, ordinary ridge, and support-restricted ridge estimators under centering assumptions.
- Proof of a lower bound (irreducible gap) and an upper bound decomposing excess strategic MSE (Theorems 1 and 2). The upper bound depends on Σ (feature covariance), θ* (best linear projection), and H (manipulation cost matrix).
- Heterogeneity gap formalized via operator-norm distances comparing H−1_SS to scaled identity in the retained-support subspace; constants capture sensitivity to alignment and magnitude of coefficients.
Algorithm
- Continuous relaxation of support-selection (makes the combinatorial problem amenable to optimization), followed by local greedy/discrete refinement to produce a final support and tuned λ.
Empirical evaluation
- Realistic case study in health-care payments: simulated upcoding calibrated to Medicare Advantage coding behavior drawn from prior policy work.
- Evaluation metric: strategic MSE after agents best-respond (i.e., measured after equilibrium manipulations).
- Baselines: full-support models, naive heuristics that drop the most-manipulable features, and other standard regularization choices.
- Results show joint selection + tuning yields better strategic MSE while retaining predictive performance.

Implications for AI Economics

For regulators and policy-makers (e.g., CMS and similar institutions)
- Naively excluding features because they appear manipulable may produce worse strategic outcomes than a considered joint policy of selective retention plus coefficient shrinkage. Policies should evaluate features by joint predictability-manipulability-covariance structure.
- When a less-manipulable proxy exists and is sufficiently predictive, dropping a manipulable feature can reduce incentives to manipulate with limited predictive cost—this clarifies when common policy choices (feature removal) are warranted.
- Regularization can be a low-friction, legally and operationally feasible lever to reduce gaming incentives; however, its effect depends on the heterogeneity and alignment of manipulation costs across features.
For economic design and welfare analysis
- The paper formalizes trade-offs that arise when decision rules are constrained by institutional inertia (coarse levers only). This makes explicit the welfare losses that cannot be eliminated without redesigning the predictor class.
- The irreducible gap quantifies the efficiency cost of restricting attention to legacy pipelines; this can guide decisions about whether to invest in full redesigns versus tuning coarse levers.
- Results bridge proxy-means-tests literature (which emphasizes verifiability and predictive power) with incentive-aware ML: selection must guard both against manipulation and predictive loss.
Practical research and policy priorities suggested
- Estimation and robustification of the manipulation-cost structure H is crucial: the paper’s prescriptions depend on knowledge (or uncertainty models) for H.
- Extensions to non-linear models, heterogeneous agents, and non-quadratic costs would improve applicability in settings where linearity or quadratic costs are poor approximations.
- Empirical validation in deployed settings (beyond simulated upcoding) would help quantify real-world gains (e.g., potential reduction in overpayments).
Overall: this work gives a principled, implementable framework for regulators and organizations who must operate within legacy prediction pipelines and need to mitigate strategic manipulation without full model redesign.

Assessment

Paper Typetheoretical Evidence Strengthmedium — The paper provides strong theoretical characterizations and an algorithm with simulation-based and one empirical case-study demonstration, which together give coherent and useful evidence; however, there is no experimental or quasi-experimental validation of agent behavior under the proposed interventions, and results hinge on modeling assumptions (costs, information, form of manipulation) that may limit external validity in real-world deployments. Methods Rigorhigh — Rigorous formal analysis of strategic classification with proofs characterizing optimal regularization for any feature subset, development of an algorithm for joint feature selection and ridge tuning, and systematic evaluation with simulations and a real-world benchmark demonstrate methodological thoroughness and careful theoretical work. SampleTheoretical model plus simulations; applied to a single real-world 'healthcare payments' benchmark dataset (used to illustrate and simulate strategic manipulation and to compare performance of feature-selection + ridge strategies versus naïve feature exclusion), but no randomized or longitudinal field experiment. Themesgovernance org_design IdentificationFormal game-theoretic / strategic classification model: agents may manipulate observable features at a cost; the planner chooses a linear predictor with a subset of features and ridge regularization. Causal claims are derived analytically from the model and validated via simulations and a single real-world healthcare payments benchmark (simulation of manipulation on observed data), rather than from randomized or quasi-experimental variation in field data. GeneralizabilityRelies on specific modeling assumptions about agent manipulation costs, information, and one-shot strategic responses — real behaviors may differ., Validated on one healthcare payments dataset; results may not generalize across other sectors, outcome types, or feature distributions., Focuses on linear predictors with ridge regularization — conclusions may not hold for non-linear models, other regularizers, or more complex pipelines., Does not account for institutional dynamics, repeated interactions, or equilibrium effects across multiple decision makers and agents.

Claims (7)

Claim	Direction	Outcome	Confidence & Evidence	Details
Excluding individual features based on their manipulability alone is generally suboptimal. Output Quality	negative	predictive performance / classifier performance after strategic manipulation	Reading fidelity high Study strength high	0.2
The paper provides a fine-grained characterization of the performance of a feature subset under optimal (ridge) regularization. Output Quality	positive	performance of feature subsets under optimal ridge regularization	Reading fidelity high Study strength medium	0.12
The interaction between feature selection and ridge regularization yields new insights for policy design. Governance And Regulation	positive	implications for policy design regarding choice of coarse levers (feature exclusion and regularization)	Reading fidelity high Study strength medium	0.12
We develop a practical algorithm for jointly choosing the feature set and the level of ridge regularization. Task Allocation	positive	ability to select feature subset and regularization parameter to mitigate strategic manipulation	Reading fidelity high Study strength medium	0.12
Through a real-world case study on a healthcare payments benchmark, the algorithm can guide the design of coarse policy levers in practice. Decision Quality	positive	practical guidance for feature exclusion and regularization choice in a healthcare payments prediction task	Reading fidelity high Study strength medium	0.12
The results provide a principled, practical framework for mitigating the effects of strategic behavior in algorithmic decision-making systems. Ai Safety And Ethics	positive	mitigation of effects of strategic behavior on algorithmic decision outcomes	Reading fidelity high Study strength medium	0.12
In practice, decision makers are often constrained to adjusting coarser levers within existing prediction pipelines (e.g., excluding perceived-manipulable features and using standard regularization). Adoption Rate	null_result	typical operational constraints on predictor adjustment (feature exclusion, regularization)	Reading fidelity high Study strength low	0.06