Profit-aware classifiers tailored to operational limits improve detection of high-risk cold-start auto-insurance customers and raise expected portfolio profit relative to a standard baseline (p < 0.001). A balanced bagging ensemble gives the best practical trade-off for regulated deployment, while a lightweight Transformer delivers stronger robustness and generalization under data perturbations.

Advanced Insurance Risk Modeling for Pseudo-New Customers Using Balanced Ensembles and Transformer Architectures

Finn L. Solly, Raquel Soriano-Gonzalez, Angel A. Juan, Antoni Guerrero · April 17, 2026 · Risks

openalex correlational medium evidence 7/10 relevance DOI Source PDF

Profit-aware balanced bagging ensembles and a lightweight Transformer both outperform a baseline in identifying high-risk cold-start auto insurance customers on a 51,618-customer dataset (p < 0.001), with the ensemble offering the best trade-off between profit, interpretability, and efficiency and the Transformer showing superior robustness under perturbations.

In insurance portfolios, classifying customers without a prior history at a given company is particularly challenging due to the absence of historical behavior, extreme class imbalance, heavy-tailed loss distributions, and strict operational constraints. Traditional machine learning approaches, including the baseline methodology proposed in previous studies, typically optimize global predictive accuracy and therefore fail to capture business-critical outcomes, especially the identification of high-risk clients. This study extends the existing approach by evaluating two complementary business-aware classification strategies: (i) a balanced bagging ensemble specifically designed to handle class imbalance and maximize expected profit under explicit customer-omission constraints, and (ii) a lightweight Transformer-based architecture capable of learning richer feature representations. Both approaches incorporate the asymmetric financial cost structure of insurance and operate under operational selection limits. The empirical analysis is conducted on a proprietary large-scale auto insurance dataset comprising 51,618 customers and is complemented by validation on nine synthetic datasets to assess robustness. Model performance is evaluated using statistical tests (ANOVA, Friedman, and pair-wise comparisons) together with business-oriented metrics. The results show that both proposed approaches consistently outperform the baseline methodology (p < 0.001) in terms of profit, with the ensemble offering a better balance of performance and efficiency, while the Transformer shows stronger robustness and generalization under data perturbations. The balanced ensemble provides the most favourable trade-off between predictive performance, robustness, interpretability, and computational efficiency, making it suitable for deployment in regulated insurance environments, while the Transformer achieves competitive results and exhibits stronger generalization under data perturbations. The proposed approach aligns machine learning with actuarial portfolio optimization by explicitly integrating profit-driven objectives and operational constraints, offering two practical and scalable solutions for risk-based decision-making in real-world insurance settings.

Summary

Main Finding

In a no-history insurance setting with extreme class imbalance and heavy-tailed losses, two business-aware classification strategies — a balanced bagging ensemble optimized for expected profit under customer-omission constraints, and a lightweight Transformer that learns richer feature representations — both substantially outperform the prior baseline in profit (p < 0.001). The balanced ensemble delivers the best practical trade-off (predictive performance, robustness, interpretability, computational efficiency) for regulated deployment, while the Transformer shows stronger robustness and generalization under data perturbations.

Key Points

Problem context: classifying new customers at an insurer with no prior company-specific history, under severe class imbalance, heavy-tailed claim costs, and strict operational selection limits.
Business-aware design: both methods explicitly incorporate the asymmetric financial cost structure of insurance and operational selection (omission) constraints rather than optimizing only global accuracy.
Proposed approaches:
- Balanced bagging ensemble: handles class imbalance and is tuned to maximize expected profit while respecting omission constraints.
- Lightweight Transformer: learns richer feature representations to improve generalization and robustness.
Evaluation:
- Primary dataset: proprietary auto-insurance portfolio of 51,618 customers.
- Robustness checks: nine synthetic datasets with controlled perturbations.
- Metrics: business-oriented profitability metrics (expected profit under selection limits) and statistical tests (ANOVA, Friedman, pair-wise comparisons).
Results summary:
- Both approaches outperform the baseline in profit (statistically significant, p < 0.001).
- Balanced ensemble: better balance of performance, interpretability, and computational efficiency — well-suited for regulated environments.
- Transformer: stronger robustness to data perturbations and better generalization in stressed scenarios.
Practical recommendation: use the balanced ensemble when interpretability, runtime, and regulatory constraints dominate; consider the Transformer when expecting significant distributional shifts and needing stronger generalization.

Data & Methods

Real data: proprietary autos portfolio, n = 51,618 customers (no prior company-specific behavior histories).
Synthetic validation: nine datasets designed to test robustness and generalization under various perturbations (e.g., distributional shifts, label noise, class imbalance severity).
Models:
- Baseline: previously published methodology optimized for global predictive accuracy.
- Balanced bagging ensemble: ensemble of classifiers trained on balanced resamples, objective aligned to expected profit under explicit omission/selection constraints.
- Lightweight Transformer: compact attention-based architecture engineered to learn richer feature representations from available covariates.
Cost integration: asymmetric financial cost structure of claims and operational limits were encoded into the training/selection process to prioritize business-relevant outcomes (high-risk identification).
Evaluation framework:
- Business metrics focused on expected profit under operational selection limits.
- Statistical significance and ranking via ANOVA, Friedman tests, and pairwise comparisons to confirm consistent outperformance.
- Computational and interpretability assessments for deployment suitability.

Implications for AI Economics

Aligning objectives: explicitly optimizing ML models for profit and operational constraints (rather than global accuracy) materially improves economically relevant outcomes in insurance — demonstrating the value of objective design that reflects firm incentives.
Decision-theoretic ML in markets: integrating asymmetric costs and selection limits provides a template for using ML to operationalize actuarial portfolio optimization and risk-based customer selection.
Adoption trade-offs:
- Simpler ensemble methods can be preferable in regulated markets because they balance performance, interpretability, and efficiency — lowering compliance and audit friction.
- More expressive models (Transformers) can improve robustness and generalization, which matters for firms facing frequent data shifts or using richer feature sets.
Market and policy considerations:
- Better identification of high-risk clients affects adverse selection dynamics, pricing, and risk pooling; insurers should weigh effects on market segmentation and fairness.
- Explicit profit-driven selection raises questions about transparency and regulatory scrutiny — methods that maintain interpretability will ease oversight.
Research directions:
- Broader evaluation on multiple real portfolios and longitudinal deployment studies to measure long-run effects on portfolio risk and market behavior.
- Integration of fairness constraints, causal analyses, and distributional robustness methods to balance profitability, equity, and regulatory compliance.

Assessment

Paper Typecorrelational Evidence Strengthmedium — The paper provides strong predictive-evaluation evidence (large proprietary dataset, nine synthetic datasets, multiple statistical tests with p < 0.001) that its business-aware models improve profit-oriented classification for cold-start insurance customers; however, it does not establish causal effects on real-world economic outcomes (e.g., realized portfolio profit after deployment), relies on a single insurer's proprietary data and synthetic validation, and lacks field experimentation or external replication. Methods Rigorhigh — The authors use a substantial proprietary dataset (51,618 customers), complementary synthetic datasets to probe robustness, explicit incorporation of asymmetric cost structures and operational constraints, multiple model families (balanced bagging ensemble and Transformer), and appropriate statistical tests (ANOVA, Friedman, pairwise comparisons) plus business-oriented metrics; limitations include opacity from proprietary data, unspecified hyperparameter search/validation procedures in the summary, and no live A/B deployment. SampleProprietary large-scale auto-insurance dataset of 51,618 cold-start customers (no prior history at the insurer) with highly imbalanced claim/no-claim labels and heavy-tailed loss amounts; models trained/evaluated under explicit operational selection limits and asymmetric financial cost structure; validation complemented by nine synthetic datasets designed to test robustness under data perturbations. Themesadoption innovation GeneralizabilitySingle proprietary dataset from one insurer — may reflect firm-specific underwriting, pricing, and population mix, Product-specific (auto insurance) — findings may not generalize to other insurance lines (life, health, commercial) or non-insurance risk decisions, Cold-start customer focus — results pertain to new-customer classification and may not apply when historical policyholder data is available, Regulatory and market differences (jurisdiction-specific pricing rules, claim definitions) may limit transferability, Synthetic datasets may not capture all complexities of real-world data-generating processes or adversarial behavior, Model performance depends on class imbalance severity, loss-tail behavior, and operational selection limits that vary across firms

Claims (11)

Claim	Direction	Confidence	Outcome	Details
Classifying customers without a prior history at a given company is particularly challenging due to the absence of historical behavior, extreme class imbalance, heavy-tailed loss distributions, and strict operational constraints. Other	negative	high	classification_difficulty	0.15
Traditional machine learning approaches, including the baseline methodology proposed in previous studies, typically optimize global predictive accuracy and therefore fail to capture business-critical outcomes, especially the identification of high-risk clients. Output Quality	negative	high	identification_of_high-risk_clients	0.3
This study evaluates a balanced bagging ensemble specifically designed to handle class imbalance and maximize expected profit under explicit customer-omission constraints. Firm Revenue	positive	high	expected_profit_under_selection_constraints	0.15
This study evaluates a lightweight Transformer-based architecture capable of learning richer feature representations for the cold-start insurance classification problem. Output Quality	positive	high	feature_representation_quality / predictive_performance	0.15
Both proposed approaches incorporate the asymmetric financial cost structure of insurance and operate under operational selection limits. Other	positive	high	alignment_with_business_constraints	0.15
The empirical analysis is conducted on a proprietary large-scale auto insurance dataset comprising 51,618 customers and is complemented by validation on nine synthetic datasets to assess robustness. Other	positive	high	dataset_size / robustness_assessment	n=51618 0.5
Both proposed approaches consistently outperform the baseline methodology (p < 0.001) in terms of profit. Firm Revenue	positive	high	profit	n=51618 p < 0.001 0.5
The balanced bagging ensemble offers a better balance of performance and efficiency compared to the Transformer and the baseline. Organizational Efficiency	positive	high	predictive_performance_and_computational_efficiency	n=51618 0.3
The Transformer shows stronger robustness and generalization under data perturbations and achieves competitive results. Output Quality	positive	high	robustness / generalization (model performance under perturbation)	0.3
The balanced ensemble provides the most favourable trade-off between predictive performance, robustness, interpretability, and computational efficiency, making it suitable for deployment in regulated insurance environments. Adoption Rate	positive	medium	suitability_for_deployment / trade-off_between_metrics	n=51618 0.18
The proposed approach aligns machine learning with actuarial portfolio optimization by explicitly integrating profit-driven objectives and operational constraints, offering two practical and scalable solutions for risk-based decision-making in real-world insurance settings. Decision Quality	positive	medium	risk-based_decision-making_effectiveness	n=51618 0.18