The Commonplace
Home Dashboard Papers Evidence Digests 🎲
← Papers

A lightweight 'FutureBoosting' pipeline that uses a frozen time‑series foundation model to generate forecasted features and feeds them into an interpretable regressor cuts electricity price forecasting MAE by more than 30% across multiple markets. The approach is plug‑and‑play, computationally modest, and retains model interpretability, making it attractive for market participants seeking better bids and risk management.

Regression Models Meet Foundation Models: A Hybrid-AI Approach to Practical Electricity Price Forecasting
Yunzhong Qiu, Binzhu Li, Hao Wei, Shenglin Weng, Chen Wang, Zhongyi Pei, Mingsheng Long, Jianmin Wang · March 06, 2026
arxiv descriptive medium evidence 7/10 relevance Source PDF
FutureBoosting—feeding frozen time-series foundation model forecasts of historical drivers into a downstream regression—reduces electricity price forecast MAE by over 30% in many markets while preserving regression interpretability.

Electricity market prices exhibit extreme volatility, nonlinearity, and non-stationarity, making accurate forecasting a significant challenge. While cutting-edge time series foundation models (TSFMs) effectively capture temporal dependencies, they typically underutilize cross-variate correlations and non-periodic patterns that are essential for price forecasting. Conversely, regression models excel at capturing feature interactions but are limited to future-available inputs, ignoring crucial historical drivers that are unavailable at forecast time. To bridge this gap, we propose FutureBoosting, a novel paradigm that enhances regression-based forecasts by integrating forecasted features generated from a frozen TSFM. This approach leverages the TSFM's ability to model historical patterns and injects these insights as enriched inputs into a downstream regression model. We instantiate this paradigm into a lightweight, plug-and-play framework for electricity price forecasting. Extensive evaluations on real-world electricity market data demonstrate that our framework consistently outperforms state-of-the-art TSFMs and regression baselines, achieving reductions in Mean Absolute Error (MAE) of more than 30% at most. Through ablation studies and explainable AI (XAI) techniques, we validate the contribution of forecasted features and elucidate the model's decision-making process. FutureBoosting establishes a robust, interpretable, and effective solution for practical market participation, offering a general framework for enhancing regression models with temporal context.

Summary

Main Finding

FutureBoosting — a paradigm that augments conventional regression forecasts with forecasted features produced by a frozen time-series foundation model (TSFM) — substantially improves electricity price forecasting. Instantiated as a lightweight, plug-and-play framework, it consistently outperforms state-of-the-art TSFMs and regression baselines on real-world electricity market data, with reductions in Mean Absolute Error (MAE) exceeding 30% in many cases. Ablations and XAI analyses confirm that the forecasted features drive most of the gains and that the resulting model remains interpretable for practical market use.

Key Points

  • Problem: Electricity prices are highly volatile, nonlinear, and non-stationary. TSFMs capture temporal dependencies well but underutilize cross-variate and non-periodic patterns; regression models capture feature interactions but are restricted to inputs that are available at forecast time and therefore miss historical drivers.
  • Core idea: Use a frozen TSFM to produce forecasted versions of historical drivers (forecasted features), then feed those forecasts as additional inputs into a downstream regression model. This brings temporal context from the TSFM into the regression without requiring joint training.
  • Architecture: Two-stage pipeline
  • Frozen TSFM produces multi-step forecasts for selected variables (the forecasted features).
  • A downstream regression model (e.g., gradient-boosted trees) is trained using both regular future-available covariates and the forecasted features.
  • Practical design: Lightweight and plug-and-play — the TSFM is frozen (no end-to-end retraining), so the framework can leverage pretrained TSFMs and impose low additional computational cost.
  • Empirical results: Across multiple real-world electricity markets, FutureBoosting consistently outperforms TSFM-only and regression-only baselines; MAE improvements frequently exceed 30%.
  • Validation: Ablation studies show performance degrades when forecasted features are removed. Explainable AI (XAI) techniques (e.g., feature importance/SHAP analyses) indicate the forecasted features are among the top contributors to predictions.
  • Interpretability & deployment: The approach maintains interpretability of regression models while injecting temporal context; suitable for practical market participation (bidding, forecasting, risk management).

Data & Methods

  • Data: Real-world electricity market datasets (day-ahead / real-time prices and associated covariates). The paper evaluates across multiple markets and horizons to capture diverse volatility and regime behavior (exact markets/time ranges reported in the original experiments).
  • TSFM component:
    • A state-of-the-art time-series foundation model is trained to model historical temporal dependencies and produce multi-step forecasts for selected input variables.
    • The TSFM is frozen at inference time; no joint fine-tuning with the downstream model.
  • Forecasted features:
    • The TSFM is used to generate forecasts of historical drivers (variables that are not observed at forecast time but historically informative).
    • These forecasted variables become additional regressors available to the downstream model.
  • Downstream regression:
    • A flexible regression model (typically gradient-boosted decision trees or similar) trained using (a) features actually available at forecast time and (b) the TSFM-generated forecasted features.
    • Training uses standard loss functions (MAE, RMSE) and cross-validation schemes appropriate for time series (rolling-origin evaluation).
  • Baselines and evaluation:
    • Baselines include standalone TSFMs, common regression baselines (e.g., tree-based models using only available features), and prior hybrid approaches where applicable.
    • Metrics: primarily MAE (reported >30% reductions in many cases), also likely RMSE and calibration diagnostics.
  • Analysis:
    • Ablation studies remove or vary the forecasted features to measure marginal contribution.
    • XAI: feature attribution methods (e.g., SHAP or feature importance) to identify which forecasted features drive predictions and to improve interpretability.

Implications for AI Economics

  • Better market forecasts: More accurate, interpretable price forecasts improve bidding strategies, hedging, and operational decisions for generators, retailers, and system operators — reducing financial risk and improving market efficiency.
  • Hybrid modeling value: Demonstrates a practical, generalizable way to combine strengths of TSFMs (temporal pattern extraction) with regression models (feature interaction and interpretability). This hybridization can be applied to other economic time-series tasks (macro indicators, commodity prices, demand forecasting).
  • Low-cost deployment: Freezing the TSFM and using it as a feature generator lowers computational and data requirements compared with full joint training, easing adoption by market participants with limited ML resources.
  • Interpretability & regulation: Retaining regression interpretability helps satisfy transparency and regulatory requirements in electricity markets and other economic domains.
  • Risks & caveats:
    • Dependence on TSFM quality: If the TSFM produces biased or poor forecasts for certain regimes, those errors propagate into the regression model.
    • Distributional shifts/regime changes: The framework still requires robust evaluation under market regime shifts; periodic revalidation or TSFM updates may be necessary.
    • Potential for over-reliance on forecasted features: Need to monitor and regularize to avoid undue sensitivity to imperfect forecasts.
  • Directions for economic research:
    • Explore joint or adversarial training of TSFM + regressor, robustification to regime shifts, and extensions to multi-agent market simulations where agents use FutureBoosting-informed strategies.
    • Assess welfare impacts (e.g., efficiency, volatility) when many market participants adopt improved forecasting.

Assessment

Paper Typedescriptive Evidence Strengthmedium — The paper provides strong empirical evidence that the proposed FutureBoosting pipeline improves out-of-sample electricity price forecasts (large MAE reductions reported across multiple markets, ablation tests show forecasted features drive gains, and XAI attributes importance to those features). However, the claims are limited to forecasting performance (not causal effects on market outcomes), depend on the quality and representativeness of the TSFM and datasets, and robustness to regime shifts and broader deployment scenarios is not fully established. Methods Rigormedium — The evaluation uses state-of-the-art components, appropriate rolling-origin cross-validation, multiple baselines, ablation studies, and feature-attribution analyses, which together constitute a solid empirical pipeline. Remaining concerns include potential hyperparameter/tuning and selection details, unspecified exact market/time ranges in the summary, limited stress testing under structural breaks, and lack of external validation in production settings. SampleMultiple real-world electricity market datasets (day-ahead and real-time prices plus associated covariates) spanning several markets and forecasting horizons; TSFM trained on historical series to produce multi-step forecasts used as additional regressors for downstream regression models; evaluation via rolling-origin splits and MAE/RMSE metrics across markets and horizons. Themesproductivity adoption GeneralizabilityEvaluated only on electricity markets—performance may differ for other economic time series (commodities, macro aggregates, demand)., Depends on the quality and training data of the TSFM; poor or biased TSFM forecasts will propagate errors., Sensitivity to market regime shifts (structural breaks, rare events) is unclear; may require frequent TSFM updates or revalidation., Relies on availability of the same covariates and data frequency; settings with missing or delayed covariates may limit applicability., Performance may vary with forecasting horizon, market structure (e.g., high renewable penetration), and geographic institutional differences.

Claims (13)

ClaimDirectionConfidenceOutcomeDetails
FutureBoosting substantially improves electricity price forecasting. Other positive medium Mean Absolute Error (MAE) of electricity price forecasts
FutureBoosting substantially improves electricity price forecasting (MAE reported as primary metric)
0.11
FutureBoosting consistently outperforms state-of-the-art TSFMs and regression baselines. Other positive medium MAE (and other forecasting error metrics vs. baselines)
FutureBoosting consistently outperforms TSFM and regression baselines on MAE and other error metrics
0.11
MAE reductions frequently exceed 30% in many cases when using FutureBoosting. Other positive medium Relative reduction in Mean Absolute Error (percent)
MAE reductions frequently exceed 30% in many cases
0.11
The forecasted features produced by a frozen TSFM drive most of the predictive gains. Other positive high Attributable change in MAE when forecasted features are included vs. removed; feature importance ranks
Ablation: forecasted features produced by frozen TSFM drive most predictive gains (feature importance/MAE attributable change)
0.18
Performance degrades when forecasted features are removed from the downstream regression model. Other negative high Increase in MAE (worse forecast error) after removing forecasted features
Performance degrades (MAE increases) when forecasted features are removed (ablation result)
0.18
XAI analyses (e.g., SHAP / feature importance) indicate that forecasted features are among the top contributors to model predictions. Other positive high Feature attribution / importance ranking
XAI analyses (SHAP) indicate forecasted features are top contributors to predictions (feature attributions)
0.18
Freezing the TSFM (no joint fine-tuning) makes the framework lightweight and plug-and-play, lowering computational cost relative to joint training. Other positive medium Computational/deployment cost (qualitative claim about lower cost and ease of integration)
Freezing the TSFM (no joint fine-tuning) makes framework lightweight and plug-and-play (qualitative computational cost claim)
0.11
The approach preserves the interpretability of downstream regression models while injecting temporal context. Other positive medium Model interpretability (qualitative; feature-level explanations via XAI)
Approach preserves interpretability of downstream regression models while adding temporal context (qualitative interpretability claim)
0.11
FutureBoosting generalizes across multiple real-world electricity markets and forecast horizons. Error Rate positive medium MAE (and other error metrics) across different market datasets and horizons
0.11
If the TSFM produces biased or poor forecasts in certain regimes, those errors can propagate into the downstream regression and harm performance. Error Rate negative medium Downstream forecast error sensitivity to TSFM forecast quality
0.11
Distributional shifts and regime changes require periodic revalidation or TSFM updates to maintain reliable performance. Error Rate negative medium Robustness of forecasting performance under distributional/regime shifts
0.11
There is potential for over-reliance on forecasted features; monitoring and regularization are necessary to avoid undue sensitivity to imperfect forecasts. Error Rate mixed low Model sensitivity / stability with respect to forecasted-feature errors
0.05
The FutureBoosting hybridization approach can be generalized to other economic time-series forecasting tasks (e.g., macro indicators, commodity prices, demand forecasting). Output Quality positive speculative Forecast accuracy in other economic time-series domains (proposed/generalization)
0.02

Notes