How OctoDash forecasts Agile electricity prices using ensemble modelling and independent data sources
Octopus Energy publishes tomorrow's Agile tariff rates each day around 4–5pm. Before that, you're flying blind. OctoDash fills this gap by predicting half-hourly prices up to 7 days ahead, so you can plan when to run appliances, charge your EV, or schedule high-energy tasks—even before official rates are published.
Rather than relying on a single forecast, OctoDash combines multiple independent data sources. Each uses a fundamentally different methodology, which means their errors are uncorrelated—when one gets it wrong, the others often get it right. This is the core principle behind ensemble forecasting.
An independent forecast built on BMRS (Balancing Mechanism Reporting Service) data from National Grid ESO. It models the electricity supply curve using real generation data, fuel costs, interconnector flows, and demand forecasts to estimate the wholesale day-ahead price that drives Agile rates.
This model excels during daytime hours when supply-demand fundamentals are the dominant price driver. It updates daily with the latest grid data.
A machine-learning forecast trained on years of historical Agile pricing data combined with weather, demand patterns, and market signals. It provides not just a central price estimate but also a confidence interval (low/high bounds) for each half-hour slot.
The confidence bounds are particularly valuable —tight bounds mean the model is certain; wide bounds signal genuine uncertainty. AgilePredict updates multiple times daily, tightening its bounds as more data becomes available.
Our own model analyses historical pricing patterns, incorporating time-of-day profiles, seasonal adjustments, weather conditions, and recent price trends. It acts as a stabilising anchor when external forecasts disagree or are unavailable.
The historical model is continuously validated against actual prices using rolling backtests, so its weight in the ensemble is always grounded in real-world accuracy.
Combining forecasts isn't as simple as averaging them. OctoDash uses an inverse-variance weighted ensemble, which means more accurate models receive more influence over the final prediction.
Each data source is queried independently. We retrieve Guy Lipman's supply-curve forecast, AgilePredict's ML estimates with confidence bounds, current weather data, and recent historical pricing patterns.
For each model, we continuously track how accurate its past predictions have been using rolling backtests against actual published Agile rates. This gives each model a measured error rate (MAE) that reflects its real performance, not theoretical claims.
We also track accuracy by half-hour slot. That means the model choice at 04:00 can differ from 17:00 if recent evidence says one source performs better in that specific time bucket.
Models with lower historical error receive proportionally more weight using inverse-variance weighting (1/MAE²). If one model has been twice as accurate as another, it gets roughly four times the influence. This automatically adapts as model performance changes over time.
Different models perform better at different times. OctoDash now applies an explicit day/night regime: daytime and peak periods favour Guy Lipman's fundamentals-led model, while evening and overnight periods favour AgilePredict's ML signal. The blend updates this weighting per half-hour slot.
On top of this, OctoDash applies regime guardrails: Guy's share is capped overnight unless corroborated by tight ML bounds, while daytime/peak slots can enforce a minimum Guy share when recent error data supports it.
When multiple independent models agree on a price, confidence increases significantly. When they disagree, the final prediction leans toward the model currently weighted highest for that time of day. AgilePredict's confidence bounds provide an additional check—if the supply-curve price falls within the ML model's predicted range, it's a strong signal of consensus.
Final predictions are sanity-checked against historical price ranges and the ML model's confidence interval to catch outliers. Extreme values are tempered while still allowing genuinely unusual prices (like negative rates) to come through when the evidence supports them.
Every prediction comes with a confidence percentage. This isn't a vague guess—it's calculated from multiple measurable factors:
Multiple models agree closely, ML confidence bounds are tight, backtest accuracy is strong, and weather data is available.
Models show some disagreement, or ML bounds are wider than usual. Common for predictions 2–3 days ahead or during volatile weather.
Significant model disagreement, wide ML bounds, limited data sources, or extended forecast horizon (4–7 days out).
Model agreement: When Guy Lipman's supply-curve price falls within AgilePredict's ML bounds, it means two completely independent approaches are confirming the same price range.
Tight ML bounds: When AgilePredict's low/high range narrows, the ML model itself is more certain. A 3p range carries far more weight than a 15p range.
Strong backtests: When recent predictions for this time of day have been accurate, confidence in the current prediction is boosted.
Model disagreement: When the supply-curve and ML models predict significantly different prices, confidence drops—but that drop is now scaled by which model actually had weight in the blend for that slot.
Wide ML bounds: If the ML model itself is uncertain (wide confidence interval), our confidence should reflect that honestly.
Forecast horizon: Predictions further in the future are naturally less certain. Tomorrow's forecast carries higher base confidence than next week's.
Historical volatility: Some half-hour slots are inherently more volatile than others (e.g. the 4–7pm peak). Confidence accounts for this.
No single forecast model is best all the time. The supply-curve model can miss demand spikes that the ML model catches from pattern recognition. The ML model can be thrown off by unprecedented market events that the supply-curve model handles through fundamental analysis. Our historical model provides stability when external sources are temporarily unavailable or publishing unusual values.
By blending all three with accuracy-weighted averaging, the ensemble consistently outperforms any individual source. This is a well-established principle in forecasting known as the "wisdom of crowds"—combining independent estimates produces better results than relying on a single expert.
OctoDash takes this further by continuously measuring each model's accuracy through rolling backtests and adjusting weights in real time. If one model starts performing better in a particular season or time of day, the ensemble automatically shifts to give it more influence.
We believe prediction confidence should be honest. If we're uncertain about a price, we'd rather show you a lower confidence score than pretend we know something we don't. Every confidence percentage in OctoDash is grounded in measurable data—model agreement, effective blend weights, backtest accuracy, and ML confidence bounds—not arbitrary estimates.
Download OctoDash to get half-hourly price predictions up to 7 days ahead with confidence scores for every slot.