FlexSource - Lind et al Baseline Methods (2023)

Source - Lind et al Baseline Methods (2023)


“Baseline methods in the context of modern distributed flexibility: an evaluation considering multi-DER types, markets, and product characteristics” — Lind et al. (2023), Journal of Utilities Policy. A systematic evaluation of nine baseline methodologies for DER flexibility market participation, with a proposed decision framework for method selection.

Document metadata

FieldValue
AuthorsLeandro Lind, José P. Chaves, Orlando Valarezo, Anibal Sanjab, Luis Olmos
InstitutionsIIT-ICAI School of Engineering, Universidad Pontificia Comillas (Madrid); VITO/EnergyVille (Belgium)
PublishedJournal of Utilities Policy, 2023
DOI10.1016/j.jup.2023.101688
FundingCoordiNet project (Horizon 2020, No. 824414); BeFlex project (No. 101075438)
TypePeer-reviewed academic article (preprint available)

Summary

The problem addressed: when a DER is activated for explicit flexibility, how do you determine how much flexibility was actually delivered? For large scheduled generators, the answer is easy — compare metered output to the committed schedule. For DERs, no individual schedule exists, so a counterfactual “what would this resource have done without activation?” must be estimated. This counterfactual is the baseline.

The paper evaluates nine established methods against three criteria (accuracy, simplicity, integrity) across four DER types (Load-DR, Controllable DG, Non-controllable DG, Energy Storage Systems), multi-DER aggregation, and two product dimensions (direction: up/down; timing: real-time to weeks ahead). The central finding is that no one-size-fits-all baseline method exists.

Nine baseline methods

MethodHow it worksBest forKey weakness
XofYAverage of X highest/mid/lowest days from the last Y eligible daysLoad-DR (upward)Upward bias (HighXofY); fails for weather-dependent DG/ESS
Rolling averageAverage of last X same-type days (weekday/weekend), recency-weightedLoad-DRSame as XofY; doesn’t capture DG/ESS variability
Comparable dayFSP selects an ex-post non-activation reference dayNon-controllable DGLow integrity; FSP chooses own baseline
RegressionStatistical model (consumption = f(weather, season, past data))Load-DR, PV/wind with weather dataComplex; high simplicity cost
Machine learningNeural network / ML techniquesNon-controllable DG, Load-DRVery low simplicity; black-box risk
MBMAMeter reading immediately before activation = baselineBalancing services (short-duration)Integrity risk for ESS (see below); inaccurate for long activations
Zero baselineBaseline = 0; all production during activation = flexibility deliveredBackup generators, batteries providing upward production flexibilityFails for consumption-side DR
Control groupAverage of similar non-activating customers during activationMulti-DER aggregationRequires a valid comparison group; low integrity
Capacity limitationProduct defined as a power cap; no energy-delta baseline neededDSO congestion managementRequires different clearing algorithms; primarily upward only
Self-reportedFSP reports its own baselineLarge industrial FSPsLow integrity without verification

Key analytical findings

MBMA integrity risk for batteries

MBMA (Meter-Before-Meter-After) reads the meter immediately before activation and uses that reading as the baseline. For batteries that provide upward flexibility via increased production (injection), this creates a manipulation opportunity: the operator could switch from charging to discharging mode just before the pre-activation reading, causing the baseline to be negative/zero and the subsequent production to appear larger than it actually is. This “gaming” inflates the measured flexibility delivered.

The recommended method for batteries providing production-side flexibility is zero baseline — any injection during activation counts as delivered flexibility, with no pre-activation baseline manipulation possible. This is indeed the approach used in SWITCH for batteries providing increased production. (Source - SWITCH User Documentation (2026))

Capacity limitation products and the baseline question

Capacity limitation products (where the DSO sets a power cap and the FSP must stay below it) appear to eliminate the baseline problem: the product is defined by the cap, not by an energy delta. However, the authors note that even capacity-cleared products often still require energy delivery validation post-activation — whether the FSP delivered the agreed energy volume. The Swedish context confirms this: SWITCH’s TO (Tillgänglighetsordrar) and DO (Direktordrar) products use capacity logic for market clearing (MW-based bids), but FSPs are still expected to deliver the awarded energy volume, requiring a baseline for validation. Capacity clearing and energy settlement are separable steps.

Harmonisation across sequential markets

When a DSO LFM and a TSO balancing market operate sequentially (both drawing from the same portfolio), different baseline methods create distortions. An FSP managing upward activations in both markets may face conflicting incentives depending on whether the TSO uses MBMA and the DSO uses XofY. The paper recommends baseline harmonisation across interacting markets — relevant to future TSO-DSO coordination as NC DR matures. (Network Code on Demand Response)

Market timing

  • Real-time / balancing services: MBMA is the international standard (used in FCR, mFRR); no time for ex-ante calculation.
  • Day-ahead cleared products: XofY or rolling average; but gate-closure timing must exclude the hours between GCT and activation to prevent gaming.
  • Long-term contracted products (ST/LongFlex): ex-ante calculation methods; regression or ML feasible.

DER-type matrix (summary)

  • Load-DR: historical methods (XofY, rolling average) are adequate — medium accuracy, high simplicity.
  • Non-controllable DG (wind, solar): regression or ML needed for accuracy (weather-driven output); XofY only works with same-day adjustment.
  • Controllable DG (backup generators): zero baseline is the most accurate (baseline IS zero when idle).
  • ESS: MBMA for consumption-side; zero baseline for production-side flexibility.
  • Multi-DER aggregation: no single method covers mixed portfolios well; submetering per technology type is the most accurate but costly; comparable day or control group are pragmatic alternatives.

Connections to Swedish context

  • SWITCH MBMA — the default automatic baseline in SWITCH for consumption-side resources. The paper confirms this is the correct approach for short-duration balancing-type products, though accuracy degrades for activations longer than 1–2 hours.
  • SWITCH zero baseline (noll-referens) — used for battery resources providing production-side upward flexibility. The paper endorses this approach on both accuracy and integrity grounds.
  • NODES rolling average — sthlmflex used a 5-day rolling average as the standard NODES baseline (Source - sthlmflex säsong 3 (2022-2023)). The paper categorises this as a rolling average variant with medium accuracy and medium integrity for Load-DR.
  • NC DR Art. 43–44 settlement — future flexibility markets under NC DR will require standardised baseline methods. This paper provides the academic grounding for what ACER/Ei should require.
  • BeFlexible project — this paper was produced under the same BeFlexible project that funds SWITCH market demonstrations. The academic framework and the operational SWITCH design are therefore closely related.

Relevance to wiki topics

TopicRelevance
Baseline MethodsPrimary source for the concept page
SWITCHMBMA and zero-baseline methods explained; capacity/energy distinction clarified
NODESNODES uses rolling average in sthlmflex; no MBMA
AggregationMulti-DER baseline challenge is the core aggregation settlement problem
Energy StorageBattery-specific baseline recommendations; MBMA integrity risk
Flexibility MarketBaseline design is central to market settlement and FSP participation
Network Code on Demand ResponseFuture regulation will standardise baseline methods; NC DR Art. 43–44
CoordiNetPaper funded by CoordiNet; SWITCH’s MBMA baseline traces to CoordiNet design