Microsoft GridSFM: Revolutionizing Power Grids
Le brief IA que les pros lisent chaque soir
Les 7 actus IA du jour, décryptées en 5 min. Gratuit.
Inclus dès l'inscription : notre sélection des meilleurs guides & comparatifs IA.
Choisis ton rythme
Gratuit · Pas de spam · Désabonnement en 1 clic
Microsoft has recently introduced GridSFM, a lightweight foundation model that could transform the management of electrical grids. This model is designed to predict the optimal alternating current power flow (AC-OPF) in just a few milliseconds, a breakthrough that promises to enhance operational efficiency and achieve significant cost savings. GridSFM stands out for its ability to directly influence major financial decisions, with a potential impact of up to $20 billion per year in congestion losses and a reduction of 3.4 TWh of unused renewable energy. By providing a rapid estimate of generator distribution and costs, GridSFM offers increased visibility into congestion and the stability of the electrical system.
Beyond estimating generator distribution and costs, GridSFM produces complete states of the AC system, giving operators direct visibility into congestion, stability, and the overall health of the system. It serves as a foundation for the community to develop advanced electrical grid simulators and planning tools without having to recreate data or models from scratch.
A Model for the Future of Electrical Grids
Microsoft's model is particularly relevant in a context where electrical grids are under increasing pressure. With rising demand, the integration of renewable energies, the electrification of transportation, and extreme weather events, the need to determine optimal operating points is crucial. Addressing this question requires solving the optimal alternating current power flow (AC-OPF) problem, a complex and non-convex optimization challenge that calculates the least-cost generator distribution (how much each generator produces) to meet demands while respecting the physics of power flow, voltage limits, thermal constraints, and stability requirements. This problem underpins essential operations of the electrical system, including reliability, real-time dispatch, market clearing, and emergency situation analysis.
These decisions directly govern outcomes amounting to up to $20 billion per year in congestion costs and multi-terawatt-hours of renewable energy reduction, making both economic efficiency and grid reliability highly sensitive to the quality of determining these operating points. However, the AC-OPF is computationally expensive: a utility-sized grid can take hours to solve, forcing a trade-off between solving a small number of carefully selected scenarios or relying on approximations that overlook critical aspects of physics, which can lead to errors in estimating power flows and binding constraints, resulting in suboptimal distribution and degraded reliability under stressed conditions.
Impressive Performance
To address this limitation, Microsoft introduces GridSFM, a unique neural network that approximates the AC-OPF in milliseconds across networks ranging from 500 to 80,000 buses. It takes standard AC-OPF data (network topology, generator and load specifications, transmission line constraints) as input and produces an operating point and a feasibility verdict (whether the system meets all physical and operational constraints). By eliminating the computational bottleneck, GridSFM allows for the evaluation of orders of magnitude more scenarios in real-time, facilitating more informed decisions and shifting network operations from reactive responses to proactive optimization.
In this initial version, Microsoft offers two tiers: GridSFM-Open for research networks up to 4,000 buses, and GridSFM-Premier for production systems up to 80,000 buses. The model is constructed as a structured block discrete neural operator, representing each network as a directed graph, with buses (connection points in the network) and generators as vertices, and transmission lines and AC lines as edges. It is trained using solver supervision, where benchmark solutions are generated using the AC-OPF solver (IPOPT in PowerModels.jl), and physics-based constraints that penalize violations of fundamental physical laws such as Kirchhoff's laws of voltage and current, as well as operational constraints like thermal limits. This allows the model to learn from both feasible and infeasible regimes.
Most learning-based substitutes for the AC-OPF train a model per network on a narrow distribution. GridSFM takes the opposite approach: in this version, a single model is trained on over 150 base network topologies and approximately 500,000 scenarios covering various load profiles, multi-element outages, line capacity reductions, tightening voltage limits, and different generator cost coefficients, so that the model is forced to generalize rather than memorize. In the 54 test scenarios for GridSFM-Open, our model achieves a median cost gap of 2.23% compared to solver truth labels (average of 3.41%; gap < 5% on 83% of scenarios). When more precision is needed, GridSFM's prediction also serves as a starting point for traditional numerical solvers, with GridSFM-seeded-warm outperforming the cold solver by 1.66 times in geometric mean across the same test scenarios and beating the industry-standard DC-OPF by 1.59 times in geometric mean. The geometric mean, also known as the multiplicative mean, is used here as it is more robust against outliers. Our model also demonstrates the ability to adapt to new networks with only a few fine-tuning scenarios.
An Alternative to Traditional Methods
A common scheme in network operations and planning is having to choose between solving a small set of carefully selected scenarios accurately using the full AC-OPF or running thousands of scenarios through a faster approximation that neglects certain parts of the physics. For example, a commonly used tool is the DC-OPF approximation, a linearized version that assumes constant voltage magnitudes and small angle differences while ignoring reactive power and losses. The DC approximation solves in seconds while the full AC takes minutes to hours, which is why most contingency screens, pre-market clearing stages, and planning sweeps today operate on the DC approximation. The cost is real: the DC approximation entirely ignores voltage and reactive constraints, and its distribution cost can exceed 10% compared to the AC optimum in stressed scenarios (with networks in the worst cases exceeding 20% in our test benchmark).
GridSFM is designed as a direct alternative to the DC approximation in this fast approximation niche, and unlike most existing neural AC-OPF substitutes, which require retraining for each new topology, GridSFM generalizes across networks in its supported size range without retraining by topology, allowing it to integrate as universally as the DC approximation. In particular, compared to the AC-OPF, GridSFM offers three concrete advantages:
-
Same class of accuracy as the DC approximation regarding standalone distribution cost. GridSFM and DC fall within the same distribution of cost gap per scenario, with complementary failure modes: DC fails on networks where its lossless/reactive linearization is structurally incorrect; GridSFM fails on networks outside its training distribution. Both limitations close along orthogonal axes. The ceiling of DC is set by linearization, while the tail of GridSFM closes with more training data.
-
1,000 times faster than a full AC solver and about 100 times faster than the DC approximation during the inference step, fast enough to sweep thousands of contingencies (e.g., line or generator outages) in minutes on a standard GPU.
-
A true AC operating point, not a linear approximation. GridSFM produces voltages and reactive power, so the same prediction can be passed to a traditional numerical solver as an AC starting point, opening a workflow that the DC approximation cannot offer.
Feasibility Assessment: Stress Score Screening
A scenario is considered infeasible when no distribution satisfies all constraints simultaneously: the requested load cannot be served within voltage limits, thermal limits, or generator capacities. Operationally, infeasibility is the most consequential failure signal: the requested operating condition cannot be served at all, and the response is an intervention (load shedding, redeployment, relaxing thermal limits). It is also the most expensive class of scenario to evaluate, as the solver only learns that a scenario is infeasible after iterating to non-convergence: each infeasible case costs a full solver cycle, often longer than a feasible case. Sweeping thousands of contingencies or stress cases to identify infeasibles is therefore one of the worst budgets in any planning workflow.
GridSFM addresses this with a stress score per scenario trained jointly with the distribution head. We evaluate the score across three classes of scenarios on each network:
-
real-feasible: scenarios on which the AC-OPF solver successfully converged (i.e., truly feasible operating points),
-
real-infeasible: scenarios on which the solver failed to converge (truly infeasible operating points),
-
synth-infeasible: feasible base points that we deliberately perturbed to violate a specific constraint (voltage compression, thermal bottleneck, angle tightening, or DC thermal congestion).
Across the 54 test scenarios, the binary accuracy of the stress score per network is overall uniform across classes: real-feasible (green) average of 94.5%, real-infeasible (red) average of 96.1%, synth-infeasible (orange) average of 90.4%. Most networks cluster within a few points of the averages; outliers below 80% are the same difficult networks that appear in the cost gap analysis below.
Brief IA — L'actualité IA en français
L'essentiel de l'actualité de l'intelligence artificielle, décrypté et expliqué chaque jour.