Machine Learning-Based Power Flow Solver

Machine Learning-Based Power Flow Solver
GridLAB-D TAC Meeting September 6, 2019 Hi my name is lily Buechler, I’m a PhD student at Stanford, studying mechanical engineering I’m going to talk about our preliminary work on developing a machine learning based power flow solver that will be integrated into Gridlab-D A lot of this work was done by another phd student in our group, Siobhan powell, who has put a lot of work into this over the last few months.

The Power Flow Problem At the core of any grid simulation engine, like gridlabd, is a power flow solver Whose purpose is to solve the so-called powerflow equations This system of equations, relates the real and reactive power injections at each bus in the network to the magnitude and phase of the voltage at each bus Generally, in grid simulations, the real and reactive power, are the inputs to the problem As well at the topology and characteristics of the network (admittance matrix) And the voltage magnitude and phase, are the variables that you are trying to solve for These equations implicitly assume that the voltages at any given time instant, are only dependent on the power injections at that time instant, and not dependent on any previous history That assumption, which is called quasi-static power flow, is a key assumption used in many simulation engines that simplifies this power flow solution By decoupling the solver solutions in time Now these equations are highly nonlinear, which means that it is impossible to derive an explicit expression for the outputs, which are the voltages, in terms of the inputs

Standard Approach: Newton Raphson Iteration
Newton Raphson method: Iterative root finding method Seed initial guess with last solver solution New approach: replace Newton-Raphson solver with data-driven approximation: Initial guess Evaluate function Linearize model Update guess So in order to solve the equations, you have to take an iterative approach There are various iterative methods for solving the powerflow equations Standard one is called newton Raphson iteration Which is an iterative root finding method Basic idea, is that you want to find the solution to an equation So you first guess a solution Then you evaluate the function at that solution Then you linearize the model at that point, as an approximation Which in this visualization is just the tangent line to the curve And then you use that linearization, to make a better guess And you repeat that until your guess is close enough to the actual solution Newton Raphson has been used extensively for powerflow And it very efficient Each iteration might only take a couple of milliseconds and the algorithm can typically converge in only a few iterations This issue is that generally an extremely large number of solutions are needed, each time you use the grid simulator And those add up But say you are using Gridlabd to do a grid integration study And you have tens or hundreds of scenarios you want to evaluate And you want to run each simulation over say several months or years With a timestep on the order of seconds to minutes to hours That is going to require tens to hundreds of millions of NR solver solutions And that can be very computationally expensive So the idea that we are proposing here, is instead of iteratively solving these equations We can use methods from statistical learning and data science to learn the relationship between these variables And then solve for the solution explicitly (requiring no iteration), which is much more efficient

Literature on Data-Driven Power Flow Approximation
Yu, Jiafan, Yang Weng, and Ram Rajagopal. "Robust mapping rule estimation for power flow analysis in distribution grids." 2017 North American Power Symposium (NAPS). IEEE, 2017. Development of SVM-based power flow approximation Forward and backward power flow mappings Comparison with regression-based methods Our group at Stanford has done some past work in this area PhD student Jiafan Yu, who recently graduated, did some work a couple years ago on developing a support vector machine based approximation to the powerflow equations Compared that to regression based methods Generally found that SVMs are more accurate than linear regression But are more computationally expensive If you use the right kernel functions, you can actually recover the actual power flow equations with high degree of accuracy This study was a great proof of concept

Challenges for Implementation in GridLAB-D
Model formulation 3-phase unbalanced power flow Current injection method ZIP load representation Scalability Which modeling methods work well in high dimension? How much training data is needed? Accuracy How often should the model be re-trained? Can model accuracy be predicted? Computation speed Can the number of Newton Raphson iterations be predicted? How does an ML-approach compare to existing methods for speeding up power flow simulations? ZIP load model But there are a number of challenges that have to be addressed to actually implement this type of methodology in a simulation engine like gridlabd Some of these we have analyzed already, but there are still many more we will be investigating in the future First, gridlabd simulates 3-phase unbalanced powerflow, as opposed to single phase, which is what has been looking at in the past, for this data-driven approximation Which makes the model formulation more complicated Also, In the original study, the loads were assumed to be constant power loads Which means they are not dependent on voltage But gridlabd, many load models use a ZIP load representation, where the power consumption is a polynomial function of the voltage So our algorithm has to be adapted for that as well Scalability is a key requirement Certain methods like svm, may have high accuracy, but don’t scale well to high dimension We want these methods to be useful for large networks Different methods also require different amounts of training data to learn a good model Another key requirement is accuracy The general idea is that you would use the outputs from the standard NR solver, to train the model What training data is optimal to learn a model that we know is going to be in a certain operating regime most of the time Standard NR method has a guaranteed degree of accuracy The question is can we do the same for a data-driven model Can we estimate the model accuracy for certain model inputs, without having to use NR to verify Can we design methods to decide when the model should be retrained? For example when the topology of the network changes And finally, computational speed Can we estimate the number of newton Raphson iterations that it is going to take to solve for a solution, before doing so, and use that information to decide wither or not to use a data-driven model? In terms of computation time, how does this method compare to other methods in the literature on speeding up power flow solvers Variable timestepping, parallelization, vector quanitization, which is where you reuse solutions that have similar inputs

Cluster-based Linear Regression Model
Learn separate model for each bus Regularized least squares regression: Cluster training data into different operating modes and train a different model for each Clustering methods: Gaussian mixture model K-means clustering K-means clustering of residential load profiles We started out by looking at the simplest and fastest methods, to form a performance baseline We started using a very simple linear regression And looked at ways to improve accuracy Here the features are the real and reactive powers at nominal voltage Samples are different timesteps And the variables we are trying to predict are the voltage magnitude and phase Learned a different model for each bus in the network Estimate the parameters using regularized least squares regression Couple observations we made One is that for typical operation of the network, the power flow equations are actually fairly linear and linear regression does a pretty good job The second is that load profiles generally have different operating modes, driven by typical load profile behavior that varies throughout the day Perhaps learn a better model, by training different models for different operating regimes We developed this cluster-based linear regression model Where we cluster the training data into different operating regimes And then train a different model for each Tried 3 different methods – GMM, K-means clustering, and a heuristic, where we clustered by day of the week This is an example of clustering the data from about 15 residential load profiles into 4 different clusters, with respect to time, using K-means clustering Generally those operating regimes end up corresponding with different times of day

Error Checking and Solver Integration
Voltage vector: Error metric: normalized vector error: Interaction with NR solver: Error check Distance check Step change check We have also looked at the best way to decide when to use the data-driven model and when to use the NR solver based on the observed prediction error Error metric that we used was the normalized vector error expressing the voltage in rectangular coordinates taking the maximum over all buses First step is to split the data into a training and test set, and train the model on the training set For each test sample, use the trained model to make a prediction Using a set of tests, decide whether or not to use the estimate Tried 3 different tests Error check – every time you use the NR solver, also test the model approximation. And to use the model approximation again, the last computed error needs to be below a certain threshold This method assumes that the inputs to the model move somewhat slowly through different operating regimes Distance check – Observed that high prediction error tends to occur for points that are at the edge of a cluster Require that the model inputs are within a certain percentile distance away from the cluster center to use the model Step change check – checks if there is a significant jump between the last simulation solution and the current model estimate Only uses the model if that change in time is below some threshold Also assumes that the solution won’t change much from one solution to another

Test Case: IEEE 123 Bus Network
IEEE 123 bus model Replace spot loads with time-varying real and reactive power profiles for 344 residential homes 4.2 kV nominal voltage 4 voltage regulators GridLAB-D simulation 7 day training 21 day testing 1-minute simulation timestep Training and testing data breakdown We tested these methods using the IEEE 123 bus network model in gridlabd Replaced the spot loads with time-varying real and reactive power profiles for 344 homes Network operates at 4.2 kV 4 voltage regulators We tested these methods offine, but running a 31 day simulation and breaking it down into a training set and a test set We discarded the first 3 days, because it generally takes a little while for the system to equilibrate, because of the simulation initial conditions 7 days of training 21 days of testing Used a 1 minute timestep

Results: Cluster-Based Modeling
Training error Testing error Mean prediction error: 0.2% Base Case: no clustering K-Means test error GMM test error Cluster-based models (k=7): Here are some of the results for prediction accuracy, using the model approximation 100% of the time The top 2 plots here are for the base case with no clustering Training error and testing error distributions look relatively similar Mean test error is about 0.2% That’s not as low as the NR solutions, but it is reasonable The cluster-based models did reduce the mean error even further Most significantly for K-means clustering Although there still is a long tail to the distribution

Results: Error checking
Final model performance: Avoided solves 86.7% 0.0168 Error above 0.01 0.05% Parameters of error checks were tuned using sensitivity analysis Best performance results from a combination of error checks Significant reduction in number of Newton Raphson solves Final error distribution using error checks: For the error checking algorithm, we used a sensitivity analysis to tune the parameters and thresholds of the error checks And found that the best performance results from a combination of the error check, distance check and step change check Using the optimized model, we saw an 86% reduction in the number of NR solves Median test error was 0.04 % Maximum test error was about 1.6% But with a very small number of samples with errors above 1%

Results: Load Composition and Loading Level
ZIP load model We also did a sensitivity analysis to analyze how the prediction error is dependent on the loading level, as well as the load composition One observation of the pervious study, is that model accuracy is significantly dependent on loading level, for both SVMs and regression-based methods To evaluate loading level, we proportionally scaled the load profiles over a wide range To evaluate load composition, we tested the model for 3 cases: constant current, constant power, and constant impedance loads Which are the 3 extremes in terms of load behavior We found that voltage magnitude error is highly correlated with loading level The impact of load composition is less significant But the error does tend to be more significant for constant power loads

Example: Voltage magnitude prediction
Low loading level High loading level Constant impedance loads This is just an example of the predicted and actual voltage profiles for a specific bus, with a low loading level and high loading level, the only difference being the scaling factor applied to the real and reactive power profiles For the low loading level you can’t even distinguish the difference between the two The difference starts to become appartent for the high loading level, but the approximation still emulates the general behavior

Results: Effects of Training Set Size

Next steps: Comparison of SVM and other ML-based methods with linear regression Scalability – test accuracy on larger networks (IEEE 8500 node network) Optimal training set design Algorithm implementation with online interaction with GridLAB-D Newton Raphson solver

Questions?

Extra Slides

Machine Learning-Based Power Flow Solver

Similar presentations

Presentation on theme: "Machine Learning-Based Power Flow Solver"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Machine Learning-Based Power Flow Solver

Similar presentations

Presentation on theme: "Machine Learning-Based Power Flow Solver"— Presentation transcript:

Similar presentations

About project

Feedback