Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multilayer Perceptron (MLP) Regression

Similar presentations


Presentation on theme: "Multilayer Perceptron (MLP) Regression"— Presentation transcript:

1 Multilayer Perceptron (MLP) Regression
Andrew Marshall

2 Context In my project, microstructures are generated, and their 2 point statistics are correlated Structure-property linkages are developed via regression. Examples: linear regression, multivariate polynomial regression, GPR Artificial neural networks (ANNs) allow a computer to “learn” a problem by changing the importance assigned to each input as it obtains more information; ANNs may be used to work on regression problems Dr. Kalidindi is skeptical that GPR can work, suggested neural networks

3 Theory: Perceptron A perceptron loosely models a neuron in a brain
It takes weighted inputs and applies an activation function X1 W1 Perceptron F(X1W1 + X2W2) X2 W2

4 Theory: Neural network
A neural network is composed of many perceptrons and many connections Three layer types: input, output, and hidden (can have multiple hidden layers) Allows more complex problems to be solved Learning is supervised Ultimate goal is to optimize weights Hidden Input Output

5 Theory: Learning ej(n) = dj(n) – yj(n) 𝜀 𝑛 = 1 2 𝑗 𝑒 2 𝑗 (𝑛
Since MLP is supervised, an error is produced with respect to the target value Learning determines how much weights are adjusted in response to the error ej(n) = dj(n) – yj(n) d is the target value, j is the observed output node, n is the data point. Cost function (error energy) sums squares of error for all nodes at nth data point. 𝜀 𝑛 = 𝑗 𝑒 2 𝑗 (𝑛

6 Theory: Gradient descent
Goal of gradient descent is to improve weight values throughout the network by minimizing the error energy function (backpropagation) 𝑣 𝑗 𝑛 = 𝑖 𝑤 𝑗𝑖 𝑛 𝑦 𝑖 (𝑛) ∆ 𝑤 𝑗𝑖 𝑛 =𝛼 ∆ 𝑤 𝑗𝑖 𝑛−1 −𝜂 𝜕𝜀 𝑛 𝜕 𝑣 𝑗 𝑛 𝑦 𝑖 (𝑛) 𝛿 𝑗 𝑛 = − 𝜕𝜀 𝑛 𝜕 𝑣 𝑗 𝑛 = 𝑒 𝑗 𝜑 ′ 𝑣 𝑗 𝑛 (𝑜𝑢𝑡𝑝𝑢𝑡 𝑛𝑜𝑑𝑒𝑠) First: induced local field. Eta = learning rate. Phi prime is the deriv. of the act. func. alpha is a regularization (smoothing) param. 𝛿 𝑗 𝑛 = − 𝜕𝜀 𝑛 𝜕 𝑣 𝑗 𝑛 = 𝜑 ′ 𝑣 𝑗 𝑛 𝑘 − 𝜕𝜀 𝑛 𝜕 𝑣 𝑘 𝑛 𝑤 𝑘𝑗 (𝑛) (ℎ𝑖𝑑𝑑𝑒𝑛 𝑛𝑜𝑑𝑒𝑠) ∆ 𝑤 𝑗𝑖 (𝑛)= 𝛼 ∆ 𝑤 𝑗𝑖 𝑛−1 + 𝜂 𝛿 𝑗 (𝑛) 𝑦 𝑖 (𝑛)

7 Function: sklearn.neural_network.MLPRegressor
Python-based MLP tool used to solve regression problems Parameters: Activation function Solver Learning rate Regularization parameter Many others, some of which only apply to certain solvers

8 Example problem: abalone shells
Dataset of 4177 abalone shells is given In this case, 7 numeric predictors are used to try to predict a single output (number of rings) Data was split 3500 (calibration set) (validation set)

9 All defaults R2 = 0.409 Ignore half values on axes – obviously rings are integer only

10 Learning rate=adaptive, all else defaults
Default is constant. Made no difference whatsoever.

11 Hidden layer number = 500, all else defaults
Default is 100

12 All defaults, seed = 10 R2 = 0.420 I set default as 1

13 Activation function = ‘tanh’, all else defaults, seed = 1
R2 = 0.444 Default is ‘relu’ = rectified linear unit function = max(0,x)

14 Solver = lbfgs, all else defaults, seed = 1
Default is ‘adam’. ‘lbfgs’ is supposedly better for smaller datasets.

15 My project Create linkages between microstructure and elastic property
In this case, first 10 PC scores are the inputs, and 𝜒 1 is the output 𝜒 1 = 𝛾 𝛾 ∗ 𝑀=𝑈𝛴 𝑉 ∗ Chi is strain rate ratio. Numerator is the avg equiv. strain rate in phase 1. Denominator is composite equiv. strain rate. 𝑓 𝑡 𝑘 − 𝑓 𝑡 ≈ 𝑗 = 1 𝑅 ∗ 𝛼 𝑗 𝑘 𝜑 𝑗𝑡 𝛼=𝑈, 𝜑= 𝛴 𝑉 ∗

16 Application to my data Increased number of hidden layers and maximum iterations to each, lbfgs solver used R2 (linear)= 0.892 R2 (MLP)= 0.901 Fit is relatively stable (varies a little each time but not much). Using lbfgs solver is crucial for small datasets. Eta is shear strength ratio. Trained on

17 Next Steps Investigate other learning methods such as support vector machines Improve model fits through hyperparameter tuning Apply machine learning model to elastic constant data from generated microstructures Determine method to calculate an uncertainty value for a prediction; this value will be used to determine whether or not the generated microstructure will be submitted for simulation


Download ppt "Multilayer Perceptron (MLP) Regression"

Similar presentations


Ads by Google