Stat 6601 Project: Neural Networks (V&R 6.3)

Stat 6601 Project: Neural Networks (V&R 6.3)
12/4/2018 Stat 6601 Project: Neural Networks (V&R 6.3) Group Members: Xu Yang Haiou Wang Jing Wu 12/4/2018 Chapter 8.11 Neural Networks

Definition Neural Network There are various classes of NN models.
A broad class of models that mimic functioning inside the human brain There are various classes of NN models. They are different from each other depending on: (1) Problem types, prediction, Classification , Clustering (2) Structure of the model (3) Model building algorithm We will focus on feed-forward neural network. 12/4/2018

A bit of biology . . . Most important functional unit in human brain – a class of cells called – NEURON Dendrites – Receive information Cell Body – Process information Axon – Carries processed information to other neurons Synapse – Junction between Axon end and Dendrites of other Neurons Dendrites Cell Body Axon Neural Nework Synapse Neurons 12/4/2018

An Artificial Neuron . I V = f(I) f X1 w1 X2 w2
Receives Inputs X1 X2 … Xp from other neurons or environment Inputs fed-in through connections with ‘weights’ Total Input = Weighted sum of inputs from all sources Transfer function (Activation function) converts the input to output Output goes to other neurons or environment f X1 X2 Xp I I = w1X1 + w2X w3X3 +… + wpXp V = f(I) w1 w2 . wp Dendrites Cell Body Axon Direction of flow of Information 12/4/2018

Simplest but most common form (One hidden layer)
12/4/2018

Choice for Activation function
Tanh (hyperbolic tangent) f(x) = (ex – e-x) / (ex + e-x) -1 1 0.5 Logistic f(x) = ex / (1 + ex) Threshold 0 if x< 0 f(x) = 1 if x >= 1 12/4/2018

A collection of neurons form a layer
Input Layer - Each neuron gets ONLY one input, directly from outside Hidden Layer - Connects Input and Output layers Output Layer - Output of each neuron directly goes to outside x1 wij Input layer Hidden Layer(s) Outputs x2 x3 x4 12/4/2018

More general format Skip-layer connections Outputs Hidden Input layer
wij Input layer Hidden Layer(s) Outputs 12/4/2018

Fitting criteria Least squares Maximum likelihood Log likelihood
One way to ensure f is smooth: E+λC(f ) 12/4/2018

Usage of nnet in R nnet.formula(formula, data=NULL, weights, ..., subset, na.action=na.fail, contrasts=NULL) size: number of units in the hidden layer. Can be zero if there are skip-layer units. Wts: initial parameter vector. If missing chosen at random. linout: switch for linear output units. Default logistic output units. entropy: switch for entropy (= maximum conditional likelihood) fitting. Default by least-squares. softmax: switch for softmax (log-linear model) and maximum conditional. skip: Logical for links from inputs to outputs. formula: A formula of the form 'class ~ x1 + x ' weights: (case) weights for each example - if missing defaults to 1. rang: if Wts is missing, use random weights from runif(n, -rang, rang). decay: Parameter λ. maxit: maximum of iterations for the optimizer. Hess: Should the Hessian matrix at the solution be returned? trace: logical for output form the optimizer. 12/4/2018

An Example Code: library(MASS) library(nnet) attach(rock)
area1<-area/10000; peri1<-peri/10000 rock1<-data.frame(perm, area=area1, peri=peri1, shape) rock.nn<-nnet(log(perm)~area + peri +shape, rock1, size=3, decay=1e-3, linout=T, skip=T, maxit=1000, hess=T) summary(rock.nn) 12/4/2018

Output > summary(rock.nn) a 3-3-1 network with 19 weights
options were - skip-layer connections linear output units decay=0.001 b->h1 i1->h1 i2->h1 i3->h1 b->h2 i1->h2 i2->h2 i3->h2 b->h3 i1->h3 i2->h3 i3->h3 b->o h1->o h2->o h3->o i1->o i2->o i3->o >sum((log(perm)-predict(rock.nn))^2) [1] # weights: 19 initial value iter 10 value iter 20 value iter 30 value iter 40 value …………………………………. iter 140 value iter 150 value iter 160 value iter 170 value iter 180 value iter 190 value iter 200 value iter 210 value final value converged 12/4/2018

Use the same method from previous section to view the fitted surface
Code: Xp <- expand.grid(area = seq(0.1, 1.2, 0.05), peri = seq(0, 0.5, 0.02), shape = 0.2) trellis.device() rock.grid <- cbind(Xp, fit = predict(rock.nn,Xp)) ## S: Trellis 3D Plot wireframe(fit ~ area + peri, rock.grid, screen = list(z = 160, x = -60), aspect = c(1, 0.5), drape = T) 12/4/2018

Output 12/4/2018

Experiment to show key factor which affects the degree of fit
attach(cpus) cpus3 <- data.frame(syct = syct-2, mmin = mmin-3, mmax = mmax-4, cach = cach/256, chmin = chmin/100, chmax = chmax/100, perf = perf) detach() test.cpus <- function(fit) sqrt(sum((log10(cpus3$perf) - predict(fit, cpus3))^2)/109) cpus.nn1 <- nnet(log10(perf) ~ ., cpus3, linout = T, skip = T, size = 0) test.cpus(cpus.nn1) [1] cpus.nn2 <- nnet(log10(perf) ~ ., cpus3, linout = T, skip = T, size = 4, decay = 0.01, maxit = 1000) test.cpus(cpus.nn2) [1] cpus.nn3 <- nnet(log10(perf) ~ ., cpus3, linout = T, skip = T, size = 10, decay = 0.01, maxit = 1000) test.cpus(cpus.nn3) [1] cpus.nn4 <- nnet(log10(perf) ~ ., cpus3, linout = T, skip = T, size = 25, decay = 0.01, maxit = 1000) test.cpus(cpus.nn4) [1] 12/4/2018

Stat 6601 Project: Neural Networks (V&R 6.3)

Similar presentations

Presentation on theme: "Stat 6601 Project: Neural Networks (V&R 6.3)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Stat 6601 Project: Neural Networks (V&R 6.3)

Similar presentations

Presentation on theme: "Stat 6601 Project: Neural Networks (V&R 6.3)"— Presentation transcript:

Similar presentations

About project

Feedback