Download presentation

Presentation is loading. Please wait.

Published byJoaquin Davidson Modified over 6 years ago

1
Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 20013329 안순길

2
Contents How the Brain works Neural Networks Perceptrons

3
Introduction Two view points in this chapter Computational view points : representing function using network Biological view points : mathematical model for brain Neuron: computing elements Neural Networks: collection of interconnected neurons

4
How the Brain Works Cell body (soma) :provides the support functions and structure of the cell Axon : a branching fiber which carries signals away from the neurons Synapse : converts a electrical signal into a chemical signal Dendrites : consist of more branching fibers which receive signal from other nerve cells Action potential: electrical pulse Synapse excitatory: increasing potential synaptic connection: plasticity inhibitory: decreasing potential A collection of simple cells can lead to thoughts, action, and consciousness.

5
Comparing brains with digital computers They perform quite different tasks, have different properties Speed (in Switching speed) computer is a million times faster brain is a billion times faster Brain Perform a complex task More fault-tolerant: graceful degradation To be trained using an inductive learning algorithm

6
Neural Networks NN: nodes(unit), links(has a numeric weight) Each link has a weight Learning : updating the weights Two computational components linear component: input function nonlinear component: activation function

7
Notation

8
Simple computing elements Total weighted input By applying the activation function g

9
Three activation function

10
Threshold To cause the neuron to fire can be replaced with an extra input weight. The input greater than threshold, output 1 Otherwise 0

11
Applying neural network in Logic Gates

12
Network structures(I) Feed-forward networks Unidirectional links, no cycles DAG(directed acyclic graph) No links between units in the same layer, no links backward to a previous layer, no links that skip a layer. Uniformly processing from input units to output units No internal state

13
input units/ output units/ hidden units Perceptron: no hidden units Multilayer networks: one or more hidden units Specific parameterized structure: fixed structure and activation function Nonlinear regression: g(nonlinear function)

14
Network Structures(II) Recurrent Network The Brain similar to Recurrent Network Brain has backward link like Recurrent Recurrent networks have internal states stored in the activation level Unstable, oscillate, exhibit chaotic behavior Long computation time Need advanced mathematical method

15
Network Structures(III) Examples Hopfield networks Bidirectional connections with symmetric weights Associative memory: most closely resembles the new stimulus Boltzmann machines Stochastic(probabilitic) activation function

16
Optimal Network Struture(I) Too small network: in capable of representation Too big network: not generalized well Overfitting when there are too many parameters. Feed forward NN with one hidden layer can approximate any continuous function Feed forward NN with 2 hidden layer can approximate any function

17
Optimal Network Structures(II) NERF(Network Efficiently Representable Functions) Function that can be approximated with a small number of units Using genetic algorithm: running the whole NN training protocol Hill-climbing search(modifying an existing network structure) Start with a big network: optimal brain damage Removing weights from fully connected model Start with a small network: tiling algorithm Start with single unit and add subsequent units Cross-validation techniques

18
Perceptrons Perceptron: single-layer, feed-forward network Each output unit is indep. of the others Each weight only affects one of the outputs where,

19
What perceptrons can represent Boolean function AND, OR, and NOT Majority function: W j =1, t=n/2 ->1 unit, n weights In case of decision tree: O(2 n ) nodes can only represent linearly separable functions. cannot represent XOR

20
Examples of Perceptrons Entire input space is divided in two along a boundary defined by In Figure 19.9(a): n=2 In Figure 19.10(a): n=3

21
Learning linearly separable functions(I) Bad news: not many problem in this set Good news: given enough training examples, there exists a perceptron algorithm learning them. Neural network learning algorithm Current-best-hypothesis(CBH) scheme Hypothesis: a network defined by the current values of the weights Initial network: randomly assigned weight in [-0.5, 0.5] Repeat the update phase to achieve convergence Each epoch: updating all the weights for all the examples

22
Learning linearly separable functions(II) Learning The error Err=T-O :Rosenblatt in 1960 : learning rate Error positive Need to increase O Error negative Need to decrease O

23
Algorithm

24
Perceptrons(Minsky and Papert, 1969) Limits of linearly separable functions Gradient descent search through weight space Weight space han no local minima Difference btw. NN and other attribute-based methods such as decision trees. Real numbers in some fixed range vs. discrete set Dealing with discrete set Local encoding: a single input, discrete attribute values None=0.0, Some=0.5, Full=1.0 (WillWait) Distributed encoding: one input unit for each attribute

25
Example

26
Summary(I) Neural network is made by seeing human ’ s brain Brain still superior to Computer in Switching Speed More fault-tolerant Neural network nodes(unit), links(has a numeric weight) Each link has a weight Learning : updating the weights Two computational components linear component: input function nonlinear component: activation function

27
Summary(II) In this text, We only consider Feed-forward networks Unidirectional links, no cycles DAG(directed acyclic graph) No links between units in the same layer, no links backward to a previous layer, no links that skip a layer. Uniformly processing from input units to output units No internal state

28
Summary(III) Network size decides Representation Power Overfitting when there are too many parameters. Feed forward NN with one hidden layer can approximate any continuous function Feed forward NN with 2 hidden layer can approximate any function

29
Summary(IV) Perceptron: single-layer, feed-forward network Each output unit is indep. of the others Each weight only affects one of the outputs Only available in linear separable functions If Problem Space is flat, Neural Network is very available. In other words, if we make it easy in algorithm perspective, Neural network also do Basically, Back Propagation only guarantee Local Optimality in neural network

Similar presentations

© 2021 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google