Presentation is loading. Please wait.

Presentation is loading. Please wait.

Neural Network Oleh Danny Manongga

Similar presentations


Presentation on theme: "Neural Network Oleh Danny Manongga"— Presentation transcript:

1 Neural Network Oleh Danny Manongga

2 Overview Basics of Neural Network Advanced Features of Neural Network
Applications I-II Summary

3 Basics of Neural Network
What is a Neural Network Neural Network Classifier Data Normalization Neuron and bias of a neuron Single Layer Feed Forward Limitation Multi Layer Feed Forward Back propagation

4 What is a Neural Network?
Neural Networks What is a Neural Network? Biologically motivated approach to machine learning Similarity with biological network Fundamental processing elements of a neural network is a neuron 1.Receives inputs from other source 2.Combines them in someway 3.Performs a generally nonlinear operation on the result 4.Outputs the final result

5 Similarity with Biological Network
Fundamental processing element of a neural network is a neuron A human brain has 100 billion neurons An ant brain has 250,000 neurons

6

7

8 Synapses, the basis of learning and memory

9 Real Neural Networks The business end of this is made of lots of these joined in networks like this Our own computations are performed in/by this network This type of computer is fabulous at pattern recognition

10

11

12

13 Neural Network Neural Network is a set of connected
INPUT/OUTPUT UNITS, where each connection has a WEIGHT associated with it. Neural Network learning is also called CONNECTIONIST learning due to the connections between units. It is a case of SUPERVISED, INDUCTIVE or CLASSIFICATION learning.

14 Neural Network Neural Network learns by adjusting the weights so as to be able to correctly classify the training data and hence, after testing phase, to classify unknown data. Neural Network needs long time for training. Neural Network has a high tolerance to noisy and incomplete data

15

16 Artificial Neural Networks
An artificial neuron (node) An ANN (neural network) Nodes do very simple number crunching Numbers flow from left to right: the numbers arriving at the input layer get “transformed” to a new set of numbers at the output layer. There are many kinds of nodes, and many ways of combining them into a network, but we need only be concerned with the types described here, which turn out to be sufficient for any (consistent) pattern classification task.

17

18

19

20 Different types of node: A Threshold Linear Unit
2.6 -0.5 TLU T=1 0.1 2.0 1.0 -1.0 Output - if weighted sum >= threshold, output 1, else 0 Here, weighted sum is: 2.6x x x1.0 = -2.1 TLUs are: useful in teaching the basics of NNs networks of them can do anything, so TLUs have all the required power but they are difficult to use in practice (hard to design the right network of TLUs for the job in hand), which is why we normally use … (next slide)

21 A Logistic Unit 2.6 -0.5 f(weighted sum) 0.109 2.0 0.1 1.0 -1.0
It calculates the weighted sum of inputs, as before And then squashes that sum between 0 and –1 using this function:

22 The Logistic Map weighted_sum
Doing this makes it easier to train NNs to do what we want them to. (sometimes f is slightly different, e.g. so that output is between –1 and 1)

23 Bias Units 1 -3.1 2.6 -0.5 f(weighted sum) 0.109 2.0 0.1 1.0 -1.0
In practice, ANNs tend to have logistic units, with one of the inputs fixed to 1 (this is treated as input arriving from a fixed `bias’ unit) The connection from the bias unit has a weight, which can be varied, and this amounts to a kind of threshold which is gradually learned.

24 Simple ANN Example (with TLUs)
1 A 1 0.5 1 B -1 0.5 This one calculates XOR of the inputs A and B Each non-input node is an LTU (linear threshold unit), with a threshold of 1. Which means: if the weighted sum of inputs is >= 1, it fires out a 1, Otherwise it fires out a zero.

25 Computing AND with an NN
0.5 B 0.5 The blue node is the output node. It adds the weighted inputs, and outputs 1 if the result is >= 1, otherwise 0.

26 Computing OR with a NN A 1 1 B The green node is the output node.
It adds the weighted inputs, and outputs 1 if the result is >= 1, otherwise 0. With these weights, only one of the inputs needs to be a 1, and the output will be 1. Output will be 0 only if both inputs are zero.

27 Computing NOT with a NN A -1 1 Bias unit which always
sends fixed signal of 1 This NN computes the NOT of input A The blue unit is a threshold unit with a threshold of 1 as before. So if A is 1, the weighted sum at the output unit is 0, hence output is 0; If A is 0, the weighted sum is 1, so output is 1.

28 So, an NN can compute AND, OR and NOT – so what?
It is straightforward to combine ANNs together, with outputs from some becoming the inputs of others, etc. That is, we can combine them just like logic gates on a microchip. Since we can compute AND, OR and NOT, we can in fact therefore compute any logical function at all with a network of TLUs. This is a powerful result, because it means a network of TLUs can do anything. The catch is that, for most interesting problems (precisely the ones where we want to use ANNs) we don’t know what the logical function is. E.g. given gene expression patterns (or blood test results, etc…) from the cells of each of a number of patients, we have no idea what the function is which would indicate which of those patients is in the early stages of cancer. But, given some data where we know the required outputs, there are algorithms which can find appropriate sets of weights for an ANN, so that: The ANN learns an appropriate function by itself The ANN can therefore provide the correct outputs for each of the patterns it was trained with The ANN therefore can make predictions of the correct output for patterns it has never seen before

29 … … … To put it another way
Imagine this. Image of handwritten character converted into array of grey levels (inputs) 26 outputs, one for each character a 7 2 b c d 3 e … … … f Weights are the links are chosen such that the output corresponding to the correct letter emits a 1, and all the others emit a 0. This sort of thing is not only possible, but routine: Medical diagnosis, wine-tasting, lift-control, sales prediction, …

30 Getting the Right Weights
Clearly, an application will only be accurate if the weights are right. An ANN starts with randomised weights And with a database of known examples for training 7 If this pattern corresponds to a “c” 2 We want these outputs 1 3 If wrong, weights are adjusted in a simple way which makes it more likely that the ANN will be correct for this input next time Send Training Pattern in Crunch to outputs Adjust weights All correct STOP Some wrong

31 Training Algorithms Classic: Backpropagation (BP)
Others: variants of BP, evolutionary algorithms, etc … Details: Not needed for this specific course You will learn BP in the Connectionism module, if you take that. Here are some good tutorials (there are plenty more): Here’s some excellent free software:

32 Generalisation The ANN is Learning during its training phase
When it is in use, providing decisions/classifications for live cases it hasn’t seen before, we expect a reasonable decision from it. I.e. we want it to generalise well. This network was trained with the black As and Bs; the white A represents an unseen test case. In the third example, it thinks is a B A A A A A A A B A B A B A A A B B B B B B Well learned Poor generalisation Diabolical Coverage and extent of training data helps to avoid poor generalisaton Main Point: when an NN generalises well, its results seems sensible, intuitive, and generally more accurate than people

33 ANN points One main issue is deciding how to present the data as input patterns to the network, and how to encode the output (next slide) Another issue is the topology. It needs three layers of networks (I.e. 1 layer between the inputs and outputs – called the hidden layer – in order to be capable of learning any function), but how many nodes? Often this is trial and error; a rule of thumb is to use 40% more nodes than in the input layer. Commonly employed in control applications (lift, environment, chemical process) Also credit rating: NNs and similar are used by banks to throw out decisions about accept or declining loans. Also prediction (sales, sunspots, stocks, horse races) Can be slow (hours, days) to train, but decision time is always instant. Very useful in bioinformatics – currently, standard NN provides better accuracy than anything else in predicting secondary structure (regions of helix, sheet, loop) from protein primary sequence. Also great at classifying expression patterns.

34 Encoding Suppose we gene expression patterns from each of 10 cancerous skin cells, one from each patient: 1: , 4.6, 0.2, 1.7, 1.1 …. (20 numbers) 2: , 2.9, 1.2, 1.3, 1.8 …. 10: 6.4, 1.1, 1.0, 1.7, 1.0 …. Etc … And we also have gene expression patterns from each of 10 healthy skin cells from different patients: 1: , 4.7, 3.1, 0.8, 1.2 …. (20 numbers) 2: , 3.9, 4.2, 0.3, 1.8 …. 10: 5.4, 2.1, 2.0, 1.9, 1.0 …. Etc …

35 We might set up the ANN like this
All of these are connected to all of these 3.2 One output node 4.6 0.2 (20 input nodes) (30 hidden nodes) With the rule that: if output > 0.5, prediction is cancerous, Else prediction is healthy.

36 Or like this … 0.5 O1 0.9 Two output nodes -0.04 O2 … … …
All of these are connected to all of these 0.5 O1 0.9 Two output nodes -0.04 O2 Inputs scaled to lie between –1 and 1 With the rule that: if O1’s output is higher than O2’s, then prediction is cancer, if O2’s output ishigher than O1’s, prediction is healthy

37 About encoding There are various other ways
(you will see how things are usually set up for protein secondary structure prediction in a later lecture) The choice depends on things we won’t go into here (out of scope), but are essentially commonsense reasons, based on what would seem to give the ANN the best chance of learning the required function.

38 A Final Note NNs can provide very accurate predictions
But it is difficult to figure out how they work out their predictions However, we can inspect the weights of a trained NN, to gain some insight into what it is about the inputs that leads to certain outputs. E.g. Data on racehorse past form and current conditions Prediction of whether it will win the race If we train this network and find that the weights on the links from the “number of previous wins” unit are all very low, we can infer that this contributes little or nothing to the prediction. The “length of race” input may have high weights, indicating this is a key factor, etc …

39

40

41

42

43

44

45


Download ppt "Neural Network Oleh Danny Manongga"

Similar presentations


Ads by Google