Download presentation
Presentation is loading. Please wait.
1
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 1 Creating Data Representations On the other hand, sets of orthogonal vectors (such as 100, 010, 001) can be processed by the network more easily. This becomes clear when we consider that a neuron’s net input signal is computed as the inner product of the input and weight vectors. The geometric interpretation of these vectors shows that orthogonal vectors are especially easy to discriminate for a single neuron.
2
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 2 Creating Data Representations Another way of representing n-ary data in a neural network is using one neuron per feature, but scaling the (analog) value to indicate the degree to which a feature is present. Good examples: the brightness of a pixel in an input image the brightness of a pixel in an input image the distance between a robot and an obstacle the distance between a robot and an obstacle Poor examples: the letter (1 – 26) of a word the letter (1 – 26) of a word the type (1 – 6) of a chess piece the type (1 – 6) of a chess piece
3
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 3 Creating Data Representations This can be explained as follows: The way NNs work (both biological and artificial ones) is that each neuron represents the presence/absence of a particular feature. Activations 0 and 1 indicate absence or presence of that feature, respectively, and in analog networks, intermediate values indicate the extent to which a feature is present. Consequently, a small change in one input value leads to only a small change in the network’s activation pattern.
4
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 4 Creating Data Representations Therefore, it is appropriate to represent a non-binary feature by a single analog input value only if this value is scaled, i.e., it represents the degree to which a feature is present. This is the case for the brightness of a pixel or the output of a distance sensor (feature = obstacle proximity). It is not the case for letters or chess pieces. For example, assigning values to individual letters (a = 0, b = 0.04, c = 0.08, …, z = 1) implies that a and b are in some way more similar to each other than are a and z. Obviously, in most contexts, this is not a reasonable assumption.
5
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 5 Creating Data Representations It is also important to notice that, in artificial (not natural!), completely connected networks the order of features that you specify for your input vectors does not influence the outcome. For the network performance, it is not necessary to represent, for example, similar features in neighboring input units. All units are treated equally; neighborhood of two neurons does not imply to the network that these represent similar features. Of course once you specified a particular order, you cannot change it any more during training or testing.
6
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 6 Creating Data Representations If you wanted to represent the state of each square on the tic-tac-toe board by one analog value, which would be the better way to do this? = 0 = 0 X = 0.5 X = 0.5 O = 1 O = 1 X = 0 = 0.5 = 0.5 O = 1 Not a good scale! Goes from “neutral” to “friendly” and then “hostile”. More natural scale! Goes from “friendly” to “neutral” and then “hostile”.
7
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 7 Representing Time So far we have only considered static data, that is, data that do not change over time. How can we format temporal data to feed them into an ANN in order to detect spatiotemporal patterns or even predict future states of a system? The basic idea is to treat time as another input dimension. Instead of just feeding the current data (time t 0 ) into our network, we expand the input vectors to contain n data vectors measured at t 0, t 0 - t, t 0 - 2 t, t 0 - 3 t, …, t 0 – (n – 1) t.
8
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 8 Representing Time For example, if we want to predict stock prices based on their past values (although other factors also play a role): ? t0+tt0+tt0+tt0+t t$1,000$0 t 0 -6 t t 0 -5 t t 0 -4 t t 0 -3 t t 0 -2 t t0-tt0-tt0-tt0-t t0t0t0t0
9
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 9 Representing Time In this case, our input vector would include seven components, each of them indicating the stock values at a particular point in time. These stock values have to be normalized, i.e., divided by $1,000, if that is the estimated maximum value that could occur. Then there would be a hidden layer, whose size depends on the complexity of the task. And there could be exactly one output neuron, indicating the stock price after the following time interval (to be multiplied by $1,000).
10
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 10 Representing Time For example, a backpropagation network could do this task. It would be trained with many stock price samples that were recorded in the past so that the price for time t 0 + t is already known. This price at time t 0 + t would be the desired output value of the network and be used to apply the BPN learning rule. Afterwards, if past stock prices indeed allow the prediction of future ones, the network will be able to give some reasonable stock price predictions.
11
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 11 Representing Time Another example: Let us assume that we want to build a very simple surveillance system. We receive bitmap images in constant time intervals and want to determine for each quadrant of the image if there is any motion visible in it, and what the direction of this motion is. Let us assume that each image consists of 10 by 10 grayscale pixels with values from 0 to 255. Let us further assume that we only want to determine one of the four directions N, E, S, and W.
12
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 12 Representing Time As said before, it makes sense to represent the brightness of each pixel by an individual analog value. We normalize these values by dividing them by 255. Consequently, if we were only interested in individual images, we would feed the network with input vectors of size 100. Let us assume that two successive images are sufficient to detect motion. Then at each point in time, we would like to feed the network with the current image and the previous image that we received from the camera.
13
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 13 Representing Time We can simply concatenate the vectors representing these two images, resulting in a 200-dimensional input vector. Therefore, our network would have 200 input neurons, and a certain number of hidden units. With regard to the output, would it be a good idea to represent the direction (N, E, S, or W) by a single analog value? No, these values do not represent a scale, so this would make the network computations unnecessarily complicated.
14
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 14 Representing Time Better solution: 16 output neurons with the following interpretation: NESW Q1Q1Q1Q1 Q2Q2Q2Q2 Q3Q3Q3Q3 Q4Q4Q4Q4 This way, the network can, in a straightforward way, indicate the direction of motion in each quadrant (Q 1, Q 2, Q 3, and Q 4 ). Each output value could specify the amount (or speed?) of the corresponding type of motion.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.