Statistical Classification Methods 1.Introduction 2.k-nearest neighbor 3.Neural networks 4.Decision trees 5.Support Vector Machine.

Statistical Classification Methods 1.Introduction 2.k-nearest neighbor 3.Neural networks 4.Decision trees 5.Support Vector Machine

What is classification Locate new observations into known classes by previously trained model Model was trained by existing data with known label Introduction

Machine Learning is the study of computer algorithms that improve automatically through experience. [Machine learning, Tom Mitchell, McGraw Hill, 1997] Machine Classifier Training Data: example input/output pairs input output Machine Learning for Classification

Classification Steps Introduction

Examples of Classification Task Determining cells as cancer or non-cancer Classifying credit card transactions as legitimate or fraudulent Classifying secondary structures of protein as alpha-helix, beta-sheet, or random coil Categorizing news stories as finance, weather, entertainment, sports, etc Introduction

Prototype based Methods: K-Nearest Neighbour (KNN),Weighed KNN, Fuzzy KNN, etc. Boundary based Methods: Neural Networks, such as Multiple Layer Perception (MLP), Back Propagration (BP), Support Vector Machine (SVM), etc. Rule based Methods: Decision Tree Classification Methods used in DM

Classification Method: 1 kNN - Basic Information Training method: – Save the training examples At prediction time: – Find the k training examples (x 1,y 1 ),…(x k,y k ) that are nearest to the test example x – Predict the most frequent class among those y i ’s. kNN

kNN Steps Store all input data in the training set For each sample in the test set Search for the K nearest sample to the input sample using a Euclidean distance measure For classification, compute the confidence for each class as C i /K, (where C i is the number of samples among the K nearest samples belonging to class i.) The classification for the input sample is the class with the highest confidence. kNN

An arbitrary instance is represented by(a 1 (x), a 2 (x), a 3 (x),.., a n (x)) – a i (x) denotes features Euclidean distance between two instances d(x i, x j )=sqrt (sum for r=1 to n (a r (x i ) - a r (x j )) 2 ) Continuous valued target function – mean value of the k nearest training examples kNN Calculation kNN

kNN Calculation - 1-Nearest Neighbor kNN

kNN Calculation - 3-Nearest Neighbor kNN

On Class Practice 1 Data – Iris.arff and your own data (if applicable) Method – k-NN – Parameter (Select by yourself) Software – wekaclassalgos1.7 Step – Explorer->Classify->Classifier (Lazy IBK)

Classification Method: 2 Neural Networks – Biological inspiration Animals are able to react adaptively to changes in their external and internal environment, and they use their nervous system to perform these behaviours. An appropriate model/simulation of the nervous system should be able to produce similar responses and behaviours in artificial systems. The nervous system is build by relatively simple units, the neurons, so copying their behaviour and functionality should be the solution. Neural networks

A neural network is an interconnected group of nodes Neural Networks – Basic Structure Neural networks

Neural Networks – Biological inspiration Neural networks Dendrites Soma (cell body) Axon

Neural Networks – Biological inspiration Neural networks synapses axon dendrites The information transmission happens at the synapses.

Neural Networks – Biological inspiration Neural networks The spikes (signal) travelling along the axon of the pre-synaptic neuron trigger the release of neurotransmitter substances at the synapse. The neurotransmitters cause excitation (+) or inhibition (-) in the dendrite of the post-synaptic neuron. The integration of the excitatory and inhibitory signals may produce spikes in the post-synaptic neuron. The contribution of the signals depends on the strength of the synaptic connection.

Neural Networks – Artificial neurons Neural networks Neurons work by processing information. They receive and provide information in form of spikes. The McCullogh-Pitts model Inputs Output w2w2 w1w1 w3w3 wnwn w n-1... x 1 x 2 x 3 … x n-1 x n y

Neural Networks – Artificial neurons Neural networks The McCullogh-Pitts model: spikes are interpreted as spike rates; synaptic strength are translated as synaptic weights; excitation means positive product between the incoming spike rate and the corresponding synaptic weight; inhibition means negative product between the incoming spike rate and the corresponding synaptic weight;

Neural Networks – Artificial neurons Neural networks Nonlinear generalization of the McCullogh-Pitts neuron: y is the neuron’s output, x is the vector of inputs, and w is the vector of synaptic weights. Examples: sigmoidal neuron Gaussian neuron

Neural Networks – Artificial neural networks Neural networks Inputs Output An artificial neural network is composed of many artificial neurons that are linked together according to a specific network architecture. The objective of the neural network is to transform the inputs into meaningful outputs.

Neural Networks Neural networks Learning in biological systems Learning = learning by adaptation The young animal learns that the green fruits are sour, while the yellowish/reddish ones are sweet. The learning happens by adapting the fruit picking behaviour. At the neural level the learning happens by changing of the synaptic strengths, eliminating some synapses, and building new ones.

Neural Networks Neural networks Learning as optimisation The objective of adapting the responses on the basis of the information received from the environment is to achieve a better state. E.g., the animal likes to eat many energy rich, juicy fruits that make its stomach full, and makes it feel happy. In other words, the objective of learning in biological organisms is to optimise the amount of available resources, happiness, or in general to achieve a closer to optimal state.

Neural Networks Neural networks Learning in biological neural networks The learning rules of Hebb: synchronous activation increases the synaptic strength; asynchronous activation decreases the synaptic strength. These rules fit with energy minimization principles. Maintaining synaptic strength needs energy, it should be maintained at those places where it is needed, and it shouldn’t be maintained at places where it’s not needed.

Neural Networks Neural networks Learning principle for artificial neural networks ENERGY MINIMIZATION We need an appropriate definition of energy for artificial neural networks, and having that we can use mathematical optimisation techniques to find how to change the weights of the synaptic connections between neurons. ENERGY = measure of task performance error

Neural Networks- mathematics Neural networks Inputs Output

Neural Networks-mathematics Neural networks input / output transformation W is the matrix of all weight vectors. F actually is two functions: weighted sum of input & activation function

Neural Networks- Perceptron Neural networks ● Basic unit in a neural network ● Linear separator ● Parts  N inputs, x 1... x n  Weights for each input, w 1... w n  A bias input x 0 (constant) and associated weight w 0  Weighted sum of inputs, z = w 0 x 0 + w 1 x 1 +... + w n x n  A threshold function, i.e y=1 if z > 0, y=-1 if z <= 0

Neural Networks- Perceptron Neural networks x1 x2...... xn Σ Thres hold z = Σ w i x i x0 w0 w1 w2 wn 1 if z >0 -1 otherwise

Neural Networks- Perceptron Neural networks Learning in Perceptron Start with random weights Select an input couple (x, F(x)) if then modify the weight according with Note that the weights are not modified if the network gives the correct answer

Neural Networks- Perceptron Neural networks Can add learning rate to speed up the learning process; just multiply in with delta computation Essentially a linear discriminant Perceptron theorem: If a linear discriminant exists that can separate the classes without error, the training procedure is guaranteed to find that line or plane. only one layer, problem with solving complex problem

Neural Networks Neural networks MLP Backpropagation networks Attributed to Rumelhart and McClelland, late 70’s Can construct multilayer networks. Typically we have fully connected, feedforward networks. Inputs Output

Neural Networks – MLP BP Neural networks Learning Procedure: Randomly assign weights (between 0-1) Present inputs from training data, propagate to outputs Compute outputs O, adjust weights according to the delta rule, backpropagating the errors. The weights will be nudged closer so that the network learns to give the desired output. Repeat; stop when no errors, or enough epochs completed

Neural Networks – MLP BP Neural networks Inputs Output if Error found here Perceptron can only change weight here MLP BP changes weight here as well

Neural Networks – MLP BP Neural networks Very powerful - can learn any function, given enough hidden units! With enough hidden units, we can generate any function. Have the same problems of Generalization vs. Memorization. With too many units, we will tend to memorize the input and not generalize well. Some schemes exist to “prune” the neural network. Networks require extensive training, many parameters to fiddle with. Can be extremely slow to train. May also fall into local minima. Inherently parallel algorithm, ideal for multiprocessor hardware. Despite the cons, a very powerful algorithm that has seen widespread successful deployment.

Neural Networks – MLP BP Neural networks Parameters: number of layers number of neurals on layer transfer function (activation function) number of iterations (cycles)

On Class Practice 2 Data – Iris.arff (weka format) and your own data (if applicable) – Iris.txt (Neucom format) Method – Back-Propagation and Multiple layer Perceptron – Parameters (Select by yourself) Software – wekaclassalgos1.7 Steps:Explorer->Classify->Classifier (Nerual – multilayerperceptron - BackPropagation) – Neucom Steps: Modeling Discovery -> Classification-> Neural Networks-> Multi-Layer Perceptron

Self Organizing Map (SOM) Neural networks Characteristic: 1.uses neighborhood 2.High-dimensional low-dimensional Two concepts: 1.Training builds the map using input examples. 2.Mapping classifies a new input vector Components: Nodes or Neurons weight vector of the same dimension as the input data vectors and a position in the map space

– Gaussian neighborhood function: – d ji : initial distance of neurons i and j in a 1-dimensional lattice| j - i | in a 2-dimensional lattice|| r j - r i || where r j is the position of neuron j in the lattice. Neighborhood Function Neural networks

40 N 13 (1) N 13 (2) Neural networks

–  measures the degree to which excited neurons in the vicinity of the winning neuron cooperate in the learning process. – In the learning algorithm  is updated at each iteration during the ordering phase using the following exponential decay update rule, with parameters Neighborhood Function Neural networks

Degree of neighbourhood Distance from winner Degree of neighbourhood Distance from winner Time Neighborhood Function Neural networks

SOM – Algorithm Steps 43 1.Randomly initialise all weights 2.Select input vector x = [x 1, x 2, x 3, …, x n ] from training set 3.Compare x with weights w j for each neuron j to 4.determine winner find unit j with the minimum distance 5.Update winner so that it becomes more like x, together with the winner’s neighbours for units within the radius according to 6.Adjust parameters: learning rate & ‘neighbourhood function’ 7.Repeat from (2) until … ? Note that: Learning rate generally decreases with time: Neural networks Step

SOM - Architecture Lattice of neurons (‘nodes’) accepts and responds to set of input signals Responses compared; ‘winning’ neuron selected from lattice Selected neuron activated together with ‘neighbourhood’ neurons Adaptive process changes weights to more closely inputs 2d array of neurons Set of input signals Weights x1x1 x2x2 x3x3 xnxn... w j1 w j2 w j3 w jn j Neural networks

On Class Practice 3 Data – Iris.arff and your own data (if applicable) Method – SOM – Parameter (Select by yourself) Software – wekaclassalgos1.7 Step – Explorer->Classify->Classifier (Functions - SOM)

Statistical Classification Methods 1.Introduction 2.k-nearest neighbor 3.Neural networks 4.Decision trees 5.Support Vector Machine.

Similar presentations

Presentation on theme: "Statistical Classification Methods 1.Introduction 2.k-nearest neighbor 3.Neural networks 4.Decision trees 5.Support Vector Machine."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Statistical Classification Methods 1.Introduction 2.k-nearest neighbor 3.Neural networks 4.Decision trees 5.Support Vector Machine.

Similar presentations

Presentation on theme: "Statistical Classification Methods 1.Introduction 2.k-nearest neighbor 3.Neural networks 4.Decision trees 5.Support Vector Machine."— Presentation transcript:

Similar presentations

About project

Feedback