Zip Codes and Neural Networks: Machine Learning for

Slides:

Advertisements

Similar presentations

Artificial Intelligence 12. Two Layer ANNs

Advertisements

Perceptron Lecture 4.

1 Image Classification MSc Image Processing Assignment March 2003.

Neural Networks and SVM Stat 600. Neural Networks History: started in the 50s and peaked in the 90s Idea: learning the way the brain does. Numerous applications.

Page 1 of 50 Optimization of Artificial Neural Networks in Remote Sensing Data Analysis Tiegeng Ren Dept. of Natural Resource Science in URI (401)

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.

Machine Learning Lecture 4 Multilayer Perceptrons G53MLE | Machine Learning | Dr Guoping Qiu1.

Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.

Machine Learning Neural Networks

Simple Neural Nets For Pattern Classification

Handwritten Character Recognition Using Artificial Neural Networks Shimie Atkins & Daniel Marco Supervisor: Johanan Erez Technion - Israel Institute of.

Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell.

IT 691 Final Presentation Pace University Created by: Robert M Gust Mark Lee Samir Hessami Mark Lee Samir Hessami.

October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 1 Creating Data Representations On the other hand, sets of orthogonal vectors.

Radial Basis Function (RBF) Networks

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

Image Recognition and Processing Using Artificial Neural Network Md. Iqbal Quraishi, J Pal Choudhury and Mallika De, IEEE.

1 Introduction to Artificial Neural Networks Andrew L. Nelson Visiting Research Faculty University of South Florida.

Soft Computing Colloquium 2 Selection of neural network, Hybrid neural networks.

MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way

Biointelligence Laboratory, Seoul National University

Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.

Classification Part 3: Artificial Neural Networks

Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.

DIGITAL IMAGE PROCESSING Dr J. Shanbehzadeh M. Hosseinajad ( J.Shanbehzadeh M. Hosseinajad)

© Negnevitsky, Pearson Education, Will neural network work for my problem? Will neural network work for my problem? Character recognition neural.

Explorations in Neural Networks Tianhui Cai Period 3.

Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy

Chapter 7 Neural Networks in Data Mining Automatic Model Building (Machine Learning) Artificial Intelligence.

LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.

Artificial Intelligence Techniques Multilayer Perceptrons.

COMPARISON OF IMAGE ANALYSIS FOR THAI HANDWRITTEN CHARACTER RECOGNITION Olarik Surinta, chatklaw Jareanpon Department of Management Information System.

M Machine Learning F# and Accord.net. Alena Dzenisenka Software architect at Luxoft Poland Member of F# Software Foundation Board of Trustees Researcher.

Applying Neural Networks Michael J. Watts

Handwritten Recognition with Neural Network Chatklaw Jareanpon, Olarik Surinta Mahasarakham University.

Soft Computing Lecture 8 Using of perceptron for image recognition and forecasting.

EE459 Neural Networks Examples of using Neural Networks Kasin Prakobwaitayakit Department of Electrical Engineering Chiangmai University.

Akram Bitar and Larry Manevitz Department of Computer Science

CSE & CSE6002E - Soft Computing Winter Semester, 2011 Neural Networks Videos Brief Review The Next Generation Neural Networks - Geoff Hinton.

Fundamentals of Artificial Neural Networks Chapter 7 in amlbook.com.

LOAD FORECASTING. - ELECTRICAL LOAD FORECASTING IS THE ESTIMATION FOR FUTURE LOAD BY AN INDUSTRY OR UTILITY COMPANY - IT HAS MANY APPLICATIONS INCLUDING.

Kim HS Introduction considering that the amount of MRI data to analyze in present-day clinical trials is often on the order of hundreds or.

CPH Dr. Charnigo Chap. 11 Notes Figure 11.2 provides a diagram which shows, at a glance, what a neural network does. Inputs X 1, X 2,.., X P are.

Chapter 11 – Neural Nets © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.

Big data classification using neural network

Neural Network Architecture Session 2

Chapter 7. Classification and Prediction

Applying Deep Neural Network to Enhance EMPI Searching

Applying Neural Networks

Artificial Neural Networks I

Data Mining, Neural Network and Genetic Programming

Data Mining, Neural Network and Genetic Programming

CSE 473 Introduction to Artificial Intelligence Neural Networks

Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)

Intelligent Information System Lab

Final Year Project Presentation --- Magic Paint Face

Random walk initialization for training very deep feedforward networks

CSE P573 Applications of Artificial Intelligence Neural Networks

Roberto Battiti, Mauro Brunato

Hidden Markov Models Part 2: Algorithms

of the Artificial Neural Networks.

CSE 573 Introduction to Artificial Intelligence Neural Networks

network of simple neuron-like computing elements

Neural Networks Geoff Hulten.

Artificial Intelligence 12. Two Layer ANNs

CSC321: Neural Networks Lecture 11: Learning in recurrent networks

Automatic Handwriting Generation

Akram Bitar and Larry Manevitz Department of Computer Science

Machine Learning.

Patterson: Chap 1 A Review of Machine Learning

Presentation transcript:

Zip Codes and Neural Networks: Machine Learning for Handwritten Number Recognition Taylor Harbold, Michelle Page, Courtney Rasmussen Supervisor: Dr. Cuixian Chen 1UNCW Department of Mathematics and Statistics Introduction Statistical Techniques Model Refinement Neural Network is an idea from neuroscience that dates back to the 1940s. It started as using electrical circuits to model how neurons work, and has since then led to amazing advances like artificial intelligence. Neural networks utilize an oversimplification of the synapse processes that occur in the brain to interpret information. Raw input is taken in, organized and interpreted a certain way, and then a conclusion is come to. In statistics, neural networks mimic these processes by employing the methods of projection pursuit regression and back propagation. It is executed by taking linear data and putting it through complex, non-linear equations that improve themselves and get better at interpreting data with practice, just like our brains. By doing this, we create a simpler method for solving complex problems. Although neural networks have a wide array of uses, like facial recognition and stock prediction, we applied these techniques to predict the true value of hand-written digits from zip code data. Using MATLAB and the RSNNS package in the program R-Studio, we created prediction models that can “read” human handwriting. A neural network consists of three different layers; input, hidden layer, and output. The input layer is made up of the vectors of X with 𝑥= 𝑥 1 , 𝑥 2 ,…, 𝑥 𝑝 𝑇 where p denotes the number of dimensions of the model. The subsequent layer is the hidden layer, which consists of hidden neurons denoted by Zm. To calculate Z we use 𝑍 𝑚 =𝜎 ∝ 0𝑚 + ∝ 𝑚 𝑇 𝑋 where 𝑚=1,2,…,𝑀, M being the number of hidden neurons in the layer. The activation function most widely used is the sigmoid function, denoted by 𝜎, formulated by 𝜎 𝑣 = 1 1+ 𝑒 −𝑣 . The output layer is the response variable in the form y= 𝑦 1 , 𝑦 2, …, 𝑦 𝐾 . To calculate the response variable we use 𝑦 𝑘 = 𝑔 𝑘 𝑇 and 𝑇 𝑘 = 𝛽 0𝑘 + 𝛽 𝑘 𝑇 𝑍. Initial weights (or slopes) and biases (or intercepts) of Z and Y are randomly generated from a range of [-1,1]. In most cases, the sigmoid function also assumes the role of the non-linear function 𝑔 𝑘 . If K=1 we will perform a regression neural network, whereas if K>1 a classification neural network is needed. In order to find the optimal model, we needed to find the amount of hidden neurons within each hidden layer that would give the best accuracy. We tested at 5, 50, 75 and 100 hidden neurons. After running the first series of analysis, it was found that using 100 hidden neurons was most ideal. Using more hidden neurons usually results in a better model since it shows the flexibility of the data. Here it was necessary for the data since there were 256 entries. The learning rate is the length of the step that is taken when updating weights and biases of the model per iteration during back propagation. Keeping this constant at 0.1, we found that the test data accuracy was 0.9387145. Comparing this to the model with 75 hidden neurons at a learning rate of 0.1, the accuracy was 0.937718. Since the difference of these two models are so close, we decided to compare the models at different learning rates to find the most accurate model. Table of Model Accuracy Single Hidden Layer Back-Propagation Network for Classification Number of Hidden Neurons Learning Rate Minimum Weighted SSE Accuracy of Training Model Accuracy of Test Model 5 (Base Model) 0.1 2700 0.8777945 0.7927255 50 900 0.9950624 0.9317389 75 850 0.9954739 0.937718 100 825 0.9962968 0.9387145 0.3 800 0.995611 0.935725 0.5 780 0.996434 750 0.9969826 0.9431988 0.6 0.9965711 0.9342302 0.9968454 0.9402093 Data INPUT HIDDEN LAYER RESPONSE 𝑍 1 = 𝜕( 𝛼 01 + 𝛼 11 𝑋 1 + 𝛼 21 𝑋 2 + 𝛼 31 𝑋 3 ) =𝜎(−0.4+ 0.2∗1 + 0.4∗0 +(−0.5∗1)) 𝑍 1 =𝜎(.7) 1 1+ 𝑒 −0.7 =0.332 Therefore 𝑍 1 =0.332. In the United States, postal codes are a vital part of the US postal service. Zip codes are made up of five numerical values ranging from zero to nine, usually following a pattern depending on geographical location within each region of the United States. For example, North Carolina has ZIP codes starting with the numbers 27 and 28, whereas Massachusetts has ZIP codes starting with the digits 010 through 027. It is also important to note that some states have ZIP codes that are restricted to starting to only two numerical values. An example of this would be the state of Utah in which all of their zip codes start with 84. Geographical Location of the first two digits of ZIP code Example of handwritten ZIP codes The second stage of model refinement was needed in order to find the best learning rate. Using 100 hidden neurons, we tested the model at learning rates of 0.1, 0.3, 0.5, and 0.6. We found that as the learning rate increased from 0.1 to 0.5, the accuracy increased as well. But when the learning rate went past 0.5, the accuracy started to decline. The learning rate that created the best model at 100 hidden neurons was 0.5, giving an accuracy of 0.9431988. This shows that we were able to predict the ZIP code digits with 94.31988% accuracy and an error of 5.68%.. 𝑌 1 = 𝑔 1 (0.1+(−0.3∗0.332) +(−0.2∗0.525)) 𝑌 1 = 𝑔 1 (−0.105) 𝜎(𝑇)= 1 1+ 𝑒 0.105 =0.474 Therefore 𝑌 1 = 0.474. Weighted SSE Graphical Summary Base Model Final Model http://www.whereig.com/usa/zipcodes/ https://web.stanford.edu/~hastie/local.ftp/Springer/OLD/ESLII_print4.pdf The data used to analyze handwritten ZIP codes was taken from envelopes in the United States that had been automatically scanned into the U.S Postal System. Each number from the ZIP code was isolated and examined by creating an eight-bit grey scale map. This was done with the identity of each isolated number that had a 16 by 16 grid overlaying the image. The eight-bit map is represented by pixels that range from 0 to 255, depending on the size and clarity of the handwriting of each isolated number. Neural Networks made it possible to predict handwritten ZIP codes using a training data set. The training and test data had ten columns corresponding to each number in the isolated ZIP code (0, 1, 2…, 9). The responses were represented in binary form, 0 being false and 1 being true. This resulted in 7,291 rows of results that would help model handwritten ZIP codes. The test data that was used was compromised of the ten columns and 2,007 binary responses, similar to the training data set. Training Data Testing Data Training Data Testing Data After running the model, we then calculate the error of each perceptron, using the distinct formulas dependent on the layer. For the output layer we utilize the equation 𝐸𝑟𝑟 𝑘 = 𝑂 𝑘 1− 𝑂 𝑘 𝑇 𝑘 − 𝑂 𝑘 , where 𝑂 𝑘 is the output value for 𝑦 𝑘 and 𝑇 𝑘 is the true value. For the hidden layer, the formula 𝐸𝑟𝑟 𝑚 = 𝑧 𝑚 1− 𝑧 𝑚 1− 𝑧 𝑚 𝑘=1 𝐾 𝐸𝑟𝑟 𝑘 𝛽 𝑚𝑘 is needed. We then use a function of these values to update the weights and biases between each perceptron and run the model again. Hidden Neurons Learning Rate Accuracy Test Model 5 0.1 0.7927255 Hidden Neurons Learning Rate Accuracy Test Model 100 0.5 0.9431988 For weights: 𝑤 𝑖𝑗 𝑛𝑒𝑤 = 𝑤 𝑖𝑗 𝑜𝑙𝑑 +ℓ 𝐸𝑟𝑟 𝑗 𝑂 𝑖 For biases: 𝜃 𝑗 𝑛𝑒𝑤 = 𝜃 𝑗 𝑜𝑙𝑑 +ℓ 𝐸𝑟𝑟 𝑗 At this point in time, it was found that in order to most accurately predict the handwritten ZIP code that the model would need 100 hidden neurons at a learning rate of 0.5. Using this model, the handwritten ZIP code can be predicted at a 94.31988% accuracy with a 5.68% error. We then run the model once again with new weights and biases, bringing our output values closer to the true values of Y with each iteration. Future Studies In future research, it would be wise to expand on the amount of hidden layers within the model. Manipulating the amount of hidden layers would influence the model and determine if using regression or clustering would be more optimal. We can later use the amount of hidden layers to create a hierarchy system that will analyze different levels of resolution.