Joe Bradish Parallel Neural Networks. Background  Deep Neural Networks (DNNs) have become one of the leading technologies in artificial intelligence.

Slides:



Advertisements
Similar presentations
A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch.
Advertisements

1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.
Kostas Kontogiannis E&CE
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Machine Learning Neural Networks
Handwritten Character Recognition Using Artificial Neural Networks Shimie Atkins & Daniel Marco Supervisor: Johanan Erez Technion - Israel Institute of.
Neural Networks Basic concepts ArchitectureOperation.
Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes Module: Neural Networks: Concepts (Reading:
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Rutgers CS440, Fall 2003 Neural networks Reading: Ch. 20, Sec. 5, AIMA 2 nd Ed.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Traffic Sign Recognition Using Artificial Neural Network Radi Bekker
Machine Learning. Learning agent Any other agent.
MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way
1 st Neural Network: AND function Threshold(Y) = 2 X1 Y X Y.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.
Artificial Neural Networks
Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Explorations in Neural Networks Tianhui Cai Period 3.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
Chapter 9 Neural Network.
© Copyright 2004 ECE, UM-Rolla. All rights reserved A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C.
Chapter 3 Neural Network Xiu-jun GONG (Ph. D) School of Computer Science and Technology, Tianjin University
What is a neural network? Collection of interconnected neurons that compute and generate impulses. Components of a neural network include neurons, synapses,
Appendix B: An Example of Back-propagation algorithm
Hybrid AI & Machine Learning Systems Using Ne ural Networks and Subsumption Architecture By Logan Kearsley.
Backpropagation An efficient way to compute the gradient Hung-yi Lee.
NEURAL NETWORKS FOR DATA MINING
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
George F Luger ARTIFICIAL INTELLIGENCE 6th edition Structures and Strategies for Complex Problem Solving Machine Learning: Connectionist Luger: Artificial.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Multi-Layer Perceptron
Akram Bitar and Larry Manevitz Department of Computer Science
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
PARALLELIZATION OF ARTIFICIAL NEURAL NETWORKS Joe Bradish CS5802 Fall 2015.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Procedure for Training a Child to Identify a Cat using 10,000 Example Cats For Cat_index  1 to Show cat and describe catlike features (Cat_index)
Lecture 5 Neural Control
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Image Source: ww.physiol.ucl.ac.uk/fedwards/ ca1%20neuron.jpg
Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.
Introduction to Neural Networks Freek Stulp. 2 Overview Biological Background Artificial Neuron Classes of Neural Networks 1. Perceptrons 2. Multi-Layered.
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.
Chapter 6 Neural Network.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
Neural networks.
Big data classification using neural network
Neural Networks.
Learning with Perceptrons and Neural Networks
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
FUNDAMENTAL CONCEPT OF ARTIFICIAL NETWORKS
Prof. Carolina Ruiz Department of Computer Science
Synaptic DynamicsII : Supervised Learning
network of simple neuron-like computing elements
Capabilities of Threshold Neurons
Artificial Neural Networks
Artificial Neural Networks
Computer Vision Lecture 19: Object Recognition III
Akram Bitar and Larry Manevitz Department of Computer Science
Prof. Carolina Ruiz Department of Computer Science
Presentation transcript:

Joe Bradish Parallel Neural Networks

Background  Deep Neural Networks (DNNs) have become one of the leading technologies in artificial intelligence and machine learning  Used extensively by major corporations – Google, Facebook  Very expensive to train  Large datasets – Often in the terabytes  The larger the dataset, the more accurately the network can model the underlying classification function  Limiting factor has almost always been computational power, but we are starting to reach levels that can solve previously impossible problems

 Set of layers of neurons which represent synapses in the network  Each neuron has a set of inputs  Either inputs into the entire network  Or, outputs from previous neurons, usually from the previous layer  Underlying algorithm of network can vary greatly  Multilayer feed forward  Feedback Network  Self-organizing maps  Maps from high dimension to lower in 1 layer  Sparse Distributed Memory  Two layer feedforward, associative memory Quick Neural Net Basics

 Everything is reliant on the weights  Determine the importance of each signal which is essential to the network output  The training cycle adjusts the weights  By far, the most critical step of a successful neural network  No training = useless network  Network topology is also key to training and especially parallelization Learning / Training

 Implies these ways of parallelization:  Training session parallelism  Training example parallelism  Layer parallelism  Neuron parallelism  Weight parallelism  Bit parallelism Where can we use parallelization?  Typical structure of a neural network:  For each training session  For each training example in the session  For each neuron in the layer  For all the weights of the neuron  For all the bits of the weight value

Example - Network level Parallelism Notice there are many different neural networks, some of which feed into each other The outputs are sent to different machines and then aggregated once more

 Each Neuron is assigned a specific controlling entity on the communication network  Each computer is responsible for forwarding the weights to the hub so that the computer controlling the next layer can feed it into the neural network  Uses a broadcast system Example – Neuron Level Parallelism

 Used to parallelize serial backpropagation  Usually implemented as a series of matrix- vector operations  Achieved using an all-to-all broadcasts  Each node (on a cluster) is responsible for a subset of the network  Uses master broadcaster Parallelism by Node

 Backward propagation more complicated 1. Master scatters error vector to current layer 2. Each process computes their weight change for its subset 3. Each process computes their error vector for the previous layer 4. Each process sends its contribution to error vector to master 5. Master sums contributions and prepares previous layer’s error vector for broadcast  Forward propagation is straight forward 1. Master broadcasts previous layer’s output vector 2. Each process computers its subset of the current layer’s output vector 3. Master gathers from all processes and prepares vector for next broadcast Parallelization by Node Cont.

Results of Node Parallelization MPI used for communication between nodes 32 machine cluster of Intel Pentium II Up to 16.36x speedup with 32 processes

Results of Node Parallelization Cont.

 Each process determines the weight change on a disjoint subset of the training population  Changes are aggregated and then applied to neural network after each epoch (set of training)  Low levels of synchronization needed  Only requires two additional steps  Very simple to implement Parallelism by training example Uses master-slave style topology

Speedups using Exemplar Parallelization Max speedup with 32 processes – 16.66x

 Many different strategies for parallelization  Strategy depends on shape, size, type of training data  Node excels at small datasets and on-line learning  Exemplar gives best performance on large training datasets  Different topologies will perform radically different when using the same parallelization strategy  On-going research  GPUs have become very prevalent, due to their ability to perform matrix operations in parallel  Sometimes it is harder to link multiple GPUS  Large clusters of weaker machines have also become prevalent, due to reduced cost  Amazon, Google, and Microsoft offer commercial products for scalable neural networks on their clouds Conclusion

Questions?