August 8, 2006 Danny Budik, Itamar Elhanany Machine Intelligence Lab

Slides:



Advertisements
Similar presentations
Multi-Layer Perceptron (MLP)
Advertisements

Slides from: Doug Gray, David Poole
Multilayer Perceptrons 1. Overview  Recap of neural network theory  The multi-layered perceptron  Back-propagation  Introduction to training  Uses.
Kostas Kontogiannis E&CE
Machine Learning Neural Networks
Lecture 14 – Neural Networks
Neural Networks Basic concepts ArchitectureOperation.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Radial Basis Functions
Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes Module: Neural Networks: Concepts (Reading:
Presenting: Itai Avron Supervisor: Chen Koren Final Presentation Spring 2005 Implementation of Artificial Intelligence System on FPGA.
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
SOMTIME: AN ARTIFICIAL NEURAL NETWORK FOR TOPOLOGICAL AND TEMPORAL CORRELATION FOR SPATIOTEMPORAL PATTERN LEARNING.
Machine Learning. Learning agent Any other agent.
Biointelligence Laboratory, Seoul National University
A Shaft Sensorless Control for PMSM Using Direct Neural Network Adaptive Observer Authors: Guo Qingding Luo Ruifu Wang Limei IEEE IECON 22 nd International.
Introduction to Neural Networks. Neural Networks in the Brain Human brain “computes” in an entirely different way from conventional digital computers.
Artificial Neural Networks An Overview and Analysis.
Chapter 3 Neural Network Xiu-jun GONG (Ph. D) School of Computer Science and Technology, Tianjin University
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
NIMIA October 2001, Crema, Italy - Vincenzo Piuri, University of Milan, Italy NEURAL NETWORKS FOR SENSORS AND MEASUREMENT SYSTEMS Part II Vincenzo.
NEURAL NETWORKS FOR DATA MINING
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.
Artificial Intelligence Techniques Multilayer Perceptrons.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
1 ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 17: TRTRL, Implementation Considerations, Apprenticeship Learning Dr. Itamar Arel.
Simultaneous Recurrent Neural Networks for Static Optimization Problems By: Amol Patwardhan Adviser: Dr. Gursel Serpen August, 1999 The University of.
Methodology of Simulations n CS/PY 399 Lecture Presentation # 19 n February 21, 2001 n Mount Union College.
Akram Bitar and Larry Manevitz Department of Computer Science
Institute of Intelligent Power Electronics – IPE Page1 A Dynamical Fuzzy System with Linguistic Information Feedback Xiao-Zhi Gao and Seppo J. Ovaska Institute.
Neural Networks - Berrin Yanıkoğlu1 Applications and Examples From Mitchell Chp. 4.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Lecture 5 Neural Control
Speech Lab, ECE, State University of New York at Binghamton  Classification accuracies of neural network (left) and MXL (right) classifiers with various.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Chapter 8: Adaptive Networks
Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.
Neural Networks 2nd Edition Simon Haykin
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.
Learning: Neural Networks Artificial Intelligence CMSC February 3, 2005.
Neural Network Architecture Session 2
Optical RESERVOIR COMPUTING
Learning in Neural Networks
A Simple Artificial Neuron
Intelligent Information System Lab
LECTURE 28: NEURAL NETWORKS
RECURRENT NEURAL NETWORKS FOR VOICE ACTIVITY DETECTION
CSE P573 Applications of Artificial Intelligence Neural Networks
CSE 473 Introduction to Artificial Intelligence Neural Networks
A critical review of RNN for sequence learning Zachary C
Artificial Intelligence Chapter 3 Neural Networks
CSE 573 Introduction to Artificial Intelligence Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
LECTURE 28: NEURAL NETWORKS
Financial Data Modelling
Emna Krichene 1, Youssef Masmoudi 1, Adel M
Artificial Intelligence Chapter 3 Neural Networks
A Dynamic System Analysis of Simultaneous Recurrent Neural Network
Artificial Intelligence Chapter 3 Neural Networks
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 17: TRTRL, Implementation Considerations, Apprenticeship Learning November 3, 2010.
Random Neural Network Texture Model
November 1, 2010 Dr. Itamar Arel College of Engineering
Artificial Intelligence Chapter 3 Neural Networks
Akram Bitar and Larry Manevitz Department of Computer Science
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

TRTRL: A Localized Resource-Efficient Learning Algorithm for Recurrent Neural Networks August 8, 2006 Danny Budik, Itamar Elhanany Machine Intelligence Lab Department of Electrical & Computer Engineering University of Tennessee

Background and Motivation Overview of RTRL Outline Background and Motivation Overview of RTRL The Truncated-RTRL (TRTRL) algorithm Simulations results Conclusions and Future

Recurrent Neural Networks (RNNs) Feedforward neural networks are stateless or (memory-less) and statically map from input space to output space RNNs contain internal feedback loops that allow the network to capture temporal dependencies Many tasks require learning of temporal sequences, including Sequential pattern recognition (e.g. speech recognition) Time series prediction

RNNs (cont.) Numerous RNN architectures have been proposed Elman Networks – simplistic (Markovian-type) RNNs Back-propagation through time (BPTT) Time-Delay Neural Networks (TDNNs) The above are not designed for online (real-time) applications Real-Time Recurrent Learning (RTRL) Targets online training of RNNs Calculates the gradient without need for explicit history However, it requires extensive storage and computational resources that limit its scalability

Real Time Recurrent Learning (RTRL) Originally described by R.J. Williams in 1989 Activation function of neuron k is defined as where sk denotes the weighted sum of all activations leading to neuron k The network error at time t is defined by: where dk(t) denotes the desired target value for output neuron k

RTRL: Updating Weights The error is minimized along a positive multiple of the performance measure gradient such that The partial derivatives of the activation function with respect to the weights are identified as sensitivity elements and are denoted by

Recursive Calculation of the Sensitivity Elements in RTRL The sensitivities of node k with respect to the change in weight wij can be obtained as follows These key components require each neuron to Perform O(N3) multiplications  resulting in a total computational complexity of O(N4) Access storage of O(N2)  yielding a total storage requirement of O(N3) Example: For N=100  100 million multiplications and over 2 MB of storage!

Truncated Real Time Recurrent Learning (TRTRL) Motivation: To obtain a scalable version of the RTRL algorithm while minimizing performance degradation How? Limit the sensitivities of each neuron to its ingress (incoming) and egress (outgoing) links

Performing Sensitivity Calculations in TRTRL For all nodes that are not in the output set, the egress sensitivity values for node j are calculated by imposing j=k in the original RTRL sensitivity equation, such that Similarly, the ingress sensitivity values for node j are given by For output neurons, a nonzero sensitivity element must exist in order to update the weights

Resource Requirements of TRTRL The network structure remains the same with TRTRL, only the calculation of sensitivities is reduced Significant reduction in resource requirements … Computational load for each neuron drops to from O(N3) to O(2KN), where K denotes the number of output neurons Total computational complexity is now O(2KN2) Storage requirements drop from O(N3) to O(N2) Example revisited: For N=100, 10 outputs  100k multiplications and only 20kB of storage!

Evaluating TRTRL Performance Testing criteria Tests must have nonlinear functions which exploit the characteristics of RNNs Temporal dependency: two identical inputs could have different outputs depending on the time step

Test case #1: Frequency Doubler Input: sin(x), target output sin(2x) Both networks had 15 neurons

Test case #2: Mackey-Glass Chaotic Series Predictor Series is based on time-delayed differential equations Task is much more difficult than the frequency doubler Both networks had 25 neurons, predicting output with 30 time step dependencies Similar performance was observed on both learning algorithms

RTRL vs. TRTRL Training Time As expected, speedup gain for increases with network size Non-optimal code written in MATLAB®

Conclusions and Future Work High-performance, resource-efficient learning algorithm for training RNNs has been introduced Computational complexity reduced from O(N4) to O(N2) Storage complexity reduced from O(N3) to O(N2) Minor performance degradation observed Future work Algorithm is now more localized lending itself to hardware realization Future work will focus on FPGA implementation of TRTRL