숭실대 전기공학과 C ontrol I nformation P rocess L ab 김경진.

Slides:

Advertisements

Similar presentations

Artificial Neural Networks

Advertisements

Multi-Layer Perceptron (MLP)

A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch.

NEURAL NETWORKS Backpropagation Algorithm

EE 690 Design of Embodied Intelligence

Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Multilayer Perceptrons 1. Overview  Recap of neural network theory  The multi-layered perceptron  Back-propagation  Introduction to training  Uses.

Artificial Neural Networks

Navneet Goyal, BITS-Pilani Perceptrons. Labeled data is called Linearly Separable Data (LSD) if there is a linear decision boundary separating the classes.

Tuomas Sandholm Carnegie Mellon University Computer Science Department

Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.

Lecture 14 – Neural Networks

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.

Radial Basis-Function Networks. Back-Propagation Stochastic Back-Propagation Algorithm Step by Step Example Radial Basis-Function Networks Gaussian response.

LOGO Classification III Lecturer: Dr. Bo Yuan

CHAPTER 11 Back-Propagation Ming-Feng Yeh.

Neural Networks An Introduction.

Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences

Artificial Neural Networks

Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.

Neural NetworksNN 11 Neural netwoks thanks to: Basics of neural network theory and practice for supervised and unsupervised.

Introduction to Neural Networks Debrup Chakraborty Pattern Recognition and Machine Learning 2006.

11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering

Lecture 3 Introduction to Neural Networks and Fuzzy Logic President UniversityErwin SitompulNNFL 3/1 Dr.-Ing. Erwin Sitompul President University

Classification / Regression Neural Networks 2

Artificial Intelligence Techniques Multilayer Perceptrons.

Linear Discrimination Reading: Chapter 2 of textbook.

Non-Bayes classifiers. Linear discriminants, neural networks.

11 1 Backpropagation Multilayer Perceptron R – S 1 – S 2 – S 3 Network.

Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.

CS621 : Artificial Intelligence

Fundamentals of Artificial Neural Networks Chapter 7 in amlbook.com.

Robert J. Marks II CIA Lab Baylor University School of Engineering CiaLab.org Artificial Neural Networks: Supervised Models.

Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.

Chapter 18 Connectionist Models

Neural Network Terminology

Intro. ANN & Fuzzy Systems Lecture 13. MLP (V): Speed Up Learning.

Neural Networks 2nd Edition Simon Haykin

Perceptrons Michael J. Watts

Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.

Chapter 6 Neural Network.

Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.

Neural NetworksNN 21 Architecture We consider the architecture: feedforward NN with one layer It is sufficient to study single layer perceptrons with.

Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.

CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.

Machine Learning Supervised Learning Classification and Regression

Fall 2004 Backpropagation CS478 - Machine Learning.

Artificial Neural Networks

CS623: Introduction to Computing with Neural Nets (lecture-5)

CSE 473 Introduction to Artificial Intelligence Neural Networks

Derivation of a Learning Rule for Perceptrons

CSE P573 Applications of Artificial Intelligence Neural Networks

CSC 578 Neural Networks and Deep Learning

Synaptic DynamicsII : Supervised Learning

CSE 573 Introduction to Artificial Intelligence Neural Networks

Neural Networks Chapter 5

Artificial Neural Networks

Neural Network - 2 Mayank Vatsa

Multi-Layer Perceptron

Lecture Notes for Chapter 4 Artificial Neural Networks

Backpropagation.

Chapter - 3 Single Layer Percetron

COSC 4335: Part2: Other Classification Techniques

CS623: Introduction to Computing with Neural Nets (lecture-5)

Backpropagation.

CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.

Artificial Neural Networks / Spring 2002

Presentation transcript:

숭실대 전기공학과 C ontrol I nformation P rocess L ab 김경진

 Variety of Neural Network  Feedforward Network – Perceptron  Recurrent Network - Hopfield Network ▪ 입력과 출력을 동일하게 하는 Network ▪ Optimization Problem 에 이용  Competitive Network - Hamming Network ▪ Feedforward + Recurrent Network ▪ 입력에 대하여 Hamming distance 를 최소화 하는 Network ▪ Target 불필요함  Recurrent Layer  Layer with Feedback  초기조건 필요

 W = [w 11 w 12 w 13 w 21 w 22 w 23 w 31 w 32 w 33 ]  b = [b 1 b 2 b 3 ] T  P 1 = [ ] T (banana)  P 2 = [ ] T (pineapple)  T 1 = [ ] T, T 2 = [ ] T

 W 1 = [P 1 T P 2 T] T, b = [R R] T, (R = 입력의 개수 )  W 2 = [1 -ε;- ε 1], 0< ε<1/s-1 (s = Recurrent Layer 의 Neuron 개수 )

 The Perceptron is a binary classifier.  Single Neuron Perceptron

 Learning Rule – Perceptron  e = t – o (t = target, o = output, e = error)  W = W + eX = W + (t – o)X  b = b + e = b + (t – o)  초기값에 따라 Weight, Bias 값이 달라짐

 X = [  ]  O = [ ]  Simulation Result1  Initial Weight : [0 0]  Initial Bias : 0  Iteration Number : 3  Weight : [2 2]  Bias = -2  Simulation Result2  Initial Weight : [ ]  Initial Bias : -10  Iteration Number : 4  Weight : [ ]  Bias =

 ADA ptive LI near NE uron  Perceptron 과의 차이  Transfer Function : Hard Limit vs Linear  Algorithm(L east M ean S quare )  W(k+1) = W(k) + 2αe(k)p T (k)  b(k+1) = b(k) + 2αe(k)

 X = [  ]  O = [ ]  Simulation Result1  Initial Weight : [0 0]  Initial Bias : 0 α α : 0.5  Iteration Number : 2  Weight : [ ]  Bias = -0.5  Simulation Result2  Initial Weight : [ ]  Initial Bias : -10 α α : 0.5  Iteration Number : 2  Weight : [ ]  Bias =

 Simulation 시 주의사항  적정한 α 값 찾기 ▪ α 가 크면 발산 ▪ α 가 작으면 반복 횟수 증가  error 가 더 이상 줄어들지 않으 면 멈추기 ▪ ADALINE 은 선형시스템  Simulation Result3  Initial Weight : [0 0]  Initial Bias : 0 α α : 1.2  Weight : [ ]*e153  Bias = 5.2e153  Simulation Result4  Initial Weight : [0 0]  Initial Bias : 0 α α : 0.1  Iteration Number : 162  Weight : [ ]  Bias =

 Linearly Separable  직선으로 구분 가능한 것  AND Problem  Not Linearly Separable  직선으로 구분 불가능 한 것  XOR Problem  ADALINE Network 로는 분류 불가능  해결 방법 1 - Multi Neuron 사용  해결방법 2 - Multi Layer 사용

 해결방법 1. Multi Neuron 사용  Target 의 차원을 늘린다. ▪ Ex) 1, 0 -> [0;0], [0;1], [1;0], [1;1]  Simulation 결과 ▪ Initial Weight : [1 2;-1 -5] ▪ Initial Bias : [3;-2], α : 0.5 ▪ Iteration Number : 2 ▪ W = [0 0;0 0], b = [0;0]  한계점 - ① 차원 늘리기 ② 선형적 분류  ∴ 해결방법 2. Multi Layer Perceptron 사용

 MLP 의 장단점  Not Linear Separable 문제 해결, 함수 근사화  복잡한 구조 및 알고리즘, 국소적 최소값 수렴

 B ack P ropagation 1. Forward Propagation 2. Backward Propagation ▪ (Sensitivity) 3. Weight Bias Update

 Weight, Bias  Rand 함수로 임의의 값 선정  Hidden Layer Neuron  은닉층 뉴런의 개수 (HDNEU)  HDNEU 가 많을수록 복잡한 문제 해결 가능  Alpha  Steepest Descent Method 에서와 같은 개념  Stop Criteria  수치적인 Algorithm 이므로 학습을 중단할 기준이 필요함  M ean S quare E rror 로 판단함

 HDNEU = 20  α = 0.1  Stop Criteria =  Iteration Number : 480  MSE : 4.85e-3  Elapsed Time : [sec]

 BP Algorithm  HDNEU= 20  α = 0.2  Stop Criteria =  Iteration Number : 3000  3000 번에 수렴 못 함  4710 번에 수렴  MSE :  Elapsed Time : 739[sec]  그림 띄우지 않을 시 7.76[sec]

 MO mentum B ack P ropagation  Backpropagation Algorithm + Low Pass Filter  Weight, Bias Update  Variable  Gamma(γ) – 전달함수에서의 pole

 MOBP Algorithm  HDNEU = 20  α = 1  γ = 0.9  Stop Criteria =  Iteration Number : 625  MSE :  Elapsed Time : 150[sec]

 C onjugate G radient B ack P ropagation  최적화 이론의 Conjugate Gradient Method 인용  복잡한 알고리즘이지만 수렴속도는 빠름  Variable  α, γ 불필요  HDNEU, Stop Criteria  Algorithm  Step1. Search Direction( )  Step2. Line Search( )  Step3. Next Search Direction( )  Step4. if Not Converged, Continue Step

 CGBP Algorithm  HDNEU = 20  Stop Criteria =  Iteration Number : 69  MSE :  Elapsed Time : 22[sec]

 HDNEU = 20  Stop Criteria =  Iteration Number : 125  MSE :  Elapsed Time : 37[sec]

 HDNEU = 10  Stop Criteria =  Iteration Number : 3000  MSE :  Elapsed Time : 900[sec]  Global Minima : 전역적 최소값  LMS Algorithm 은 언제나 Global Minima 보장  Local Minima : 국소적 최소값  BP Algorithm 은 Global Minima 보장 못 함  여러 번의 시뮬레이션이 필요함

 Over Parameterization  신경회로망에서 은닉층 내의 뉴런의 개수가 필요이상으로 많을 때, 학습 데 이터는 제대로 학습 시키지만 그 외의 데이터에서는 오차가 발생하는 것  Generalization Performance( 일반화 성능 )  학습 데이터가 아닌 다른 입력을 통해 신경회로망의 성능을 시험하는 것

 모든 입력의 attribute 를 0~1 사이의 값으로 만드는 것  학습 데이터로 Scaling 할 때, 최대 · 최소값을 저장하고 그 값을 통해 검증 데이터를 Scaling 한다.  Nearest Neighbor 에서의 normalize 개념과 유사  선택사항 – Target 값의 Scaling 유무  Target 값이 백단위 이상이라면 Scaling 필요  Ex) 전력수요 예측 [ 어제 전력수요 ; 요일 ; 최고온도 ; 최저온도 ] – 하루치 데이터 Origin Data [ ] Modification Data [ ]

 전력수요 예측 Simulation  HDNEU = 20, Stop Criteria = 0.01, Max Iteration Number = 1000  Case1. Not Scaling  Iteration Number : 1000, Train Set MSE : 11,124,663, Test Set MSE : 20,425,686  Case2. Scaling  Iteration Number : 1000, Train Set MSE : 11,124,663, Test Set MSE : 20,425,686  Case3. Target Scaling  Iteration Number : 6, Train Set MSE : , Test Set MSE :

 Overfitting  Stop Criteria 를 지나치게 작게 선정하여 학습 데이터의 에러는 작아지도록 Weight, Bias 를 학습시키나 검증 데이터의 에러는 오히려 커져서 일반화 성 능이 떨어지는 것을 말한다.  Stop Criteria : 0.01 /  Test Set MSE : / ☆ Issue 1. Stop Criteria 를 얼마로 선정해야 하는가 ? 2. HDNEU 는 몇 개가 적당한가 ?

 Machine Learning, Tom Mitchell, McGraw Hill.  Introduction to Machine Learning, Ethem Alpaydin, MIT press.  Neural Network Design, Martin T.Hagan, Howard B.Demuth, Mark Beale, PWS Publishing Company.  Neural Networks and Learning Machine, Simon Haykin, Prentice Hall