Web-Mining Agents Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Tanya Braun (Übungen)

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Slides from: Doug Gray, David Poole
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
G53MLE | Machine Learning | Dr Guoping Qiu
NEURAL NETWORKS Perceptron
also known as the “Perceptron”
Navneet Goyal, BITS-Pilani Perceptrons. Labeled data is called Linearly Separable Data (LSD) if there is a linear decision boundary separating the classes.
Support Vector Machines
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Lecture 14 – Neural Networks
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
Simple Neural Nets For Pattern Classification
CII504 Intelligent Engine © 2005 Irfan Subakti Department of Informatics Institute Technology of Sepuluh Nopember Surabaya - Indonesia.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
20.5 Nerual Networks Thanks: Professors Frank Hoffmann and Jiawei Han, and Russell and Norvig.
Neural Networks Marco Loog.
An Illustrative Example
Lecture 4 Neural Networks ICS 273A UC Irvine Instructor: Max Welling Read chapter 4.
Artificial Neural Networks
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
Radial Basis Function Networks
Neural Networks. Plan Perceptron  Linear discriminant Associative memories  Hopfield networks  Chaotic networks Multilayer perceptron  Backpropagation.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
DIGITAL IMAGE PROCESSING Dr J. Shanbehzadeh M. Hosseinajad ( J.Shanbehzadeh M. Hosseinajad)
Multi-Layer Perceptrons Michael J. Watts
1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.
Matlab Matlab Sigmoid Sigmoid Perceptron Perceptron Linear Linear Training Training Small, Round Blue-Cell Tumor Classification Example Small, Round Blue-Cell.
Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
Artificial Intelligence Techniques Multilayer Perceptrons.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
Neural Networks and Machine Learning Applications CSC 563 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi.
Non-Bayes classifiers. Linear discriminants, neural networks.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
An Artificial Neural Network Approach to Surface Waviness Prediction in Surface Finishing Process by Chi Ngo ECE/ME 539 Class Project.
CS621 : Artificial Intelligence
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Lecture notes for Stat 231: Pattern Recognition and Machine Learning 1. Stat 231. A.L. Yuille. Fall Perceptron Rule and Convergence Proof Capacity.
EEE502 Pattern Recognition
Introduction to Neural Networks Freek Stulp. 2 Overview Biological Background Artificial Neuron Classes of Neural Networks 1. Perceptrons 2. Multi-Layered.
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.
Chapter 6 Neural Network.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Neural NetworksNN 21 Architecture We consider the architecture: feed- forward NN with one layer It is sufficient to study single layer perceptrons with.
Machine Learning: A Brief Introduction Fu Chang Institute of Information Science Academia Sinica ext. 1819
CSSE463: Image Recognition Day 15 Today: Today: Your feedback: Your feedback: Projects/labs reinforce theory; interesting examples, topics, presentation;
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Vision-inspired classification
an introduction to: Deep Learning
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Structure learning with deep autoencoders
Introduction to Deep Learning for neuronal data analyses
Artificial Intelligence Methods
Neuro-Computing Lecture 4 Radial Basis Function Network
Neural Network - 2 Mayank Vatsa
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
Word representations David Kauchak CS158 – Fall 2016.
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

Web-Mining Agents Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Tanya Braun (Übungen)

Classification Artificial Neural Networks SVMs R. Moeller Institute of Information Systems University of Luebeck

Agenda Neural Networks Single-layer networks (Perceptrons) –Perceptron learning rule –Easy to train Fast convergence, few data required –Cannot learn „complex“ functions Support Vector Machines Multi-Layer networks –Backpropagation learning –Hard to train Slow convergence, many data required Deep Learning

XOR problem

(learning rate) Proof omitted since neural networks are not in the focus of this lecture

Support Vector Machine Classifier Basic idea –Mapping the instances from the two classes into a space where they become linearly separable. The mapping is achieved using a kernel function that operates on the instances near to the margin of separation. Parameter: kernel type

y = +1 y = -1 Nonlinear Separation

marginseparatorsupport vectors Support Vectors

Literature Mitchell (1989). Machine Learning. Duda, Hart, & Stork (2000). Pattern Classification. Hastie, Tibshirani, & Friedman (2001). The Elements of Statistical Learning.

Literature (cont.) Russell & Norvig (2004). Artificial Intelligence. Shawe-Taylor & Cristianini. Kernel Methods for Pattern Analysis.

Z = y1 AND NOT y2 = (x1 OR x2) AND NOT(x1 AND x2)

W1 W2 W3 f(x)f(x) David Corne: Open Courseware

f(x)f(x) x = -0.06× × ×0.002 = David Corne: Open Courseware

A dataset Fields class etc … David Corne: Open Courseware

Training the neural network Fields class etc … David Corne: Open Courseware

Training data Fields class etc … Initialise with random weights David Corne: Open Courseware

Training data Fields class etc … Present a training pattern David Corne: Open Courseware

Training data Fields class etc … Feed it through to get output David Corne: Open Courseware

Training data Fields class etc … Compare with target output error 0.8 David Corne: Open Courseware

Training data Fields class etc … Adjust weights based on error error 0.8 David Corne: Open Courseware

Training data Fields class etc … Present a training pattern David Corne: Open Courseware

Training data Fields class etc … Feed it through to get output David Corne: Open Courseware

Training data Fields class etc … Compare with target output error -0.1 David Corne: Open Courseware

Training data Fields class etc … Adjust weights based on error error -0.1 David Corne: Open Courseware

Training data Fields class etc … And so on … error -0.1 Repeat this thousands, maybe millions of times – each time taking a random training instance, and making slight weight adjustments Algorithms for weight adjustment are designed to make changes that will reduce the error David Corne: Open Courseware

The decision boundary perspective… Initial random weights David Corne: Open Courseware

The decision boundary perspective… Present a training instance / adjust the weights David Corne: Open Courseware

The decision boundary perspective… Present a training instance / adjust the weights David Corne: Open Courseware

The decision boundary perspective… Present a training instance / adjust the weights David Corne: Open Courseware

The decision boundary perspective… Present a training instance / adjust the weights David Corne: Open Courseware

The decision boundary perspective… Eventually …. David Corne: Open Courseware

The point I am trying to make Weight-learning algorithms for NNs are dumb They work by making thousands and thousands of tiny adjustments, each making the network do better at the most recent pattern, but perhaps a little worse on many others But, by dumb luck, eventually this tends to be good enough to learn effective classifiers for many real applications David Corne: Open Courseware

Some other points If f(x) is non-linear, a network with 1 hidden layer can, in theory, learn perfectly any classification problem. A set of weights exists that can produce the targets from the inputs. The problem is finding them. David Corne: Open Courseware

Some other ‘by the way’ points If f(x) is linear, the NN can only draw straight decision boundaries (even if there are many layers of units) David Corne: Open Courseware

Some other ‘by the way’ points NNs use nonlinear f(x) so they can draw complex boundaries, but keep the data unchanged David Corne: Open Courseware

Some other ‘by the way’ points NNs use nonlinear f(x) so they SVMs only draw straight lines, can draw complex boundaries, but they transform the data first but keep the data unchanged in a way that makes that OK David Corne: Open Courseware

Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,

The new way to train multi-layer NNs… David Corne: Open Courseware

The new way to train multi-layer NNs… Train this layer first David Corne: Open Courseware

The new way to train multi-layer NNs… Train this layer first then this layer David Corne: Open Courseware

The new way to train multi-layer NNs… Train this layer first then this layer David Corne: Open Courseware

The new way to train multi-layer NNs… Train this layer first then this layer David Corne: Open Courseware

The new way to train multi-layer NNs… Train this layer first then this layer finally this layer David Corne: Open Courseware

The new way to train multi-layer NNs… EACH of the (non-output) layers is trained to be an auto-encoder Basically, it is forced to learn good features that describe what comes from the previous layer David Corne: Open Courseware

an auto-encoder is trained, with an absolutely standard weight- adjustment algorithm to reproduce the input David Corne: Open Courseware

an auto-encoder is trained, with an absolutely standard weight- adjustment algorithm to reproduce the input By making this happen with (many) fewer units than the inputs, this forces the ‘hidden layer’ units to become good feature detectors David Corne: Open Courseware

intermediate layers are each trained to be auto encoders (or similar) David Corne: Open Courseware

Final layer trained to predict class based on outputs from previous layers David Corne: Open Courseware