Presentation is loading. Please wait.

Presentation is loading. Please wait.

Comparing between machine learning methods for a remote monitoring system. Ronit Zrahia Final Project Tel-Aviv University.

Similar presentations


Presentation on theme: "Comparing between machine learning methods for a remote monitoring system. Ronit Zrahia Final Project Tel-Aviv University."— Presentation transcript:

1 Comparing between machine learning methods for a remote monitoring system. Ronit Zrahia Final Project Tel-Aviv University

2 Overview zThe remote monitoring system zThe project database zMachine learning methods: yDecision of Association Rules yInductive Logic Programming yDecision Tree zApplying the methods for project database and comparing the results

3 Remote Monitoring System - Description zSupport Center has ongoing information on customer’s equipment zSupport Center can, in some situations, know that customer is going to be in trouble zSupport Center initiates a call to the customer zSpecialist connects to site from remote and tries to eliminate problem before it has influence

4 Remote Monitoring System - Description Gateway Product AIX/NT Customer TCP/IP [FTP] TCP/IP [Mail/FTP] Support Server AIX/NT/95 Modem

5 Remote Monitoring System - Technique zOne of the machines on site, the Gateway, is able to initiate a PPP connection to the support server or to ISP zAll the Products on site have a TCP/IP connection to the Gateway zBackground tasks on each Product collect relevant information zThe data collected from all Products is transferred to the Gateway via ftp zThe Gateway automatically dials to the support server or ISP, and sends the data to the subsidiary zThe received data is then imported to database

6 Project Database z12 columns, 300 records zEach record includes failure information of one product at a specific customer site zThe columns are: record no., date, IP address, operating system, customer ID, product, release, product ID, category of application, application, severity, type of service contract

7 Project Goals zDiscover valuable information from database zImprove the products marketing and the customer support of the company zLearn different learning methods, and use them for the project database zCompare the different methods, based on the results

8 The Learning Methods zDiscovery of Association Rules zInductive Logic Programming zDecision Tree

9 Discovery of Association Rules - Goals zFinding relations between products which are bought by the customers yImpacts on product marketing zFinding relations between failures in a specific product yImpacts on customer support (failures can be predicted and handled before influences)

10 Discovery of Association Rules - Definition zA technique developed specifically for data mining yGiven xA dataset of customer transactions xA transaction is a collection of items yFind xCorrelations between items as rules zExample ySupermarket baskets

11 Determining Interesting Association Rules zRules have confidence and support yIF x and y THEN z with confidence c xIf x and y are in the basket, then so is z in c% of cases yIF x and y THEN z with support s xThe rule holds in s% of all transactions

12 Discovery of Association Rules - Example zInput Parameters: confidence=50%; support=50% zIf A then C: c=66.6% s=50% zIf C then A: c=100% s=50% TransactionItems 12345A B C 12346A C 12347A D 12348B E F

13 Itemsets are Basis of Algorithm zRule A => C zs=s(A, C) = 50% zc=s(A, C)/s(A) = 66.6% TransactionItems 12345A B C 12346A C 12347A D 12348B E F ItemsetSupport A75% B50% C A, C50%

14 Algorithm Outline zFind all large itemsets ySets of items with at least minimum support yApriori algorithm zGenerate rules from large itemsets yFor ABCD and AB in large itemset the rule AB=>CD holds if ratio s(ABCD)/s(AB) is large enough yThis ratio is the confidence of the rule

15 Pseudo Algorithm

16 Relations Between Products

17 Relations Between Failures Association RulesConfidence ( CF )Item Set ( L ) 4  6 14 / 16 = 0.8754-6 6  4 14 / 15 = 0.93 5  10 15 / 18 = 0.835-10 10  5 15 / 15 = 1

18 Inductive Logic Programming - Goals zFinding the preferred customers, based on: yThe number of products bought by the customer yThe failures types (i.e severity level) occurred in the products

19 Inductive Logic Programming - Definition zInductive construction of first-order clausal theories from examples and background knowledge zThe aim is to discover, from a given set of pre- classified examples, a set of classification rules with high predictive power zExamples: yIF Outlook=Sunny AND Humidity=High THEN PlayTennis=No

20 Horn clause induction Given: P: ground facts to be entailed (positive examples); N: ground facts not to be entailed (negative examples); B: a set of predicate definitions (background theory); L: the hypothesis language; Find a predicate definition (hypothesis) such that 1.for every (completeness) 2.for every (consistency)

21 Inductive Logic Programming - Example zLearning about the relationships between people in a family circle

22 Algorithm Outline zA space of candidate solutions and an acceptance criterion characterizing solutions to an ILP problem zThe search space is typically structured by means of the dual notions of generalization (induction) and specialization (deduction) yA deductive inference rule maps a conjunction of clauses G onto a conjunction of clauses S such that G is more general than S yAn inductive inference rule maps a conjunction of clauses S onto a conjunction of clauses G such that G is more general than S. zPruning Principle: yWhen B and H don’t include positive example, then specializations of H can be pruned from the search yWhen B and H include negative example, then generalizations of H can be pruned from the search

23 Pseudo Algorithm

24 The preferred customers If ( Total_Products_Types( Customer ) > 5 ) and ( All_Severity(Customer) < 3 ) then Preferred_Customer

25 Decision Trees - Goals zFinding the preferred customers zFinding relations between products which are bought by the customers zFinding relations between failures in a specific product zCompare the Decision Tree results to the previous algorithms results.

26 Decision Trees - Definition zDecision tree representation: yEach internal node tests an attribute yEach branch corresponds to attribute value yEach leaf node assigns a classification zOccam’s razor: prefer the shortest hypothesis that fits the data zExamples: yEquipment or medical diagnosis yCredit risk analysis

27 Algorithm outline zA  the “best” decision attribute for next node zAssign A as decision attribute for node zFor each value of A, create new descendant of node zSort training examples to leaf nodes zIf training examples perfectly classified, Then STOP, Else iterate over new leaf nodes

28 Pseudo algorithm

29 Information Measure zEntropy measures the impurity of the sample of training examples S : y is the probability of making a particular decision yThere are c possible decisions zThe entropy is the amount of information needed to identify class of an object in S yMaximized when all are equal yMinimized (0) when all but one is 0 (the remaining is 1)

30 Information Measure zEstimate the gain in information from a particular partitioning of the dataset zGain(S, A) = expected reduction in entropy due to sorting on A zThe information that is gained by partitioning S is then: zThe gain criterion can then be used to select the partition which maximizes information gain

31 Decision Tree - Example DayOutlookTemperatureHumidityWindPlayTennis D1sunnyhothighweakNo D2sunnyhothighstrongNo D3overcasthothighweakYes D4rainmildhighweakYes D5raincoolnormalweakYes D6raincoolnormalstrongNo D7overcastcoolnormalstrongYes D8sunnymildhighweakNo D9sunnycoolnormalweakYes D10rainmildnormalweakYes D11sunnymildnormalstrongYes D12overcastmildhighstrongYes D13overcasthotnormalweakYes D14rainmildhighstrongNo

32 Decision Tree - Example (Continue) humiditywind highweaknormalstrong NP S: [9+,5-] E=0.940 S: [9+,5-] E=0.940 [6+,2-] E=0.811 [3+,3-] E=1.00 Gain (S, Wind) =.940 - (8/14).811 - (6/14)1.0 =.048 [3+,4-] E=0.985 [6+,1-] E=0.592 Gain (S, Humidity) =.940 - (7/14).985 - (7/14).592 =.151 Which attribute is the best classifier? Gain(S, Outlook) = 0.246 Gain(S, Temperature) = 0.029

33 Decision Tree Example – (Continue) outlook ? sunnyovercastrain Yes {D1, D2, …, D14} [9+,5-] {D4,D5,D6,D10,D14} [3+,2-] {D1,D2,D8,D9,D11} [2+,3-] {D3,D7,D12,D13} [4+,0-] ? S sunny = {D1,D2,D8,D9,D11} Gain(S sunny, Humidity) =.970 – (3/5)0.0 – (2/5)0.0 =.970 Gain(S sunny, Temperature) =.970 – (2/5)0.0 – (2/5)1.0 – (1/5)0.0 =.570 Gain(S sunny, Wind) =.970 – (2/5)1.0 – (3/5).918 =.019

34 Decision Tree Example – (Continue) outlook humiditywind sunnyovercastrain Yes highstrongnormalweak NoYesNoYes

35 Overfitting zThe tree may not be generally applicable called overfitting zHow can we avoid overfitting? yStop growing when data split not statistically significant yGrow full tree, then post-prun zThe post-pruning approach is more common zHow to select “best” tree: yMeasure performance over training data yMeasure performance over separate validation data set

36 Reduced-Error Pruning zSplit data into training and validation set zDo until further pruning is harmful: 1.Evaluate impact on validation set of pruning each possible node (plus those below it) 2.Greedily remove the one that most improves validation set accuracy Produces smallest version of most accurate sub-tree

37 The Preferred Customer NO: 7 YES: 0 NO: 0 YES: 3 NoOfProducts < 2.5>= 2.5 MaxSev < 4.5 >= 4.5 NO: 3 YES: 8 Target attribute is TypeOfServiceContract

38 Relations Between Products NO: 0 YES: 1 NO: 4 YES: 0 Product2 Product9 01 01 Product6 0 1 NO: 0 YES: 15 NO: 0 YES: 1 Target attribute is Product3

39 Relations Between Failures NO: 5 YES: 1 NO: 1 YES: 0 Application8 Application2 01 01 Application10 0 1 NO: 0 YES: 11 NO: 2 YES: 2 Target attribute is Application5


Download ppt "Comparing between machine learning methods for a remote monitoring system. Ronit Zrahia Final Project Tel-Aviv University."

Similar presentations


Ads by Google