Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules.

Similar presentations


Presentation on theme: "Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules."— Presentation transcript:

1 Nima Hazar Amin Dehesh

2  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules.  ID3 is a general purpose rule induction algorithm developed by Quinlan(1979), and today is most expert system shell.

3  Predicting weather: temperature, wind direction, condition of sky, barometric pressure.

4  Choose most important issue first.  No-data result.  Excludes irrelevant factors.

5

6  The central choice in the ID3 is selecting which attribute to test at each node in the tree.  Define a statistical property, called information gain, that measures how well a give attribute separates the training examples according to their target classification.

7  Characterizes the purity of an arbitrary of examples.  Suppose S is collection of 14 examples: 9 positive and 5 negative ([9+,5-]).

8  0log0 = 0  Entropy is 0 if all members of S belong to same class.  Entropy is 1 when the collection contains an equal number of positive and negative.  If the target attribute can take on C different values :

9  Gain(S,A) of an attribute A, relate to be a collection of examples S, is defined as: Values(A) is the set of all possible values for attribute A, and S v is the subset of S for which attribute A has value v.

10  An example :

11  Playing Tennis:

12  The best is “outlook”

13

14  This process continues for each new leaf until either of two conditions is met: 1.Every attribute has already been included along this path through the tree 2.The training examples associated with this leaf node all have the same target attribute value.

15

16  Determine objective  Determine decision factors: attribute nodes of the decision tree  Determine decision factor values : attribute values of the decision tree  Determine solution : list of final decisions that the system can make

17  Form example set: problem knowledge  Create decision tree  Test the system  Revise the system

18 Predict the outcome of a football game.  Objective: a football game prediction system that can predict if our team will win or lose its next game  Decision factors : location of the game, weather, our own team’s record, and the opponent’s record

19  Decision factor values :  Solution: a simple binary decision

20  Examples :go back over first 8 games.

21  Decision Tree : ID3 Algorithm

22  Testing : Next 8 games

23  Revising: After discussing, another factor is the team’s health. With the values of {poor, average, good}

24

25  Attempts to determine the influence of various system elements on the system’s behavior.  Some shells like 1STCLASS permit you to deactivate decision factors or examples without removing.  Advantages: ◦ Impact of decision factors. ◦ Examples from different sources.

26

27  Discovers rules from examples.  Avoids knowledge elicitation problem.  Can produce new knowledge.  Can uncover critical decision factors.  Can eliminate irrelevant decision factors.  Can uncover contradictions.

28  Example: Pump diagnosis system:  Sometimes contradiction is acceptable.  Indicates a problem: ◦ A bad example. ◦ Inadequate decision factors or values.

29  Often difficult to choose good decision factors.  Difficult to understand rules.  Applicable only for classification problems.

30  AQ11  WILLARD  Transformer Fault Diagnosis  Customer Support  VAX-VMS Operating System Tuning  Predicting Stock Market Behavior

31  Developed in 1980 for diagnosing soybean diseases.  Capable of identifying 15 diseases.  630 examples of diseased soybean plants.  Uses 35 decision factors.  A special example selection program, ESEL, was used to select 290 considerably different examples.

32  AQ11 formulated a set of rule for classifying a new example into one of the 15 categories.  They also developed a rule-based system from knowledge of a plant pathologist.  The 340 examples were used to compare.  The rule-based results 71.8%.  AQ11 scored 97.6%.

33  To aid forecasting thunderstorms in the United States NSSFC (1984).  Uses 140 examples of thunderstorm weather data.  30 modules, each with a single decision tree.  Characterize to none, approaching, slight, moderate or high with a probability range.

34  Was developed using RULEMASTER inductive tool.  Was tested for one week in spring of 1984, in Texas, Oklahoma, and Colorado.  5 thunderstorms passed through this region.  WILLARD was found to compare favorable with those made by an expert meteorologist.

35  Designed to detect early signs of transformer faults in order to avoiding potential failure.  An expert system for evaluating the gas-in- oil test results.  From knowledge contained in the form of past examples using RULEMASTER rule induction tool.  Contains 27 modules, each with its own induced rule.  Tested using data from 900 tests.  90% agreed with the expert’s analysis.

36  SCREENIO, a software by NORCOM, allows the user to design IBM PC screens for Realia COBOL programs.  NORCOM developed an expert system to aid their support personnel.  9 months of data on typical customer problems.  9 decision factors and was developed in 1 day using 1STCLASS.

37  Is used by support personnel who ask the customer for values for each of the decision factors.  Made a major improvement in customer support responsiveness and efficiency.

38  An expert system to help tune the VAX-VMS operating system (1984).  Tuning is a complex and dynamic task. Over 150 parameter must be set by the system manager.  Adjustments and modifications are required in response to changes in configuration and loading.  The developed system collects data on present system performance and generate a summary report.

39  Interacts with the system manager by asking questions that lead to a recommended action.  Recommendation: ◦ Adjusting system parameters ◦ Adjusting user authorization values ◦ Redistribution or reducing user demand ◦ Purchasing new hardware ◦ …

40  Was developed using the induction tool TIMM.  The effectiveness is measured by comparing the performance before and after the recommended actions are taken.  Has made the management of the operating system a more efficient task.

41  Is a difficult challenge. Analysts use historical data analysis techniques which provide a degree of uncertainty.  Braun developed an expert system based on ID3 to improve reliability of stock market prediction. (1987)  20 decision factors, values were determined over a time between March 1981 to April 1983 from Wall Street Journal.

42  3 different results: ◦ Bullish: upward trend ◦ Bearish: downward trend ◦ Neutral: either call is too risky  Correctly predicted 64.4% of the time.  The expert analyst predicted 60.2%.  The analyst was impressed with general structure of the decision tree and with system predictions.


Download ppt "Nima Hazar Amin Dehesh.  Several induction algorithm have been developed that vary in the methods employed to build the decision tree or set of rules."

Similar presentations


Ads by Google