Data Mining (Student Presentation) Samira Roshan_Asma Akbari Mehr 87-88
hiddenThere is often information hidden in the data that is not readily evident Human analysts may take weeks to discover useful information Much of the data is never analyzed at all Number of analysts Total new disk (TB) since 1995 The Data Gap Gap
Data collected and stored at enormous speeds (GB/hour) Traditional techniques infeasible for raw data Data mining may help scientists
DATA Base Target Data Transformed Data Patterns and Rules
Classification Regression Collaborative Filtering Clustering Association rules Deviation detection
ClassifierDecision rules Salary > 5 L Prof. = Exec New applicants data Many approaches: Statistics, Decision Trees, Neural Networks,...
Unsupervised learning when old data with class labels not available e.g. when introducing a new product.
Given set T of groups of items Example: set of item sets purchased MilkCerealRice TeaRiceBread ChipsBreadcheese......
The use of data, particularly about people, for data mining has serious ethical implications. When applied to people discriminate.
Data mining (or simple analysis) on people may come with a profile that would raise controversial issues of – Discrimination – Privacy – Security Examples: – Should males between 18 and 35 from countries that produced terrorists be singled out for search before flight? – Can people be denied mortgage based on age, sex, race? – Women live longer. Should they pay less for life insurance?
Instances Instances: the individual, independent examples of a concept Attributes Attributes: measuring aspects of an instance We will focus on nominal and numeric ones
number of nuclei (values: 1,2) number of tails (values: 1,2) color (values: light, dark) wall (values: thin, thick) Lethargia Burpoma Healthy
# Color LightDark Lethargi a 32 Burpom a 12 Healthy 22 # Tails 12 Lethargi a 50 Burpom a 03 Healthy 22 # Nucleus 12 Lethargi a 41 Burpom a 03 Healthy 22 # Membrance ThinThick Lethargia 32 Burpoma 21 Healthy 31
# ColorLightDark Lethargi a 32 Burpom a 12 Healthy22 # Tails12 Lethargi a 50 Burpom a 03 Healthy22 # Nucleus 12 Lethargi a 41 Burpom a 03 Healthy22 # Membrance ThinThick Lethargia32 Burpoma21 Healthy31
Tails
# ColorLightDark Lethargi a 32 Burpom a 00 Healthy02 # Nucleus 12 Lethargi a 41 Burpom a 00 Healthy02 # Membrance ThinThick Lethargia32 Burpoma00 Healthy02
Tails Nucleu s Lethargia
Tails Nucleu s Lethargia Color Nucleu s Healthy Burpoma Lethargia Healthy
If # Tails = 1 then If # Nucleus = 1 then class = Lethargia else If color = light then class = Lethargia else class = Healthy else If # Nucleus = 1 then class = Healthy else class = Burpom
Resources relational-database.htmhttp:// relational-database.htm avaisman/cscd3 4summer/ccsc343s.htm