Presentation is loading. Please wait.

Presentation is loading. Please wait.

Page Number: 1 Prof. Dr. Veljko Milutinovic: Coarchitect of the World's first 200MHz RISC microprocessor,

Similar presentations


Presentation on theme: "Page Number: 1 Prof. Dr. Veljko Milutinovic: Coarchitect of the World's first 200MHz RISC microprocessor,"— Presentation transcript:

1 Page Number: 1 vm@etf.bg.ac.yu http://galeb.etf.bg.ac.yu/~vm Prof. Dr. Veljko Milutinovic: Coarchitect of the World's first 200MHz RISC microprocessor, for DARPA, about a decade before Intel. Responsible for several successful datamining-oriented e-business on the Internet products, developed in cooperation with leading industry in the USA and Europe. Consulted for a number of high-tech companies (TechnologyConnect, BioPop, IBM, AT&T, NCR, RCA, Honeywell, Fairchild, etc...) Ph.D. from Belgrade. After that, for about a decade, on various positions (professor) at one of the top 5 (out of about 2000) US universities in computer engineering (Purdue). Author and coauthor of about 50 IEEE journal papers (plus many more in other journals). According to some, a European record for his research field. Guest editor for a number of special issues of: Proceedings of the IEEE, IEEE Transactions on Computers, IEEE Concurrency, IEEE Computer, etc… Over 20 books published by the leading USA publishers (Wiley, Prentice-Hall, North-Holland, Kluwer, IEEE CS Press, etc...). Forewords for 7 of his books written by 7 different Nobel Laureates, in cooperation with Telecom Italia learning services (you are welcome to visit http://www.ssgrr.it). Datamining in e-Business: Veni, Vidi, Vici!

2 Page Number: 2 Issues in Data Mining Infrastructure Issues in Data Mining Infrastructure Authors:Nemanja Jovanovic, nemko@acm.orgnemko@acm.org Valentina Milenkovic, tina@eunet.yutina@eunet.yu Prof. Dr. Veljko Milutinovic, vm@etf.bg.ac.yuvm@etf.bg.ac.yu http://galeb.etf.bg.ac.yu/~vm

3 Page Number: 3 Data Mining in the Nutshell Data Mining in the Nutshell  Uncovering the hidden knowledge  Huge n-p complete search space  Multidimensional interface NOTICE: All trademarks and service marks mentioned in this document are marks of their respective owners. Furthermore CRISP-DM consortium (NCR Systems Engineering Copenhagen (USA and Denmark), DaimlerChrysler AG (Germany), SPSS Inc. (USA) and OHRA Verzekeringen en Bank Groep B.V (The Netherlands)) permitted presentation of their process model.

4 Page Number: 4 A Problem … A Problem … You are a marketing manager for a cellular phone company  Problem: Churn is too high  Bringing back a customer after quitting is both difficult and expensive  Giving a new telephone to everyone whose contract is expiring is expensive  You pay a sales commission of 250$ per contract  Customers receive free phone (cost 125$)  Turnover (after contract expires) is 40%

5 Page Number: 5 … A Solution … A Solution  Three months before a contract expires, predict which customers will leave  If you want to keep a customer that is predicted to churn, offer them a new phone  The ones that are not predicted to churn need no attention  If you don’t want to keep the customer, do nothing  How can you predict future behavior?  Tarot Cards?  Magic Ball?  Data Mining?

6 Page Number: 6 Still Skeptical? Still Skeptical?

7 Page Number: 7 The Definition The Definition  Automated The automated extraction of predictive information from (large) databases  Extraction  Predictive  Databases

8 Page Number: 8 History of Data Mining History of Data Mining

9 Page Number: 9 Repetition in Solar Activity Repetition in Solar Activity  1613 – Galileo Galilei  1859 – Heinrich Schwabe

10 Page Number: 10 The Return of the Halley Comet The Return of the Halley Comet 191019862061 ??? 1531 1607 1682 239 BC Edmund Halley (1656 - 1742)

11 Page Number: 11 Data Mining is Not Data Mining is Not  Data warehousing  Ad-hoc query/reporting  Online Analytical Processing (OLAP)  Data visualization

12 Page Number: 12 Data Mining is Data Mining is  Automated extraction of predictive information from various data sources  Powerful technology with great potential to help users focus on the most important information stored in data warehouses or streamed through communication lines

13 Page Number: 13 Data Mining can Data Mining can  Answer question that were too time consuming to resolve in the past  Predict future trends and behaviors, allowing us to make proactive, knowledge driven decision

14 Page Number: 14 Focus of this Presentation Focus of this Presentation  Data Mining problem types  Data Mining models and algorithms  Efficient Data Mining  Available software

15 Page Number: 15 Data Mining Problem Types Data Mining Problem Types

16 Page Number: 16 Data Mining Problem Types Data Mining Problem Types  6 types  Often a combination solves the problem

17 Page Number: 17 Data Description and Summarization Data Description and Summarization  Aims at concise description of data characteristics  Lower end of scale of problem types  Provides the user an overview of the data structure  Typically a sub goal

18 Page Number: 18 Segmentation Segmentation  Separates the data into interesting and meaningful subgroups or classes  Manual or (semi)automatic  A problem for itself or just a step in solving a problem

19 Page Number: 19 Classification Classification  Assumption: existence of objects with characteristics that belong to different classes  Building classification models which assign correct labels in advance  Exists in wide range of various applications  Segmentation can provide labels or restrict data sets

20 Page Number: 20 Concept Description Concept Description  Understandable description of concepts or classes  Close connection to both segmentation and classification  Similarity and differences to classification

21 Page Number: 21 Prediction (Regression) Prediction (Regression)  Similar to classification- difference: discrete becomes continuous  Finds the numerical value of the target attribute for unseen objects

22 Page Number: 22 Dependency Analysis Dependency Analysis  Finding the model that describes significant dependences between data items or events  Prediction of value of a data item  Special case: associations

23 Page Number: 23 Data Mining Models Data Mining Models

24 Page Number: 24 Neural Networks Neural Networks  Characterizes processed data with single numeric value  Efficient modeling of large and complex problems  Based on biological structures - Neurons  Network consists of neurons grouped into layers

25 Page Number: 25 Neuron Functionality Neuron Functionality I1 I2 I3 In Output W1 W2 W3 Wn f Output = f (W1*I1, W2*I1, …, Wn*In)

26 Page Number: 26 Training Neural Networks Training Neural Networks

27 Page Number: 27 Neural Networks Neural Networks  Once trained, Neural Networks can efficiently estimate the value of an output variable for given input  Neurons and network topology are essentials  Usually used for prediction or regression problem types  Difficult to understand  Data pre-processing often required

28 Page Number: 28 Decision Trees Decision Trees  A way of representing a series of rules that lead to a class or value  Iterative splitting of data into discrete groups maximizing distance between them at each split  CHAID, CHART, Quest, C5.0  Classification trees and regression trees  Unlimited growth and stopping rules  Univariate splits and multivariate splits

29 Page Number: 29 Decision Trees Balance>10Balance<=10 Age<=32Age>32 Married=NOMarried=YES

30 Page Number: 30 Decision Trees Decision Trees

31 Page Number: 31 Rule Induction Rule Induction  Method of deriving a set of rules to classify cases  Creates independent rules that are unlikely to form a tree  Rules may not cover all possible situations  Rules may sometimes conflict in a prediction

32 Page Number: 32 Rule Induction Rule Induction If balance>100.000 then confidence=HIGH & weight=1.7 If balance>25.000 and status=married then confidence=HIGH & weight=2.3 If balance<40.000 then confidence=LOW & weight=1.9

33 Page Number: 33 K-nearest Neighbor and Memory-Based Reasoning (MBR) K-nearest Neighbor and Memory-Based Reasoning (MBR)  Usage of knowledge of previously solved similar problems in solving the new problem  Assigning the class to the group where most of the k-”neighbors” belong  First step – finding the suitable measure for distance between attributes in the data  + Easy handling of non-standard data types  - Huge models

34 Page Number: 34 K-nearest Neighbor and Memory-Based Reasoning (MBR) K-nearest Neighbor and Memory-Based Reasoning (MBR)

35 Page Number: 35 Data Mining Models and Algorithms Data Mining Models and Algorithms  Logistic regression  Discriminant analysis  Generalized Adaptive Models (GAM)  Genetic algorithms  Etc…  Many other available models and algorithms  Many application specific variations of known models  Final implementation usually involves several techniques  Selection of solution that match best results

36 Page Number: 36 Efficient Data Mining Efficient Data Mining

37 Page Number: 37 Don’t Mess With It! YES NO YES You Shouldn’t Have! NO Will it Explode In Your Hands? NO Look The Other Way Anyone Else Knows? You’re in TROUBLE! YES NO Hide It Can You Blame Someone Else? NO NO PROBLEM! YES Is It Working? Did You Mess With It?

38 Page Number: 38 DM Process Model DM Process Model  CRISP – tends to become a standard  5A – used by SPSS Clementine (Assess, Access, Analyze, Act and Automate )  SEMMA – used by SAS Enterprise Miner (Sample, Explore, Modify, Model and Assess)

39 Page Number: 39 CRISP - DM CRISP - DM  CRoss-Industry Standard for DM  Conceived in 1996 by three companies:

40 Page Number: 40 CRISP – DM methodology CRISP – DM methodology Four level breakdown of the CRISP-DM methodology: Phases Generic Tasks Process Instances Specialized Tasks

41 Page Number: 41 Mapping generic models to specialized models Mapping generic models to specialized models  Analyze the specific context  Remove any details not applicable to the context  Add any details specific to the context  Specialize generic context according to concrete characteristic of the context  Possibly rename generic contents to provide more explicit meanings

42 Page Number: 42 Generalized and Specialized Cooking Generalized and Specialized Cooking  Preparing food on your own  Find out what you want to eat  Find the recipe for that meal  Gather the ingredients  Prepare the meal  Enjoy your food  Clean up everything (or leave it for later)  Raw stake with vegetables?  Check the Cookbook or call mom  Defrost the meat (if you had it in the fridge)  Buy missing ingredients or borrow the from the neighbors  Cook the vegetables and fry the meat  Enjoy your food or even more  You were cooking so convince someone else to do the dishes

43 Page Number: 43 CRISP – DM model CRISP – DM model  Business understanding  Data understanding  Data preparation  Modeling  Evaluation  Deployment Business understanding Data understanding Data preparation ModelingEvaluation Deployment

44 Page Number: 44 Business Understanding Business Understanding  Determine business objectives  Assess situation  Determine data mining goals  Produce project plan

45 Page Number: 45 Data Understanding Data Understanding  Collect initial data  Describe data  Explore data  Verify data quality

46 Page Number: 46 Data Preparation Data Preparation  Select data  Clean data  Construct data  Integrate data  Format data

47 Page Number: 47 Modeling Modeling  Select modeling technique  Generate test design  Build model  Assess model

48 Page Number: 48 Evaluation Evaluation  Evaluate results  Review process  Determine next steps results = models + findings

49 Page Number: 49 Deployment Deployment  Plan deployment  Plan monitoring and maintenance  Produce final report  Review project

50 Page Number: 50 At Last… At Last…

51 Page Number: 51 WWW.NBA. COM WWW.NBA.COM

52 Page Number: 52 Se7en Se7en

53 Page Number: 53 CD – ROM CD – ROM CD – ROM

54 Page Number: 54 Evolution of Data Mining Prospective, proactive information delivery Lockheed, IBM, SGI, numerous startups Advanced algorithms, multiprocessors, massive databases What’s likely to happen to Boston unit sales next month? Why? Data Mining (2000) Retrospective, dynamic data delivery at multiple levels Pilot, IRI, Arbor, Redbrick, Evolutionary Technologies OLAP, Multidimensional databases, data warehouses What were unit sales in New England last March? Drill down to Boston. Data Navigation (1990s) Retrospective, dynamic data delivery at record level Oracle, Sybase Informix, IBM, Microsoft RDBMS, SQL, ODBC What were unit sales in New England last March? Data Access (1980s) Retrospective, static data delivery IBM, CDC Computers, tapes, disks What was my average total revenue over the last 5 years? Data Collection (1960s) CharacteristicsProduct ProvidersEnabling Technologies Business Question Evolutionary Step

55 Page Number: 55 Examples of DM projects to stimulate your imagination  Here are six examples of how data mining is helping corporations to operate more efficiently and profitably in today's business environment –Targeting a set of consumers who are most likely to respond to a direct mail campaign –Predicting the probability of default for consumer loan applications –Reducing fabrication flaws in VLSI chips –Predicting audience share for television programs –Predicting the probability that a cancer patient will respond to radiation therapy –Predicting the probability that an offshore oil well is actually going to produce oil

56 Page Number: 56 Comparison of fourteen DM tools Evaluated by four undergraduates inexperienced at data mining, a relatively experienced graduate student, and a professional data mining consultant Evaluated by four undergraduates inexperienced at data mining, a relatively experienced graduate student, and a professional data mining consultant Run under the MS Windows 95, MS Windows NT, Macintosh System 7.5 Run under the MS Windows 95, MS Windows NT, Macintosh System 7.5 Use one of the four technologies: Decision Trees, Rule Inductions, Neural, or Polynomial Networks Use one of the four technologies: Decision Trees, Rule Inductions, Neural, or Polynomial Networks Solve two binary classification problems: multi-class classification and noiseless estimation problem Solve two binary classification problems: multi-class classification and noiseless estimation problem Price from 75$ to 25.000$ Price from 75$ to 25.000$

57 Page Number: 57 Comparison of fourteen DM tools The Decision Tree products were - CART - Scenario - See5 - S-Plus The Decision Tree products were - CART - Scenario - See5 - S-Plus The Rule Induction tools were - WizWhy - DataMind - DMSK The Rule Induction tools were - WizWhy - DataMind - DMSK Neural Networks were built from three programs - NeuroShell2 - PcOLPARS - PRW Neural Networks were built from three programs - NeuroShell2 - PcOLPARS - PRW The Polynomial Network tools were - ModelQuest Expert - Gnosis - a module of NeuroShell2 - KnowledgeMiner The Polynomial Network tools were - ModelQuest Expert - Gnosis - a module of NeuroShell2 - KnowledgeMiner

58 Page Number: 58 Criteria for evaluating DM tools A list of 20 criteria for evaluating DM tools, put into 4 categories: Capability measures what a desktop tool can do, and how well it does it - Handles missing data- - Considers misclassification costs - Allows data transformations - Includes quality of tesing options - Has a programming language - Provides useful output reports - Provides visualisation Capability measures what a desktop tool can do, and how well it does it - Handles missing data- - Considers misclassification costs - Allows data transformations - Includes quality of tesing options - Has a programming language - Provides useful output reports - Provides visualisation

59 Page Number: 59 Visualisation excellent capability good capabilitysome capability “blank” no capability + excellent capability good capability - some capability “blank” no capability

60 Page Number: 60 Criteria for evaluating DM tools Learnability/Usability shows how easy a tool is to learn and use - Tutorials - Wizards - Easy to learn - User’s manual - Online help - Interface Learnability/Usability shows how easy a tool is to learn and use - Tutorials - Wizards - Easy to learn - User’s manual - Online help - Interface

61 Page Number: 61 Criteria for evaluating DM tools Interoperability shows a tool’s ability to interface with other computer applications - Importing data - Exporting data - Links to other applications Interoperability shows a tool’s ability to interface with other computer applications - Importing data - Exporting data - Links to other applications Flexibility - Model adjustment flexibility - Customizable work enviroment - Ability to write or change code Flexibility - Model adjustment flexibility - Customizable work enviroment - Ability to write or change code

62 Page Number: 62 Data Input & Output Model excellent capability + excellent capability good capability some capability - some capability “blank” no capability “blank” no capability

63 Page Number: 63 A classification of data sets Pima Indians Diabetes data set Pima Indians Diabetes data set –768 cases of Native American women from the Pima tribe some of whom are diabetic, most of whom are not –8 attributes plus the binary class variable for diabetes per instance Wisconsin Breast Cancer data set Wisconsin Breast Cancer data set –699 instances of breast tumors some of which are malignant, most of which are benign –10 attributes plus the binary malignancy variable per case The Forensic Glass Identification data set The Forensic Glass Identification data set –214 instances of glass collected during crime investigations –10 attributes plus the multi-class output variable per instance Moon Cannon data set Moon Cannon data set –300 solutions to the equation: x = 2v 2 sin(g)cos(g)/g –the data were generated without adding noise

64 Page Number: 64 Evaluation of forteen DM tools

65 Page Number: 65 Strenghts and Weaknesses Strengths: Ease of use (Scenario, WizWhy..) Ease of use (Scenario, WizWhy..) Data visualisation (S-plus,MineSet...) Data visualisation (S-plus,MineSet...) Depth of algorithms (tree options) (CART,See5,S-plus..) Depth of algorithms (tree options) (CART,See5,S-plus..) Multiple nn architectures (NeuroShell) Multiple nn architectures (NeuroShell) Weaknesses: Difficult file I/O (OLPARS,CART) Difficult file I/O (OLPARS,CART) Limited visualisation (PRW,See5,WizWhy) Limited visualisation (PRW,See5,WizWhy) Narrow analyses path (Scenario) Narrow analyses path (Scenario)

66 Page Number: 66 How to improve existing DM applications The top ten points: Database integration Database integration –no more flat files –use the millions $ spent on data warehousing Automated model scoring Automated model scoring –without scoring DM is pretty useless –should be integrated with the driving applications Exporting models to other applications Exporting models to other applications –close the loop between DM and applications that need to use the results (scores)

67 Page Number: 67 How to improve existing DM applications Business templates Business templates –cross-selling specific application is more valuable than a general modeling tool Effort knob Effort knob –it is relevant in a way that tuning parametars are not Incorporate financial information Incorporate financial information –the financial information is very important and often available and should be provided as input to the DM application

68 Page Number: 68 How to improve existing DM applications Computed target columns Computed target columns –allow the user to interactively create a new target variable Time-series data Time-series data –a year’s worth of monthly balance information is qualitatively different than twelve distinct non-time-series variables Use versus View Use versus View –do not present visually to user the full model, only the most important levels Wizards Wizards –not necessarily but desirable –prevent human error by keeping the user on track

69 Page Number: 69 Potential Applications Data mining has many varied fields of application, some of which are listed below: Retail/Marketing Retail/Marketing  Identify buying patterns from customers  Find associations among customer demographic characteristics  Predict response to mailing campaigns  Market basket analysis

70 Page Number: 70 Potential Applications BankingBanking  Detect patterns of fraudulent credit card use  Identify `loyal' customers  Determine credit card spending by customer groups  Find hidden correlations between different financial indicators  Identify stock trading rules from historical market data

71 Page Number: 71 Potential Applications Insurance and Health CareInsurance and Health Care  Claims analysis - i.e., which medical procedures are claimed together  Predict which customers will buy new policies  Identify behaviour patterns of risky customers  Identify fraudulent behaviour

72 Page Number: 72 Potential Applications TransportationTransportation  Determine the distribution schedules among outlets  Analyse loading patterns MedicineMedicine  Characterise patient behaviour to predict office visits  Identify successful medical therapies for different illnesses  To predict the effectiveness of surgical procedures or medical tests

73 Page Number: 73 Potential Applications SportSport  To make the best choice about players in different circumstance  To predict the results of relevance match  Do a better list of seed players in groups or tournament  DM report from an NBA game When Price was Point-Guard, J.Williams missed 0% (0) of his jump field-goal attempts, and made 100% (4) of his jump field-goal-attempts. The total number of such field-goal-attempts was 4.

74 Page Number: 74 DM and Customer Relationship Management CRM is a process that manages the interactions between a company and its customers CRM is a process that manages the interactions between a company and its customers Users of CRM software applications are database marketers Users of CRM software applications are database marketers Goals of database marketers are: Goals of database marketers are:  identifying market segments, which requires significant data about prospective customers and their buying behaviors  build and execute campaigns Tightly integrating the two disciplines presents an opportunity for companies to gain competetive advantage Tightly integrating the two disciplines presents an opportunity for companies to gain competetive advantage

75 Page Number: 75 DM and Customer Relationship Management How Data Mining helps Database Marketing How Data Mining helps Database Marketing Scoring Scoring The role of Campaign Management Software The role of Campaign Management Software Increasing the customer lifetime value Increasing the customer lifetime value Combining Data Mining and Campaign Management Combining Data Mining and Campaign Management

76 Page Number: 76 DM and Customer Relationship Management Evaluating the benefits of a Data Mining model Evaluating the benefits of a Data Mining model Gains chart Profability chart

77 Page Number: 77 Data Mining Examples Bass Brewers “We’ve been brewing beer since 1777. With increased competition comes a demand to make faster/better decisions” Bass Brewers “We’ve been brewing beer since 1777. With increased competition comes a demand to make faster/better decisions” Northern Bank Northern Bank “The information is now more accessible, paperless, and timely.” TSB Group Plc TSB Group Plc “We are using Holos because of its flexibility and its excellent multidimensional database”

78 Page Number: 78 Data Mining Examples Delphic University “Real value is added to data by multidimensional manipulation (being able to easily compare many different views of the avaible information in one report) and by modeling.” Delphic University “Real value is added to data by multidimensional manipulation (being able to easily compare many different views of the avaible information in one report) and by modeling.” Harvard - Holden “Sybase technology has allowed us to develop an information system that will preserve this legacy into the twenty-first century.” Harvard - Holden “Sybase technology has allowed us to develop an information system that will preserve this legacy into the twenty-first century.” J.P.Morgan “The promise of data mining tools like Information Harvester is that they are able to quickly wade through massive amounts of data to identify relationships or trending information that would not have been available without the tool.” J.P.Morgan “The promise of data mining tools like Information Harvester is that they are able to quickly wade through massive amounts of data to identify relationships or trending information that would not have been available without the tool.”

79 Page Number: 79 Securities Brokerage Case Study The following four pages are derived from a copyrighted case study originally created by SmartDrill Data Mining (Marlborough, MA, U.S.A.). The following four pages are derived from a copyrighted case study originally created by SmartDrill Data Mining (Marlborough, MA, U.S.A.). Their website is: http://smartdrill.com Their website is: http://smartdrill.com And the original case study appears in its entirety here: http://smartdrill.com/CHAID.html And the original case study appears in its entirety here: http://smartdrill.com/CHAID.html

80 Page Number: 80 Securities Brokerage Case Study Predictive market segmentation model designed to identify and profile high-value brokerage customer segments as targets for special marketing communications efforts. Predictive market segmentation model designed to identify and profile high-value brokerage customer segments as targets for special marketing communications efforts. The dependent variable for this ordinal CHAID model is brokerage account commission dollars during the past 12 months The dependent variable for this ordinal CHAID model is brokerage account commission dollars during the past 12 months We begin by splitting the client's entire customer file into a modeling sample and a validation sample. (Once the model is built using the modeling sample, we apply it to the validation sample to see how well it works on a sample other than the one on which it was built). We begin by splitting the client's entire customer file into a modeling sample and a validation sample. (Once the model is built using the modeling sample, we apply it to the validation sample to see how well it works on a sample other than the one on which it was built).

81 Page Number: 81 Securities Brokerage Case Study The resulting CHAID model has 55 segments. The resulting CHAID model has 55 segments. However, the results are summarized in the following comb chart, showing the segment indexes (indexes of average dollar value) However, the results are summarized in the following comb chart, showing the segment indexes (indexes of average dollar value)

82 Page Number: 82 Securities Brokerage Case Study The part of Gains Chart: Average Annual Brokerage Commission Dollars … … … … … … … …... Gains chart provides quantitative detail useful for financial and marketing planning. Gains chart provides quantitative detail useful for financial and marketing planning. We have highlighted the top 20% of the file in blue We have highlighted the top 20% of the file in blue The top 20% of the file is worth an average of about $334 per account, which is nearly three times the average account value for the entire sample. The top 20% of the file is worth an average of about $334 per account, which is nearly three times the average account value for the entire sample.

83 Page Number: 83 Securities Brokerage Case Study Using the data in the gains chart, we can better plan our communications/promotion budget. Using the data in the gains chart, we can better plan our communications/promotion budget. In general, the best segments represent customers who are experienced, aggressive, self-directed traders. In general, the best segments represent customers who are experienced, aggressive, self-directed traders. The other decisions that can help us: The other decisions that can help us:  We might wish to conduct some market research among customers in under-performing segments, or among under-performing customers in the better segments  We can use the segment definitions to help us identify possible issues and question areas to include in the survey Before we try to apply such a model, we perform a validation against a holdout sample, to confirm that it is a good model. Before we try to apply such a model, we perform a validation against a holdout sample, to confirm that it is a good model.

84 Page Number: 84 References References  Bruha, I., ‘Data Mining, KDD and Knowledge Integration: Methodology and A case Study”, SSGRR 2000  Fayyad, U., Shapiro, P., Smyth, P., Uthurusamy, R., “Advances in Knowledge Discovery and Data Mining”, MIT Press, 1996  Glumour, C., Maddigan, D., Pregibon, D., Smyth, P., “Statistical Themes nad Lessons for Data Mining”, Data Mining And Knowledge Discovery 1, 11-28, 1997  Hecht-Nilsen, R., “Neurocomputing”, Addison-Wesley, 1990  Pyle, D., “Data Preparation for Data Mining”, Morgan Kaufman, 1999  www.thearling.com www.thearling.com  www.crisp-dm.com www.crisp-dm.com  www.twocrows.com www.twocrows.com  www.sas.com/products/miner www.sas.com/products/miner  www.spss.com/clementine www.spss.com/clementine  galeb.etf.bg.ac.yu/~vm

85 Potentials of R&D in Cooperation with U. of Belgrade Potentials of R&D in Cooperation with U. of Belgrade An Overview of Advanced Datamining Projects for High-Tech Computer Industry in the USA and EU

86 Page Number: 86 VLSI Detection for Internet/Telephony Interfaces Goran Davidović, Miljan Vuletić, Veljko Milutinović, Tom Chen, and Tom Brunett * eT 

87 Page Number: 87 INTERNET SERVICE PROVIDER REMOTE SITE Superposition/DETECTION... USERS... HOME/OFFICE/FACTORY AUTOMATION ON THE INTERNET SPECIALIZED

88 Page Number: 88 Reconfigurable FPGA for EBI Božidar Radunović, Predrag Knežević, Veljko Milutinović, Steve Casselman, and John Schewel *  * Virtual

89 Page Number: 89 INTERNET PROVIDER... USERS VCC CUSTOMER SATISFACTION vs CUSTOMER PROFILE SPECIALIZED SERVICE

90 Page Number: 90 BioPoP Veljko Milutinovic, Vladimir Jovicic, Milan Simic, Bratislav Milic, Milan Savic, Veljko Jovanovic, Stevo Ilic, Djordje Veljkovic, Stojan Omorac, Nebojsa Uskokovic, and Fred Darnell isItWorking.com

91 Page Number: 91 Testing the Infrastructure for EBI Phones Phones Faxes Faxes Email Email Web links Web links Servers Servers Routers Routers Software Software Statistics Correlation Innovation

92 Page Number: 92 CNUCE Integration and Datamining on Ad-Hoc Networks and the Internet Veljko Milutinović, Luca Simoncini, and Enrico Gregory  *University of Pisa, Santanna, CNUCE

93 Page Number: 93 GSM DM Ad-Hoc Internet

94 Page Number: 94 Genetic Search with Spatial/Temporal Mutations Jelena Mirković, Dragana Cvetković, and Veljko Milutinović  *Comshare

95 Page Number: 95 Drawbacks of INDEX-BASED: Time to index + ranking Advantages of LINKS-BASED: Mission critical applications + customer tuned ranking Provider Well organized markets: Best first search If elements of disorder: G w DB mutations Chaotic markets: G w S/T mutations

96 Page Number: 96 e-Banking on the Internet Miloš Kovačević, Bratislav Milic, Veljko Milutinović, Marco Gori, and Roberto Giorgi Miloš Kovačević, Bratislav Milic, Veljko Milutinović, Marco Gori, and Roberto Giorgi  *University of Siena

97 Page Number: 97 Bottleneck#1: Searching for Clients and Investments 1472++ *University of Siena + Banco di Monte dei Paschi

98 Page Number: 98 SSGRR Organizing Conferences via the Internet Zoran Horvat, Nataša Kukulj, Vlada Stojanović, Dušan Dingarac, Marjan Mihanović, Miodrag Stefanović, Veljko Milutinović, and Frederic Patricelli  *SSGRR, L’Aquila

99 Page Number: 99 2000: Arno Penzias 2001: Bob Richardson 2002: Jerry Friedman 2003: Harry Kroto http://www.ssgrr.it

100 Page Number: 100 Summary Books with Nobel Laureates: Kenneth Wilson, Ohio (North-Holland) Kenneth Wilson, Ohio (North-Holland) Leon Cooper, Brown (Prentice-Hall) Leon Cooper, Brown (Prentice-Hall) Robert Richardson, Cornell (Kluwer-Academics) Robert Richardson, Cornell (Kluwer-Academics) Herb Simon (Kluwer-Academics) Herb Simon (Kluwer-Academics) Jerome Friedman, MIT (IOS Press) Jerome Friedman, MIT (IOS Press) Harold Kroto (IOS Press) Harold Kroto (IOS Press) Arno Penzias (IOS Press) Arno Penzias (IOS Press)

101 Page Number: 101 http://galeb.etf.bg.ac.yu/~vm/ e-mail: vm@etf.bg.ac.yu


Download ppt "Page Number: 1 Prof. Dr. Veljko Milutinovic: Coarchitect of the World's first 200MHz RISC microprocessor,"

Similar presentations


Ads by Google