Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006.

Similar presentations


Presentation on theme: "Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006."— Presentation transcript:

1 Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006

2 2 Outline Mining Different Data Types –Spatial, Temporal, Time Series, Data Streams, Multimedia, XML, Web, Text etc. Distributed Data Mining (DDM) Mobile & Ubiquitous Data Mining (UDM) Data Mining E-Services Anytime, Anywhere Data Mining E-Services

3 3 Generations of Data Mining Four Generations of Data Mining Systems – Robert Grossman First Generation – Stand Alone, Centralised, Single Algorithm Second Generation – Integration with databases, support for high- dimensionality, complex data types Third Generation –Distribution and Heterogeniety Fourth Generation – Support for mining embedded, mobile and ubiquitous data sources

4 Distributed Data Mining

5 5 Distributed Data Mining Inherently distributed data MNC + Global Markets => Physical/geographical separation of users from the data sources Traditional data mining model involving the co-location of users, data and computational resources is inadequate

6 6 Distributed Data Mining (DDM) The inherent distribution of data and other resources as a result of organisations being distributed. The large volumes of data, the transfer of which results in exorbitant communication costs. The need to mine heterogeneous data, the integration of which is both non-trivial and expensive. The performance and scalability bottle necks of data mining.

7 7 Distributed Data Mining (DDM) DDM = Data Mining (DM) + Knowledge Integration (KI) DM - Performing traditional knowledge discovery at each distributed data site. KI - Merging the results generated from the individual sites into a body of cohesive and unified knowledge.

8 8 Parallel Data Mining (PDM) Principal distinction between DDM & Parallel DM –parallel mining involves parallel processors with or without shared memory Parallel data mining also includes development of parallel versions of traditional data mining techniques. Can be integration – DecisionCentre

9 9 DDM – Algorithms & Architectures Research in distributed data mining can be divided into two broad categories [Fu01]: Data Mining Algorithms. –focus on efficient techniques for knowledge integration. Distributed Data Mining Architectures. –focus on development of distributed data mining architectures –emphasizes the processes and technologies that support construction of software systems to perform distributed data mining

10 10 Taxonomy of DDM Architectures

11 11 Classification – DDM Systems DDM Architectural ModelsDDM Systems Client-serverDecisionCentre [CDG99], IntelliMiner [PaS99, PaS01], InterAct [PaD02] Agents  Mobile Agent  Stationary Agent JAM [SPT97], Infosleuth [UMG98, MUU99], BODHI [KPH99], Papyrus [Ram98], PADMA [KHS97a, KHS97b]

12 12 Client-Server DDM

13 13 Mobile Agent Model for DDM

14 14 Hybrid Model for DDM

15 Ubiquitous Data Mining

16 16 Ubiquitous Data Mining (UDM) Mining data in a resource-constrained environment to support the time critical information needs of mobile users Typical Characteristics –Mobile User – frequent disconnections –Handheld Device - >Resource constraints – memory, battery, processor, screen real-estate –Time critical –Real-time & On-line –Data Streams Example Scenarios Many Challenges

17 17 Current Research Kargupta’s Group Monash Univ. –AgentUDM –Adapative, Cost-efficient & Light-weight data mining techniques for data streams >Mohamed Medhat >LWC, LWF & LWClass >Watch this space!!!

18 Data Mining E-Services

19 19 Data Mining E-Services “…data analysis and mining functions themselves will be offered as business intelligence e-services that accept operational data from clients and return models or rules” Umesh Dayal, 2001 Why? – Knowledge is a key resource – Cost of data mining infrastructure

20 20 Data Mining E-Services Current Commercial Landscape –Several ASPs -> DigiMine, Information Discovery, WhiteCross Systems, ListAnalyst.com etc. etc. –Mode of Operation Hybrid Model & Data Mining ASPs –Optimise Response Time >Leads to improved throughput –QoS Estimation –Location Preferences of Clients

21 21 Data Mining E-Services Current Commercial Landscape –Several ASPs -> DigiMine, Information Discovery, WhiteCross Systems, ListAnalyst.com etc. etc. –Mode of Operation Hybrid Model & Data Mining ASPs –Optimise Response Time >Leads to improved throughput –QoS Estimation –Location Preferences of Clients

22 Anytime, Anywhere Data Mining E-Services

23 23 My Thoughts Data is a commodity, Analysis is a service Access anytime, anywhere By anyone… –From large corporations to small business to individuals From home buyers to mobile salespersons to grocery shoppers…

24 24 My Thoughts A preliminary model for delivery –Datacentric Grids

25 References

26 26 References MobileComponents/projects/dame/http://www.csse.monash.edu.au/projects/ MobileComponents/projects/dame/ research.htmlhttp://www.csse.monash.edu.au/~shonali/ research.html /http://www.csee.umbc.edu/~hillol/DDMBIB / tmlhttp://www.csee.umbc.edu/~hillol/diadic.h tml main.htmlhttp://www.csse.monash.edu.au/~mgaber/ main.html


Download ppt "Www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006."

Similar presentations


Ads by Google